Re: why Rust references are not pointers


#1

Continuing the discussion from Bikeshed: Rename `catch` blocks to `fallible` blocks:

In C/C++ there’s clear distinction in the syntax between passing things by reference (pointer) and by value, but there’s no distinction between owned and borrowed values.

In Rust it’s the opposite. The syntax exists to separate borrowed vs owned, but the distinction between values and pointers is vague, and handled entirely differently (& may be a 2-usize struct, but Box is just a raw pointer, usually).

C is super explicit about dereferencing, but the concept of lifetimes is left entirely up to programmer’s imagination.
OTOH Rust is explicit about borrows and lifetimes, but dereferencing is often hidden.

So given all that if you think Rust reference == C pointer, you’ll think of them from a completely wrong perspective.

For example, in C it’s common to return objects by pointer, because that’s how malloc works, and there’s no other way to have any private type than via an opaque struct pointer. But in Rust if you try to allocate something and return a reference to it the borrow checker will tell you it doesn’t make any sense.

In C it’s usual to put pointers in structs. In Rust references in structs is are a special case of limited usefulness.

So I think it’s better to think of Rust’s references as temporary read/write locks that are implemented at compile time. That fixes thinking from “I want to return this by pointer” to “why would I give a temporary read-only lock to the object, rather than the object itself?”


#2

I don’t have time for a detailed response atm, but I do want to emphasize that I was comparing Rust references to C++ references, not to C pointers (or C++/Rust raw pointers). For me at least, most of what you’re saying is also an argument “why C++ references are not pointers”.

But totally independently of that, I think the terminology point you appear to be arguing for has some merit, and I’ve even see people like niko express similar views. I just think that the terminology arguments to have are, say, whether a “shared reference” should be called a “readonly reference”, or whether a “mutable reference” should be called an “exclusive reference”; I don’t see any compelling alternatives to the “reference” part.


#3

I wouldn’t group C++ in with C here.

  • C++ distinguishes between owned (T) and borrowed (T & or const T&) values.
  • It is not at all common to return objects by pointer in modern C++. Easiest way to return something is by value (copy constructor, usually with the hope of the compiler performing copy elision); preferred way in modern code is through move constructors.
  • Putting raw pointers in structs in C++ is a special case of limited usefulness. (edit: well, perhaps not exactly, since in C++ there’s a lot you can do with a pointer. But most of it is dangerous, and that’s why ref-counted pointers are often preferred where possible)

#4

The way we explain these concepts in the book is that “pointer” is the most general idea, and “references” are a specific kind of pointer, one with more guarantees.


#5

That’s how I tend to think of them, and how I have always understood them (and I come from a C/Java background)…


#6

The thing I struggled with was that Box is a pointer, too. Because it doesn’t have a pointer-like syntax sigil it was hard for me to internalise that, and I severely overused references in the beginning.


#7

Out of curiosity, do you come from a C++ background? C++ has smart pointers too (e.g. std::shared and std::unique), though I’m not sure it has anything like Box. I ask because I imagine that people coming from different backgrounds would struggle with different aspects of references…


#8

I was always under the impression that Box<T> is analogous to std::unique_ptr<T>. They both express unique ownership of a T in the heap, and they both support move semantics and RAII cleanup. The biggest difference I know of is that Box<T> is not quite implementable without magic, but that’s only because of the “DerefMove” problem.


#9

I’ve written much more plain C than C++, so to me Box is a fancy malloc(), and I’m still baffled that my structs and the rest of the program cares whether something came from “malloc” or not.


#10

&T and Box<T> and *mut T have exactly the same in-memory representations. Box<str> and Box<[u8]> and Box<Trait> are fat pointers, just like &str and &[u8] and &Trait.

The real difference is the ownership and borrowing semantics:

  • Box<T>: unique owning pointer
  • Rc<T>: shared owning pointer
  • &mut T: unique borrowed pointer
  • &T: shared borrowed pointer

#11

Nice way to put it. I’d think this statement alone would clarify a lot for most people.


#12

I like the term borrow. Event if it is quite specific to Rust.