'ref' is a counter-intuitive keyword


#1

I’m not sure if this has been brought up before but using ‘ref’ in pattern matching can be quite counter-intuitive. I believe a more apt way of expressing reference binding should consist of something that has the opposite meaning to ‘ref’. Let me demonstrate:

let a = x; // Match a to x. No problem - understood.

let b = &x; // Match b to reference of x. No problem - understood.

let ref c = x; // Match reference of c to x? Hmm, if we destructure this, x is a reference to some value. Thus c should refer to that value right? WRONG.

let &c = x; // This is actually the correct way to express the previous line.

It is too easy for a casual reader who is not very acquainted with rust to confuse ref and &. Even after having given the explanation of what ref does, it can still leave quite some dissonance. I would consider this a papercut. Now consider a ‘deref’ keyword instead:

let deref c = x; // Match the dereference of c to x. If we destructure this, x is a dereference of some reference. Thus c refers to that reference!

Doesn’t the above feel better to you? But there is a problem with using ‘deref’ like that. Rust has already established a convention that prefixing keywords do not take part in pattern matching, but are rather a property of the bound name. For example:

let mut a = b; // mut is a property of the bound name a, not that b is a mut value for some value which a binds to.

let ref c = x; // ref technically follows this convention, but as I mentioned, reading this can be a papercut to understanding Rust

My proposal would be to use the following:

let *c = x; // * is already the dereference operator

let *mut c = x; // Mutable reference

Thoughts?


#2

Absolutely! My favourite part of match ergonomics default binding modes is that it makes ref far less necessary.


#3

Yes, several times. If I remember correctly, it was even discussed with almost the exact syntax you proposed, and it was immediately pointed out how it conflicts with existing pattern / expression syntax (I don’t remember the the details but you can search for it).

It might not be immediately intuitive, but my experience was that I became accustomed to it after about a couple of days spent writing Rust and from that point onward, it makes perfect sense. Yes, Rust is a language that you probably have to put more learning effort in. I don’t think that adding another way to say ref would make things clearer or easier to learn (the converse, actually).


#4

Yes, the ref problem was brought up many times before.

But the & seem a wrong to me. There used to be a proposal to use star on the left side :

let *c = x;

It seem more correct to me since the types on both sides of the equal sign have to be equivalent. If c is a reference , you have to dereference it to get x.


#5

Oh yeah, plenty of times.

I think I said somewhere before that making sense out of the ref keyword is actually the only part of the Rust language I ever struggled with. Everything else in the language was straightforward to me (at least in its basic use cases; things like the module system only got weird for me only when I dove into the details).

But here I have to completely disagree.

The reason ref was hard for me has nothing to do with the choice of keyword. For me, Rust is the first language that both has proper support for sum types including pattern matching, and distinguishes between value and pointer/reference types. That immediately implies that a pattern match must contain not only a pattern to decide if any given value matches successfully or not, but also a variable binding to be created if the match is successful. Before Rust, the closest thing I had to pattern matching was destructuring assignment in Javascript, but of course Javascript doesn’t have static types at all, much less separate value and reference types, so the binding part was just a name.

ref was hard for me because I completely missed the fact that a match arm has to contain both a pattern match and variable binding (even if the latter is always partially implicit and often completely implicit). Somehow, we must distinguish between matching on a reference (i.e., what let &x = y; does) and matching on a value to create a binding to a reference (i.e., what let ref x = y; does). Once I finally understood this, ref became easy for me.

It is easy to imagine lots of alternative syntaxes, but none that would really move the needle here. The hard part was internalizing why we need two separate syntaxes here in the first place, and none of the suggestions I’ve seen come close to helping with that.

This actually feels much worse to me, because this explanation implies the binding and the match are entangled and cannot be reasoned about separately. Today’s ref simply means “the binding will be a reference” but has no effect on the match. When you want to figure out what values a pattern will or won’t match against, you simply ignore all the refs. That’s nice and orthogonal.

I can see some parallels between this explanation and the details of how Javascript destructuring work (e.g., that let [a] = [10]; assigns 10 to a, rather than making a an array), but I think that design for destructuring syntax makes sense precisely because Javascript has no value vs ref types to worry about (not even with Typescript added on) so it has significantly less complexity to fit into that crowded space.


And finally, as scott hinted at, part of the motivation for the “match ergonomics” RFC was that it should make ref so exceedingly rare in practice that novices typically won’t need to learn it. I think that’s a far more effective solution to the problem than tweaking the syntax.


#6

#7

To add a bit to the conclusion in that thread, ref x as a binding mode has a nice property that *x as a pattern does not: both forms only make sense applied to identifiers, and not nested patterns, so picking the one analogous to mut avoids any questions around “materializing” new references.

Or in other words, the &x pattern matches against an actual structural part of an object- a pointer and its associated data. A hypothetical *x pattern doesn’t have anything to match against- it’s trying to add structure (a new pointer that doesn’t exist in the object) while other patterns remove it.


#8

I just read ref as “referent” rather than “reference” and call it good, on the few cases where it’s still needed.


#9

The “correct” way to think about ref is as setting the “binding mode” for the pattern. A ref pattern binds by borrow; a ref mut pattern binds by mutable borrow; a hypothetical move pattern would bind by move. Pre-match-ergonomics, move would not have been useful. Now, since matching on something that types as &T will force ref mode, being able to switch back to move for, e.g., a Copy type would actually be useful. I really need to get around to writing the RFC for move patterns…