Is `ref` more powerful than match ergonomics?

Try this: :slight_smile:

#![feature(move_ref_pattern)]

fn main ()
{
    let mut it = (Some(vec![42]), Some(vec![]));
    if let (Some(v), Some(ref mut vs)) = it {
        vs.push(v);
    }
    dbg!(&it.1);
}

This indeed an example of where ref and ref mut are fundamentally more expressive as default binding modes won't allow you to do partial moves out of enums.


I personally still use ref and ref mut when I start having to do too many *xs. In such a case, I find ref x to be more readable. Both default binding modes and ref mut? are useful, and have their place in the language.

So to be clear with respect to:

I definitely do not favor removing ref mut? (including because of the additional complexity involved in edition breaks).

4 Likes

I find the new model unintuitive, and generally speaking feel uncharacteristically disappointed by rustc's lack of help in enforcing types when I encounter it. Most commonly, this manifests as a type error reported later in my code about a reference that shouldn't have been a reference, when the more helpful error message from rustc would have been "you forgot a & in your pattern". When I get such an error, I end up having to carefully check the type of what I'm matching against the pattern (since rustc won't do it for me), and insert & and ref into the pattern as needed to get the correct types out.

Prior to this change to rustc, the compiler would immediately flag the actual error I made in the pattern, right where my pattern didn't match the type I was matching. That made it much easier to fix such errors.

This is a regular occurrence for me. The last time this happened was a couple of days ago.

13 Likes

IIRC, the following was discussed a long time ago when the new model was adopted. Rustaceans who prefer the lack of ambiguity of the old model could achieve that with a crate-level lint, were such to be RFC'd and approved. Enough time has passed since adoption of the new model to make it clear that the new model is problematic for at least part of the population. If they haven't "gotten used to it" yet, they are unlikely to do so in the future. Perhaps it's time to reconsider such a lint for those who want it.

7 Likes

I personally think of it as patterns "deconstructing" expressions, which seems less academic than being a "dual" of expressions. But I would be interested to explore this line of thinking further. Forget compatibility for a moment, and I'll try to suspend my intuition based on current patterns. What could patterns look like, if they mimicked expressions rather than deconstructing expressions? (Bear in mind that expressions themselves already deconstruct expressions, if you think about * and & in an expression context.)

For the record, I'm not opposed to the concept of a different pattern-matching model that more people find intuitive. I understand that some people found ref and ref mut unintuitive. I just don't want a syntax in which a common error (such as forgetting a &) is something rustc interprets as valid but has the wrong semantics, and then results in an error later that doesn't directly point to the source of the problem.

4 Likes

I think to avoid confusion with drop, i.e. pattern matching doesn't necessarily cause the inversion of the call constructor, i've seen rather the term "destructuring" being used and better in this context.

Thanks, that's the right term to use; "destructuring" is much more accurate.

This should "just" be a diagnostics issue. Would it resolve some of this concern if rustc, in addition to suggesting a *dereference of a name, went back to the pattern the name was defined at and said "reference introduced here" or similar?

Sketching it out with a trivial example:

fn example(x: &Option<S>) {
    match x {
        Some(val) => drop::<S>(val),
        None => (),
    }
}
error[E0308]: mismatched types
 --> src/lib.rs:6:32
  |
6 |         Some(val) => drop::<S>(val),
  |         ^^^^^^^^^              ^^^
  |         |                      |
  |         |                      expected struct `S`, found `&S`
  |         |                      help: consider dereferencing the borrow: `*val`
  |         note: borrow introduced here because pattern has type `&std::option::Option<crate::S>`

And also potentially making this work:

   |        help: consider destructuring the borrow here: `Some(&val)`

Edit: filed on the issue tracker

3 Likes

It occurred to me that I should try to give a minimal example of the kind of issue I regularly run into. So, here's a reduced version of what I encountered the other day.

Sample code (also available on play.rust-lang.org):

use std::collections::BTreeMap;

enum Thing {
    Variant1,
    Variant2(u32),
}

struct ComplexStructure {
    things: BTreeMap<u32, Thing>,
}

impl ComplexStructure {
    fn from_thin_air() -> Self {
        let mut things = BTreeMap::new();
        things.insert(11, Thing::Variant1);
        things.insert(22, Thing::Variant2(42));
        Self { things }
    }
    
    fn process_thing(&self, key: u32) {
        let thing = self.things.get(&key).expect("imagine .with_context(|| ...)? here");
        match thing {
            Thing::Variant1 => self.process_variant1(),
            Thing::Variant2(v) => self.process_variant2(v),
        }
    }

    fn process_variant1(&self) {
        println!("processing variant1");
    }

    fn process_variant2(&self, v: u32) {
        println!("processing variant2: {}", v);
    }

}

fn main() {
    let complex_structure = ComplexStructure::from_thin_air();
    complex_structure.process_thing(11);
}

Compiling this gives the following error:

error[E0308]: mismatched types
  --> src/main.rs:24:57
   |
24 |             Thing::Variant2(v) => self.process_variant2(v),
   |                                                         ^
   |                                                         |
   |                                                         expected `u32`, found `&u32`
   |                                                         help: consider dereferencing the borrow: `*v`

error: aborting due to previous error

I've had to learn, as a result of the new match behavior, to treat this kind of error as "I probably got a pattern wrong earlier". And indeed, the root cause of the error is "oh, right, .get returns Option<&V> not Option<V>".

One fix (which I tend to favor) would be:

        match thing {
            &Thing::Variant1 => self.process_variant1(),
            &Thing::Variant2(v) => self.process_variant2(v),
        }

Another fix, also valid but for some reason not what my intuition tends to reach for:

        match *thing {
            Thing::Variant1 => self.process_variant1(),
            Thing::Variant2(v) => self.process_variant2(v),
        }

rustc doesn't point to either of those. Instead, it points to a symptom, rather than the root cause.

EDIT:

It would help if rustc stopped suggesting a *dereference at the point it detects the error, and went back to identifying a type mismatch in the pattern, which it used to do. I have trouble thinking of any situation in which I would want to patch it up later; I want to fix the root cause of the problem.

EDIT 2: Yes, your subsequent addition of another suggested fix would be an improvement. That would at least make me feel more like rustc was helping me identify solve the problem. It's still not ideal, but it seems like a net improvement.

4 Likes

This. Sadly, my corresponding feature request in Clippy was quickly dismissed with the rather arrogant line that "we're not going to lint a feature away" (which is not even consistent, because the whole point of a linter is to warn about potentially dangerous features).

2 Likes

Where did we get to with types in patterns, again?

I'm imagining an arm like Some(x: i32) or Some(x: &_) so that one has the option to over-specify if one would like, and thus not get the DWIM behaviour.

Still stuck on RFC: Generalized Type Ascription by Centril ยท Pull Request #2522 ยท rust-lang/rfcs ยท GitHub. I would be happy to split out an RFC that is purely about type ascription on patterns, as long as we do it generally (Pat ::= Pat ":" Type | ... ;, not some special case) but I'm not sure if consensus is achievable there?

(Note: The way I would implement x: i32 is to have PatKind::Type(Pat<'hir>, Ty<'hir>) use the AdjustMode::Pass, which means that the expected type and default binding mode given to type checking x: i32 is not altered before checking the pattern, and then both the type and the pattern is checked against expected: Ty<'tcx> (modulo details re. subtyping). From this it follows automatically due to recursion that the default-binding-mode algorithm will still peel reference types off e.g. Some(_) patterns.)

1 Like

One basic case that doesn't work without ref is when you want to move in one arm and borrow in another arm:

enum Either {
    A(Vec<u8>),
    B(Vec<u16>)
}

struct S {
    kind: Either,
}

fn x(s: S) {
    let mut option1 = None::<Vec<u8>>;
    let mut option2 = None::<S>;
    match s.kind {
        Either::A(data) => {
            option1 = Some(data);
        }
        Either::B(ref data) => {
            if data.contains(&0) {
                option2 = Some(s);
            }
        }
    }
}

Playground

Without ref, match s.kind doesn't work for the second arm, and match &s.kind doesn't work for the first arm.

3 Likes

I've got that error too a few times. But overall, I think it's an acceptable price to pay for not getting many other errors about missing ref or using Enum::Foo instead of &Enum::Foo and such. If I need a reference, I can match on a reference and the rest generally just works. Previously, it used to be a whack-a-mole of randomly adding & and ref until compiler stopped complaining.

I'm not programming to get the types right. It's the opposite: types exists to help me write correct programs. If the compiler can achieve the same result without quizzing me about type details, that's better.

9 Likes

It can't always guess correctly though, especially not in the context of unsafe code. There is an important distinction between values and pointers (references) that is lost.

Consider for example what happens if you have a C API that works with pointers (possibly of mixed levels, so also pointers-to-pointers), and it wants to be generic at the same time. This means that the pointers passed to some functions will need to be casted to/from *const c_void and the like. For example, the Core Foundation framework on macOS employs this technique extensively.

This playground demonstrates that instead of the correct pointer-to-key, functions get passed a pointer-to-pointer-to-key, which would lead to memory corruption given their real implementation.

This is possible because default binding modes kick in around the initialization of the for loop. The same happens in the match, although that doesn't cause the incorrect conversion, but it hides the hint that action is also a reference.

Note that the double casting would be unnecessary if key was just *const CFString, however, it's needed very often with other pointer-to-pointer APIs, and people do it all the time even if/when it's bad practice.

As someone who interacts with unsafe C APIs quite a lot, I find this especially scary and disturbing, and I definitely want a way to just turn off default binding modes and/or know when they are in action.

2 Likes

This feels like a concern I can definitely empathize with in general re. leaving fewer details to "inference" (let's put it generally like so, type variables and lifetime elision count as well I think) when dealing with unsafe { ... } code.

7 Likes

Speaking personally:

Like @matklad, I find I never use ref anymore. I generally prefer to have match &foo then to have ref and the like in the arms.

But it seems clear that there are some things that it would be good to follow up on.

In terms of error messages, it's useful to talk about confusing or misleading error messages. It'd be helpful to have examples and something like "here is the error I get, here is the error I wanted" (like, literally copy the text and try to sketch it out). It's always much easier to make improvements.

Of course, as far as I know, a lot of those errors come up in the same case: the toe-stub around matching Copy types and the resulting need to write *x when you don't really want to. Explaining the situation is good, but ideally, we'd make it so that *x is not required (and I guess the ideal explanation depends a bit on how you prefer to fix).

Myself, I tend to add the *, because I like the model of "matching against references yields references". So I don't mind the advice to add a *, for example. But I get the reasons that others don't (repetitive, etc).

I would prefer to have a kind of coercion that will insert the references for me, but I think I would also be happy with the ability to write patterns like Some(&x) so as to move the copy into the pattern (as @varkor cited here).

I don't quite know what the breaking change aspect of that is, but it might be something to consider with an Edition if required, as it seems unlikely to me to affect a lot of crates. (Also, as a solution, it feels suboptimal to me, as it does "bring back" the idea of &x as a pattern, which has been a historical stumbling block for sure, but I don't know that I see the ideal solution yet.)

5 Likes

@CAD97 Unfortunately and somewhat disappointingly, that hint does not actually work. The "implicit references" that are introduced by match ergonomics are not actually references for the purpose of pattern matching, and as far as I know, it is impossible to deconstruct them in the pattern.

Specifically for Option, this can be handled via as_ref()/as_mut(), but that does not help when matching in another enum.

This is the same problem that was brought up before by @kornel


I just made the experiment for Miri and was indeed able to get rid of all ref patterns. That required adding as_ref()/as_mut() in a bunch of places. (Turns out that was unnecessary.) I am not sure yet which style I like better.

I thought that I had recently encountered a case where I needed ref, but cannot find it any more... it probably was something like the ones mentioned above, where we want to move in some patterns and take a reference in others.

2 Likes

It seems to me that when I read the comments on this thread (not unlike many other threads) that there is a fundamental tension between those who prefer to optimize for writing and those who prefer to optimize for reading. I honestly think this is fundamentally a tension between those who have had to maintain their own (and others') code over many years and those who have primarily worked on their own new/greenfield software. I'm not 100% sure this characterization is accurate, but, I wonder if there is a way to measure that aspect. I think if that could be understood, it might help to better frame some of these disagreements and help each side better understand the perspectives presented.

I find Rust to be an absolutely unique and interesting language that has carved out a new space in software development. Every time I read something on these forums I feel like I've come away having learned something new and important from the community - even when I disagree with the conclusions. I do, however, tend to sense an undercurrent tendency to ignore or discount long-term usage "wisdom" sometimes in favor of theoretical conclusions that have a less than solid argument, that are argued well from a rhetorical standpoint, but ultimately are just opinions that lack long-term wisdom once you pick apart the analysis.

That being said, this could just be my own biases shading my reading of others comments to some extent, but, I really do think that ergonomic arguments sometimes are neglectful of real concerns of those who favor explicitness for the sake of maintainability. It often seems that those who favor "explicitness" are constantly on the defensive having to "prove" that a new ergonomic addition will not cause harm rather than a new ergonomic initiative having to "prove" that it won't cause harm. That seems a little bit backwards, but, is consistent with how things go in life generally. It is easier to say, "Hey, look, here is this new artificial sweetener that your body doesn't absorb and so has 0 calories and because your body doesn't absorb it, then, it can't possibly be harmful," than it is to have to recognize, "Hey, yeah, but, if my small-intestine doesn't absorb it, then it makes its way to my large-intestine where, this new not-sugar is used by bacteria as energy and because it isn't normal sugar a rare bacterial variant is enriched that then begins to produce certain enzymes and by-products that ultimately ends up absorbed in my lower-intestine and then crossing the blood-brain barrier and causing brain damage over time". In this analogy, remove of "ref/ref mut" would be the equivalent of "now, that we have this perfectly acceptable sugar-substitute, we can get rid of that diabetes-causing table-sugar now, can't we?"

9 Likes

I agree that it seems like a fundamental tension, but I don't like this framing. I find code using match ergonomics easier to read, and some folks find it harder to read. This isn't because "I prioritize writing code and they prioritize reading it," just that we find different things easy or hard to read in the first place.

I don't think this is accurate. I can find examples of people on both sides of this debate who are very experienced.

I go back to Not Explicit here. I think using "explicit" only means people talk past each other.

Please, please, please do not do this here.

17 Likes