Lived Experiences: Strange match ergonomics

Continuing the discussion from Allow disabling of ergonomic features on a per-crate basis?:

There are a few reports of match ergonomics that produce surprising or counter-intuitive results. It seems, in effect, that match ergonomics may fall into the same uncanny valley as 2015 modules. While guided by simple rules, the implications of those rules are non-obvious.

I’m interested in knowing which patterns everyone finds surprising. Even better (though it’s a lot of effort), would be a screencast or writeup that demonstrates you running into surprising behavior while editing code. If there are particular patterns that cause trouble, these could be used to improve compiler guidance, especially adding lints that offer guidance to bring non-obvious elision to the fore.

let S { ref y } = *x and reference wrapping

Thanks to by @novacrazy on the disable ergonomics thread.

Consider the following:

struct SomeStruct { y : SomeOtherStruct }

#[derive(Clone, Copy)]
struct SomeOtherStruct { z: i32 }

impl SomeOtherStruct { fn frob(&self) { println!("frobbed {:?}", self.z) } }

pub fn main() {
    let x = SomeStruct { y: SomeOtherStruct { z: 5 } };
    let SomeStruct { y } = x;
    y.frob();
    foo(y) // outputs "5"

fn foo(b: SomeOtherStruct) {
    println!("{}", b.z)
}

The binding to y is performed by copying SomeOtherStruct, and y.frob() is called using automatic reference mapping. The code compiles without error.

Now consider what occurs if x borrows SomeStruct instead of owning it:

pub fn main() {
    let x = &SomeStruct { y: SomeOtherStruct { z: 5 } };
    let SomeStruct { y } = x;
    y.frob();
    foo(y) // expected struct `SomeOtherStruct`, found reference
}

By simply changing adding character, the semantics of the match change. y is no longer a copy; it’s a borrow, because the match desugars to let &SomeStruct { ref y } = x;. Automatic reference mapping continues to obscure the reference, and the compiler emits an error when calling foo(y), which is surprising.

The change in match behavior is not entirely obvious, due to the interaction between the match, automatic reference mapping, and Copy.

In isolation, this case may not seem to be a big deal–the author of the code might be expected to understand the implications of the borrow. Real-world cases, however, suffer from additional complexity, especially in cases of refactoring deep copies into borrows.

Case :face_with_raised_eyebrow:: let Foo(x): &Foo<i32> = &&Foo(23);

@phaylon and I are also discussing a potentially related case on reddit. I am not experienced enough with Rust to tease it apart, and no variant removing ergonomics successfully typechecks.

11 Likes

I’ve refeactored a bit of code and ended up with such error:

fn foobar(&self) -> (&[…], &[…]) {…}
let (foo, bar) = &self.foobar(); // bug here

for f in foo.iter() {}
for f in foo {} // error here

Note the extra & in &self.foobar(). This has unexpectedly turned &[_] into &&[_], which then still worked through most of the function thanks to auto deref magic of method calls, and failed only at the point of use in for in, which complained about &&[]: Iterator not being satisfied.

In this case there was a significant distance between the root cause of the error and the error that the compiler has found.

Could this case be improved? e.g. can it warn about creating a double indirection?

6 Likes

In this case maybe auto-deref should work for for loops too. eg. if &T: IntoIterator and I give the for loop a &&T maybe it should auto-deref it to &T.

4 Likes

Ouch, that sounds really-really bad. :frowning:

No, that would actually make the situation even worse: more magic autoderef -> more distance between bug and cause. It would make the compiler error go away in this one case, but that's probably not a good thing, because it's the exact opposite of what we want (we want the compiler to reject otherwise ill-typed programs that suddenly became "well-typed" via pattern match "ergonomics").

2 Likes

It would make it consistent with auto-deref being used in other places, and it's the lack of consistency here which is partly to blame for the confusing bug.

I agree with your sentiment - except I think that pattern match ergonomics make sense and it's the auto-deref which is the problem here - but since we already have auto-deref we may as well make it useful everywhere, not just in some places.

5 Likes

The core idea is sound, but its a bit tricky because IntoIterator::into_iter takes self by value. I believe that's the reason we don't do any autoderef: if the type that implements IntoIterator is not Copy, it won't work.

It would work for us to do this for types that are shared references only or, more broadly, for types that implement IntoIterator + Copy.

Also, @kornel's bug is not particularly related to match ergonomics in my opinion. There are several alternatives which have the same form, without using match ergonomics:

fn foobar(&self) -> (&[…], &[…]) {…}
let (ref foo, ref bar) = self.foobar(); // bug here
fn foobar(&self) -> &[...] {...}
let foo = &self.foobar();

Its not clear to me how the fact that the extra reference passed through a tuple destructuring is the problem here. Fundamentally the problem is that a binding was accidentally declared as a double reference, which almost always works, but doesn't work for for loops.

There are several ways to declare a double reference that that are equally implicit as "match ergonomics" but don't use it; in general, thanks to type inference, you don't know what type a binding has unless you look at the signature of the functions called on the right hand side of its initialization.

2 Likes

It's exactly because I wrote (foo, bar), but got (ref foo, ref bar). I relied on assumption that functions return the exactly right amount of indirection for their types. By writing (foo, bar) I expected to get exactly the type returned by the function.

Sure, there are places in Rust where you can accidentally create too much indirection, but they're usually more complicated ones. And the problem of mostly getting away with accidental double indirection is technically separate, but here match ergonomics and magic autoderef conspired to create a problem together.

2 Likes

To me the solution would be to have let bindings (irrefutable bindings) treated a bit more strictly than match bindings (refutable bindings).

Even though they’re technically almost the same, I do have different uses and expectations around them. In match I dig into larger structures, and Option<&T> vs &Option<T> is a common pain point.

I have smaller expectations from let, which I basically use to get multiple return values, and AFAIR I’ve only ever used it with very straightforward destructuring of tuples, or at most a 1:1 mapping of a 2-3 field struct.

Would it be feasible to warn when irrefutable pattern causes double indirection due to match ergonomics? Something along lines of "Hey, you’ve created &&foo out of &bar, you’ve probably meant to get &foo"

2 Likes

But my second example is just about as simple as it gets:

let foobar = &self.foobar();

I think far more responsible than match ergonomics for the distance of the error from the bug here is the implicit promotion that lets you assign to a reference expression at all. Incidentally, this feature was just as controversial when it was introduced as match ergonomics are now.

Would it be feasible to warn when irrefutable pattern causes double indirection due to match ergonomics?

I think it would be reasonable to warn when the RHS is a reference expression and the bindings become double references, whether you are binding the whole thing or destructuring it. This sounds like a clippy lint though.

2 Likes

Could you go into more detail about what this means? Specifically, I don't understand the meaning of "implicit promotion".

When you write:

let foo = &bar();

We "promote" the lifetime of the return of bar() implicitly to be longer than the expression it is in. A more restrictive / "explicit" system would require you to write:

let bar = bar();
let foo = &bar;

Hence kornel's example would have errored unless they wrote:

let foobar = self.foobar();
let (foo, bar) = &foobar;

Which I think would have made the bug less likely, and would have also applied to the case I showed where one returns a single slice instead of a tuple of slices that is destructured.

However, I think the benefits of this promotion feature outweigh the bugs it could introduce (especially since they are the kinds of bugs that are caught, the error message is just a bit confusing).

5 Likes

As a variant on this, I encountered the following a few weeks ago:

pub fn new([...], config_custom_toolchains: Vec<config::CustomToolchain>) -> Self {
    [...]
    for ct in config_custom_toolchains.into_iter() {
        [...]
        let CustomToolchain { compiler_executable, archive } = ct;

        // use compiler_executable and archive

Altering the Vec in the code above to be a slice causes problems much further down in the usage of compiler_executable and archive because the types become incorrect.

This was quite confusing to me, likely because I’ve spent so long without match ergonomics that I’ve built up a particular mental model. Specifically, I see any form of let Struct { ... } = s as a checkpoint for my ‘type reasoning’ - scanning it instantly lets me know the exact types coming out of the expression (i.e. a kind of type annotation). With match ergonomics this is no longer the case and so I must either write something like let Struct { ... }: Struct = s, keep all the borrow state in my head or accept the loss of the checkpoint (so errors end up further away when I compile). I’ll be interested to see how I adapt longer term (this is my first real encounter in the wild).

I would’ve spent longer on this if I hadn’t already been aware of match ergonomics (as it’s such a sharp departure from previous behavior). It made me wonder if there are good ways to flag “the compiler made a type decision here that affects your later errors” - it’s probably not an issue for new users, but I expect initial confusion among Rust-using coworkers who (like me) need to re-examine how they reason about some bits of Rust code but (unlike me) have no idea about the match ergonomics feature.

9 Likes

That is the exact problem, thanks for pointing it out and phrasing it so well. The binding modes changing semantics (value -> reference) without a change in syntax (nothing -> ref) are doing exactly what is the anti-thesis of good language and API design: "things that look similar should be similar, and things that look different should be different". With the introduction of default binding modes, two bindings that look identical can radically differ in their behavior (owning vs borrowing), while two bindings that differ in the existence of the ref qualifier can behave identically (both borrowing).

This in turns breaks local reasoning (which I'm kind of puzzled why the lang team thinks is a good thing, given that the engineering of compilers is perhaps the most prevalent profession where the ability to perform local reasoning on code is paramount), which means exactly what you just complained about: you have to keep a much bigger chunk of state in your head in order to track types correctly. In other words, it makes correct reasoning harder, not easier. The only thing it makes easier is have some code compile (which would otherwise be trivial to adjust to compile anyway), but that's arguably a feat in itself.

2 Likes

I’ll bring this up again:

I personally expect this kind of behavior in the match expression, as it’s more often opening up a structure temporarily, and then whatever &ref are required are noise.

In a let statement, “match ergonomics” applying continues to surprise me every once in a while, as it’s primarily used to tear apart structures.

The feature was billed as “match ergonomics”. It applying to let patterns is somewhat surprising.

I still don’t know long-term whether having both patterns behave the same way is better, or if making let stricter would have been better. if let is another tricky case since what I want is pretty much 50/50.

I will say that my post is intended as an observation rather than a complaint - I wanted to express the concern for existing users and I'm genuinely interested in how I adapt longer term.

I don't think local reasoning is uniformly better (see Java variable annotations). It is possible that patterns were the 'just right' on top of function annotations to naturally give Rust the perfect reasoning scope when writing code, but that would be quite a coincidence. Idiomatic Rust code evolves over time, so I'm content to wait and see how it shakes out.

I have over 200 annotations in this file (16 in the 33 highlighted lines, and not even considering my 'match ergonomics' macro mast! to auto insert box patterns) - they're a mixture of mind-numbing manual labour and more tricky ones requiring careful consideration (most of which could be automatically inserted to avoid a match **x { X(ref mut y) => [...] dance). I wouldn't trivialise how much match ergonomics could help.

3 Likes

I'd like to highlight that confusion around let applies to all the examples cited so far. I'd be interested to know whether this is due to the discussion centering around individual patterns, or whether it's common to think of let as a more strict language construct.

Personally, I share @CAD97's opinion--I think of destructuring let as an explicit construct, and match as its more powerful (and implicit) cousin. My "inner type checker" wants if let to work the same way, for consistency, but due to its suspension between let and match, I fully expect neither approach to be a perfect fit.

Spitballing how I'd bridge the gap, I'd probably want something like a warn-level lint saying the idiom for let is explicit, and have the lint output the compiler-inferred ergonomics. For if let, I might introduce a "hint" or "info" lint that highlights ergonomics apply without "taking sides". In terms of RLS support, a refactoring would be presented that switches between ergonomic modes. The warn lint would present the refactoring as a one-way code fix, while the "hint" lint would be bidirectional.

1 Like

I think that's the essence of my complaint, too.

I think of let Struct {…} = s and let s: Struct = s as equivalent, so the mismatch between type annotation and destructuring is the uncanny valley.

5 Likes

My point was that however helpful it might be, it does more harm than good. However, I still stand by the point that all this confusion isn't worth it for the sake of avoiding a deref and a ref – I'm just not bothered of having to dereference a pointer and explicitly specify a ref binding. It's just not an issue for me, and apparently, for many of us.

This has been "broken" since before 1.0. let x = y; already "radically differs" in owning vs borrowing based on whether y is a reference. All match ergonomics does is allow this sort of thing to pass through patterns, by eliding part of the structure of types (the part built out of references).

Your jab about "local reasoning" in compilers is also incredibly disingenuous. None of the local reasoning a compiler does is at all thwarted by match ergonomics.

1 Like

No, it was an analogy. Match "ergonomics" breaks humans' local reasoning (obviously if it broke the compiler's, then it would have more immediately-noticeable and lower-level effects such as missed optimizations, failing to compile, or even unsoundness).

1 Like