Linting and match ergonomics

Here are some demonstrations of issues caused by match ergonomics. I hope these will suffice as demonstrations of the kinds of guarantees I’m after.

Case A: T to &mut T

Example

In this case a type changes from an owned value to a mutable reference. Even though there is an API change, the same code dealing with the data still compiles and runs, simply giving different results.

The version with an owned value has this behavior:

has resource 23
finalizing ptr
releasing resources
assume related resource dropped
reusing ptr

While the mutable reference of course has different drop semantics:

has resource 23
finalizing ptr
assume related resource dropped
reusing ptr
releasing resources

When we perform this change in a version of Rust without match ergonomics, we get a clear error message:

error[E0308]: mismatched types
  --> src/main.rs:23:17
   |
23 |                 Some(resource) => {
   |                 ^^^^^^^^^^^^^^ expected mutable reference, found enum `std::option::Option`
...
85 |     fn_handler!();
   |     -------------- in this macro invocation
   |
   = note: expected type `&mut std::option::Option<Resource>`
              found type `std::option::Option<_>`

While this example uses raw pointers to mark the relevant parts, the issue of unexpected changes to drop semantics is certainly not unknown in Rust. FFI resources are simply one of the more severe cases of this problem, due to issues with debuggability and things being outside of the view of the Rust compiler.

The example involves a change from an owned value to a mutable reference. But the other direction is just as undiscoverable and just as problematic. An owned value might get released earlier than expected, leading to dangling pointers.

Case B: Lifetime Semantics Change

Example

In this case a type changes from a reference to a tuple, to a tuple of two references. The same code operating on a single reference with a single lifetime still applies even though now two separate lifetimes are involved.

With match ergonomics, the code compiles and the change is not noticable.

Without match ergonomics, Rust gives us a clear error for our initial written code, asking us to be explicit with our reference semantics:

error[E0308]: mismatched types
  --> src/main.rs:12:17
   |
12 |             let (a, b) = get(data);
   |                 ^^^^^^ expected &(i32, i32), found tuple
...
38 |     fn_handler!();
   |     -------------- in this macro invocation
   |
   = note: expected type `&(i32, i32)`
              found type `(_, _)

If that fix is applied, the change to a flat tuple with reference will cause another clear error:

error[E0308]: mismatched types
  --> src/main.rs:12:17
   |
12 |             let &(a, b) = get(data);
   |                 ^^^^^^^ expected tuple, found reference
...
59 |     fn_handler!();
   |     -------------- in this macro invocation
   |
   = note: expected type `(&i32, &i32)`
              found type `&_`

Communicating Intended Semantics

Example

This case isn’t about an issue due to code changing, but about debuggability and readability of the code.

It demonstrates a state enum containing multiple kinds of cases. It is a mix of values to mutate, values to only introspect, and primitives that are copied out.

The full-on match ergonomics version hides all those details. Copied, introspected and mutated parts of the state all look the same.

A third match is provided detailing how enforcement of having to write ref mut highlights the mutated part and makes it easier to spot something that should not be mutated.

Some notes:

  • Detecting useless &mut _ bindings will not solve the issue, since we might be trying to debug accidental mutation.
  • A stored &mut _ could still allow hidden mutation, but I can force the consumer to write an identifier for the field name, like stream_ref or stream_mut. This isn’t possible with normal patterns.

It should also be noted that since this is a convention-based use-case, there is no way for a lint targetting specific problems to be a proper solution. The value is the enforcement of writing in the conventional, explicit style, even when not necessary. Given the above example, the value comes from being able to enforce that every stream handler change still follows the convention and communicates intent. Occasional uniformity is not uniformity.

Deref Coercions

I do believe that deref-coercions fall into a similar category and might also be worth having an optional lint. There is also the issue of the ergonomics semantics stacking up. After all, it’s the combination of match and deref ergonomics that gives us

let Foo(x): &Foo<i32> = &&Foo(23);

In Defense of Defensive Programming

I think that there is lots of value in the ability to do defensive programming. I also believe that whenever certain guarantees no longer apply due to ergonomics or similar changes, a way to keep the defensive guarantees is of similar value.

We already have facilities for defensive programming in Rust:

  • We use .. when destructuring to mark that there are unmentioned fields in there.
  • We use mut to make bindings that change easily visible. Even with the existence of hidden &mut _ references or internal mutability, this is still a very valued feature by the community.
  • We have must_use to ensure things are explicitly discarded.

I’m sometimes wondering, if we had started out with match ergonomics, would a lint requiring one to explicitly state the reference semantics be this contested?

Other Lints

This is certainly not the only useful lint. There are others that would certainly be useful, like (non-exhaustive):

  • Linting against ref mut bindings that aren’t mutated.
  • Linting against plain match ergonomics bindings on &mut values when nothing or only parts are mutated.
  • Linting against treating single-variant enums as irrefutable.
  • Linting against deref coercions, as mentioned above.

Requiring explicit patterns is certainly not an answer to all the problems, but it certainly is to some. Given any piece of code I write, pre-match-ergonomics Rust certainly has more in-code guarantees than post-match-ergonomics Rust.

Summary

The examples are some specific highlights to issues that can come up during refactoring, but also during prototyping or simply because one misremembers an API or data structure.

Other things to consider are:

  • There might be multiple contributors of various experience levels.
  • The code might not be touched a lot.
  • Hidden semantics are always harder to catch during reviews.
  • The involved pieces might be far apart, even in different crates.
  • The code above uses raw pointers but no unsafe. The actual unsafe usage might be far away.
  • Parts of the code could be obscured by macros.
  • Some things like wrong raw pointer usage might accidentally keep working for a while, until a seemingly unrelated change causes obscure failures.
  • There’s a lot of pattern matching going on in Rust.
  • Patterns can be deeply nested.

I already mentioned that deref coercions can further complicate things. Some things in the future like in-band lifetimes or patterns performing deref coercions themselves would complicate things even further. And they will certailnly increase the cognitive burden for reviewing patches that will probably not include all relevant parts.

As a final note, thanks to @ahmedcharles for highlighting the fundamental issue here: Confidence in the code, guarantees that are upheld by the compiler, and minimizing the amount of features one has to keep in ones head while understanding the current and possible future implications of any given piece of code.

11 Likes