Stabilizing `non_exhaustive_omitted_patterns` or a better alternative

A language empowering everyone to build reliable and efficient software.

#[non_exhaustive] has been around for over 5 years, allowing crates to add enum variants without it being a breaking change. There even is a allow-by-default clippy lint for enums that are not marked non_exhaustive. But there is still no stable way to get a warning or error when missing a variant that already exists.

The closest is non_exhaustive_omitted_patterns (Tracking issue), which is not yet stabilized (no comments on the tracking issue for almost a year). But I think the approach this is going for is not the right one either. When thinking about whether your code is complete and you matched all (existing) variant combinations you effectively care about whether the wildcard/catch all arm is ever executed (while this only checks if all variants have been mentioned). The reason for this is apparently related to performance concerns, but Rust already does this analysis for exhaustive enums, so I don't see this as an issue if done in the right place:

In order for this lint to work properly with nested patterns, we would need to change the exhaustiveness algorithm to explore a lot more cases, which isn't feasible performance-wise.

Doing it this way may be useful, but in my opinion it doesn't achieve the goal of "I want to be sure this doesn't miss anything unless the dependency updates and adds a new variant".

Another one was #[warn(reachable)] or #[deny(reachable)] (here, here, here), which actually does what you want. But it seems to have been replaced by non_exhaustive_omitted_patterns.

I'm not an expert on compiler internals, so implementing the following may not be easy, but the compiler already has to do complete exhaustiveness checking, so I don't fully understand why there couldn't be an "if this branch ever ends up in the binary output a warning". The most naive way to do this could be to remove the catch-all and see if the code still compiles. If it doesn't: Print a warning and add the catch-all back in. That way you have the ability to add variants and warnings when you don't have one (unless you opt-into it being a compile error with #[deny(reachable)].

We even have the opposite #[warn(unreachable_patterns)] lint, but it doesn't work on non_exhaustive enums (since future versions can make this relevant of course). So a system for checking if all variants are matched already exists somewhere. There just isn't the logic/warning for handling missing arms of non_exhaustive enums.

My opinion

  • This isn't something that should be left open for over 5 years, with people talking about it even longer ago
  • I don't think the current non_exhaustive_omitted_patterns is the way to go, it only solves parts of the problem.
  • When compiling a crate you know the exact version of your dependency (rustc also knows this, not just cargo). You know which variants currently exist → Handling non_exhaustive enums like normal enums and only add/enable the wildcard/catch-all branch if needed [1] (and printing a warning) should work. Updating the dependency still wouldn't be a breaking change (unless you opt-in to having this be treated as an error).
    • This could even be extended to a general compile_warn!("message") macro, similar to compile_error!("message") instead of an attribute/lint. Both currently cannot be used for this, see example below.
  • I'm not sure if the warning could be a enabled by default or if you'd have to opt-in in some way.

Some ways it could be done (most are examples from the issues linked above)

// Current unstable implementation (doesn't emit a warning for this as
// long as Enum only has A and B). Useful in some situations, but IMO
// not in general for non-exhaustive enums.
#[warn(non_exhaustive_omitted_patterns)]
match (x, y) {
    (Enum::A, Enum::A) => true,
    (Enum::B, Enum::B) => true,
    _ => false,
}

// What I'd prefer for this, as it makes this match more like non
// `non_exhaustive` matches, only using the fallback if the dependency
// is updated and adds a new variant.
match expr {
    Expr::Array(e) => {...}
    Expr::Assign(e) => {...}
    #[warn(reachable)]
    _ => { /* some sane fallback */ }
}

Here is another option that unfortunately doesn't work as of today, even though it probably should. EDIT: You still want type checking, borrow checking and everything else in the unused branch, so maybe this isn't how it should be (though that doesn't really matter for compile_error. Therefore you'd still want to process/analyze/compile the unused branch. So to make this one work with compile_error you'd probably need to delay the "stop compilation" of compile_error until the "can this even be executed" check is done.

The compile_error is probably encountered and processed before the exhaustiveness checks and compilation always fails. I think this is the most readable in terms of name-clarity (avoiding the naming issue that seems to have stalled non_exhaustive_omitted_patterns) and one of the most flexible, but it is probably quite a bit harder to implement, as it relates to in which order things are done in the compiler.

match expr {
    Expr::Array(e) => {}
    Expr::Assign(e) => {}
    _ => {
        compile_error!("Missing non_exhaustive pattern")
    }
}
match expr {
    Expr::Array(e) => {}
    Expr::Assign(e) => {}
    _ => {
        compile_warn!("Missing non_exhaustive pattern")
    }
}

How is it that the only way to handle this properly on stable rust (with non_exhaustive existing for 5+ years) is a runtime panic? Am I missing something here?


  1. Or if compiling in such a way where the dependency isn't linked at compile time. ↩︎

5 Likes

It sounds to me like this goes directly against the purpose of adding #[non_exhaustive] in the first place, both on the specific enum and in the language in general. You can already do exhaustive matches on enums if you want extra correctness checks, and get a compile error if any variants are added.

The entire purpose of #[non_exhaustive] is to make adding a new variant a non-breaking change, i.e. the code which compiled before keeps compiling. If you add a feature to make non-exhaustive matches into an error, you disable that language feature and remove API evolution possibilities from library developers. And it doesn't matter much whether it is a hard error or a lint. Once dependents deny this lint, build will break, causing headache for the devs of the library providing that enum.

Enums where new variants cannot be reasonably safely ignored shouldn't have been #[non_exhaustive] in the first place.

2 Likes

This argument about breakage could equally well be applied to #[deprecated]. This program fails to compile today:

mod some_lib {
    #[deprecated]
    pub fn foo() {}
}

mod some_user {
    #![deny(deprecated)]

    fn user() {
        super::some_lib::foo();
    }
}

(But if some_user is a dependency of the current build, then it will still compile thanks to Cargo's use of cap-lints.)

The point of non_exhaustive_omitted_patterns, like the point of #[deprecated], is to alert developers of dependent crates that they should consider updating their code at their convenience, even though the change (deprecating an item or adding a variant) is not formally breaking and the code will continue functioning.

If they add deny(non_exhaustive_omitted_patterns) to their code and that causes build failures they’re not planning on, that's their workflow problem, just like any other deny(some_lint) or forbid(some_lint). Adding allow-by-default non_exhaustive_omitted_patterns does not create new problems, only give developers another tool which is misusable in the same way all lints are misusable.

11 Likes

#[deprecated] is added by library's authors, not consumers. The authors decide when to deal with the trouble of deprecating their APIs. Given that deprecated APIs are often subject to removal, it's breakage anyway, just a with a grace period.

#[non_exhaustive], on the other hand, is specifically added by authors to prevent breaking consumers who wouldn't know otherwise that adding variants isn't considered a breaking change. The proposal is to void their attempt at improved backwards compatibility. In that case we could just remove #[non_exhaustive] entirely, it would be serving no purpose.

If Cargo semver rules are followed and the removal is done across a major version bump, then there is no breakage, in the sense that no transitive dependents will stop compiling upon cargo update. (But that's not relevant to non_exhaustive, because there is no corresponding notion of eventual removal.)

Yes, this is true. And those consumers are not going to be broken. Let's name some parties:

  • foo::Enum is a #[non_exhaustive] enum.

  • bar is a crate that depends on foo and contains

    fn do_something(e: foo::Enum) {
        #![warn(non_exhaustive_omitted_patterns)]
        match e { ... }
    
  • baz is a crate that depends on bar.

If foo releases a new minor version with a new variant in foo::Enum, then:

  • baz's builds continue to succeed, for all users of baz.
  • baz’s behavior may or may not be as desired; it is up to the developers of foo to design things so that the new variant does not cause such problems. (This is the status quo — use of the flexibility that #[non_exhaustive] grants always has this consideration.)
  • Once bar’s developers update their lockfiles, their builds of bar will still succeed but have a new warning.
    • If they use #[deny(...)] or RUSTFLAGS=-Dwarnings or similar, then their builds fail.

The only case in which any builds fail is if bar uses #[deny(...)], RUSTFLAGS=-Dwarnings, or similar, in which case they opted in to that and are already subject to such failures any time they update a library or their Rust toolchain. There are lots of ways for non-breaking changes to cause new lints.

I claim that there is no breakage for any party here, if we accept “breakage” as meaning approximately “my build is failing after cargo update, and I can do nothing about it without forking some of my dependencies”.

5 Likes

The idea is about a warning, not an error. Warnings are not breaking.

4 Likes

As other's have mentioned: It's not about making this an error that prevents compilation, but like #[deprecated] a built-in way to inform dependencies that they may want to look into the new variant, even if it just to find out that there is a new feature they may want to use.

I disagree. There are use-cases for both in the same crate on the same type and different needs/desires in terms of stability vs correctness. Let me give you two examples why:

Example 1: Consumer

As a project depending on library X, which uses #[non_exhaustive] I WANT to be notified when there is a new variant. If I'm working on a security/safety critical application I may even WANT my build to fail with an error if a new variant is added (e.g. when using locked versions from cargo.lock). At the moment there is no way for me to opt-into those breaking change (also see this topic). Personally I haven't had this situation, but I really don't like the idea of NEEDING a catch-all branch without even getting notified when compiling that I'm missing something, with the only way to find it being a runtime panic. In my opinion this goes against one of Rusts core concepts, but I do see the need/desire for #[non_exhaustive] and the advantages this can bring. In fact that's exactly the issue I have in my second example:

Example 2: Producer and Consumer

For context: I have a types crate defining common types used across a HTTP API boundry on both client and server side. I am the one that wrote the server side using this (same workspace), but I'm not the (only) one that uses the types crate on the client-side.

Today I wanted to add #[non_exhaustive] to the enums in that crate, to allow adding more variants to them without breaking backwards compatibility to clients using an older version of types (no actual breaking change => no need to do a major version bump, at the moment the project is still in development, so not having #[non_exhaustive] yet is okay). Backwards compatibility on the API is not an issue, since the encoding (JSON in this case) does allow adding new variants to enums.

But on the server side, where I'm consuming these enums I do not want the catch-all. I do not want missing variants to go unnoticed until the server crashes in staging/production. At the very least I WANT warnings when new variants are added, too (granted, I'm usually the one adding them, but looking through all match statements to find them is still error prone). I'd actually PREFER this to be an error in this case and WANT to opt-in to these as errors caused by my own changes (or others working on the project).

Currently (on stable at least) this is like if you'd have to use a catch-all in every match statement in your own crate, which defined the enum (just that it's the same workspace and/or organisation in my case). This makes deciding whether or not I want to use #[non_exhaustive] or not a difficult decision (for which I'm currently tending towards no).

This specific example could be solved by having workspaces behave like crates in regards to requiring the catch-all (another option I've seen proposed while looking into this), but that would not help in example 1 and might not be what everyone with a workspace wants (e.g. in examples meant to show how to use the crate).

An alternative (hacky) workaround

The only ways I can think of to solve this (as of now) is to:

  • use nightly (don't want that for production unless absolutely necessary) or
  • use conditional compilation and a separate nightly run just to check exhaustiveness (which is still only semi-complete)
  • use conditional compilation (e.g. via env variable) to disable/remove the #[non_exhaustive] attribute while working on the project.

None of these is ideal.

2 Likes

There is one use case forgotten: binary installs. If a crate sets #[deny(non_exhaustive_omitted_patterns)] and their dependency adds a new variant, then cargo install will break. While perhaps the blame could be placed on the binary's authors, I'd say having this kind of breakage in the ecosystem is still undesirable. Note that the binary crate may itself be a dependency of other tools and applications, so it's basically the same "breaking dependents" problem.

This is not an issue with Clippy lints, since Clippy isn't run on compilation, only during development/CI by the devs who can directly fix any warnings and errors.

I don't know if it does, but it could cap-lints so that no warnings are fatal.

4 Likes

cargo install docs are unfortunately silent about the treatment of lints. But yes, if it ignores deny and forbid lint levels, and the default level for this lint is warn, then I guess my compatibility concerns don't apply. Any breakage would be purely opt-in by the crates' devs.

1 Like

This is already true for denying any other lint too, one very similar example would be #[deny(must_use)], that is another lint against an open set of situations controlled by your dependencies. Adding a new lint that authors can incorrectly deny is not a problem.

5 Likes

The performance-related quote you quote is not relevant, it was about an implementation strategy of non_exhaustive_omitted_patterns. You are correct that deny(reachable_ignoring_non_exhaustive) would be possible to implement.

I designed non_exhaustive_omitted_patterns at the time because I believed that was the lint that people actually needed, exactly for the case you mention. I may very much be wrong there. What do you feel is insufficient in it? Do you have a specific example where it fails to do what you need?

1 Like

It does achieve the goal of notifying you when a new variant was added, yes. And I think it is a good way for match statements where you'd want to use a catch-all branch like in the "both have the same variant" example.

But this likely gets problematic as soon as you do not want a catch-all. When you want to make sure you have caught all cases in a more complex match. When the only real way to handle the catch-all case would be to panic.

To me a "warn me if this arm ever reaches codegen" feels more intuitive and more closely matches the situation when #[non_exhaustive] isn't involved, than a "warn me if everything was mentioned", but that's probably subjective.

I must admit that I haven't come across the situation of multiple #[non_exhaustive] enums in the same match statement, yet, so I can't give a real example where there actually is a difference.

Another thing to consider (not sure how the current implementation handles this) is how patterns with nested (non-exhaustive) enums should be handled (especially when the patterns become more complicated):

match x {
    Outer::A(Inner::X),
    Outer::A(Inner::Y),
    Outer::B(Inner::X),
    // Forgot B(Y)
    _ => panic!(),
}

I think in these cases you wouldn't want to push the burden of checking exhaustiveness to the developer, as it removes a large benefit of match statements in the first place.

Perhaps you need both, for different situations.

Personally I'd be okay with both, but given that #[non_exhaustive] is already in use, sometimes recommended and has been stabilized for a while, something like this should probably be stabilized, regardless of how exactly it works.

The big problem with "warn if this arm ever reaches codegen" is that it's either non-deterministic, or you've said that pre-codegen optimizations must apply unconditionally.

It's a lot simpler to have a deterministic warning if you "simply" enumerate all possible enum variants, and warn if a variant is missed. For example:

match x {
    Outer::A(Inner::X) => ax,
    Outer::A(Inner::Y) => ay,
    Outer::B(Inner::X) => bx,
    Outer::B(Inner::Y) if we_are_frobnicating() => frobnicate(by),
    _ => unreachable!(),
}

Should the warning depend on whether or not the compiler is capable of determining that we_are_frobnicating() is unconditionally true or not?

This is why "warn if a variant is not mentioned" is preferable; such a warning could ignore Outer::B(Inner::Y) if we_are_frobnicating() (due to the guard - we already do this for exhaustive enums), and then notice that the only match for Outer::B(Inner::Y) is the unconditional pattern _.

1 Like

No, as this isn't done for exhaustive enums either. It doesn't check exhaustiveness before optimizations, it just requires that you handle all cases.

Granted: That may look a bit weird, but it clearly marks the we_are_frobnicating() == false case as handled, instead of bundling it in with the non_exhaustive enums:

match x {
    Outer::A(Inner::X) => ax,
    Outer::A(Inner::Y) => ay,
    Outer::B(Inner::X) => bx,
    Outer::B(Inner::Y) if we_are_frobnicating() => frobnicate(by),
    Outer::B(Inner::Y) => unreachable!(),
    #[warn(reachable)]
    _ => unreachable!(),
}

Effectively making this a "would it compile if we remove #[non_exhaustive] and all arms marked with #[warn(reachable)]". Maybe reachable isn't a good name either, as it does imply that the check runs after optimizations. Maybe #[warn(used_for_known_variant)], or #[unknown_variants_only] (though you couldn't decide if it should be warn or deny in that case).

But, in the example I gave, the arm for Outer::B(Inner::Y) if we_are_frobnicating() does not reach codegen - it's unconditionally true before we get to codegen, and therefore any warning that's dependent on "this branch reaches codegen" is going to not trigger.

Which is why we want something more like non_exhaustive_omitted_patterns, which fires even if the conditional turns out unconditionally true by the time we get to codegen - the goal is not "this arm unreachable at codegen time", but rather "if we take away irrefutable patterns, would this match still match everything, or are there variants not matched?"

Codegen is irrelevant, a static check that depends on optimizations is not the kind of thing we do in rust. "reachable" here means "reachable as far as we can statically tell from looking only at the patterns", it's not about codegen (which in fact very often thinks an arm is reachable despite exhaustiveness analysis proving that it isn't).

The current form of the lint is relevant for cases like:

match ... {
    (Enum::A, true) => ...,
    (Enum::B, true) => ...,
    (Enum::C, false) => ...,
    (Enum::D, false) => ...,
    _ => {}
}

As the author of this code, I would like to be warned if Enum gets a fifth variant. But the last branch is very much reachable regardless of whether Enum has more variants. So a lint that's based on exhaustiveness/reachability analysis is not the solution.

3 Likes

Effectively making this a "would it compile if we remove #[non_exhaustive] and all arms marked with #[warn(reachable)]".

"if we take away irrefutable patterns, would this match still match everything, or are there variants not matched?"

"reachable" here means "reachable as far as we can statically tell from looking only at the patterns",

I'm pretty sure all three of you already agree!

1 Like

I've looked through match statements using two values in the Rust repo today and the vast majority (if not all of them) do rely on a catch-all arm and would benefit from non_exhaustive_omitted_patterns but not reachable_ignoring_non_exhaustive.

So I must admit that non_exhaustive_omitted_patterns is (probably) more useful for match statements with tuples, while there is not a big difference for match statements that only have a single value.

Though I'm not completely convinced it is the best option, especially for match statements with more complex patterns. Consider the following (contrived) example:

#[non_exhaustive]
pub enum MyEnum {
    A,
    B(i32),
}
#[no_mangle]
pub fn whatever(num: MyEnum) -> i32 {
    use MyEnum::*;

    #[warn(non_exhaustive_omitted_patterns)]
    match num {
        A => {},
        B(..=0) => {},
        // B(1) => {},
        B(2..=10) => {}
        B(11..) => {}
        // _ => {}
    };
    0
}

If these are in different crates you need the catch-all branch in case a C variant is added. When that is the case you will get a warning, but there is no warning that this catch-all branch is already load bearing because B(1) was missed. Only the (probably) more common B(2..10) mistake would emit a (different) warning.

As far as I can tell the only way (besides not using non_exhaustive) to get the compiler to emit a warning when this is detected (which it can already do) would be to separate this into two nested match statements. While #[warn(reachable_ignoring_non_exhaustive)] still clearly tells the compiler that you never intended this to be a relevant arm and only need it due to the non_exhaustive.

I doubt there is a way to combine these two cases into a single attribute, hence there may be a need for both (though I don't know how relevant this actually is in practice or if there are other, similar cases).

1 Like

That example has me wondering if there's a tractable way to have two separate catch-all branches: one for "if there is an unknown variant for MyEnum, go here" and the existing "anything unmatched goes here".

To put a bikesheddable syntax on it, something like:

#[non_exhaustive]
pub enum MyEnum {
    A,
    B(i32),
}

pub fn whatever(num: MyEnum) -> i32 {
    match num {
        MyEnum::A => {},
        MyEnum::B(..=0) => {},
        // MyEnum::B(1) => {},
        MyEnum::B(2..=10) => {}
        MyEnum::B(11..) => {}
        MyEnum::_ => {}
        // _ => {}
    };
    0
}

With both MyEnum::B(1) => {} and _ => {} commented out, this generates the expected non-exhaustive patterns error, since MyEnum::_ will not match MyEnum::A or MyEnum::B(_) here, but it will match variants added since this code was first written, and in this case, MyEnum::B(1) is omitted.

In terms of semantics, as patterns in a set of match arms are considered, record the enum variants mentioned as known variants (in this case, MyEnum::A and MyEnum::B are known variants). Then MyEnum::_ matches all variants of MyEnum that are not recorded as known at the point you encounter MyEnum::_ in match arm order.

This would result in an unreachable match arm warning for the following code:

pub fn whatever(num: MyEnum) -> i32 {
    match num {
        MyEnum::A => {},
        MyEnum::_ => {}
        MyEnum::B(..=0) => {},
        // MyEnum::B(1) => {},
        MyEnum::B(2..=10) => {}
        MyEnum::B(11..) => {}
        // _ => {}
    };
    0
}

In this case, MyEnum::A was the only known variant. We then hit the MyEnum::_ pattern, which matches all variants of MyEnum other than the known ones, and then MyEnum::B is unreachable, since it's already matched by MyEnum::_.

For the purposes of non_exhaustive_omitted_patterns, MyEnum::_ does not match anything; and if I've thought this through properly, MyEnum::_'s job is to match anything that would trigger the non_exhaustive_omitted_patterns warning, but not things that would not trigger that warning.

4 Likes