Doubly-exhaustive match statement

It would be amazing if match statements could also have a compile-time error if the output isnt exhausted (obviously this will only work on enums, I can't see any reasonable/sane way to enforce this on e.g. i32), for example

fn bool_check(inp: bool)->Result<(), String> {
double_match<Result<(), String>> inp {
true => Ok(()),
false => panic!("incorrect input!"),
}}

In this case, the code would not compile, as the double_match has no output of Err.

1 Like

I understand how it is a massive changn, but can you please elaborate how it would be 'breaking'? it's not replacing the match statement, rather adding a new variant, so acc. to semver the minor is bumped, not the major?

1 Like

I can't say I've ever wished for such a syntax before. I'd be very interested to understand what use case(s) it would unlock for you?

I feel like this would be more generally useful, easier to explain, and certainly easier to sell as a language feature, if it were instead a lint-requesting attribute applicable to function definitions and to expressions. You'd write

#[bikeshed_all_variants_must_be_possible]
fn bool_check(inp: bool) -> Result<(), String> {
    match inp {
        true => Ok(()),
        false => panic!("incorrect input!"),
    }
}

and you'd get something like

error: function `bool_check` never returns `Err` variant of `Result`

This doesn't seem particularly useful for Result, because it's common to need to define functions that have to return a Result but there's no actual way they can fail. But I can imagine uses in other contexts. For example, take this enum and its FromStr impl:

#[derive(Debug, Copy, Clone, Eq, PartialEq, Hash)]
#[non_exhaustive]
// The short name of the Rivest-Shamir-Adleman cryptosystem is RSA, not Rsa.
// Similarly for DSA and ECDSA.
#[allow(clippy::upper_case_acronyms)]
pub enum HostKeyKind {
    RSA,
    DSA,
    ECDSA,
    Ed25519,
}

struct HostKeyKindParseError;

impl FromStr for HostKeyKind {
    type Err = HostKeyKindParseError;
    #[bikeshed_all_variants_must_be_possible(Self)]
    fn from_str(s: &str) -> Result<Self, Self::Err> {
        use HostKeyKind::*;

        for (mnemonic, val) in [
            ("rsa", RSA),
            ("dsa", DSA),
            ("ecdsa", ECDSA),
            ("ed25519", Ed25519),
        ] {
            if s.eq_ignore_ascii_case(mnemonic) {
                return Ok(val);
            }
        }
        Err(HostKeyKindParseError)
    }
}

It would indeed be helpful for the compiler to catch it if I happened to leave out one of the entries in the (mnemonic, val) list. But notice how there's no match statement in the body of the function. Also notice how, in this case, what's important is whether the function can return all variants of Self, not whether it can return all variants of Result<Self, Self::Err>. FromStr is actually a perfect example of a function that has to return a Result (because that's the trait API) but in some cases (not this one) will never actually fail (because every possible input string corresponds to a valid value of the output).

5 Likes

In one instance, I have an enum of supported architectures, with helper functions to convert the arch into different strings, and then from strings back into the enum (depending on which API the program is connecting to). It will relieve a lot of stress if the compiler can tell me that one of the helpers hasn't been fully constructed instead of having to figure this out after a stressful debugging session. Thn suggestion below (with a ![flag] syntax) will also fit this use case - and many others that don't use the match statement.

Side note -

This might not have been what @jhpratt was thinking of, but, any addition of a new language keyword is a breaking change, because it takes that word away from people who were using it for something else. That's why it's done so rarely (in all programming languages - not just in Rust).

In Rust, we have the edition system, which allows adding a new keyword across an edition, so adding a keyword is possible and even straightforward — the only question is whether the new feature should have a new keyword. Syntax is often highly contentious yet arbitrary — in any new language feature, it would be best to focus on the semantics of the feature and let the syntax be a placeholder.

4 Likes

Since nothing in here is changing semantics, what you probably want is something more like an allow-by-default lint that you can turn on in particular places.

Then you could write something like

fn enum_from_string(x: &str) -> Option<Enum> {
    Some({
        #[warn(missing_variant_in_output)]
        match x {
            "foo" => Enum::Foo,
            "bar" => Enum::Bar,
            "qux" => Enum::Qux,
            _ => return None,
        }
    })
}

That said, generally the answer for that is to just generate it rather than writing it out by hand, either with a proc macro or passing the variants into a macro that generates the type and the functions so they can't be out-of-sync.

(I think writing this lint in a way that will make people happy will be way more annoying that people are expecting, since looking at the values coming out of a match like this isn't easy in anything but trivial cases.)

5 Likes

There has been an RFC about this before, which failed to gain enough traction and was closed: Add `exhaustive_match_output` RFC by Xaeroxe · Pull Request #3340 · rust-lang/rfcs · GitHub

7 Likes

What happens to nested enums? Do we need to exhaust all combinations, or just the most outer level?

Does this apply to bool? Do we have to exhaust true and false?

What happens when branch's last expression is a function call?

1 Like

Taking some inspiration from the above RFC:

  1. No by default, possibly opt-in (and then you choose how deep you want it go, e.g. only 3 nested enums).
  2. (same as number 1) Expanding though, you should be able to explicitly opt-out, say you have a nesting level of two (I'll use here Result<Option<T>, Option<E>>)so you can ensure there is a Ok(Some(T)) and Ok(None), but then #[sufficient] or whatnot the Err(None) so that the compiler/clippy won't complain about the missing Err(Some(E)) branch.
  3. No, it should apply only to enums, so like Result and Option. It would be too confusing otherwise, e.g. imagine a match branch x: i32 => x/2, - is the compiler expected to try and figure out that this will only cover the i16 range?
  4. see above
  5. Considering this lint is opt-in, it makes sense that the lint should apply here as well (i.e. since it's opt-in we don't skip parts to optimise it), unless you add #[sufficient] (or whatever they will call it) to that branch.
1 Like

Sometimes I wish there was a way to do the opposite and specify "this has to return a Result due to its trait, but it is and will stay Infallible" (changing that would be a breaking change):

struct Foo;

impl std::io::Write for Foo {
    fn write(&mut self, buf: &[u8]) -> Result<usize, std::io::Error> {
        // e.g. write to a vec
        Ok(buf.len())
    }
    // Compile error because the trait expects a std::io::Error.
    fn flush(&mut self) -> Result<(), std::convert::Infallible> {
        Ok(())
    }
}

Though in the background it has to behave like something that can return std::io::Error of course and in practice you likely run into issues because this "cannot fail" isn't propagated and a Wrapper<W: std::io::Write> would have to specify it doesn't add error paths.

2 Likes

There's a bunch of places in the stdlib that could use that! For example, AFAICT std::fs::Metadata::modified and a bunch of its friends were made to return io::Result<SystemTime> for purely theoretical reasons and they can't actually ever fail on any supported system.

To be most useful, though, this annotation would have to make it so the return type behaves like Result<T, !> for purpose of things like being able to call into_ok, and that's a weird wrinkle to add to the type system.

Isn't that more easily handled with fn modified(…) -> io::Result<SystemTime> { Ok(self.modified_impl()) } and an associated comment? (Other than the into_ok nice-to-have.)

1 Like

That's basically what we have now [it turns out there are some platforms where Metadata::modified returns an error -- specifically, an unconditional "not supported" error -- but that just means Metadata::modified wasn't the ideal example].

To me, almost all of the value would come from enabling use of into_ok by the caller.

I like the idea of using this on functions, not just match statements. I can think of uses of both in the compiler, and in other large Rust codebases, marking "when you update this enum, this is a place you need to update, and the compiler won't otherwise catch that".

It would be nice to catch this for things that 1) are not public API and 2) are not passed as functions to other things that are public API, so they don't have an externally imposed signature. Effectively, a lint detecting a potential refactor: "this function can never return an Err and nothing specifically needs it to return a Result".

4 Likes

unnecessary_wraps should be doing exactly that

1 Like

I'm in favour for this idea.

But I want to highlight, that some form of private constructors for public enums would solve a major part of the use-cases I see for this.

I feel like one wants a test for this. A pattern like:

#[test]
fn all_reachable() {
    let argpacks = [(…), …)];

    let mut matched: Vec<_> = std::iter::repeat_n(false, std::mem::variant_count<T>()).collect();
    for argpack in argpacks {
        match T::total(argpack.0, argpack.1) {
            T::Variant0 { .. } => matched[0] = true;
            T::Variant1 { .. } => matched[1] = true;
            // etc.
        }
    }
    assert!(matched.into_iter().all(|x| x));
}

This ensures that you're also actually reaching all mechanisms (and not satisfying the compiler with if !collatz() { return T::not_actually_made(); }.

1 Like

'test' as in I will need do write the code twice, once in the test and once for the actual code?