Recent change to make exhaustiveness and uninhabited types play nicer together

This is valid:

enum X { A, B }
fn x(a: &X) {
    match a {
        &X::A => (),
        &X::B => (),
    }
}

This is also valid:

enum X { A }
fn x(a: &X) {
    match a {
        &X::A => (),
    }
}

This should be valid, right? I’m not sure what I’m missing, if there’s some reason it shouldn’t be.

enum X { }
fn x(a: &X) {
    match a {
    }
}
7 Likes

You can construct it without unsafe code on nightly

Thanks for the info. Apparently, that won't work then?

I don’t think there’s any disagreement about these semantics in safe rust. The issue than is the current accepted pattern to misuse empty enums to represent foreign data. Rust needs to have a better way to de denote that, similar to en extern pointer in c.

I don't think there's any disagreement about these semantics in safe rust.

I'd hope not, but there seems to be.

The issue than is the current accepted pattern to misuse empty enums to represent foreign data. Rust needs to have a better way to de denote that, similar to en extern pointer in c.

Using a *const Void will still work. You can also make a type who's uninhabitedness is private like in the Opaque example I gave above.

Our previous consensus was that values must always have legal values, so Ok(42): Option<bool> is not possible to have even in unsafe code without UB, but say Ok(&42): Option<&bool> is possible, and matching on it results in UB.

@arielb1 So if I understand correctly, this code should handle an uninitialised &bool safely:

match bool_ref_result {
    Err(e) => ... ,

    // don't dereference
    // this is safe
    Ok(bool_ref) => ... ,
}

But this line should invoke undefined behaviour if the reference is invalid:

match bool_ref_result {
    Err(e) => ... ,

    // dereference and branch on each variant of the inner bool
    // invokes UB if the reference is invalid
    Ok(&true) => ... ,
    Ok(&false) => ... ,
}

Is that correct? If so, the equivalent with &Void is this:

match void_ref_result {
    Err(e) => ... ,

    // don't dereference
    // this is safe
    Ok(void_ref) => ... ,
}
match void_ref_result {
    Err(e) => ... ,

    // dereference and branch on each variant of the inner Void
    // invokes UB if the reference is invalid
}

Which is exactly what I’ve implemented.

1 Like

Here’s the PR to make &Void inhabited again: https://github.com/rust-lang/rust/pull/39151

That opaque type feels like moving pieces around just to make the previous pattern work. It still has the same logical inconsistency. We need something like:

extern type Foo;

That denotes that there is an inhabited type called Foo, just that it is outside Rust’s type system.

2 Likes

Yeah, that would be better really. But that would a whole 'nother RFC.

1 Like

At least currently, yes. The rules about uninitialized types are still in flux, though.

Obviously, your match implementation is safe unless you don't do stupid things with recursive references - if you can reach an empty enum, you can UB in safe code.

The compiler-team-recommended pattern to declare FFI types is

#[repr(C)]
pub struct FFIType(());

extern "C" {
    fn mk_ffitype() -> *mut FFIType;
}

The improper_ctypes lint is very annoying because it warns about this. That's a bug we should fix.

1 Like

extern type Foo;

Thinking about this for the last 10 minutes: I really like this idea and I think we should add it right away.

I don’t see how it’s different from a zero-sized struct with a private field.

It reflects what’s actually going on rather than being a hack.

2 Likes

I’d guess it’d be treated somewhat like a DST, so it can’t be created, returned, moved, size_of’d, etc, even inside the declaring module.

3 Likes

Exactly, you shouldn’t be able to ptr::read() on an FFI type for example. But you can with @arielb1’s FFIType.

Edit[0]: As I see it, it would just be an opaque type which is ?Sized but which still has thin pointers.

Edit[1]: Although… with all the other DSTs we can at least calculate their size at runtime. With an extern type we couldn’t even do that. I dunno how much of a problem this would be. You’re supposed to be able to stick DSTs at the end of structs for example.

2 Likes

Yes, see the custom DST RFC. We need to split the trait so there’s “referent types” and actual DSTs (with {size, align}_of_val) as a subtrait. Opaque C types and C’s void-in-void*(plain void is unrelated and () ahhh!!!) would use this.

So I basically agree with @glaebhoerl here, and think bool as an analogue is very compelling.

I'd like to thank @arielb1 for addressing the bool case:

Our previous consensus was that values must always have legal values, so Ok(42): Option is not possible to have even in unsafe code without UB, but say Ok(&42): Option<&bool> is possible, and matching on it results in UB.

Loosely inspired by Agda's absurd patterns, we can actually make a coherent policy out this by forcing one to get the value in a "match arm stub", e.g.:

match void_ref_result {
    Err(e) => ... ,
    Ok(&_), // no arm needed
}

This would add clarity when the uninhabited type is deeply nested. I'd consider this an excellent compromise for everywhere, or convent sugar for unsafe code for

match void_ref_result {
    Err(e) => ... ,
    Ok(&r) => match r { },
}

Where one relies on UB to remove the extra branching instead of being explicit.

There is no way to call this function without triggering undefined behavior. Therefore it is better to make it impossible to define this function, statically. Note that this is true even without the match:

enum X { }
fn x(a: &X) { }

Regarding the match, the only clearly reasonable bodies of a match on such values is the empty body, which is a no-op. Therefore, it makes sense for the compiler to statically reject such matches too.

This would be clearer if there were Inhabited and Uninhabited traits. That would give a clear path forward for generic code that needs to be generic over possibly-uninhabited types, where the code for uninhabited types does something different (probably nothing) from the code for all inhabited types, as such code could just define separate implementations for T: Inhabited and T: Uninhabited. Then match expressions, function arguments, and related things could be defined to have an implicit T: Inhabited bound.

One reason to define such a function could be satisfying a trait:

enum NoError {}
// necessary to allow unwrap() on Result<T, NoError>
impl fmt::Debug for NoError {
    fn fmt(&self, &mut fmt::Formatter) -> Result<(), fmt::Error> {
        match *self {}
    }
}

(Usually better to use ! for this, but not in all cases, and your reasoning seems to work just as well for functions taking &!. There’s a proposal to have ! magically impl all the things, but that won’t work for many traits, such as those with static methods.)

Also, it’s pretty easy for free generic functions to end up instantiated with arguments of uninhabited types. For example, in the following, futures::err is if E is uninhabited:

fn dumb_result_to_future<T, E>(r: Result<T, E>) -> futures::BoxFuture<T, E> {
    match r {
        Ok(t) => futures::ok(t).boxed(),
        Err(e) => futures::err(e).boxed(),
    }
}

But there’s no reason to forbid either example. The functions can’t actually be called without undefined behavior, but they never will be; they’re just there to satisfy the type system.

IMO, this would be better:

// No need to implement any methods for implementations of traits
// by uninhabited types, since there are no values of `Self` or
// `&Self`. Instead the compiler will automatically derive no-op
// implementations.
impl<T: Uninhabited> fmt::Debug for T {}
fn dumb_result_to_future<T, E: Inhabited>(r: Result<T, E>) -> futures::BoxFuture<T, E> {
    match r {
        Ok(t) => futures::ok(t).boxed(),
        Err(e) => futures::err(e).boxed(),
    }
}

fn dumb_result_to_future<T, E: Uninhabited>(r: Result<T, E>) -> futures::BoxFuture<T, E> {
    match r {
        Ok(t) => futures::ok(t).boxed(),
        // Err(e) is an impossible case for Unihabited types.
    }
}