Recent change to make exhaustiveness and uninhabited types play nicer together

nikomatsakis · January 12, 2017, 6:44pm

So the recent change in PR 38069 to make the exhaustiveness checking consider uninhabited types specially caused a lot fallout. A lot of this is from crates that use #[deny(warnings)], and hence which are now failing with arms that used to be considered inhabited being considered uninhabited, but some of it is not. Lint failures are not in and of themselves worrisome, but I am a bit worried that we have changed the semantics of code in unacceptable ways. Here is a list of the issues opened tallying regressions that I am aware of:

https://github.com/rust-lang/rust/issues/38889 – Subtle breaking change with match patterns on uninhabited types
- Points out that empty enums were being used intentionally to make “placeholder” enum variants, but these placeholders are now recognized as impossible.
- There exists, I think, a better pattern that could be used here.
https://github.com/rust-lang/rust/issues/38969 – ICE in empty-0.0.4, Rust 1.16, unreachable for-loop pattern
- A problem because it is a hard error for a for loop pattern to be unreachable. This is fixable and should not be a hard error anyway, in my opinion.
https://github.com/rust-lang/rust/issues/38972 – Irrefutable while-let pattern in log4rs-rolling-file-0.2.0, Rust 1.16
- A problem because it is a hard error for a while let pattern to be irrefutable. This is fixable and should not be a hard error anyway, in my opinion.
https://github.com/rust-lang/rust/issues/38977 – Unreachable expression in guard-0.3.1, Rust 1.15
- A problem only because the code is used in macros that gets exported to clients, which may have #[deny(warnings)]. This seems like a failure of cap-lints more than anything else, particularly since this crate could benefit from the feature in question. However, it does seem to be hard for @durka to find a formulation that achieves their goal of guaranteeing divergence and works across all versions of rustc.
https://github.com/rust-lang/rust/issues/38975 – Unreachable expression in void-1.0.2, Rust 1.16
- This doesn’t seem like a problem. It’s a contained lint error that won’t affect clients, and the existence of this crate can be considered a feature request for precisely the changes we have made. =)

Reviewing this list, I think there are two bugs, and we should probably just fix those, but most of the remaining impact seems all right. It’d be nice though to find a way for @durka to express the pattern they are looking for that checks without warnings on stable stable/nightly, I guess?

I do think in retrospect we didn’t spend enough energy evaluating the impact of this change. It’d be good to have phased it in more gently, at minimum by contacting crate owners. I think this is my fault as the reviewer of the PR for not double-checking that we had run crater and so forth.

That said, there is one thing that I am worried about. Some will recall this prior discussion about the best way to think about uninhabited types and mem::uninitialized. In that discussion, interestingly, we came to some conclusions that seem somewhat at odds with the current exhaustiveness checking changes. In particular, in that discussion, we talked about whether it ever made sense to have a value of type &!, and under what conditions that could be considered UB. The general consensus there was that it only became UB if the type was dereferenced – which implies that unsafe code CAN create values of type &! (say) so long as that reference never escapes to safe code and is never dereferenced by the unsafe code.

However, in nightly today, &! is considered uninhabited (as is &Void), which means that code like this typechecks:

enum Void { }
fn foo(x: Result<i32, &Void>) {
    let Ok(x) = x;
}

This seems wrong to me given the tentative outcome of the unsafe-code-guidelines discussion discussion above, no? It also seems to be a backwards incompatibility hazard for crates that were using *const Void to represent void* pointers, even if they ought not to have been doing that. (But, given the outcome of the UCG discussion, this doesn’t seem wrong, though it’s not what I would do.)

cc @canndrew @arielb1 @eddyb @strega-nil @brson

brson · January 12, 2017, 6:52pm

Thanks for putting these issues in perspective @nikomatsakis. It does look to me like there’s more legwork to do here to smooth over the transition. Whatever the outcome, please let’s make it happen before beta branches on the 31st (I think).

glaebhoerl · January 12, 2017, 7:23pm

@nikomatsakis I think these might be two separate questions which are being conflated. I suspect that &! should be considered potentially inhabited to the optimizer (and, under a tootsie pop model, only in the vicinity of unsafe blocks!), but not to the typechecker.

cuviper · January 12, 2017, 7:31pm

I understand why &! and &Void are uninhabited, but why does that create problems for *const Void? Aren’t pointers allowed to have arbitrary/impossible values?

nikomatsakis · January 12, 2017, 7:35pm

In a meeting now, so writing hastily:

But I’m having some second thoughts. My feeling is that this experimentation with ! and inhabitedness is – in a sense! – going great. We’re encountering a lot of interesting questions. But I feel like we haven’t found the final answers yet, and I am wary that we are pushing this process forward a bit too chaotically.

I think we should consider trying to restore the old behavior around empty enums temporarily. The idea would be that true inhabitedness purely derives from ! for now (which is gated). Then we can get the semantics how we want them for ! and – when we feel ready – enable them for empty enums.

In the meantime, we can start doing warning periods around empty enums for things and patterns we expect to change.

But I am very wary that we will (e.g.) accept some matches now (perhaps some that use Result<(), &Void>) that we would not want to accept later.

nikomatsakis · January 12, 2017, 8:11pm

I think you are right that there is an interaction with unsafe code here, but I disagree that the type-checker should consider types like &! uninhabited. Perhaps it is the case (depending on what ultimate rules we decided upon) that the type-checker would consider a type like &! uninhabited but only in safe code, however. The interaction of all of this reasoning with the unsafe code guidelines is all the more reason, I think, to try and isolate it to !, so that we can experiment until we are happy, and not consider empty enums to be uninhabited yet until we know what we want.

nikomatsakis · January 12, 2017, 8:12pm

So actually *const Void seems to be inarguably inhabited (by null). I should probably have written &Void in my example.

brson · January 12, 2017, 9:11pm

I agree this is prudent. There doesn't seem much reason to rush this particular feature.

nikomatsakis · January 12, 2017, 9:19pm

After discussing in the compiler-team meeting, we were thinking that the way to do this is to have both the old + new exhaustiveness checking code, and to use the never_type feature gate to decide which one we use.

cramertj · January 13, 2017, 2:36am

I don't understand how, even in unsafe code, &! could be inhabited. When converting from *const ! to &'a !, aren't you promising to the compiler that *const ! is a valid pointer to a ! for all of 'a? I don't see how that could be possible if ! is uninhabited.

RalfJung · January 13, 2017, 2:45pm

Well, are you? That question has not yet been conclusively answered. Certainly, if you pass a value of type &! to an unknown external function, you are promising something along these lines. But if the value is only passed from one internal function to another, maybe no promises are made.

cramertj · January 13, 2017, 6:00pm

What would be the practical use to defining &T as anything other than a pointer to a value of type T? If the pointer may not currently be valid, why not keep it as *const T? It seems to me that there are also a number of possible optimizations that would be broken by relaxing this definition.

glaebhoerl · January 13, 2017, 7:07pm

@nikomatsakis So, as I just remembered - the thing with the memory model / unsafe code guidelines was that: for any type T, &T may not be assumed to always refer to a valid instance of T [by the optimizer, in the vicinity of unsafe]. &! is just an instantiation that brings it into particularly stark relief, because ! otherwise happens to have no valid instances. But the logic itself is uniform across all types.

Should then the type system also require me to add an extra wildcard _ => {} arm to this match?

fn example(arg: &bool) {
    match arg {
        &true => {},
        &false => {}
    }
}

After all, if &bool cannot be assumed to refer to a valid instance of bool, then that match is not really exhaustive.

And the logic for &! is exactly the same, the only difference is the number of valid instances the type does have, which is a free parameter of the whole question.

(I had been hoping to try to elucidate why it feels absurd to me to require the typechecker to follow the same rules here as the optimizer, but failing that for now, I hope this example at least shows that something is not quite right with that assumption.)

glaebhoerl · January 13, 2017, 7:39pm

Maybe the more general point is this: the purpose of the memory model / unsafe code guidelines is to figure out when the optimizer can and cannot trust the type system. Taking the answers to that question and then applying them to the type system, rather than the optimizer, seems like it gets things mixed up.

nikomatsakis · January 13, 2017, 7:39pm

I think that the type-checker and optimizer are certainly connected. If you are in unsafe code, after all, it is presumably possible (and legal) to have an Err(x) value (where x: &!), but in safe code that should never happen (because only unsafe code could construct such a value, and it ought not to have released it to you). Put another way, there are some circumstances in which the optimizer would be able to figure out that it would be UB if the Err were inhabited, and hence it can assume that it is not; in some subset of those circumstances, we might make the typechecker reason in the same fashion. I say subset because presumably the typechecker rules should be conservative and easier to explain.

briansmith · January 13, 2017, 9:25pm

It's better to just make it impossible to construct a value of type &!. There are no values of type ! so any *const ! or *mut ! must be NULL and so attempting to make a &! is statically detectable to be nonsense. (This is clear if *const T is defined to be isomorphic to Option<&T> and *mut T is defined to be isomorphic to Option<&mut T>.)

My code does do enum Foo {} to define types that are opaque to Rust but not C (that is, C code can create an object of type Foo, return a pointer/reference to a heap-allocated Foo, free such an object given a pointer/reference, and inspect and manipulate its internals, but Rust cannot do anything except pass a pointer/reference to such a thing to C code). However, I think it is much more important for the type system to be logically sound, and I agree enum Foo {} is logically uninhabited. I would be happy to immediately change my code to stop using this pattern and I encourage other people to do the same, in order to help the language team give Rust reasonable semantics.

The only question I have is this: What exactly should wrappers around C libraries replace enum Foo {} with?

cuviper · January 13, 2017, 9:54pm

I don't think this relationship holds, given that nonsense like 42 as *const T is safe. Pointers can be anything, so I think even *const ! must be allowed to be non-NULL.

briansmith · January 13, 2017, 10:21pm

It depends on how the language is ultimately defined, especially regarding what happens when as_ref() or as &T or as &mut T is applied to a pointer that doesn't point to an object (of the right type). I would hope that we will be able to define at least a proper subset of Rust that operates only on references that actually refer to objects, i.e. a subset that doesn't contain as_ref() or as &T or as &mut T, or a subset that doesn't contain pointers at all.

Also, a Rust compiler could reasonably reject 42 as *const ! and 42 as *mut ! statically, since it knows there's no way an object of type ! can exist at address 42 by the definition of type !. Similarly, we could statically reject any extern function that returns *const ! or *mut ! since it isn't useful (AFAICT) to have an extern function that always returns a null pointer to an uninhabited type, and it wouldn't make sense for such a function to return anything other than null. More generally, we could say that by definition the only way to construct a *const ! or *mut ! is via core::ptr::null() or core::ptr::null_mut(), respectively.

hanna-kruppe · January 13, 2017, 10:29pm

Not only are arbitrary non-NULL raw pointers valid as @cuviper said (you can even construct them in safe code!), they also don’t carry the same aliasing guarantees as Option<&T> and Option<&mut T> even when they actually refer to valid values! Life is easier if you just accept that raw pointers have no inherent meaning and only have some limited obligations if they are actually dereferenced.

But let’s turn back to the question of &!. As weird as it feels to ever have a value of that type, the various partial proposals for unsafe code guidelines already imply that, in the presence of unsafe code, a reference may not carry all the same implications as a reference normally does. For example, several proposals care only about whether illegal memory accesses (for varying models of legality) actually happen at run time, so that a mutably aliased or dangling or otherwise “illegal” &T may very well exist, as long as it is not used.

The other question is whether it is useful to try and make provisions for unsafe code turning raw pointers to ! into references. The use of empty enums for opaque types would be one reason to do so. While it’s probably unavoidable that some currently existing code will be ruled UB by whatever rules are eventually adopted, this pattern is rather common and so it probably shouldn’t be broken without a measurable benefit (e.g., better optimization capabilities).

glaebhoerl · January 13, 2017, 11:10pm

Yes, because "normally" the optimizer can trust the results of the typechecker, and optimize based on them.

Right - in unsafe code, the optimizer can't always trust the typechecker, so although it should be able to assume that &! is unreachable, it refrains from doing so.

This feels backwards. The typechecker should always be able to reason that &! is uninhabited - that follows directly from the definition of those types. The optimizer not trusting the conclusions of the typechecker is an unfortunate (if highly prudent!) concession to practical reality. The typechecker not trusting itself seems plain insanity.

Again, I want to emphasize that the real question here is not "can the typechecker conclude that &! is uninhabited?", it's "when can the typechecker conclude that a match on &T is exhaustive?". And right now we say that it's exhaustive just as long as, inside the & pattern, it matches T itself exhaustively. Now, ! is not magical in this respect. It's just a type that happens to have two fewer inhabitants than bool, and one less than (). If a match on &bool needs two (non-wildcard) arms to be exhaustive and &() needs one, then &! should require zero. What is the motivation for treating it inconsistently? I contend there is none.

Also again, the optimizer not trusting the validity of & values is not in any way specific to &! either - it is also for all &T. So that isn't a motivation. If we wanted to reflect this in the type system, we would also have to say that a match on &bool with &true and &false patterns is not exhaustive, which is plainly silly (along with backwards incompatible).

[I feel like we're not getting through to each other here, but don't know why?]

Topic		Replies	Views
Blog post: never patterns, exhaustive matching, and uninhabited types	60	6547	March 25, 2019
Pattern binding modes +!	22	1726	September 16, 2022
Lets discuss Inhabited trait language design	58	3533	March 25, 2019
Pre-RFC: lint empty arms to stabilize `exhaustive_patterns` language design	19	1199	February 20, 2024
`!` in pattern position	11	1534	June 26, 2020

Recent change to make exhaustiveness and uninhabited types play nicer together

Related topics