Pattern binding modes +!

&! is an inhabited type. As such,

fn test(it: &!) {
    match it {}
}

fails to compile, as the patterns provided are nonexhaustive.

Even with match ergonomics default binding modes, a literal x pattern matches at type &! and is still inhabited.

But... a pattern &x dereferences the reference and observes the !. As such, you can make a case that we should consider the inhabitedness behind references when checking if a match scrutinee is covered by zero patterns.

This would allow replacing the

  = note: the matched value is of type `&!`
  = note: references are always considered inhabited

diagnostic with the code "just working."

But the question is: do we want to make this argument? This would paper a bit over the "&! is still inhabited" point but also make it accidentally write code that assumes that it isn't. (Is there a soundness problem with assuming it's not inhabited when it just unconstructable? There is in the other direction.) I'm not sure either way.

The question:

how could you call test?

Never type could not be created, thus I have no knowledge to create a &! type.

You can't safely construct &!. However, the type &! itself is useful in a generic context for the same reason ! is.

let v: String = match it {};

If &! is inhabited, this is UB.

2 Likes

IMHO, since &! could be created, they must be matched with something

#![feature(strict_provenance)]
#![feature(never_type)]
fn test(it: &!) ->usize {
    println!("before test");
    let ans=match it{
        ans=>{
            let ans=( ans as *const ! ).addr();
            println!("in match");
            ans
        }
    };
    println!("fine");
    drop(it);
    println!("after test");
    ans
}
fn wtf(){
    let a=1usize;
    let b=(&a as *const usize).cast::<!>();
    unsafe{dbg!(test(&*b));return};
}
fn main(){
    wtf();
    println!("wtf done")
}

Note that creating &! is considered UB: it is considerd inhabited to keep the possibility of allowing that, but currently it is both inhabited and unconstructable.

6 Likes
7 Likes

My reading: being inhabited means you can write code that creates a value with said type, in this case, todo!() or &todo!(). That doesn't mean that code will run (to completion).

For example, you can't write code that constructs a value of enum Empty {}, even only at the type level. You need to resort to an extern function: Rust Playground

let e: Empty = todo!(); works just fine.

I still like the idea of being (able to be) explicit about not matching (match e { ! }) combining a ! pattern that matches a ! value, and being able to elide the code block for an unreachable match. Though that doesn’t help with something like let Ok(v): Result<usize, &!> = e; if the implicit Err(&!) branch isn’t considered unreachable.

Interesting, that does, and let e: &Empty = todo!(), but let e: &Empty = &todo!() doesn't! I wonder what the logic is there.

Simple: the type of todo!() is !. ! can coerce to any type, but coercion is not recursive (because this is impossible in the general case), and thus &! cannot coerce to &Never. You can do let _: &Never = &(todo!() as _);.

3 Likes

Code:

enum E{
   Safe(u8),
   Unsafe(u8,&!),
}

fn foo() {
   let data = E::Safe(10);
   //use unsafe to mutate discriminant, and co.
   //have an `E::Unsafe` variant of the enum
}

We can handle this by adding smth like unsafe match. This way we can both provide access for data in unsafe code and have nice exhaustive matching in safe code:

fn safe(data: E) {
   let E::Safe(byte) = data; // if data is `E::Unsafe` - panic
}

unsafe fn usafe(data: E) {
   match data {
      Safe(byte)=>{},
      Unsafe(r)=>{}, // note this variant
   }
}

Yes, I know that panicking out is not good at all... but it's the only easy solution.

That’s a contradiction in terms. The definition of an inhabited type is that it not only has an element, but you have the constructor to prove it. Put another way: if &! is inhabited, what is it inhabited by?

I can accept &! being called nonempty

1 Like

I can construct an instance of Never or ! just as easily:

let never: Never = unsafe { std::mem::transmute(()) };
println!("{:p}", &never);
println!("This is obvious UB, don't do this");

[playground]

&! is a type which has valid values but no safe values. This is the difference between safety and validity invariants.

The validity invariant of references is still an open question, but it's generally accepted that the validity should be shallow, and not require the validity of the pointee. If you don't require the validity of ! as a pointee, you're left with a reference which must be dereferencable for size_of::<!>() == 0 bytes and aligned to align_of::<!>() == 1 bytes. If you satisfy that, you have a valid value, and thus the type is inhabited.

So if you want to be specific, you could say that &! is inhabited but not safely inhabited, as there is no value which satisfies the safety invariant (which for references, is that the referee satisfies its safety invariant plus something to guarantee the borrow model is inviolable).

5 Likes

I think there is a safe thing you could do with an inhabited value of &!, which would be std::ptr::eq.

That doesn't really make any sense. Having a value of type T means that you can perform all operations which are defined on T. For references, it means that I can dereference it and get a value of the pointee, which is impossible with &!. This means that you can't construct an element of &!, which means it isn't inhabited.

Making an "unsafe" element of &! via unsafe code has about as much sense as finding a magical "unsafe" third value of bool.

What is inhabited is *mut !, since raw pointers don't guarantee any safe operations which could expose a pointee. Any non-null pointer is a valid element of *mut !, but dereferencing it is always UB.

2 Likes

This is true of the safety invariant, yes. If I have a str which is valid UTF-8, I can do anything that the API offers. If I have a str which is invalid UTF-8, though, which does not satisfy the safety invariant but does the validity invariant, I can't do many otherwise safe operations; for example, s.chars().last() could index out of bounds, causing UB.

It is the safety invariant that you can dereference &T and get a valid T. It is a validity invariant that you can dereference &T and get size_of::<T>() bytes, and that the reference is aligned to align_of::<T>() bytes.

Of course, this is non normative, but this is the general consensus of the unsafe code working group, and making validity recursive is really hard to check for basically no optimization benefit.

Making a bool which is not 0x00 or 0x01 is a different class of invalid, because that's a validity invariant of the type. Moreover, it's the "bitstring validity," which is used for layout optimizations.

&! could still potentially have not (unsafe but) valid values still, if we make it part of the validity invariant that the pointee type is inhabited. But as currently informally defined, transmuting &() to &! gives you a valid but unsafe instance of &! where not fulfilling the safety invariant of a reference means that the only sound operations are

  • a typed copy,
  • turn it into a pointer, or
  • dereference to size_of::<T>() bytes (but to not do a typed copy of said bytes, or otherwise use them at type T).

This ignores any retagging operations done by Stacked Borrows or any other memory model enforcing the borrow quality of references, as this does care about the type of the pointee (though similarly doesn't necessarily recurse into retagging the pointee's contained references).

We might end up renaming safety/validity invariants in the future, because informally "validity" can refer to an instance which satisfies the safety invariants. But those are the terms we currently have.

(And so is the null pointer.)

Note that it's only to dereference at type !. MaybeUninit<!> is fine, as is [u8: 0] or any other inhabited ZST. The UB from dereferencing a pointer to ! comes to that doing a typed copy of an uninhabited type with the validity invariant of false.

3 Likes

I don't see why matching on a reference should be treated as a special case compared to any other operation that dereferences the reference. What is unique about this case?

Also, we allow

fn test(it: &bool) {
    match &it {
        &true => (),
        &false => (),
    }
}

even though it's possible to construct an &bool in unsafe code that doesn't uphold bools validity invariants. Again, why should we have a special case?

&T pointing to a valid T is not currently part of the validity invariant, just the safety invariant. (This is not a guaranteed decided truth.) So this is a false equivalence here.

And if &T it is part of the validity invariant of &T, then this is in support of match (x: &!) {} being valid, because there are no valid values of &!.

Well, matching doesn't dereference until it hits a pattern. match (x: &!) { ! => () } would dereference the reference, as would match x { &_ => () }, but match x { _ => () } doesn't dereference the reference.

The position of my OP is that we could if we wanted say that a match with no patterns does dereference the scrutinee, like a pattern could with default binding modes, and the result of that would be a zero-arm match on reference-to-uninhabited behaving the same as a zero-arm match on uninhabited.

Effectively, applying default binding modes to the "uninhabited pattern" to pattern match through references in the scrutinee.

(To anyone: it's probably best to clarify explicitly whether a post is for or against &! being inhabited, and for or against a zero-arm match over &! being valid if &! is inhabited.)

To be explicit: my position is that

  • &T has a validity invariant that does not refer to the validity of T
  • Thus &! is inhabited
  • However, holding a &! is basically instant UB anyway, because
    • doing basically anything to &T asserts the validity of T
    • this includes minimally a retag of &T (which a typed copy does), as well as of course anything that actually uses T
    • the only thing that might be exempted is as casts
    • notably functions such as ptr::addr or ptr::eq retag and thus assert T validity
  • It would be a nice bit of ergonomics if match ergonomics default binding modes applied to the "uninhabited pattern" seeing through references
    • but since this only applies when the concrete type is known anyway, &! can just do *it instead of match it {}
    • so this basically only applies to custom uninhabited enums

To be clear, I understand that "T breaks validity invariants => &T breaks validity invariants" is likely too strong, and that "T breaks validity invariants => &T breaks safety invariants" is the current consensus. I see no reason to change that. But even given that latter standard, dissalowing empty matches on references to uninhabited types seems strange to me, because in the &bool match example I gave, Rust makes no special allowances for the possibility of an &bool that breaks safety invariants, but in the case of &! it does. Or at least that's the impression I get.

I think this is the point where my intuition diverges from your rationale. I generally think of pattern matching in purely "declarative" terms, not in terms of performing operations like dereferencing. The idea that a specific pattern within a pattern match could have a side effect (in this case, a dereference invoking UB) just never occurred to me. I haven't written much complex unsafe code before, so maybe there are subtleties I'm missing. But I don't see why safe code should be encumbered just to remove 1 footgun for unsafe code that uses references to invalid values, even though such code would still have 9,999,999 other footguns to contend with (and should probably be using raw pointers anyway).