Lets discuss Inhabited trait

scottmcm · July 16, 2018, 11:32pm

I think this is a massive understatement.

Right now, it's common to just let x = unsafe { mem::uninitialized() };, and I don't blame anyone who does, as the interface pushes you in that direction. But that's completely backwards from where the risk actually is.

Creating unintialized memory can, and should, be safe.

let x = MaybeUninit::default(); being safe is great. It's like how let p = 53629 as *const _; is safe. Then the rules for uninitialized memory become really similar to those of pointers, where it's unsafe to read instead of to create, since that's where things actually go wrong. It'll be great for RawVec<T> to be able to give a &[MaybeUnit<T>] instead of its current "well, here's a pointer" -- it could even Deref to that slice safely.

newpavlov · July 16, 2018, 11:32pm

Then it will be +1 trait in the row of Sync and co.

But why such condition on unsafe? If I'll write struct with private members and no constructors, I doubt we should consider it uninhabited type. In my understanding you can construct Haskal type value, but due to the recursive nature you either get segfault (dereferencing invalid pointer) or stack overflow (if you'll get reference cycle). I am not proficient in type theory so I don't know how to express it strictly, maybe it's a type with infinite value size (on abstract machine without segfault traps)?

Hm, apologies then. Looks I've misunderstood you somewhere.

newpavlov · July 16, 2018, 11:40pm

Good point, your post probably should be part of the RFC motivation. Though in my experience usually I use uninitialized as part of a single unsafe block in which I handle filling data, but I understand pitfall described by you. Nevertheless I believe that issues around inhabited and uninhabited types are still here, and that Inhabited auto-trait is the best solution, which ideally should be in the language. In the MaybeUninit case it should be forbidden to create MaybeUninit<!>, which I don’t think can be done nicely without Inhabited bound.

scottmcm · July 17, 2018, 12:10am

Why do you think that? Calling MaybeUninit::<!>::default() is totally fine and safe.

Preventing it would have major costs, too. Like my RawVec example; it's sound to have a Vec<!> today, which uses RawVec<!> internally, but if MaybeUninit<!> was disallowed, then RawVec<!> wouldn't work either.

(Obviously you won't be able to get a ! out unless someone put one in -- which of course they can't -- but the requirements on Vec for that to be the case are no different than for any other type: you can't get a String out of a Vec without putting one in, either. So there's no need for extra ! restrictions.)

notriddle · July 17, 2018, 12:14am

Just because you can unsafe your way past the compiler doesn't mean you've constructed a valid instance of the type. After all, you can transmute an instance of the enum Void {} type into existence, too, but we still call it uninhabited.

If constructing a box with zero is defined as having undefined behavior (which it is, for all boxes, not just the Haskal one), then zero does not inhabit the box type. If no values inhabit a type, then that type is uninhabited.

That's completely arbitrary. Inhabitedness is a degenerate case of the type system's existing rules; enum Void {} is uninhabited because enums are required to be one of the type's variants, and Void has no variants for it to be. Not because the size calculator deduces that no space needs to be allocated for it.

The fact that there are multiple ways to create an uninhabited type, beyond just the empty enums, is why I don't want to bake uninhabitedness into the type system.

newpavlov · July 17, 2018, 12:32am

Yes, but getting value from it does not make sense. Same goes for Vec<!>, I've asked several times, but no one have provided any practical examples of why such "pathological types" should be allowed.

I can construct such Haskal with the Box which points to the memory in which is stored pointer, which points to the same memory. (i.e. pointer and stored value are equal) Practically it ends up in stack overflow, which was mentioned in my previous post. Does it count as a value of the Haskal type? Yes, this value will require infinite memory, but theoretically it's a valid value, no?

So shouldn't we properly specify ways to create uninhabited types as an improvement of Rust type system instead of averting eyes from the problem? And will build fail-safes around some of the obvious UB.

notriddle · July 17, 2018, 12:50am

If that's true, then ! is inhabited. You just have to run an infinite loop to completion to construct it Though, actually, I'm not sure if a Box is allowed to own itself (allocate some memory, write that memory's own address into it, then transmute). You sure can't drop it, but you might be allowed to forget it.

struct Endless<'a>(&'a mut Endless, &'a mut Endless);

Assuming we don't allow an infinite graph, the pointers need to either form a loop or dangle. It can't be dangling, because exclusive references are not allowed to dangle. It can't be a loop, because that would mean both endless.0 and endless.1 form paths to endless, violating the no-aliasing rule of mut-references.

Do you want to bake that kind of complicated reasoning into the type system? Or do you want to have the type system treat it as an inhabited type, even though it isn't?

scottmcm · July 17, 2018, 12:53am

Suppose you want to parse multiple things, and separate the parsed values and the errors. Something like this:

use std::str::FromStr;
pub fn parse_many<T: FromStr>(xs: &[&str]) -> (Vec<T>, Vec<T::Err>) {
    let mut successes = Vec::new();
    let mut errors = Vec::new();
    for x in xs {
        match x.parse() {
            Ok(v) => successes.push(v),
            Err(e) => errors.push(e),
        }
    }
    (successes, errors)
}

Well that uses Vec<Uninhabited> when T = String.

felix.s · July 17, 2018, 6:41pm

(Slightly off-topic idle question: wouldn't it suffice for the compiler to assume the type is uninhabited unless it succeeds at proving otherwise by induction over type structure?)

mcy · July 17, 2018, 6:59pm

Depends on how smart it has to be to prove that. For example, proving that every field of a struct is inhabited is insufficient, because of the &mut alias example.

notriddle · July 17, 2018, 7:28pm

You can't replace Rust's existing inhabitedness checker with one like that.

Right now, Rust allows you to use a match void {} to convert from uninhabited types to any other type. This has to be a conservative guess in the opposite direction: do not allow empty match unless void is definitely uninhabited, because empty match has the same type signature as transmute.

I guess that you could define it so that the Inhabited trait uses a different analysis than the match checker, but that would mean APIs that have an Inhabited bound might not work with types that the user is able to construct and use in normal live code. That sounds like a really complex "feature" that would hit like an unexpected slap in the face. You'd be stuck fighting the inhabitedness checker just like you currently fight the borrow checker.

felix.s · July 17, 2018, 8:01pm

Not if you have to prove T is inhabited before establishing that &mut T is inhabited.

On the other hand though... it seems it's perfectly possible to at least construct &mut Haskal with some dextrous use of unsafe, essentially just as @notriddle described:

unsafe {
    let mut val: Box<Haskal> = Box::new(mem::uninitialized());
    ptr::write(&mut val.0, mem::transmute::<*mut Haskal, Box<Haskal>>(&mut *val) );
    let valp: *mut Haskal = &mut *val;
    mem::forget(val);
    mem::transmute::<_, &mut Haskal>(valp)
};

Of course, this is a silly example, and its validity (not to mention usefulness) is questionable. But hey, it compiles on playground and behaves as expected (which is to say, it blows the stack when attempting to print it out).

Uninhabited enums only, it seems. This fails to compile, even on nightly:

enum Null {}
struct Void(Null);

fn stare_into(void: Void) {
    match void {}
}

If you had in mind something like enum AmIInhabited<T> { X(T) }, I think it is already the case today that the type checker has to assume that any type parameter may be potentially used with an inhabited type, and so it cannot be assumed to be uninhabited.

newpavlov · July 17, 2018, 8:03pm

Ideally I would like to have smart compiler which will recognize type uninhabitness and will notify user. We could use an explicit #[uninhabited] attribute to silence the warning.

Can you provide other example of non-trivial uninhabited types (i.e. not empty enums or their composites) which are not recursive?

@scottmcm

Hm, good example. Although in practice I am not sure why user will use such approach instead of simply going with Vec<Result<T, T::Err>>, but nevertheless I can accept it as a motivation for allowing Vec<!>. (though I am personally still neutrally-negative about this feature)

mcy · July 17, 2018, 8:55pm

This holds for all T:

fn materialize<T>() -> T {
    unsafe {
        mem::uninitialized()
    }
}

Of course, as with any other sketchy way of assembling a type, this is UB, and momomorphization for T = ! will cause the compiler to emit halt-and-catch-fire (though the compiler is free to not do this and return garbage instead).

felix.s · July 19, 2018, 7:12pm

My example is a bit more sophisticated than a mere mem::uninitialized::<T>(): it actually attempts to construct a value upholding the invariants of its type, and arguably it succeeds. It’s not conceptually very different from constructing a reference cycle with Rc.

It is true it does make some sketchy ABI assumptions (that Box<T>, &mut T and *mut T have the same memory representation and differ only in ownership semantics) and the intermittent state with a not-yet-leaked local val: Box<Haskal> binding holding a value which points back to itself violates the logical invariants of Box. But merely violating the logical invariants of a type is not UB by itself. You’d have to invoke code whose safety relies on those invariants. (That’s why RalfJung’s evil function doesn’t need unsafe.)

mcy · July 19, 2018, 7:40pm

I can see why you'd take that position, but I disagree. Rc explicitly allows cycles, and when working with them you need to be careful not to create them. After all, leaking memory is allowed in safe Rust (see Box::leak). On the other hand, here you've created a horror of horrors: an aliased Box; in this case, Box which contains itself (which produces exciting results when printed, or, better yet, dropped!).

Imagine the analogous C++ situation of std::unique_ptr and std::shared_ptr.

Depends on how much of that type is blessed... and Box is (for unfortunate, hilarious reasons) among the holiest of holies of such types. As listed in the nomicon's (exhaustive) list of UB types,

I think it is completely fair to consider that violating the invariants of a type foreign to your crate, which its unsafe interface does not permit (with or without strings attached), is UB. Especially for lang items like Box. Note that you did not use any of Box's unsafe interfaces, but instead performed great evils with std::{mem, ptr}.

Like, I get the point you're trying to make, and I agree that the compiler trying to prove that Haskal is uninhabited is going to result in... hilarious, unforeseeable problems (unless a repr hint is allowed to prevent this).

Centril · July 19, 2018, 9:16pm

I would like to reinforce this point; If a type is not in on you temporarily violating its invariant to build a safe interface around it, then invalidating its invariant is in fact the source of unsoundness and thus undefined behavior.

For example, say I define:

use std::mem;
use std::marker::PhantomData;

pub struct Id<S: ?Sized, T: ?Sized>(PhantomData<(*mut S, *mut T)>);

impl<T: ?Sized> Id<T, T> { pub const REFL: Self = Id(PhantomData); }

impl<S: ?Sized, T: ?Sized> Id<S, T> {
    /// Casts a value of type `S` to `T`.
    ///
    /// This is safe because the `Id` type is always guaranteed to
    /// only be inhabited by `Id<T, T>` types by construction.
    pub fn cast(self, value: S) -> T where S: Sized, T: Sized {
        unsafe {
            // Transmute the value;
            // This is safe since we know by construction that
            // S == T (including lifetime invariance) always holds.
            let cast_value = mem::transmute_copy(&value);
     
            // Forget the value;
            // otherwise the destructor of S would be run.
            mem::forget(value);
     
            cast_value
        }
    }
}

say that you now outside of the crate take some ZST and transmute it to refl : Id<S, T> where S and T are nominally unequal types. If you do that, you can use refl.cast(expr) to cast any expr : S to expr : T and thus you've bricked the type system and introduced UB.

earthengine · August 15, 2018, 2:03pm

Technically, we do have a way to declare a type is “uninhabited”: by definition, a ! value can be coerce to any type automatically, but this is not true for empty enums. However we can at least write

enum Void {}
impl From<Void> for ! {
    fn from(v:Void) -> ! {
        match v { }
    }
}

People will not want to write this manually; but when they write trait bounds T: Into<!> the compiler can use this to assume T is uninhabited.

system · March 25, 2019, 8:30am

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Pre-(Pre-)RFC: niche types language design	10	584	November 4, 2024
Recent change to make exhaustiveness and uninhabited types play nicer together compiler	90	8922	March 25, 2019
Make mem::uninitialized and mem::zeroed panic for (some) types where 0 is a niche Unsafe Code Guidelines	30	3928	December 22, 2024
Missing layout optimization for types containing Infallible /! compiler	12	440	December 23, 2024
Mem::uninitialized, `!` and trap representations language design	56	6803	March 25, 2019

Lets discuss Inhabited trait

Related topics