Lets discuss Inhabited trait

newpavlov · July 16, 2018, 2:57pm

RFC 1892 suggests to deprecate std::mem::uninitialized. The main motivation is to solve problem of unsoundness around creating values of uninhabited types (e.g. ! or Void). In my opinion it looks like plugging holes without solving an underlying issue: currently we don’t have a way to distinguish inhabited and non-inhabited types on type-level. Thus constraints which will make code safer can not be expressed properly. The RFC lists “the old Inhabited trait proposal” trait proposal as an alternative, but unfortunately I couldn’t find it.

IIUC the main issue with introducing Inhabited trait is that to make it work, it will have to be an automatic trait bound, so e.g. for Result variants we’ll have to explicitly use ?Inhabited bound on its variants. But this change can’t be done in a straightforward fashion, as it can break a lot of code. The most common uses of uninhabited types include:

FFI types (arguably a dirty hack, should be replaced with extern types)
Marker types (think e.g. BigEndian and LittleEndian types from byteorder crate), these types can benefit from explicit !Inhabited bounds
Void like types (which will be superseded by never type), arguably the only sensible use for them is in enum variants, to denote “impossible” cases.

So how Inhabited trait can be introduced in a backwards compatible way? As I see it, solution is make Inhabitted auto-bound violation to issue warnings instead of compile-time error in Rust 2018 edition, and make it a hard error in the next edition. While this approach raises difficult questions regarding how it can be implemented, I think that in a long run we better have a proper Inhabited trait than plug holes here and there.

hanna-kruppe · July 16, 2018, 3:17pm

Why would we want an Inhabited trait? What (useful) code is enabled by having this trait? What other issues are lurking that can’t be fixed with the proposed MaybeUninitialized type? There’s mem::zeroed which has similar issues with uninhabited types, but that one can also produce invalid bit patterns for inhabited types, so it can’t soundly be used in generic code anyway (without extra bounds that are more restrictive than, and would imply, Inhabited).

Also keep in mind that one possible outcome of the unsafe code guidelines is that it’s straight up UB to ever use mem::uninitialized for anything except maybe ZSTs. In that case we’d want MaybeUninitialized anyway.

newpavlov · July 16, 2018, 3:39pm

Because uninitialized and zeroed are not only ways to create uninhabited type values, e.g. you can write:

let a: Void = unsafe { mem::transmute(ZstType) };

And I think there is other holes like that, which I can’t recall now. So shouldn’t we simply place Inhabited trait bounds on function and let type system handle it from here? And in some cases you may want to use !Inhabitted trait bound, as in marker types, or if we’ll take Result:

// Inhabited bound is redundant, I'll use it for explicitness
impl<T: Inhabited, E: !Inhabited> Result<T, E> {
    /// Safely unwraps Ok variant
    fn always_ok(self) -> T {
        // I think it will not work today, but in future compiler may
        // prove that Result<T, !> is equivalent to T
        unsafe { mem::transmute(self) }
    }
}

impl<T: !Inhabited, E: Inhabited> Result<T, E> {
    fn always_err(self) -> E { .. }
}

One of the arguments which I’ve heard against Inhabited trait bound is that it will make things like Box<[!]> to require explicit ?Inhabited bounds, which can infect a lot of code bases. But I haven’t heard an explanation why we need such strange types in the first place.

mcy · July 16, 2018, 3:45pm

Please elucidate on this. I think an automatic trait bound with this much churn is difficult to justify. Moreover, most functions never need to care that they're handling an uninhabited type, because those functions will get optimized away (since they can't be called usually). It is the uncommon case where we want to explicitly ban uninhabited types.

newpavlov · July 16, 2018, 3:47pm

Having value of uninhabited type is UB. Period. We simply can’t rely on “they’ll get optimized away”. So functions should very much care not to get into such cases, and auto-bound will handle this for most of the code, without changing much for most of Rusteceans.

hanna-kruppe · July 16, 2018, 3:49pm

That's UB in any case, and introducing an Inhabited trait won't prevent all those misuses from occurring. Obviously unsafe code can easily cause horrible UB, but that's neither news nor specific to inhabitedness. Deprecating uninitialized is more of a lint, not a soundness fix.

We don't have negative bounds, though, and it's far from clear whether we'll ever get them.

newpavlov · July 16, 2018, 3:54pm

How transmute<T1: Inhabited, T2: Inhabited>(v: T1) -> T2 will not prevent misuse of transmuting ZST into uninhabited type value? Compiler simply will reject code which will place ?Inhabited bound on T2. Yes, it will not prevent all possible problems in regards to uninhabited types, but as I see it most of them will be handled. Authors will have to explicitly opt-into possibility of using uninhabited types and to think about consequences.

For the time being I though it could work in the same way as Sync does.

crlf0710 · July 16, 2018, 3:55pm

I don’t think rust has a tradition that make markers for storage representation?

I’d imagine something like this:

trait TypeInfo {
    const SIZE_OF: usize;
    const INHABITED: bool;
    ...
}

mcy · July 16, 2018, 3:57pm

Having a value of uninhabited type indicates that that code cannot be reached safely and that the linker can safely delete that code from the binary. The typeck does not know about panicking or anything like that, only about types, which it manipulates abstractly. For example, we can materialize a ! with

fn make_never() -> ! { loop {} }

I can create a reference to it, and dereference it, because ! is Copy:

let ref_never: &! = &make_never();
let never = *ref_never;

None of this code is UB.

Of course, at this point the compiler can assume that this code will never run because it manipulates empty types. As you know, there is no safe way to return a value from make_never. There are unsafe ways, and that is how you get UB. Usually, the compiler will insert a halt-and-catch-fire into these functions in debug mode, but will completely remove them in release mode.

It is completely silly to ban empty types by default, since empty types can be used in a generic context to express something that never happens. For example, if we get an analogue to C++'s ptr-to-memeber, you could imagine that T::*U could be made uninhabited if T has no field of type U. This adds a safety guarantee for calling functions with generic ptr-to-members, which would, in your proposal, generate bizarre errors.

newpavlov · July 16, 2018, 4:10pm

Ah, I think I've got what you meant. I should've added "uninhabited type value in code which will run". Yes, your example is not UB, in the same way as error branch is not UB for E=!:

match result { Ok(v) => { .. }, Err(e) => { .. } }

So if I understand you correctly your worry is that code in the error branch will have to use ?Inhabited bound, is that right? it's a tricky one, but can't typechecker process e as Inhabited type if it can be proved that respective block will be never reached?

I am not familiar with C++, but why not make it Option instead? To me your ptr-to-memeber example looks like an excellent way to shoot yourself in the foot.

hanna-kruppe · July 16, 2018, 4:14pm

This is backwards incompatible, including for code that has no UB. You can transmute from an uninhabited type to any other type -- it'll be dead code, but it'll be fine.

Furthermore, there are plenty of ways to write transmute-like operations without using transmute. Pointer casts are one option, unions are another.

And finally, I'll point out that catching some a few misuses of transmute really doesn't seem like a sufficient reason to add a new magic trait and new default bound. It would be much more compelling if the trait enabled any useful safe abstractions.

newpavlov · July 16, 2018, 4:24pm

This is why I've proposed to make Inhabited check compiler error only for post-2018 edition, and implement it as a warning in 2018 edition. With auto-trait bound code without UB will not have any problems migrating. Yes, arguably it's a bit too much magic to liking of some, but I think lack of Inhabited trait will be more harmful in a long run considering Rust safety priorities.

This is exactly why you want Inhabited bound, to be able to reason if cast is safe or not by knowing that type will always be Inhabited. As for unions, if you will not allow ?Inhabited on unions, you will not have the problem, as you will not be able to construct union with uninhabited variant. Of course we have unfixable (?) hole with extern fn which returns !, but IMO one rare hole is better than 5 common ones.

Inhabited, as I see it, is not only and so much about catching transmute misuses, but about an ability to properly encode invariants on type-system level.

Centril · July 16, 2018, 5:28pm

If we wanted to support use cases such as:

// Inhabited bound is redundant, I'll use it for explicitness
impl<T: Inhabited, E: !Inhabited> Result<T, E> {
    /// Safely unwraps Ok variant
    fn always_ok(self) -> T {
        // I think it will not work today, but in future compiler may
        // prove that Result<T, !> is equivalent to T
        unsafe { mem::transmute(self) }
    }
}

impl<T: !Inhabited, E: Inhabited> Result<T, E> {
    fn always_err(self) -> E { .. }
}

then one way to do that perhaps is to introduce an auto trait Uninhabited which looks like this:

pub unsafe auto trait Uninhabited {
    // will need to work around:
    // error[E0380]: auto traits cannot have methods or associated items
    fn absurd<T>(self) -> T;
}

we can then write:

impl<T, E: Uninhabited> Result<T, E> {
    /// Safely unwraps Ok variant
    fn always_ok(self) -> T {
        match self {
            Ok(x) => x,
            Err(x) => x.absurd(),
        }
    }
}

...

This involves no negative bounds.

hanna-kruppe · July 16, 2018, 5:36pm

Sorry, I missed that, but then you're still breaking legitimate and possibly-useful transmutes in Rust 2018.

(I don't care much about transmute since I want it deprecated and replaced anyway, but it's not encouraging if there are such glaring false positives.)

The proposal seems to grow in scope, amount of churn, and backwards compatibility breaks, seems to grow with every back-and-forth here. I'm going to cut this short now and just say that I remain unconvinced this trait is worthwhile.

acmcarther · July 16, 2018, 5:39pm

I’ve reviewed the RFC in question and the OP here and I’m unconvinced that these proposals justify removing mem::uninitialized with the original proposal or your addendum. I’m one of those (likely few and reckless) people that actually uses mem::uninitialized for my algorithms, usually in conjunction with mem::swap, and I’d be loathe to hear that this tool was removed from my toolbox because it raised some questions in a corner of the type system (uninhabited types) that doesn’t actually affect my usage.

Perhaps I should comment on the RFC? It’s already sprawling and in its FCP and I suspect I wouldn’t be heeded anyway…

mcy · July 16, 2018, 5:39pm

I don’t know if the negation is what we want. I think it’s more helpful to be able to say “I want a type that is inhabited”; in an ideal world, everyone switches over to ! instead of messing about with empty enums or bizarre pathological types.

Centril · July 16, 2018, 5:45pm

There are a bunch of types which are uninhabited which are derived types of !, such as [!; N > 0] (which I assume is what you mean by “pathological”)… feels useful to be able to talk about all uninhabited types generally…

With mutually exclusive traits, you can also model the inhabited / uninhabited dichotomy perfectly.

mcy · July 16, 2018, 5:59pm

Yeah, this is what I figured you'd have to do. I'm not sure that there's a neat way to enforce this that isn't "make both traits unsafe".

newpavlov · July 16, 2018, 6:01pm

@Centril

Uninhabited will solve only a part of the problem and it will be a significantly less general solution. (btw I don't really get why negative trait bounds or at least mutually exclusive traits haven't got much traction...) Also what will happen with your proposal in case of Result<!, !>?

Also, can you explain why we need those derived "pathological" types? As I've wrote earlier I don't see any real use-cases for them.

@hanna-kruppe

Ehm, in my proposal in Rust 2018 violation of Inhabited trait bound will cause warning, to allow smooth transition, so it will not break anything. If it's too much magic, then Inhabited trait can be introduced in post-2018 editions only and cause compilation errors from the start.

And can you elaborate what do you mean by "glaring false positioves"?

How exactly what I've wrote "grows proposal in scope"?? Part about pointers is just a logical consequence of having Inhabited auto-bound, so in generic code without ?Inhabited you can be sure what you can do dereferencing without worrying about potential UB related to uninhabited types.

As for unions, not allowing uninhabited types in union variants seems quite logical to me, for me union with uninhabited variant is just another "pathological" case without any real use-case.

@acmcarther

Can you elaborate why you don't like Inhabited trait approach?

acmcarther · July 16, 2018, 6:19pm

Can you elaborate why you don’t like Inhabited trait approach?

My usages don't usually cross function boundaries (that aren't FFI), and I suspect that the trait approach will reduce clarity in both cases. Usually I write Rust in one of two modes: "C-like" for FFI code or very low level code data-structure-y code, and "actual Rust", where I'm building on top of that foundational code. I'd be likely to care about initialized-ness in the first mode and demand explicitness, but not the second mode where I'd accept the implicitness brought by trait magic.

I know that's not a well structured argument. I can supply some code snippets (:

Initializing an FFI object (potentially in a multi step process where partial initialization is expected)

github.com

acmcarther/void/blob/37e3c7c569c963d4b6d6cb9c39f032650a14eb15/core/rend/vk/sdl2_vulkan_interop.rs#L28-L37


      
          let mut sys_wm_info = sdl2_sys::SDL_SysWMinfo {
            version: sdl2_sys::SDL_version {
              major: 2,
              minor: 0,
              patch: 7,
            },
            subsystem: sdl2_sys::SDL_SYSWM_TYPE::SDL_SYSWM_UNKNOWN,
            info: std::mem::uninitialized(),
          };
          if sdl2_sys::SDL_GetWindowWMInfo(self.window.raw(), &mut sys_wm_info)

The user doesn't always control the shape of the data structures they're working with. They also may not control the way that they're initialized. In this case, I'm instructed to pass a partially initialized struct to a foreign function. I think Rust should not diverge too much from what the C usage would look like in these cases, if possible...

Here's another, simpler example (that I think the trait might be able to handle better?)

github.com

acmcarther/void/blob/37e3c7c569c963d4b6d6cb9c39f032650a14eb15/core/net/low/netcode_client.rs#L284-L286


      
          let mut rio_client_config: rio::reliable_config_t = std::mem::uninitialized();
          
          rio::reliable_default_config(&mut rio_client_config);

Internal data structure logic

I also use mem::uninitialized in contexts not related to FFI. Below is an example of an octree implementation where I perform a resize by extracting the "inner node" of the tree, and reinserting it as one octant of a larger octree:

github.com

acmcarther/void/blob/c03dcea1680e643bd543a9976051189e95d35211/core/alg/octree.rs#L230-L271


      
          unsafe {
            // Take ownership of inner node without initializing a replacement
            let mut inner_node = std::mem::uninitialized();
            mem::swap(&mut inner_node, &mut self.inner_node);
          
            let mut new_outer_node = OctreeNode {
              config: inner_node.config.clone(),
              data: Vec::new(),
              center: new_center.clone(),
              // This looks confusing, but see `OctreeNode::insert` to understand how this is similar to
              // (but opposite from) insertion
              half_size: double_size.clone(),
              is_leaf: true,
              children: [None, None, None, None, None, None, None, None],
              population: 0,
              metadata: M::default(),
            };
          
            new_outer_node.population = inner_node.population;
            new_outer_node.is_leaf = false;

This file has been truncated. show original

In this case, the "uninitalized"-ness doesn't cross any API boundary, and I don't really care to communicate that it's initialized or not -- I just want to implement the algorithm.

Topic		Replies	Views
Pre-(Pre-)RFC: niche types language design	10	556	November 4, 2024
Recent change to make exhaustiveness and uninhabited types play nicer together compiler	90	8908	March 25, 2019
Make mem::uninitialized and mem::zeroed panic for (some) types where 0 is a niche Unsafe Code Guidelines	30	3924	December 22, 2024
Missing layout optimization for types containing Infallible /! compiler	12	428	December 23, 2024
Mem::uninitialized, `!` and trap representations language design	56	6731	March 25, 2019

Lets discuss Inhabited trait

Related topics