Restricted Enum Variants Newtype and Safe MaybeUninit


#1

Restricted enum newtypes allow the type system to guarantee a value only uses some of the possible enum variants, while retaining an identical memory representation to the unrestricted type it’s derived from. The main use-cases include the obvious case such as std::io functions returning only a subset of std::io::ErrorKind, but also for utilizing uninitialized by restricting an Option<T>-like type, and a third case I’ll get into later.

The definition and obvious case

struct ErrorKindGetMetadata(ErrorKind);

impl RestrictedEnum for ErrorKindGetMetadata {
  type Unrestricted = ErrorKind;
  fn restrict(x: &Self::Unrestricted) -> &Self {
    match x {
      NotFound | PermissionDenied => x,
      _ => panic!(),
    }
  }
}

This syntax serves better as a description of how type restriction works, an actual implementation would probably be more terse (and wouldn’t allow arbitrary operations in restrict(..)). It would be theoretically compiled into the code above, but in actuality would use compiler magic to disallow any call where the panic!() case is not proved to be impossible, and it would be a no-op.

Safe, uninitialized memory

…works with a new definition for MaybeUninit (perhaps by a different name)

// Notice that this is private
enum MaybeUninit<T> {
  Uninit,
  Init(T),
}
// Perhaps this colon syntax works
pub struct Uninit<T>(MaybeUninit<T>): Uninit;
pub struct Init<T>(MaybeUninit<T>): Init;

Newtypes named after enum variants (if allowed) should only be allowed if they restrict the enum to their eponymous variant.

let mut empty_buf: [Uninit<u8>; 1500] = [Uninit; 1500];
let mut (used, free): (&mut [Init<u8>], &mut [Uninit<u8>]) = recv(&mut empty_buf)

The compiler knows the only possibility for the values in empty_buf are Uninit, just as it knows that x: () must be (), and it doesn’t have to read any memory to get the value. The MaybeUninit enum is private so that you can’t accept, return, store, etc. values of type MaybeUninit<T>, because it’s unsafe to read a value of Uninit, and even if you could there’s no tag distinguishing it from Init<T>. (MaybeUnint would therefore be both an “enum” and a union, and perhaps could be generalized instead of just using pub/private access controls).

The third case

…is having a struct with fields that may store erroneous values, but shouldn’t need cloning or unnecesary unwrapping when changing between a struct with all valid fields and the other cases.

struct AllGood {
  x: Ok<u8, u16>,
  y: Ok<u16, u8>,
  z: Ok<u8, u8>,
  w: u8,
}
#[derive(Restrict<AllGood>)]
struct SomeErrors {
  x: Result<u8, u16>,
  y: Result<u16, u8>,
  z: Result<u8, u8>,
  w: u8,
}

The Ok types would implement Deref or something like it to transparently act like their contents, while possibly taking more space. The relationship between the two structs is marked with the derive thingy which would use the Restrict trait from each of the implementing fields in AllGood on the corresponding ones in SomeErrors. There could be some heuristic that allows AllGood to be entirely declared by the derive. Any functions implemented on AllGood could automatically return Option when called on SomeErrors. And the conversion of SomeErrors to AllGood can be optimized with a Result-like type with a shared flag field.

This may have some relation to Types for Enum Variants: https://github.com/rust-lang/rfcs/pull/1450


#2

Also related: https://github.com/rust-lang/rfcs/pull/2363 Several of the comments there even talk about a straw man alternative similar to yours.

If I understand both proposals correctly, this “explicit discriminants” one I linked has the advantages that:

  • it’s arguably not even adding any new syntax, merely lifting a restriction on existing syntax
  • it covers use cases where enums have only some variants in common, not just where one is a subset of the other
  • it’s fairly clear what the layout guarantees are supposed to be, since this is only useful for non-repr(Rust) enums

Whereas what you’re proposing in this thread has the advantages that:

  • no unsafe code is involved in using these features
  • works for repr(Rust) because the conversions may have runtime costs

Unfortunately, your 2nd and 3rd examples don’t make a whole lot of sense to me.

For the 3rd, since this clearly must have potential runtime costs, why not use From/Into? (or TryFrom/TryInto if it’s fallible, which is better than panicking)

For the 2nd/MaybeUninit, there’s this handwavy stuff at the end about it being both an enum and a union somehow; how that’s supposed to work definitely needs to be in the proposal if this is a key use case. But even then, I don’t see the utility of a function returning [Init<T>; N]. If it really does guarantee every element will be initialised, why not return [T; N]? And while this isn’t relevant to your specific example, note that we’re also expecting the ! feature to come with support for things like Ok(x) irrefutably matching a Result<T, !> value, without any enum restriction machinery.

So when we only have the “obvious subset” case left, unless we’re worried about making layout guarantees to enable zero-cost conversions like that “explicit discriminants” RFC, it seems like a library with some macro magic can do the job.