A piece of code I often find myself wanting is the ability to e.g., have an array of slots, each of which is either occupied (and refers to something) or free (and refers to the next free slot for a free list). Often, both values can be assumed to fit inside a u31
or u63
or whatever.
In current Rust, there are 4 ways to do this:
- Just use the unsigned type and rely on other ways to know how to interpret it (e.g., it's free if you got to it through the free list or whatever), but this is unreliable and bug-prone and means, e.g., indices into the array of slots can't be validated as not referring to a free slot on accident
- Use a
i32
ori64
and rely on sign to distinguish them, e.g., free slots are stored as their bitwise negation. This works fine and is reasonably ergonomic (since e.g., you can just do sign comparison to check things instead of more complicated bittwiddling) but fails to encode the semantics into the type system. This also doesn't work if you have more than 2 variants (e.g., 4 variants distinguished by the first 2 bits and each fitting in au<N-2>
). enum Slot { Occupied(u32), Free(u32) }
has by far the best semantics and ergonomics but pads the integer to twice its size for no reason.enum Slot { Occupied([u8; std::mem::size_of::<u32>() - 1]), Free([u8; std::mem::size_of::<u32>() - 1]) }
has good semantics and size but is extremely unwieldy and pointless restricts the range tou<N-8>
.
What I suggest is range-valued enums of some form:
#[repr(u32)]
enum Slot {
..(1 << 31) => Occupied,
(1 << 31).. => Free,
}
which allow you to label ranges of an integer's value as enum variants. You could allow u32 as Slot
in cases where the ranges cover the enter range of possible values, but simultaneously allow matching on the enum (with some choice for the syntax of how you extract the underlying value). There is precedent of a sort for this with allowing integer ranges in pattern matching.
One potential consideration would be allowing an "offset" parameter that occurs when you extract the value, where writing (1 << 31).. => #[offset(-(1 << 31))] Free
would effectively make it a 1-bit tag union of 31 bit integers. Unclear exactly how you would define the semantics for this though, so perhaps better avoided.
It seems reasonable to allow arbitrary patterns to be used:
#[repr(u32)]
enum Foo {
0 => Zero,
1..50 | 100..200 => Good,
_ => Bad,
}
Allowing non-exhaustive patterns could also work to generalize the current unit enums. In that case though, you would not allow casting the base type into the enum and would need to write parsing code as you currently do. This would also allow current unit enums to just define a catch-all name for all bad values to allow casting to it.
Further, though dubiously useful, this could be a syntax for subdividing any base type into enums according to valid patterns, so you could also do
#repr(str)
enum Command {
"push" => Push,
"pull" => Pull,
"add" | "plus" => Add,
_ => Unknown,
}