Idea: Catch-all enum variant

Anyone who's read a C enum in Rust has gone through the hassle of defining a "fake enum" as a wrapped integer type plus constants. What would make this much easier is the ability to define an enum variant whose representation is "any other bit pattern," for instance

#[repr(C, u32)]
enum ErrorCode {
    Success = 0;
    OutOfMemory = 1;
    FailedToFrobnicate = 2;
    Unknown = _;
}

So transmuting a u32 to an ErrorCode would always be defined behavior. What do you think?

2 Likes

Should it be able to carry payload?

If you want a wrapped integer type plus constants, why don't you define a wrapper type over integer with associated constants?

struct ErrorCode(u32);

#[allow(non_upper_case_globals)] // in case if you really want it
impl ErrorCode {
    const Success: Self = Self(0);
    const OutOfMemory: Self = Self(1);
    const FailedToFrobnicate: Self = Self(2);
}

fn get_error_code() -> ErrorCode {
    if some_cond() {
        ErrorCode::Success
    } else if another_cond() {
        ErrorCode::OutOfMemory
    } else {
        ErrorCode(ffi_get_error_code())
    }
}

fn handle_error_code() {
    match get_error_code() {
        ErrorCode::Success => {...}
        ErrorCode::OutOfMemory => {...}
        ErrorCode(unknown) => {...}
    }
}
4 Likes

You lose exhaustiveness checking.

This is also a problem I came accross not too long ago, I ended up using the "integer wrapper" method described by @hyeonu because I wanted the value of the unknown variant to be stable (as in not randomly changed by the compiler to a magic value when cast to an ErrorCode and cast back to an u32, which would not be the case when using an enum with a non-tagged variant if I'm not mistaken.

I managed to overcome the exhaustiveness checking by using the const_assert_eq macro from the static_assertions crate to check the number of defined variants at compile time in every function that used the newtype. It's really not practical to do it like this but at least this gives you something resembling comptime exhaustiveness checking.

An enum with a catch-all variant also couldn't have true exhaustiveness checking (well, it's sort of possible, but really impractical, for a repr(u8) one).

What is the forward-compat story, if at some point in the future the enum has a new variant added to replace a value which was previously going to the Unknown variant? Either that is a breaking change (which it's not on the C side), or you have to use #[non_exhaustive] (and fully lose exhaustiveness checking, essentially ending up with something identical to using a wrapper type).

3 Likes

With an enum you also get the ability to have data-carrying variants.

The issue that I have with this is that the "any other bit pattern" implies that Unknown has a one-to-many relation as its repr value. As such which if any of these is ErrorCode::Unknown as u32 supported to be. It is possible to enumerate and do const equality assertions on enums currently. If there is no single as representation I don't see how this would work.

I could perhaps see it being able to specify a range: e.g. Reserved = .. or Reserved = 3..u32::MAX, which I would find more sensible from a perspective of given proc_macros a fighting chance to figure it out.

Edit: I should spell it out more clearly perhaps,

const fn foo() -> u32 { 0 }
#[repr(C, u32)]
enum Foo {
   Z = zero(),
  Unknown = _,
}

Presumably at some time in the future with a range expression a proc_macro could generate some const code which iterates over the range of Unknown.

Edit2: This does not alas equivalent to the OP's proposal as it wouldn't allow disjoint ranges, you would have to include separate variants for each disjoint range. I don't find it a big deal, I cannot recall seeing a datasheet which included disjoint ranges into a single field at least..

Um, do you want to transmute u32 to an enum type with data-carrying variants? Also such enum can't be written as a C enum.

If the enum is not for the FFI and doesn't have wrapped integer semantics, it would be better to have Option<MyEnum> where it can be missing than to add another variant represents missing value.

4 Likes

FWIW the num_enum crate has supported something similar since the arbitrary_enum_discriminant feature landed in rust:

#[derive(FromPrimitive, IntoPrimitive)]
#[repr(u8)]
enum Enum {
    Zero = 0,
    #[num_enum(catch_all)]
    NonZero(u8),
}

It doesn't allow transmuting, but can implement From and Into both ways.

7 Likes

Just a clarifying question: this does not influence the permissible representations, it only adds a conversion method that matches the variant by default, right? Using the crate it would still be UB to dereference a pointer to a valid int as a pointer to the enum instead. This distinction is important for (mutable) references to data-carrying enum that can only be translated to a union in C-api bindings if it was valid instead.

I would very enjoy being able to write, with specific annotations if necessary, an enum that can correctly derived bytemuck::Pod and similar marker traits. And then parse network headers by casting without an extra copy to the stack.

How do you check for equality when 2 different bit patterns fall into your Unknown variant ?

1 Like

derive(PartialEq) uses std::mem::discriminant(). So we'll just have to make intrinsics::discriminant_value() returning some stable value for this discriminant. It'll not be zero-cost, but it'll work.

The principal problem isn't that it's non-zero-cost. The much bigger issue is that it's very much non-obvious, and any "solution" will be counter-intuitive to a large amount of people who would expect a different (likely still reasonable in some way) default.

2 Likes

Yes, exactly so - the representation doesn't change, so this is just for conversion (which may involve a stack copy), not for transmuting or casting.

My intuition is that it should work like NaN and be unequal to everything including itself.

This is a very strong argument against the idea imo, since other people in this thread seem to have different intuitions

I'm a frequent float defender, but the weird float partial order, with NaNs not equal to themselves, is... honestly a mistake[1]. We should not replicate it here.


  1. Note that I don't think rust made a mistake in following them -- that was the right call, it's just the semantics which have time in again proven confusing and easy to forget. ↩ī¸Ž

2 Likes

That is indeed the right way to translate a C enum to Rust. In C, an enum is just an integer type with some named constants and can carry arbitrary values. In Rust, that is fundamentally not the case (or else we would have to force you to always write a fall-back branch for each match). Rust using the name enum here is slightly unfortunate...

So transmuting a u32 to an ErrorCode would always be defined behavior.

This makes it not a Rust enum type any more, so IMO using enums here is just using the wrong tool. We already have a tool to express what you are asking for, and that's a newtyped u32.

Maybe we should have nicer syntax for that? This can probably be done as a macro though.

C enums don't have that ability, either, so this cannot come up when translating a C enum to Rust.

7 Likes

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.