Pre-RFC enum from integer

From my perspective (which may be colored by interacting with C a lot) conversion of integers to (field-less) enums is a frequent operation.

Rust has a built-in way to convert an enum to an integer, but not the other way. The current workarounds are pretty bad IMHO:

  • Write a match for mapping every integer to every variant. This is unacceptably tedious, and the need to manually copy integers leaves room for errors.

  • Use two crates and a macro derive. I’m generally OK with using dependencies for small things, but that is such an overkill to merely check whether an integer is in range.

  • mem::transmute(). I’ve used it, because that is the only one-liner available in Rust, and it has backfired terribly by causing undefined behavior and memory corruption elsewhere in the program.

I wonder if many people will make the same mistake with transmute, because especially with #[repr(C)] it feels like a harmless no-op integer cast, and not a program-destroying nasal demon.

Since there isn’t a clearly good solution to this problem, Googling for solutions also brings poor results. Depending on how you phrase it you may find old Rust book about enums (which does not contain any solution), GitHub issues about now-dead FromPrimitive, and some forum threads which suggest transmute or look outdated.

So I think Rust should have a single good solution for making a field-less enum from an integer.

12 Likes

Maybe we can add #[derive(TryFrom)] for enums? And I suggest this not try to cover all integer types, just whatever the enum’s repr is.

17 Likes

Do you mean this way?

#![allow(dead_code)]

#[repr(u32)]
enum Foo { A = 300, B = 500 }

fn main() {
    let a: u8 = Foo::A as u8;
    println!("{}", a);
}

It's not good enough, as you see it could lead to bugs.

2 Likes

Yeah, the as operator is pretty bad in general, but at least there’s a way.

I suppose TryFrom could also add implementation from enum to int (with Error=!), but I wouldn’t mind if that was out of scope for this proposal and left to be fixed as part of fixing of the as operator.

I ran into the same problem as @kornel already and I would love to see this implemented.

Please think about bit-flag enumerations when designing this. Bit-flag enumerations have some constants given explicit names and values with disjoint bit patterns, and then programs may use the logical OR of any set of the named constants, possibly with some exceptions (which, of course, C does not check for). They are very common in C low-level interfaces, for instance the second argument to the POSIX open primitive on Linux can be any combination of

enum OpenFlags {
  O_RDONLY    = 0x00000000,
  O_WRONLY    = 0x00000001,
  O_RDWR      = 0x00000002,
  O_ACCMODE   = 0x00000003,
  O_CREAT     = 0x00000040,
  O_EXCL      = 0x00000080,
  O_NOCTTY    = 0x00000100,
  O_TRUNC     = 0x00000200,
  O_APPEND    = 0x00000400,
  O_NONBLOCK  = 0x00000800,
  O_DSYNC     = 0x00001000,
  O_ASYNC     = 0x00002000,
  O_DIRECT    = 0x00004000,
  O_LARGEFILE = 0x00008000,
  O_DIRECTORY = 0x00010000,
  O_NOFOLLOW  = 0x00020000,
  O_NOATIME   = 0x00040000,
  O_CLOEXEC   = 0x00080000,
  O_SYNC      = 0x00101000,
  O_PATH      = 0x00200000,
  O_TMPFILE   = 0x00410000,
  O_NDELAY    = O_NONBLOCK,
  O_FSYNC     = O_SYNC
}

except that you must use exactly one of O_RDONLY, O_WRONLY, and O_RDWR, and some of the combinations of the others might not make a whole lot of sense.

Rust’s insistence that variables with a numeric-enum type can only have values equal to one of the named constants means that this kind of foreign interface is poorly handled.

isn’t bitflags crate solving this sufficiently well?

1 Like

C conflates two different purposes, the enumeration and the bitwaise flags, they have different semantics and different purposes. One can argue that a system language should support both with a different syntax (or using an external crate).

4 Likes

Also worth noting the enumflags crate

1 Like

I feel like anything that the libc crate needs to do its job thoroughly should be in std.

(I should probably say that I’ve spent 95% of my professional coding time for years in Python, where “the batteries are included” is a watchword for the stdlib, and that’s trained me to find it irritating whenever I have to reach for a third-party library/crate.)

No, they are not, because they are just as illegal in C as they are in Rust. In fact while Rust insists an enum may only have one of the defined values, C and C++ may silently assume it.

See the type? It is int. Not an enum. It can't be, because even if the original constants were defined as enums—which they are not, asm-generic/fcntl.h defines them as macros—there is no | operator for enum types, so they get implicitly converted to int and the result of | is always int.

In C, flags are always (short/long/unsigned) int. C++ sometimes creates custom classes for them, e.g. the std::ios_base::openmode (there are a few more in ios_base). The same should be done in Rust—and that's what the bitflags and enumflags crates are for.

In C and C++, enum is sometimes used just to define the constants—because const may still be given memory and #defines are not scoped, so before constexpr enum was preferred way of defining compile-time constants that are guaranteed to be inlined. But usually in the anonymous form.

I think it would make sense to coopt something like this in the standard library once the preferred design is clear. However, since they are not—and cannot—be based on actual enums, it is completely orthogonal to the thing in the original post.

I actually think that one is a bad idea. Basing a flags type on an enum does not really make sense.

1 Like

Who told you that? That's not true at all. A variable with enum type in C can legitimately hold any value that is valid for the "underlying type" of the enumeration. This is not explicitly stated in the standard, but follows from the rule that each enumerated type is compatible with some integer type (N1570 §6.7.2.2p4) and the explicit license to type-pun as long as the two types are compatible (6.5p7).

Ok, you are right. And actually it is explicitly stated in the (at least the C++) specification, because it says cast maintains value as long as it is within the range and the range is defined so that it is the underlying type if fixed and closed to | over the declared enumerators otherwise (the wording is completely different, but that’s the effect).

Still, because C++ converts enums to the underlying type implicitly, but not the other way around, the usual approach is to use enum to define the flags, but pass them around as integer—or special type as the iostreams library does.

However, for Rust, this kind of use is incompatible with the match logic, so for Rust using enum is not appropriate. The custom structs are.

This is certainly a stance I can sympathize with, my own opinion would be that this is a limitation of the match logic, which could be extended to abstract data types using something like views

making the integer representation abstract but allowing one to still match on it. but that is probably a different discussion

In case of Rust it’s clear that enum can hold strictly only one of its variants, and anything else leads to undefined behaviour. And it’s not even theoretical problem. Since the latest enum optimisations have landed it actually causes memory corruption if you force enum to hold another value.

1 Like

Hi,

Author of enum-primitive-derive here. I’d certainly love to improve the usage or make it easier to use the crate as well until there is a solution in the language itself. I wanted to use something that provided by absolute guarantee when creating a FFI so that’s why I made it. The other solutions out there were clunky so I felt a custom derive was the easiest way to go. I couldn’t figure out how to not require two crates since the trait that everyone expects is provided by num-traits and cannot be re-exported by a procedural macros crate.

Any suggestions are very much welcome!

On the Rust issue for C enums and undefined behavior, I suggested (if it’s possible) to specify a default for repr(C) enums in the event they are out of range.

For many of the enums I worked with there was a simple and obvious choice of a default.

Being able to use enums directly in my bindings would massively simplify the bindings in many cases, an opinion that the creators of bindgen seem to share. It would be really nice if there were a good solution here which would enable that safely.

1 Like

I keep running into this issue. I think in an ideal world TryFrom could be derived with a TryFromIntError as the error case. I also ran into what @bascule mentioned. We have a lot of cbindings for rust crates and there we always have a special value in the enum that acts as a stand-in for unknown values. However I do not think that this default should be an actual Default. With TryFrom this pattern would work:

let foo = 42.try_into().unwrap_or(MyEnum::SomeDefault);

(In particular I just now came across this issue again because I want to have something like an AtomicEnum and that’s pretty tricky to implement currently without some automatic two way conversion to integers)

Is this still alive?

It’s still a need =) but I’m not aware of any recent developments here. That said, the fact that custom derive exists seems like it could be used to solve this problem in “user land” quite readily.