[Pre-RFC]: Adding conversion to/from integer on enums with #[repr(i/u*)]

Summary

Add conversion to and from underlying integer type on enums with the #[repr(u*)] or #[repr(i*)] attribute.

Motivation

Enums with the #[repr(u*)] or #[repr(i*)] attributes are commonly used for wrapping primitive integer values accepted by, or returned from, an underlying library when doing FFI. As such, it becomes a very common task to convert these Rust enum variants to their underlying integer representation, and to convert the integer representation into an instance of the enum. There is currently no really good way of doing either, in my opinion.

Since the language already guarantees that an enum with a repr attribute will be represented in memory as the given integer type. I feel it’s quite natural that it would also provide ergonomic and safe access to this integer along with conversion from it. This would improve the FFI interoperability of Rust for at least my use cases.

EDIT: Below here I propose implementing From/TryFrom. But further down in drawbacks I realize this has some problems. So see proposed trait in drawbacks for what might technically work better.

Converting to integer

An enum with #[repr(u8)] can already be easily converted into its integer value with my_enum as u8. But it has at least three drawbacks:

  • The repr type is not shown in the documentation for the type or anywhere else, so a user of this type must look at the code to know what to cast to.
  • If the type is ever changed to #[repr(u16)] or other larger types, the my_enum as u8 will continue to compile without a single warning, but it might now be silently truncating values.
  • The user doing the conversion must explicitly state the target type.

Automatically adding an impl From<MyEnum> for u8 to this enum would solve all three problems mentioned above. Now it can be used as unsafe { ffi_fn(my_enum.into()) }. And to change the type one would only need to update the C library and the enum type declaration, none of the call sites.

Converting from integer

This operation is fallible (unless the enum happens to cover all possible values) and the only way I know to correctly implement it is to enumerate all possible values / variants in a match statement or if-else chain. This is tedious and prone to copy paste errors. It’s very much the type of code you want generated for you.

All macro based libraries I have tried for doing this for me have some limitations, or even bugs.

Adding an impl TryFrom<X> for Y for all enums, Y with #[repr(X)] would eliminate the need to bring in third party libraries with their own set of problems and possibly dependencies, for doing this (in the type of code I write) very common task.

Drawbacks

  • These automatically derived trait implementations would become conflicting implementations for anyone who already manually implemented them. This could be fixed by requiring manually specifying #[derive(From, TryFrom)] or similar on the enums in question. Or by adding a new special trait to libstd for this very use case, so that no existing code already implements it:

    trait Primitive {
        type T;
        fn to_primitive(&self) -> Self::T;
        fn try_from_primitive(x: Self::T) -> Option<Self>;
    }
    
  • TryFrom is not yet stabilized. Severely limiting the usefulness of at least half this proposal for now. But it looks like TryFrom very well could be stabilized before this even becomes a thing. This is also a non-issue with the custom trait proposed just above.

  • TryFrom will return a Result<MyEnum, E>. So we would either need to set E to (), add a special error type for this to libstd, or emit a totally custom error type for each enum this is done for. It would likely be better to have the conversion from integer to enum return Option<MyEnum>, like most of the macro based crates for this does. This is also a non-issue with the custom trait proposed.

Prior art

There are a few macro-based crates that already do something similar, to what this proposal includes:

Something very similar to this has been discussed / merged a long long time ago. But that solution has been ripped out again: https://github.com/rust-lang/rust/pull/9250

1 Like

I don't write this sort of code so I don't have much to say here other than possible textual improvements to the RFC (but that seems premature...). But...

...This bit is fixable by changes to rustdoc.

That doesn't discount the other parts of the motivation tho.

1 Like

Having this be opt-in using derive sounds like a good option. And the word there doesn't need to match the trait name, so it could be something like ToDiscriminant and FromDiscriminant or something.

(Of course, this leads to the usual question of whether it could just be a crate...)

1 Like

(While this is technically true; I think this should be discouraged... I think it's a design flaw in the language that non-traits can be derived -- having ToDiscriminant generate From impls would be utterly confusing imo..)

2 Likes

It indeed could be a crate. I link to the one you link to, and four more under “prior art”. However, from the two ones I have tried I have not been satisfied. They have problems of hiding casting warnings or they bring in dependencies etc. It feels like such a fundamental and simple thing that I don’t see why it would not be provided by default. It adds a tiny overhead to the language/libstd and strengthens the FFI story for Rust.

Automatically (as in "by compiler magic", not just derive) implementing a trait usually requires making it a lang item. The beauty of From and Into is exactly that they don't need built-in compiler support, they are just conventions. #[repr(uN)] enum -> uN conversion is already trivial in the language today (you just use as), and I'd rather not make a new addition (for the other direction) explicitly depend on a yet-unstable trait (TryFrom). So I think this should wait at least until some form of fallible conversion is stabilized.

Then why not:

  • try the others as well, and
  • open an issue about hiding cast warnings on the one you like the most / dislike the least?

Could you elaborate on these limitations?

I’m wondering if this is just a case where no one has written the perfect crate yet, or if there’s a fundamental reason why an external crate can’t do this as well as a core language feature could. I don’t personally write the kind of code that needs this, so I can’t really evaluate them myself.

I do believe that an external crate indeed can do it just as well as the core language. I should have been clearer on that. It’s just that I think there is one single, basically unopinionated, correct way of doing it. And when there is only one way of doing something, and it’s not likely that way will change, then I think it’s a good fit for the language itself. But others might have a completely different feeling about that. The fact that none of the existing crates for this implements my one “unopinionated” solution hints that I might be the odd one out. Which is why I want to raise the question here and see what people think :slight_smile:

Here are my thoughts on all the crates I know that does something similar:

  • enum_primitive - Not a proc macro, so clogs up my normal Rust syntax more than it would need to, but see enum-primitive-derive for the newer version of the same thing.

  • enum-primitive-derive brings in the extra num-traits crate, so it’s quite unwieldy. And it implements to and from every integer type, not just the single “correct” type that I would expect it to provide. Thus not solving the problem of telling the user what type they should aim for or can expect to work.

  • derive-try-from-primitive - Is one of the ones I have tried. It currently has this bug: https://github.com/JeffBelgum/derive-try-from-primitive/issues/1. And it only does conversion int -> enum, not the other way around, so not feature complete.

  • from-repr-enum-derive has only infallible conversion and thus brings in this concept of a forced default variant, something I feel is very opinionated and does not suit my use cases.

  • enum-repr does not make use of the #[repr(x)] attribute but instead require its own one: #[EnumReprType = "c_int"]. Because of this it’s not aware of the underlying representation of the enum in memory and thus have to do a lot of casting in the generated code. This also introduces silencing of warnings, something they write about in their documentation.

So to summarize: I think derive-try-from-primitive is on to something. I will try to contribute a fix to the bug and I will suggest they add conversion to the primitive as well, then they are basically exactly what I would like to see added to the language itself.

Yes, except it has the drawbacks I outlined. The user needs to know/type out the type, and it silently truncates if it's wrong/the enum ever changes representation.

I don't want to, and don't think people should, use as on enums for the same reasons outlined here: RFC: Make the `as` keyword consider `Into` Trait implementations by Kerollmops · Pull Request #2308 · rust-lang/rfcs · GitHub and written about here: Can't convert usize to u64? - #8 by kornel - The Rust Programming Language Forum. I'm not trying to do casting, I'm trying to do conversion with sane error checks in place.

7 Likes

Except that two out of three are simply untrue because of type inference in the context you brought this up.

Specifically, you brought this topic up because of FFI purposes. Doesn't this mean that you would primarily convert an enum to a primitive when passing it to an FFI function? If so, what would prevent you from using as _?

Edit: OK I see, there's also the "repr changing" which is the real problem. So that justifies the need. However, I'd still strongly prefer not baking From and Into into the compiler. This doesn't seem to be as complicated a problem that it would require this.

But let's put that away for a moment, because it's more of an aesthetic problem. The more concerning problem with an automatic implementation is: what if I want my #[repr(u8)] enum to have a custom From/Into impl? If the compiler automatically and unconditionally generates those, then I don't have a choice, I can't implement them by hand to enforce a different behavior.

I think adding a core language feature should have a higher barrier; in particular if a library can do it, and there is/should be a canonical way, then there's no need to add it to the language – but it's a good fit for the standard library. (Which is something I would absolutely support.)

I’ll just note that this has come up quite a few times in crates I’ve worked on, including rustls and quinn-proto. While a custom derive might be a nice solution, I think if Rust is going to have syntax to support defining discriminant, it would make sense for the language and/or std library to also give you ways (other than the opaque stuff we have now) to work with the discriminant.

My point exactly. If I specify the memory representation I obviously care about the content of the memory for this enum. If I care about it, I'm likely to want to access it in one way or another.

Ok. Sorry for being too specific then. Maybe I should have stopped my post after my summary section. What I do propose is a single, easy to access, no dependency way of doing this. Having it automatically derived was just a suggestion, not at all my only proposed way of doing this. I just want the feature somewhere more central than crates.io and my implementation details were just a few suggestions that could be discussed.

You don't specify which two, so I don't fully understand.

I had no idea as _ could be used in this context. Thanks, I learned something new. This indeed mitigates one of the three drawbacks under most circumstances I guess.

Again, using From and TryFrom was just one suggestion. Another one was to add a new custom trait just for enums with specified repr. See my drawbacks section in the original post. It even states that I realize that the custom trait is even a more viable alternative.

Does not need to be fully automatic and enforced upon you. Could be an extra trait, and could be that it has to be triggered by #[derive(...)] and similar.

1 Like

I still don't see any way to do this without risk of truncating without warning. as _ still truncates without warning. This is a problem both if either the enum representation or if the function declaration changes. But also if they simply are wrong from the start. I have encountered a few C libraries where the constants and the functions expecting them are defined with different types.

Yep, that’s exactly what I forgot in the previous post, hence my edit.

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.