Relaxed discriminant rules for `#[non_exhaustive]` ABI-safe enums

InfernoDeity · January 19, 2022, 12:15pm

Currently, the semantics of #[non_exhaustive] ABI-safe enums are suprising. The ABI of enums depends on the number (and assigned discriminants) of fields. This mean that such types cannot be upgraded in ABI-stable contexts, such as plugin systems, which the #[repr] would seem to imply.

I ran into this issue with GitHub - LightningCreations/lccc: Lightning Creations Compiler Frontend for various languages (host), where much of the interface is using #[repr(uN)] enums, but cannot be upgraded, despite many cases deliberately leaving upgrade space in the repr.

There are a couple solutions (other than leaving it as-is) that could be explored:

Fix this specifically for #[non_exhaustive], allowing unmatched discriminants. This mostly has problems in that it affects the defining crate (which no longer is required to exhaustively match the crate)
Add a new feature for this purpose, such as #[non_exhaustive(abi)] (obviously bikesheddable) that guarantees this behaviour, and lint (possibly allow-by-default) on the combination of #[non_exhaustive] and #[repr({u,i}N)] or #[repr(C)].

This is mostly an issue in my case for data carrying enums, as dataless (C like) enums can (and in my case, do) use a macro (such as the fake-enum crate), but using this same emulation structure for data carrying enums poses a significant ergonomics issues (which is already a significant annoyance - thanks &dyn and &[T]).

scottmcm · January 19, 2022, 5:04pm

I'm strongly against having non_exhaustive do this. There are plenty of reasons for wanting Option<Foo> to layout-optimize even if Foo can grow more variants in future releases.

It might be interesting to look at those ergonomics issues. For example, maybe we could figure out how to make "pattern items" of some sort -- that seems like it'd be handy in lots of places.

InfernoDeity · January 19, 2022, 5:05pm

What about the second option, #[non_exhaustive(abi)] that indicates it's not exhaustive for both API and ABI purposes? I do think it's a potential footgun where the #[non_exhaustive] is entirely useless for an enum with a stable abi.

kpreid · January 19, 2022, 5:22pm

Maybe a better syntax for this would be

#[repr(u8, non_exhaustive)]

so that it is clear that it is modifying what representations are valid, rather than putting a representation-related thing in another attribute (that is only a visibility-like semantic constraint right now) and using the new term abi for it.

I also think this would be an interesting feature, for another use case: enums which have no invalid representations would allow more statically safe transmutes (as provided by bytemuck::Pod) and enable the use of enums as "list of known protocol-defined numbers, and also any unknown" rather than having to resort to a bundle of integer constants.

mjbshaw · January 19, 2022, 6:14pm

I use integer wrapper types for FFI instead of enums. Enums are just too much of a foot-gun to be useful for FFI.

But using integer wrapper types are annoying because it gets extremely verbose:

#[repr(transparent)]
pub struct MyEnum(i32);

impl MyEnum {
  pub const VARAINT_A: MyEnum = MyEnum(0);
  pub const VARAINT_B: MyEnum = MyEnum(1);
  pub const VARAINT_C: MyEnum = MyEnum(2);
  // ...
}

I wouldn't mind this if constants weren't so annoyingly verbose to write. I know macros and external crates can simplify this but I dislike unnecessary macros and dependencies.

Perhaps instead of pursuing enums for FFI, we could simplify integer wrapper types?

InfernoDeity · January 19, 2022, 6:25pm

I've noted that for dataless enum types, I already have and use a macro that does this. My problem is when the enum looks like

#[repr(u16)]
#[derive(Clone, Debug, Hash, PartialEq, Eq)]
pub enum Expr {
    Null,
    Const(Value),
    ExitBlock { blk: u32, values: u16 },
    BinaryOp(BinaryOp),
    UnaryOp(UnaryOp),
    CallFunction(FnType),
    Branch { cond: BranchCondition, target: u32 },
}

i'd like the ability to upgrade this in minor versions of xlang (lccc's intermediate architecture), but #[non_exhaustive] is insufficient for that, because it has ABI stability for plugins.

kornel · January 19, 2022, 11:36pm

I'm generally in favor of this.

It needs a new syntax, because niche optimizations are documented, and there could be crates relying on this.

What about the size of the whole enum, not just its discriminant? If you add a new data variant with [u8; 99999] that's going to break the ABI too.

InfernoDeity · January 19, 2022, 11:38pm

Niche optimizations are not guaranteed for enum types. However, I will agree that it is beneficial to maintain as an optimization

This is easy (or, at least, easier) to deal with. The discriminant is the biggest problem.

bjorn3 · January 20, 2022, 6:31pm

For enums with exactly two fields with one containing no data and the other containing a non-nullable type it is guaranteed that the enum is the same size as the non-nullable type and null is used to represent the variant containing no data. For example Option<&T>, &T and *const T are abi compatible. See https://doc.rust-lang.org/stable/nomicon/ffi.html#the-nullable-pointer-optimization.

Because of 2195-really-tagged-unions - The Rust RFC Book I would have though #[repr(u8)] should have the desired effect. The virtual union in terms of which the layout of #[repr(u8)] enums with fields is described should be enough to inhibit niche optimizations as the union doesn't have to have a bit representation matching any of the fields AFAIK. However it seems that this isn't the case. It seems that niche optimization is only prevented if a dummy variant is added to the union that doesn't contain the tag. Using #[repr(C)] does prevent niche optimization though.

InfernoDeity · January 20, 2022, 6:43pm

Yes, I'm aware (though, technically, normative documentation applies this specifically to Option<T>) that outer enums guarantee niche optimization. Inner enums do not, however.guarantee any niche optimization of conatining objects, which was my point here.

That might be a decent argument, but I'm unsure whether it applies, as it's not directing the exact definition, merely that it's layed-out as-if defined that way.

RalfJung · January 21, 2022, 1:09am

Which #[repr]? #[non_exhaustive] only affects type checking, it is (deliberately) not written #[repr(non_exhaustive)]. IMO these type checking effects should not be conflated with ABI concerns which are controlled by #[repr].

However some of your comment sounds like you are not just concerned with ABI (which I assume refers to things such as field placement), but also the validity invariant of the type. None of the attributes on an enum change its validity invariant in terms of high-level values, the discriminant always needs to refer to an (inhabited) variant. Is that what you want to see changed?

InfernoDeity · January 21, 2022, 1:13am

Any ABI-safe repr, #[repr(C)], #[repr({u,i}n)], etc.

ABI concerns everything about the value as it is passed between domains. Adding a variant to an enum is an ABI breaking change, because the new variant cannot be passed to old code. I would like a way to express "This enum may have variants not declared here added in the future and passed to code compiled with the current definition (which would be mostly unable to handle it further than as a _ => arm)".

CAD97 · January 21, 2022, 1:24am

Inner enum? Outer enum? Does this have a useful definition?

Note that if you have &Option<Option<T>>, you can get at &Option<T>. Thus as far as layout is concerned, it doesn't matter if an enum is inside some other enum, it has to have the same repr as when owned itself.

If you're referring to some other concept (serde's method of mapping enums into JSON ("variant": { data } versus { kind: "variant", data })? The difference between #[repr(u8)] and #[repr(C, u8)]?), it'd be good to clarify your terms here.

RalfJung · January 21, 2022, 1:57am

FWIW I have no idea what exactly this means, but also ABI as a term is ambiguous:

github.com/rust-lang/unsafe-code-guidelines

Should niches/ABI be part of the layout of a type?

opened 07:03PM - 19 Apr 19 UTC

closed 04:57PM - 25 Jul 23 UTC

gnzlbg

A-layout C-terminology

The current definition of layout (https://github.com/rust-lang/unsafe-code-guide…lines/blob/master/reference/src/glossary.md#layout) does not consider "niches" part of the type layout. In this thread (https://github.com/rust-lang/unsafe-code-guidelines/pull/120#discussion_r276975529) it was argued that maybe we might want to change that and make them part of the layout of a type. If we do that, we need to change the glossary, and distinguish that `&mut T` and `*mut T` don't have the same layout, because they don't have the same "niches". cc @eddyb

So, in the future it might be good to be more explicit here. But the rest of the comment clarifies it well enough for the purpose of this thread I think.

(I assume there is some reasonable constraint here about the size of the new variants as otherwise this could not possibly work.)

I think what you are describing is simply not an enum any more. An enum is algebraically a sum type, and being able to match on it and enumerate its variants is key to the concept of such a type. To be fair, non_exhaustive somewhat goes against that idea, but only in the sense of "the definition could change and this code should still compile" -- at codegen time, we still always have a well-defined fully known set of variants.

If you want something that is actually open-ended, you'll have to code that up yourself using an integer "discriminant" and a union for the fields. I could imagine a macro can help with that. I am not convinced that this is a job for enum.

InfernoDeity · January 21, 2022, 2:08am

Inner enum meaning the enum inside the niche-optimized enum. Enums inside other enums aren't guaranteed to apply niche-optimization on the outer enum. IE. Option<T> will be niche-optimized based on T, but is not guaranteed to apply niche optimization to the outer Option<Option<T>>. Option<Option<bool>> can have size_of==2, even if Option<bool> has size_of==1.

Yes, this is the trivial part.

I think it still is, really. After all, I'm not exceeding that definition (other than the at codegen time part, since it's really not known until load time now), I'm just saying that some variants aren't yet defined (but may be added in the future without breaking ABI).

The issue, as I mentioned, is the ergonomics of the union. I can no longer write simple code to handle the variants known to exist, and can't be safe. Rust is a huge ergonomics pain to work with in an abi-stable context, and I'd ideally like to not make it worse.

RalfJung · January 21, 2022, 8:05pm

This is entirely changing the type from being closed-world to being open-world -- so it is definitely exceeding the definition by removing its most characteristic component.

We can disagree on whether enum should support the open-world use-case (I am not convinced it should), but I will insist that this is a rather fundamental shift away from what enum currently is -- and I am backed by literal decades of research on category theory and type theory. If you don't care about the deep theory behind closed-world sum types, that is totally fair, of course. I am just stating a personal preference that I do not think these things should be mixed up lightly.

Conceptually, your open-world enums are much closer to dyn Trait than to our current enum. dyn Trait is currently the canonical way to express open-world runtime polymorphism in Rust. What you describe sounds basically like a variant of dyn Trait with the extra constraint that the underlying type has a certain maximal size, making the entire type sized? If we had that, then you could use downcasting to safely handle the variants (types) known to exist. That feels much more natural to me for an open-world usecase than trying to use enum for this. (This is coming purely from a type theoretic perspective. I understand you are coming from a low-level bits-and-bytes perspective, and that is why enum seems more natural to you. I am offering the type theoretic perspective in the hopes that it can be useful or at least educating.)

InfernoDeity · January 21, 2022, 9:41pm

To me, it still feels like a natural extension. #[non_exhaustive] allows the type to be extended without breaking API. IMO, there should be a way to extend enums without breaking ABI.

dyn Trait . This is like the worst boilerplate to work with in an abi-safe context. I'd like to avoid using this as much as possible. It's also has absud overhead because emulating vtables and dynamic dispatch sucks (I have plans to optimize this, but I can only do this on lccc itself because there I know the layout of trait objects and vtables). Downcasting is also going to be horrible for ABI, beause I can't do TypeId myself.
Ideally, I'd avoid using trait objects as much as possible. Enums are far better for ergonomics and avoiding boilerplate.

comex · January 21, 2022, 11:26pm

I'd argue that most research on type theory has nothing to say about ABI, because most languages don't support ABI stability at all. In most languages, if you want to combine two pieces of code, you're just expected to feed them into the same compiler at the same time.

You might counter that some languages that don't talk about ABI still have concepts of open-world polymorphism that could be made to fit. But they would be a poor fit. The thing about this use case is that it's not really open-world. It's not as if any crate can define its own enum variants. There is still a single authoritative definition; it's just that @InfernoDeity wants to be able to extend the definition without breaking compatibility with client crates. If we were only talking about the API level – if "breaking compatibility" meant "requiring a code change" – then that would be exactly the purpose of #[non_exhaustive]. Instead we're talking about ABI, where "breaking compatibility" means "requiring a code change or rebuild". But that doesn't make it fundamentally different from a type theory perspective.

More concretely, dyn Trait is not a good fit. For one thing, it adds an extra indirection. But suppose we fixed that by coming up with some variant of dyn where the vtable pointer is part of the object itself, something that, as you probably know, has been requested in the past for other use cases. There would still be the problem that a vtable pointer is much larger than a 1- or 2-byte discriminant.

Even if efficiency weren't a concern, what would the trait even do? The desired use case is for clients to match against known variants while ignoring unknown ones. I guess one could use the visitor pattern – some convoluted thing like:

trait Foo {
    fn visit(&mut self, visitor: &mut dyn FooVisitor);
}

trait FooVisitor {
    // instead of A { bar: i32, baz: i64 }
    fn visit_variant_a(&mut self, bar: i32, baz: i64);

    // instead of B { bay: String }
    fn visit_variant_b(&mut self, bay: String);
}

But this is just like how people emulate sum types in Java or whatever – languages that have class-based polymorphism but don't support true sum types. It would be a depressing sight in Rust, which does support them.

Not to mention that traits aren't natively ABI-stable either, though they can be emulated.

Sidenote: I have a personal stake in this. I have a fairly ambitious project in mind for the future, which I would like to write in pure Rust. However, for my use case it will be critical for most boundaries between different crates to be ABI-stable. I even want to write an ABI-stable wrapper for a significant chunk of std and core.

My current plan is to do this with the abi_stable crate (or perhaps implement something similar myself), which essentially reimplements some language features inside procedural macros, to work around the fact that the real versions of the features don't support ABI stability. For example, there's no option to make trait objects ABI-stable, so abi_stable has an attribute macro you can apply to traits to creates a custom trait object, a repr(C) struct full of extern "C" function pointers, and generate the necessary plumbing between that and the actual trait. It also supports non-exhaustive enums!

But as you might expect, these workarounds have significant limitations and ergonomic penalties. So I would like the language to offer native ABI stability as an option for as many features as possible.

RalfJung · January 21, 2022, 11:53pm

That is a good point, so my analogy with dyn Trait does not work. Thanks for pointing this out.

tcsc · January 22, 2022, 1:01am

At one point I had started writing an RFC for a #[repr(non_exhaustive)] on enums (link to the very obviously unfinished RFC that I wrote in the past is here). ABI wasn't even the main concern, instead just FFI and deserialization, both of which would have been very useful to me at the time (and the FFI case would still be useful to me). The main points were:

Only applied to C-style enums, required that the repr include an explicit size (e.g. #[repr(C, non_exhaustive)] or #[repr(int, non_exhaustive)]) and required the #[non_exhaustive] attribute to be present.
Only relaxed the validity requirements for these enums. Didn't contain any method of going from int->enum (presumably safe transmute might cover that), but enum->int via as would do the obvious thing.
Probably others. This was last year I was working on this.

Still, I'm surprised it's as controversial now as it seems. I guess the focus on ABI and the fact that it's not limited to C-like enums in this proposal is the difference?

Topic		Replies	Views
Shouldn't it be possible to #![allow(non_exhaustive)] language design	27	2672	July 3, 2021
An alternative to enum variants types language design	12	2394	January 31, 2022
Enums should magically get a trait for getting `discriminant` and `variant_count`	24	2199	June 2, 2023
Unreachable variants of enum still impact layout language design	11	1428	September 15, 2022
Trait-based enum Variants language design	27	4456	February 14, 2024

Relaxed discriminant rules for `#[non_exhaustive]` ABI-safe enums

Related topics