Pre-RFC: Anonymous variant types

I decided I would be the latest to take a crack at the problem of designing ad-hoc sum types for Rust. The design deemphasizes immediate conveniences in favor of a relatively minimal design which may not be pleasant to use raw but should be more likely to make it far enough to be built on. Here I post it for feedback.

Now up to be pull requested!

12 Likes

Minor bikeshed: I think it would be cleaner if each variant of the anonymous enum were a singular type. (() | f32, f64) would instead be written (() | (f32, f64)), and you then don’t have to worry about anything other than specifying which variant it is, the rest is just using that type.

I see the reasoning behind allowing multiple types in a variant, but this would be simpler and forwards compatible to allowing them.

1 Like

I love this proposal

(f32) // syntax error

This isn't a syntax error now, it's the same as f32


The never type is an enum with no fields on a conceptual level so i don't think we need another way to do that.


Could we use the enum keyword instead of (_|_)::0(NoneError) in something like enum::0(NoneError), it maybe easier to read and add or remove variants than the current way. (Similarly in pattern matching using enum::0 instead)

1 Like

I’m not sure I understand these lines :

let _: fn(i64, f64) -> (i64, f64 | &str) = (i64, f64 | &str)::0;
let _: fn(&str) -> (i64, f64 | &str) = (i64, f64 | &str)::1;

It seem to me you are defining and anonymous variable of type “function returning an anonymous variant” but you don’t assign a function to this variable but the type of a Variant. I would rather expect something like :

let _: fn(i64, f64) -> (i64, f64 | &str) = |x,y|(i64, f64 | &str)::0(x,y);
let _: fn(&str) -> (i64, f64 | &str) = |x|(i64, f64 | &str)::1(x);

In Rust right now you can do this

let _: fn(i32) -> Option<i32> = Option::Some;

Which is similar to what they are trying to show. I.e. that all of the variants are tuple type variants.


This works for all tuple type variants of any enum.

1 Like

At first sight i like the general Idea because it could really help for better error handling and also be a way to get rid of the weird (nested) either::Either in futures that is sometimes necessary. I like using it in languages like Typescript and overall i think this could be a very handy addition to Rust. (like anonymous structs for named arguments … :smiley: but i don’t want to derail the discussion). Looking forward to it and thanks for the time you put into this, research the other RFCs and may found a better way the get things going.

I don’t see anything about how to match on these enums, which seems odd because I thought how to match a variant when variants have no names was one of the harder parts of making anonymous enums viable. Was that a deliberate omission?

2 Likes

I think matching follows from what is described and can be done like

let x = (i32 | f32)::1(0.0);

match x {
    (i32 | f32)::0(-1) => { /* ... */ },
    (_ | _)::1(f) => { /* ... */ },
    _ => { /* ... */ },
}

however I agree that it should be included in the pre-RFC.

Thank you, I did not know this feature. It seem pretty useless to me to have a special syntax for such a thing.

It’s not special syntax. What @RustyYato wrote is just an ordinary type annotation in an ordinary let binding. It merely uses the fact that tuple enum variants are functions.

3 Likes

To the subject, I find automatic trait implementation highly problematic if it’s automated by hard-wiring these traits into the compiler (in yet another way) too. In particular, would this require making Debug and Hash also lang items? Why wouldn’t we want to what is currently done to tuples, and implement these traits in std (or core) for some anonymous sum types instead, possibly waiting for a more general solution (which AFAICT would be variadic generics for tuples, maybe something else for anonymous sum types)?

I would also like to see some real-world use cases for this proposal. (I hope I didn’t just miss them in the main text.)

2 Likes

I like this proposal a lot. A tuple can be seen as just a struct whose variables are named 0, 1, etc and in that same way your types can be seen as just a enum whose variants are named 0, 1, etc.

I think it works quite well with the recently accepted or patterns RFC. These RFC’s would together allow for things like

let x = (u32 | u32, f32)::1(1, 0.0);

match x {
    (_ | _, _)::0(x) | (_ | _, _)::1(x, 0.0) => { /* ... */ },
    (_ | _, _)::0(2 | 3) => { /* ... */ },
    _ => { /* ... */ },
}

An adition I would like to make is allowing enums like enum Foo(u32 | f32) like we currently allow structs like struct Foo(u32, f32). This would allow doing

enum Foo(u32 | u32, f32);

let x = Foo::1(1, 0.0);

match x {
    Foo::0(x) | Foo::1(x, 0.0) => { /* ... */ },
    Foo::0(2 | 3) => { /* ... */ },
    _ => { /* ... */ },
}

A simpler alternative here is to just have:

ty ::= ... | ty_senum ;

ty_senum : "(" (ty "|")+ (ty "|"?)? ")" ;

as @CAD97 has noted.

The changes to the expression and pattern grammars are not defined.

Ostensibly you could write if let _::0(x, y, ...) = expr { .. } otherwise which seems more ergonomic and probably necessary to make it more ergonomic than creating new nominally typed enums.

This is not ambiguous, but requires at least backtracking with https://github.com/rust-lang/rfcs/pull/2535. Worse, the backtracking here is not a corner case that is unlikely to arise, rather, it is quite likely to occur.

The most powerful solution is probably frunk::coproduct - Rust and should be mentioned.

Another way to have structurally typed coproducts is to have something like:

type Foo = Bar | Bar(u8, f32) | Quux { field: String };

let a_foo: Foo = Bar(1, 1.0);

match a_foo {
    Bar => expr,
    Bar(x, y) => expr,
    Quux { field } => expr,
}

As you can see, there are zero changes to the pattern and expression grammars. Only the type grammar changes.

6 Likes

If I understand correctly, instead of an enum like this:

enum RgbOrRgba {
   RGB(Vec<RGB>),
   RGBA(Vec<RGBA>),
}

I could define an anonymous one like that:

type RgbOrRgba = (Vec<RGB>, Vec<RGBA>);

and then match rgb @ Vec<RGB> rather than RgbOrRgba::RGB(rgb). That would be slightly nicer, but not ground-breaking.

But given a function like:

fn foo(a: (Vec<RGB>, Vec<RGBA>)) {}

I’d want to call it foo(vec![rgb]), not foo((Vec<RGB>, Vec<RGBA>)::0(vec![rgb])), because replacing a redundant ::RGB with a weird ::0 does not seem better.

1 Like

Understandable. However, the point of this RFC is not to be fully-featured upfront, but to have a better shot at actual implementation so conveniences as you describe can be added later. Making the variants specified by numerals rather than types is part of this, as it avoids questions about how the “sum type operator” interacts with itself and the rest of the type system with regards to cases such as two different sum types sharing a variant type being sum-typed together.

1 Like

I’ve edited the post with a revision of the pre-RFC, taking into account the feedback from this thread. The major change is to make it so that variants are limited to one field, which simplifies the grammar somewhat and makes it more feasible for users to create blanket implementations for a practical subset of anonymous variant types.

This is lovely, almost exactly the proposal I sketched in some thread a few months ago! In general, I’m guessing you can write _::0 in most situations? My proposal didn’t include the pathspec on purpose, since I thought it was much too noisy. I can see from your examples how it might be necessary, though.

What does the following desugar to?

let x: fn(A) -> (A|B) = (A|B)::0;

I fear that this might result in a lot of work for the linker, since each crate will generate its own copies of these functions and the linker will need to dedup them.

Also, I think I’ve suggested the syntax enum(A, B) before? I think the extra keyword is fine, since anonymous sums are likely to be rarer than anonymous enums, and is more symmetric with extant syntax. Plus, again on the assumption that they’re rarer, a reader spotting enum(A, B) might have a better guess (and a more searchable name) for the syntax.

As an obvious corrollary to the above paragraph, it might be worth it to introduce “sum enums” in lieu of tuple structs: enum Foo(A, B);, and by that token, “never-like enums”, e.g. enum Void;

What does the following desugar to?

let x: fn(A) -> (A|B) = (A|B)::0;

That statement is translated roughly to something like (it's not exact, because the compiler can do selective identifier unification and hygiene better than we humans and our text files can)

enum __A_B_anon_variant {
    __variant_0(A),
    __variant_1(B)
}

let x: fn(A) -> __A_B_anon_variant = __A_B_anon_variant::__variant_0;

The whole point of making these anonymous variant types mirror the behavior of enums was to let existing enum machinery be reused for anonymous variant types, and thus reduce the effort needed to implement them.

1 Like

I understand all that, since I wrote a very similar proposal a while back. But… where does __A_B_anon_variant::__variant_0 (i.e. the couple movs, xor, and ret that make it up) live? I’m worried about really intense proliferation of lots of functions like this that the linker has to go deduplicate.

But… where does __A_B_anon_variant::__variant_0 (i.e. the couple movs, xor, and ret that make it up) live?

In a special land which shares it resources with enum land but has distinct residents. Presumably any problems about the runtime needed for anonymous variant types would already have been surfaced as problems concerning the runtime needed for enums by now with the varied monomorphizations of Options, Results, and countless other common and single-shot enum types flying around in the Rust ecosystem.

1 Like