Concept RFC: Tuple Enums

Regarding the tuple struct vs tuple enum thing. I wanted to mention that you almost never need to use indices for tuple structs. For example:

// constructing
let pair = MyStruct(left, right);
// index notation
let right = pair.1;
// positional notation
let MyStruct(_, right) = pair;

Now with these tuple enums, I was thinking this benefit wouldn't apply. I realize technically this isn't true if the following syntax were considered:

// constructing
let place = MyEnum(_, right);
// destructuring
match place {
    MyEnum(left, _) => ...,
    MyEnum(_, right) => ...,
}

I suppose this was beyond my consideration earlier, because this syntax permits MyEnum(left, right) which is invalid.

I've gotta say, this is the first better-enums-related proposal I don't hate, which I think is an important step towards being able to fill out the (sum/product) x (nominal/structural) x (fields/indies) tensor. I think construction is our only remaining issue, and the main prior art, std::variant, has nothing helpful to offer (implicit coercion is the worst forever).

I'm tentatively a fan of the "overloaded" solution, or perhaps something like

enum Foo(A, B, C);

fn Foo<T: <synthetic>>(t: T) -> Foo;

where <synthetic> is an unnameable, sealed, compiler-generated trait that is implemented only by those types that are members of Foo. Thus, you could write:

Foo(a)
// or even
Foo::<A>(a)

In the future, once the type and trait namespaces are distinct, we could even name that trait, so that we have

fn Foo<T: Foo>(t: T) -> Foo;

where Foo (the trait) is any type that can be put into the Foo enum. Obviously, this all assumes the non-pathological case where Foo is (hygenically) a type union. Otherwise, you must use indices. I suppose the Foo trait could still exist but that feels a bit silly.


As always, generics make you kind of sad:

let x = Either(foo);  // Hope that type inference can figure
                      // out R and L.
let x = Either::<u32>(foo);  // What does this mean...?
let x = Either::<u32, _>::<u32>(foo); // UHHHHHHHHH??
// aka Either::<u32, ?0>::<u32>(foo)

(Should the last one actually work if the inference variable ?0 is deduced to be u32?)

A nice thing about this feature is that we always have indices as a fallback; we can allow the "nice" construction form for the clearly non-pathological cases (concrete type unions) and allow more and more of them over time.

1 Like

What about Either::<0>(foo)?

That looks like it could be confusing with const generics.

I agree with others who say that this is not sufficient motivation for a feature (checking boxes on type theory). What problem does this solve that isn't already solved by regular enums?

I think we can be more nuanced than this. Rust clearly could have said "what problem does if solve that isn't already solved by match?", but it didn't. (if is essentially positional syntax for matching where you don't need to write out the variant names, rather like is being discussed here.) It's true that filling out a tensor isn't in-and-of-itself a reason to do something, but the existence of the other features in that tensor that create that hole is evidence that there were analogous situations where something like this was worth having.

So I think I'd personally rather see this feedback more as "here's why, despite the example in the OP, this pattern for enums is materially less common than it is for structs, so even though it was worth it there, it's not worth it here" or "here's why the struct case was a big ergonomic win, and here's why the corresponding ergonomic win for the enum case with this proposal wouldn't be big enough to be worth introducing" or other things that the proponent here could respond to more directly and improve the proposal to address.

3 Likes

After re-reading the updated proposal with the new syntax I agree with the reasons behind the changes, but also feel that it lost a lot of the conciseness of the original version.

Comparisons

Comparing "vanilla" enums with tuple enums, declaration is still more concise with tuple enums:

// regular enums:
pub enum Expr {
    Array(ExprArray),
    Cast(ExprCast),
    ... 
// tuple enums:
pub enum Expr (
    ExprArray,
    ExprCast,
    ...

Matching, however, loses a lot of the benefits. Discrimination between members essentially merely moves from outside the parens to inside.

// regular enums:
match expr {
    Expr::Cast(expr) => /* expr: ExprCast */,
    Expr::If(expr) => /* expr: ExprIf */,
    ... 
// tuple enums:
match expr {
    Expr::_(expr: ExprCast) => /* ... */,
    Expr::_(expr: ExprIf) => /* ... */,
    ...

Assignment is slightly nicer, but only if index-based syntax isn't required. Note: I'm not sure which variants of those below were intended in the proposal.

// regular enums:
let e = Expr::Cast(expr_cast);
// tuple enums:
let e: Expr = expr_cast;
let e = expr_cast as Expr;
let e = Expr::3(expr_cast);
let e = Expr::From(expr_cast);

Index-based syntax vs. editing

The more I think about index-based syntax the more I think it could create problems that are more pronounced with enums than with structs:

If I add a new entry in the middle of an enum, any subsequent entry will be unchanged with tags, whereas with index-based syntax I will have to update every instance of every following index in every file.

While this is true for index-based field access in structs as well, structs more likely will used pattern-based destructuring. Furthermore, structs are very likely to only have few members, whereas with enums this is not a given. The motivating syn example has over 40 entries. While currently they are all distinct, I wouldn't want to have to find the index of ExprTry if that should ever become necessary.

Furthermore, when creating an enum, such modifications might even create silent bugs:

// before editing:
struct S(u64, u64);
let s = S(1, 2);
enum E(u64, u64);
let e = E::1(5);
// adding a new entry, but forgetting to update the instantiation sites:
struct S(Added, u64, u64);
let s = S(1, 2);            // COMPILATION ERROR: wrong arity
enum E(Added, u64, u64);
let e = E::1(5);            // WHOOPS, still compiles!

Consider that the case where an enum has a duplicated type and thus is susceptible to mix-ups is also exactly the case where one would have a need to use the index-based syntax.

Avoiding index-based syntax

To address this, instead of using an index-based syntax I was going to suggest requiring tags for those entries that could lead to duplication, with some syntax sugar to smooth things out:

pub enum Foo<T: MyTrait, U>(
    First(u64),   // duplicated entry, order irrelevant
    u64,          // regular `u64` values go here
    T,            // `T` values go here. Or use auto-generated tag: Foo::T(7u64)
    Second(T),    // duplicate requires explicit tag
    Either<T, U>,  // assign directly, or use auto-generated tag: Foo::Either(...)
    Mirrored(Either<U, T>),            // may be duplicate if T == U, tag required
    Assoc(<T as MyTrait>::AssocType),  // explicit tag required
    ...

However, this gets increasingly close to regular enums, so I'm wondering if we couldn't rather extend the regular enum syntax. This would also side-step the enum { ... } vs. enum( ... ) issue.

pub enum Expr<M> {
    Sentinel,                                      // regular
    Error { kind: ErrorKind, message: String },    // regular
    use M,                 // equiv. to M(M) - see below
    Other(M),              // regular, tag required for disambiguation
    use ExprArray,         // equiv. to ExprArray(ExprArray) - see below
    use ExprCast,
    OtherCast(ExprCast),   // regular, required for disambiguation
    use ExprTry as Try,    // equiv. to Try(ExprTry)
    // ... more entries ...
}

let ec: ExprCast = ...;
let e1: Expr<ExprCast> = ec;                   // => `use ExprCast`
let e2: Expr<ExprCast> = Expr::OtherCast(ec);  // regular syntax for regular entries
let e3: Expr<ExprCast> = Expr::M(ec);          // ambiguous, tag required => `use M`
let e4: Expr<ExprCast> = Expr::Other(ec);      // => `Other(M)`.

let e5: Expr<u64> = 7u64;     // not ambiguous => variant `use M`

(Note: something along these lines was already suggested in the "structs as enum variants" thread)

Here, use <type> would auto-generate a tag with the same name as the type or type variable (e.g., use ExprTry is equivalent to ExprTry(ExprTry). Tags of course still have to be unique.

Downside: this would make names of type variables a part of the API.

Where possible, for guaranteed non-ambiguous cases:

  • allow assignment of values of those types without tag.
  • auto-implements std::convert::From<type>.

Matching: auto-generated tags remove the need for new match syntax. match blocks would rather use the generated tags throughout.

2 Likes

Re syntax

  • especially regarding destructuring syntax

Original proposal:

enum Enum(Left, Right);
fn with_enum (expr: Enum) -> _
{
    match expr {
        left: Left => { ... },
        right: Right => { ... },
    }
}

current proposal:

enum Enum(Left, Right);
fn with_enum (expr: Enum) -> _
{
    use Expr::*;
    match expr {
        Expr::_(left: Left) => { ... },
        // or, thanks to the use above:
        _(right: Right) => { ... },
    }
}

Granted, the latter syntax is less nice, since it features less sugar, i.e., less magic.

As @scottmcm put it (although my post is answering most posts in this thread, not just theirs, of course :smile:):

But that's actually the point of the change, let's not forget this is an RFC: the bigger or more magic the suggested change is, the less chances it will have to be approved.

In other words, incremental changes are the way to go, so as to also experiment about the wins or whatnot of the suggested feature.

  • (and in an even more incremental fashion, we could even get away without the pattern-match type ascription (which is / could be an RFC on its own), by adding type inference hints within the => { ... } body).

That is, the initial syntax is not being rejected, it is just out of scope of this first proposal, but it would be the ideal candidate for a future proposal.


My personal interpretation of the above, is that Tuple Enums, taken as this first draft, would have many advantages except when destructuring, where they would still be a bit cumbersome / offer little benefit compared to regular enums (especially if the ::_ elided variant notation where to be accepted for the latter (which would already be, imho, a big win for syn-like ASTs)).

But that's fine! Indeed, that could be improved in a follow-up RFC, and there are still many other advantages with this proposal:

  • Overloaded constructor syntax

    justifying, in and on its own, the parenthesis syntax:

    enum MyEnum(Left, Right);
    // constructs:
    enum MyEnum { 0(Left), 1(Right) }
    // as well as: an **overloaded** `MyEnum` constructor
    const MyEnum: /* ... */ = /* ... */;
    impl Fn(Left) -> MyEnum for MyEnum { /* ... */ }
    impl Fn(Right) -> MyEnum for MyEnum { /* ... */ }
    
    // little magic, intuitive Rust mechanics 👌 
    

    This way, we can write:

    enum MyEnum(Left, Right);
    
    let my_enum =
        if random() {
            MyEnum(mk_left())
        } else {
            MyEnum(mk_right())
        }
    ;
    

    That, in and on itself, is already a neat syntaxic / ergonomic win.

  • This can be enhanced with delegation

    (either through a community-provided macro, or built-in, with a follow-up RFC):

    use ::std::error::Error;
    
    macro_rules! throw { ($err:expr) => (
        return Err(ImplError($err));
    )}
    
    fn may_fail (...)
      -> Result<(), impl Error + Send + Sync + 'static>
    {
        #[delegate(Error)]
        enum ImplError (::std::io::Error, SomeOtherError);
    
        ::std::fs::some_func()
            .map_err(ImplError)? // or just `?` if `#[derive(From)]`
        ;
        if some_condition() { throw!(some_other_error); }
        Ok(())
    }
    
    • Thus leading to more ergonomic allocation-free "dynamic-like" error types.

Ambiguous duplicated types are out of the scope of this RFC, IIUC.

Nevertheless, I'd say that using tuple structs/enums to then prepend a a new element is an antipattern, and can be a footgun with structs too:

// before:
#[derive(Default)]
struct S(u64, u64);
let s = S { 1: 42, ..Default::default() };

// after:
#[derive(Default)]
struct S(Added, u64, u64);
let s = S { 1: 42, ..Default::default() }; // WHOOPS, still compiles!
  • granted, such pattern for construction is not that common, but the footgun is still there.

    And that pattern is definitely common in destruction:

    let it: u64 = s.1;
    // or
    let S { 1: it, .. } = s;
    
2 Likes

One problem with this is that use would accept a type in enums and an item everywhere else:

use std::borrow::Cow; // here, `use` accepts an item

enum Moo<'a> {
    use Cow<'a, str> as Cow, // here, `use` accepts a type
}

The other problem is that structural and generic types must be renamed with use X as Y, so they can be desugared; this is even more verbose than the current syntax Y(X). In my opinion, it makes the language more complex for very little gain; the same applies to the other ideas/proposals in this thread.


Here's another idea: Add a derive macro to implement From for all unambiguous tuple-like enum variants:

#[derive(FromVariants)]
enum Foo<'a> {
    Cow(Cow<'a, str>),
    String(String),
    Numbers(u8, u16, u32),

    A(i32),  // macro ignores variants
    B(i32),  // with ambiguous types

    C,               // macro ignores unit-like and
    D { inner: () }, // struct-like enum variants
}

// so these implementations are generated for the enum:

impl<'a> From<Cow<'a, str>>   for Foo<'a> {...}
impl<'a> From<String>         for Foo<'a> {...}
impl<'a> From<(u8, u16, u32)> for Foo<'a> {...}

If you have an enum where every variant contains a different type, then the macro gives you a From implementation for evey variant.

The existence of another feature also does not provide motivation for a completely unrelated feature. In fact, it's arguable that if let ... syntax is a net negative to the language because it makes it easier to ignore and not handle variants of an enum.

The OP Pre-RFC does not provide any motivation for the feature beyond "filling out the tensor" which I find odd. An RFC must have a motivation to have a chance of being accepted. None has been proposed (that I can see). I personally can't see any useful use-case. Also, the examples I've seen are the opposite of "ergonomic" IMHO.

Tuple structs are simply tuples with names. They really aren't "structs" at all. It would've been better had the language simply permitted creating "Named Tuple Types" (which is what they actually are). Instead of:

struct Foo ( Bar, Baz, Jazz )

it should've been

tuple Foo ( Bar, Baz, Jazz )

So, there really isn't "Structs with indexes instead of named components" in the language (despite how they're declared). What there really is is "Named Tuples", so this alone takes a lot of steam out of the argument that "because Tuple Structs exist, it makes sense to have Named Anonymous Variant Enums".

If one of the biggest wins of tuple enums is enabling the following:

let expr: Expr = /* ... */;
match expr {
    Expr::_(expr: ExprCast) => /* ... */,
    Expr::_(expr: ExprIf) => /* ... */,
    Expr::_(expr: ExprMethodCall) => /* ... */,
    /* ... */
}

Why not allow the use of _ for the enum the language already has, and rely on type inference to do the rest? For instance:

struct Foo(u16);
struct Bar(u32);

enum Qux {
    Foo(Foo),
    Bar(Bar),
}

// later

let qux: Qux = /* ... */;
match qux {
    Qux::_(foo: Foo) => /* ... */,
    Qux::_(bar: Bar) => /* ... */,
}

Skimming through other linked threads, it seems like this would kind of be like the type ascription in patterns shown by that comment.

2 Likes

I am not particularly in favour of this proposal (or against it), but could we please rename it to something like ‘positional enums’? Naming this construct after tuples is positively misleading.

2 Likes

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.