Concept RFC: Tuple Enums

CAD97 · August 5, 2020, 2:17am

Last updated 2020-05-08. See edit history (the orange pencil) for the edit history. Short changelog at the bottom.

Forked from Ideas around anonymous enum types, one non-anonymous potential new enum form that I think could be exceedingly useful for certain design spaces.

Specifically, a proposal in the design space of "types as enum variants" (not "enum variants are types"). I am not wedded to the name of "Tuple Enum;" it is chosen in analogue to Tuple Structs, but I am open to alternate naming that doesn't get too much into the weeds of type theory.

The RFC format is followed loosely here to introduce the concept in a semiformal manner. This is not yet really a proper RFC text nor even really "pre RFC" stage; this is just to get comments on the concept and proposal generally.

Additionally, please avoid discussing anonymous/structural enumerations in this thread. Ideas around anonymous enum types is where that discussion currently lives, and has a draft RFC for that feature.

Summary

Analogous to tuple structs (struct Foo(u32, u64);), which are product types with indexed fields, we add tuple enums (enum Foo(u32, u64);), which are pure sum types of indexed variants.

Guide-level explanation

Example

syn, the de-facto library for Rust syntax in procedural macros, uses what it describes as a syntax tree enum to store the typed syntax tree. For example, syn::Expr is roughly defined as

pub enum Expr {
    Array(ExprArray),
    Cast(ExprCast),
    If(ExprIf),
    MethodCall(ExprMethodCall),
    Tuple(ExprTuple),
    // many variants omitted
}

where each variant of the enum is just a tag around a separately defined type—what serde calls a newtype variant (as opposed to a tuple variant or a struct variant). The idiom for use is to match and rebind with a refined type for further processing:

let expr: Expr = /* ... */;
match expr {
    Expr::Cast(expr) => /* ... */,
    Expr::If(expr) => /* ... */,
    Expr::MethodCall(expr) => /* ... */,
    /* ... */
}

As syn's syntax tree enums are "pure" sum types, in that each variant is a newtype variant, it could instead be written as a tuple enum to avoid the need to name each variant:

pub enum Expr (
    ExprArray,
    ExprCast,
    ExprIf,
    ExprMethodCall,
    ExprTuple,
    // many variants omitted
);

Matching on an Expr is now done by index:

let expr: Expr = /* ... */;
match expr {
    // NB: indexes are examples
    Expr::1(expr) => /* expr: ExprCast */,
    Expr::2(expr) => /* expr: ExprIf */,
    Expr::3(expr) => /* expr: ExprMethodCall */,
    /* ... */
}

More usefully, type ascription can be used in the pattern to infer which variant is meant:

let expr: Expr = /* ... */;
match expr {
    Expr::_(expr: ExprCast) => /* ... */,
    Expr::_(expr: ExprIf) => /* ... */,
    Expr::_(expr: ExprMethodCall) => /* ... */,
    /* ... */
}

When would I use this?

Any time you'd have an enum today with exclusively "newtype variants" of distinct types is a good canditate to be a tuple enum. It's a similar tradeoff as a regular struct versus a tuple struct; either would probably do in any case, but each one has certain cases where it can make more sense.

This is not "variant types"

In the variant types proposal, you might instead lift the ExprVariant types to be Expr::Variant variant types. This could look similar to

pub enum Expr {
    Array      { attrs: Vec<Attribute>, /* ... */ },
    Cast       { attrs: Vec<Attribute>, /* ... */ },
    If         { attrs: Vec<Attribute>, /* ... */ },
    MethodCall { attrs: Vec<Attribute>, /* ... */ },
    Tuple      { attrs: Vec<Attribute>, /* ... */ },
    // many variants omitted
}

However, there is one big key difference between this proposal, "types as enum variants," and "enum variants are types." In "enum variants are types," e.g. Expr::Array is a type representing exactly the variant Array of Expr. This means that the size and alignment of each variant type are the same as the full enum, as the variant type is just the enum type with a known variant. (In other words, transmute::<&Enum::Variant, &Enum> is always sound.) With "types as enum variants," each type (e.g. ExprArray) are still their own types with their own size and alignment which can differ from that of the enum's.

This is not "type sets" / "type unions"

Simply, if you write enum Foo(u32, u32), this is distinct from enum Foo(u32) and from enum Foo(u32, u32, u32). When there are more than one variant of a given type, the type-refinement match syntax is not usable (as it would be ambiguous) and instead the indexed match syntax must be used.

Reference-level explanation

A brief type theory overview

A type is a set of values with that type. Additionally, any value has only one type. (E.g. the value 0u8 has the type u8, but is a distinct value from 0u16, which has type u16.)

A product type is the type that results from taking the product of two sets. Given type P with |P| potential values and type Q with |Q| potential values, the product type P×Q has |P| × |Q| potential values.

In Rust, a struct is a nominal product type (it is identified by its name) with named fields, a tuple struct is a nominal product type with indexed fields, and a tuple is a structural product type (it is identified by it structure) with indexed fields. (Structural records would be the fourth kind of type in the family; a structural product type with named fields.)

A sum type (or disjoint union or coproduct) is the type that results from taking the sum of two sets. Again given type P and type Q, the sum type P+Q has |P| + |Q| potential values.

In Rust, an enum is a sum type, but it is also more than a sum type. Instead, it's a sum type of anonymous unnamed product types for each variant. This is what that "variant types" proposal is exposing: the extra product type layer in the enum definition. In contrast, a tuple enum is just a sum type of existing types, and does not include the extra mechanics for introducing an extra product layer.

Strictly speaking, you cannot sum one set (type) with itself (e.g. u32 + u32), as the two sets are not disjoint (thus a sum being a disjoint union). A (type theoretical, not C) union type solves this by deduplication; the set u32 ∪ u32 is identical to the set u32. However, there is a standard solution to this problem -- just tag the values with from which set they originated.

For "struct like" enum (the existing enum), the difference between a sum and a union type doesn't matter, as it introduces the new product type wrapping any external types, and so every type summed is guaranteed distinct. Tuple enums follow the tagging behavior to keep the same behavior as an enum of just newtype variants (effectively, they desugar to them) and to maintain generic hygeine. In the simple case of nongeneric tuple enums with a known set of variants of disjoint types (which the author believes is the main use case for enums with anonymous members), there is no difference between a sum type and a union type.

Generic hygeine

Basically, that given an Either type defined as a tuple enum

enum Either<Left, Right>(Left, Right);

and processing code of

fn example<Left, Right>(
    make_left: fn() -> Left,
    make_right: fn() -> Right,
) -> Either<Left, Right>
{
    let place: Either::<Left, Right>;
    if random() {
        place = Either::0(make_left());
    } else {
        place = Either::1(make_right());
    }
    match place {
        Either::_(_: Left ) => println!("left"),
        Either::_(_: Right) => println!("right"),
    }
    place
}

, this is equivalent to the rewritten

fn example<Left, Right>(
    make_left: fn() -> Left,
    make_right: fn() -> Right,
) -> Either<Left, Right>
{
    let place: Either::<Left, Right>;
    if random() {
        place = Either::0(make_left());
        println!("left");
    } else {
        place = Either::1(make_right());
        println!("right");
    }
    place
}

as which arm is taken is based on the local type before monomorphization. This is the case no matter who constructs the enum; the arm is resolved to a specific index based solely on local type information.

As a second example, consider

enum Foo<T>(u8, T);

fn example<T>(foo: Foo<T>) {
    match foo {
        Foo::_(_: T) => println!("T"),
        Foo::_(_: u8) => println!("u8"),
    }
}

This is resolved to the indexed syntax of

enum Foo<T>(u8, T);

fn example<T>(foo: Foo<T>) {
    match foo {
        Foo::1(_: T) => println!("T"),
        Foo::0(_: u8) => println!("u8"),
    }
}

as arms are chosen based solely on local pre-monomorphization info. Even if a Foo<u8> is provided, for which type ascription matching is unusable due to being ambiguous on which variant is meant, the arms are unambiguous for this generic function, because each type is a different local type.

Implementation details (effective desugaring)

The tuple enum enum Foo(A, B, C) has identical layout semantics as enum Foo { 0(A), 1(B), 2(C) } would, if integers were valid identifiers. Indexed based matching is handled as if the tuple enum were a "struct like" enum with this definition.

Index-based and type-ascription styles of matches may be mixed.

"Type ascription" matching relies on the locally expressed (potentially generic) type of each variant, an inferred variant index, and generalized type ascription in pattern position. If the variant's index is inferred, the resulting binding must^* have a locally evident type (either through a pattern with known type or pattern type ascription). That type is then compared against each variant type and the type of the enum itself. If exactly one^† variant type is found to match, that variant's index is used. If more than one variant's type matches, an error ("ambiguous match arm") is emitted.

^* It is also possible that a simple binding in this position could be assigned a type inference variable, such that its type could be inferred from usage rather than having to be ascribed. The author thinks that this might have more potential for confusion that it is useful, but remains open to the idea.

^† An alternative approach would say that given enum Foo(u32, u32) and the pattern Foo::_(x: u32), the pattern matches all variants with the type of u32.

Interactions with `#[non_exhaustive]`

A #[non_exhaustive] tuple enum is treated the same way as a "struct like" enum: a fallback arm must be provided when matching over it (outside of the implementing crate) and variants may be added without breaking source compatibility. However, as an extra concern for tuple enums, any new variants must added after the previous ones (as otherwise it would shift the index of existing variants) and no new variant may have the same type as an existing variant, to avoid breaking source compatibility.

For generic tuple enums, this is unfortunately equivalent to not allowing any new variants that have a previously defined type without breaking source compatibility. For example:

#[non_exhaustive]
pub enum Example<T>(T, u32);

// consumer
match value: Example::<i32> {
    value: i32 => /* ... */,
    value: u32 => /* ... */,
}

it is now impossible to add a new i32 variant to Example without breaking source compatibility.

The author of the RFC regrets this limitation but believes this is the only viable interpretation of #[non_exhaustive] for tuple enums, as the alternative is to dissallow type-refinement syntax entirely and require the use of indexed syntax. Nongeneric #[non_exhaustive] tuple enums continue to work as expected.

Drawbacks

Makes the language larger, to add an indexed form to an existing nominal feature. (Would tuple structs be reasonable to add today if they weren't already in the language? The RFC author is unsure, but tentatively believes so.)
Any new enum extension is going to have to deal with the sum type / union type distinction. Current enum cleverly avoids this problem (accidentally?) due to explicitly introducing a new "type" layer (in the theoretical sense, currently), making sure that all variants are disjoint and there is no difference between a set sum or union.
Sum types of existing types are typically nontrivial to use when they go beyond just being a disjoint union
Any remaining unknown unknowns.

Rationale

This adds the ability to express a new kind of type in the 2×2×2 matrix of

nominal vs. structural
product vs. sum
named vs. indexed members

of which three product types are currently expressable, and one sum type.

The rationale for enum tuples is therefore the same as it would be to add tuple structs to the language: adding more options for modeling data in types.

Specifically, the author believes that "types as enum variants" serves a very similar but distinct niche in data modeling to "enum variants are types," and that both are useful to data modeling of strongly typed trees such as syntax trees or other strongly typed tree-like data.

Alternatives

Don't extend enum any, and just stick to the existing nominal/named member enums. They can do everything already, so any additions are just extra niceties on top.
While not strictly an alternative, the "variant types" proposal is an alternative proposal that addresses much the same use cases as tuple enums. The author believes both proposals can live alongside each other in the language, but accepting one also makes the other harder to justify purely on expressivity arguments. (Briefly, why both are useful: fine performance/behavior tuning, due to the differences in how variant types are sized/laid out.)
Anonymous enums are technically an alternative to named tuple enums (as they are theoretically the same thing, just structural rather than nominal) but are also much more complicated of a feature, and much more likely to be confused with having union (deduplicating) semantics. Again, it's the tuple struct distinction; would we add tuple structs today, or just use type aliases of regular tuples?

Prior art

None that the author knows specifically for sum types of existing types, as most languages with "ADT enums" always name the enum variants. Typescript offers union types (e.g. string | number) but these have union semantics and effectively work via duck typing and downcasting. In fact, every language that supports downcasting (including Rust with dyn Any) supports union types; the general type upcast is a union of all potential downcastable subtypes.

If you know any languages that offer proper sum types of preexisting types, please share!

C++ std::variant (when using std::holds_alternative/std::get rather than std::visit/decltype/std::is_same_v access)?

Unresolved questions

Can and should type refinement syntax support matching further down beyond the tuple enum? How about when the tuple enum contains a tuple enum?
Should inferred variant indexes use a type variable and allow type inference or require unambiguous ascription? How unambiguous must the ascription be? (E.g. is Vec<_> enough or must it be Vec<T>?)
Simple syntax substitutions, for example:
- Instead of the pattern $path::_($pattern), use $path($pattern)
A nice construction syntax should be provided for the simple disjoint union case.
- Option: coerce from values of the variant type to values of the tuple enum type.
- Option: provide a magic constructor function (similar to tuple structs).
How exactly does a type refinement/ascription pattern interact with default binding modes?
Unknown unknowns.

Future possibilities

This RFC proposed nominal sum types with indexed variants. After this RFC, the remaining kinds of types not expressible are

structural product types with named fields (Structural records)
structural sum types with named variants (unproposed)
structural sum types with unnamed variants (Ideas around anonymous enum types)

CC

cc @Centril, author of the structural records RFC and type theorist active here
cc @robinm @Jon-Davis @lordan, active in Ideas around anonymous enum types

Changelog

2020-05-08
- Changed sugared match syntax from expr: ExprArray to Expr::_(expr: ExprArray).
- Changed reference-level section's examples to be clearer.

steffahn · August 5, 2020, 3:06am

I don’t fully understand you syntax yet. If I have

struct A;
struct B;
struct C;
enum AB(A, B);
enum CA(C, A);
enum Foo(AB, CA);

How do I write a pattern that checks if my Foo contains an AB containing an A?

Edit: Or do I have to use these numeric indices for almost anything but the most basic single-level match? Also for constructing values of AB or Foo?

mjbshaw · August 5, 2020, 4:24am

I don't think it's necessarily ambiguous. It would be natural, I think, to allow the following:

enum Foo(u32, u32);
fn example(foo: Foo) {
    match foo {
        value: u32 => println!("{}", value),
    }
}

Both variants would match the value: u32 match clause.

tkaitchuck · August 5, 2020, 5:28am

It seems like this could be used to create some odd control flow shenanigans:

enum Size(u8, u16, u32, u64);

fn example<T>(foo: Size) {
    match foo {
        value: T => println!("Match {}", value),
        _ => println!("non-match"),
    }
}

CAD97 · August 5, 2020, 6:08am

Type-refinement syntax I don't think can support pattern matching through multiple layers at a time (at least not when there are multiple possible options of the same type, anyway). I don't think you need to be able to use type-refinement through multiple layers for this to have utility, though.

(Tbh, I would expect tuple enums to be lightly used for situations like syn's syntax tree and normal enums to still be preferred for most cases.)

D'oh I forgot to mention (or even consider) how values are constructed In the unambiguous case, the main three obvious choices (beyond indexed, which should obviously work) are Into, coercion, and/or pseudo (for generic hygeine) overloaded constructor functions (gasp).

I'm adding construction to Unresolved right now (even though it is of big importance) so I can look at it again tomorrow when I'm more awake.

mjbshaw:

I don't think it's necessarily ambiguous. It would be natural, I think, to allow the following:
enum Foo(u32, u32);
fn example(foo: Foo) {
    match foo {
        value: u32 => println!("{}", value),
    }
}

It's possible, but I'm not sure if it's desirable. At the very least, it definitely needs to be an error if the type is both the enum itself and a variant type (possible via indirection), and it probably should be a lint, as these are different variants of the enum presumably for a reason.

Barring a strong argument supporting it, though, I think it's probably better to disallow type refinement syntax when two variants are the same type.

...what? Size is a type, not a trait, what are you even trying to do here?

H2CO3 · August 5, 2020, 6:11am

That sounds like a massive footgun. It comes with all the problems of unnamed fields, and those are then further compounded by the fact that these "indexed types" are alternatives to each other, so type checking can't possibly catch the mistake if one screws up the indexing.

wiogit · August 5, 2020, 10:16am

Technical Section

CAD97:

// impl<Left, Right>
let place: Either;
if random() {
    place = make_left();
} else {
    place = make_right();
}
match place {
    place: Left  => println!("left"),
    place: Right => println!("right"),
}
the intuitive arm is always taken when that side of the Either is populated, even when Left and Right are the same type. This is accomplished simply by respecting sum type semantics rather than implementing union type semantics.

I'm not sure what you mean by the intuitive arm. When Left = Right = u32 and make_right() returns u32, does the compiler know to put the value into the Enum::0/Right arm? As I understand it, the compiler would only know that you're assigning a u32 value to a (u32, u32) tuple enum, so it would have no way to resolve the match arm. Wouldn't there need to be a compiler error in such a scenario?

Philosophical Section

Type Theory Completionism As a Rationale

I don't think type theory is a good rationale for adding a feature. It is not a problem that the Rust type system doesn't check every box in the type theory matrix of possible types. To its credit, I think type theory is an important consideration for the design of a new feature, but not as a motivation for adding a feature.

Tuple Struct Rationale

Let's consider tuple structs for a bit. Are tuple structs an important feature? I see them as a very minor convenience with a very low implementation effort and little downside. From what I've seen tuple structs allow you to trade the clarity of field names for the brevity of positional fields. That brevity mainly comes into play during type definition, value constructing, and pattern matching. I would argue that field access isn't an improvement since self.0 is no briefer than self.a. If there is more to tuple structs than the clarity vs brevity trade off, let me know.

Tuple Struct vs Tuple Enum

Now lets consider tuple enums that have a duplicated type. The brevity benefits only apply to the type definition, and even then it is a weak benefit because we're talking about MyEnum(u32, i32) vs MyEnum { A(u32), B(i32) }. Pattern matching on MyEnum::0(x) is no briefer than MyEnum::A(x). Because you have to use meaningless indices, the tuple enum's reduced clarity downside is worse than any brevity benefits.

Type Union vs Tuple Enum

The type refinement feature and value construction brevity only apply when the tuple enum doesn't have a duplicated type. It is somewhat ironic because the compelling features of this RFC only apply to tuple enums with the properties of a type union. This suggests the merits of these features are merits of type unions rather than tuple enums.

dhm · August 5, 2020, 10:54am

CAD97:

Matching on an Expr is now done by type:

let expr: Expr = /* ... */;
match expr {
    expr: ExprCast => /* ... */,
    expr: ExprIf => /* ... */,
    expr: ExprMethodCall => /* ... */,
    /* ... */
}

This type-refinement syntax desugars to an equivalent syntax based on indexes:

let expr: Expr = /* ... */;
match expr {
    // NB: indexes are examples
    Expr::1(expr) => /* ... */,
    Expr::2(expr) => /* ... */,
    Expr::3(expr) => /* ... */,
    /* ... */
}

I like the idea, but the current non-indexed syntax seems a bit too magical for me.

What about:

let expr: Expr = /* ... */;
match expr {
    Expr::_(expr: ExprCast) => /* ... */,
    // or
    Expr::_(expr) => { let _: ExprIf = expr; /* ... */ },
    /* ... */
}

The idea would be to emphasize even more on the indexed nature of these Tuple Enums, but rely on a pattern-ellision-lookalike mechanic to elide the actual indexes thanks to type checking (I think that in practice there is no way for the compiler to use type-checking to affect pattern matching, so the technical aspect may be complicated).

Then, type ascription within a pattern would be a way to help this type checking, but the advantage is that it would no longer be a required feature of the language for it to support Tuple Enums.

CAD97:

and processing code of

// impl<Left, Right>
let place: Either;
if random() {
    place = make_left();
} else {
    place = make_right();
}
match place {
    place: Left  => println!("left"),
    place: Right => println!("right"),
}

What about:

trait Is { type EqTo : ?Sized; }
impl<T : ?Sized> Is for T { type EqTo = Self; }

fn ...<Left, Right : Is> ()
where
    Right : Is<EqTo = Left>, // imagine toggling this line on and off
{
    let place: Either<Left, Right>;
    if random() {
        place = Either::0(make_left()); // use indices otherwise it's ambiguous
    } else {
        place = Either::1(make_right());
    }
    match place {
        // technically not ambiguous by "syntaxic identity 🌊👋"
        place: Right => println!("right"),
        // Neither "syntaxically identical" to Right nor Left, semantically equal to both, error?
        place: <Right as Is>::EqTo  => println!("left"),
    }
}

We could say that the moment the Right : Is<EqTo = Left> constraint appears, the trait solver is able to resolve that to Right = Left and thus error on an ambiguous pattern match.

What I am worried is, that currently the trait solver is not always able to perform this reduction, so that depending on the syntactic path to refer to a type, a current version of the Rust compiler could fail to see the two types as being equal, thus allowing the match, only for a more evolved version of the compiler and its trait system to do resolve to type identity and error on this match

mjbshaw · August 5, 2020, 12:59pm

I don't have a strong opinion on whether or not enum Foo(u32, u32) should be match-able. But I think it would be nice for a future RFC draft to mention this and argue its stance, since it's not really ambiguous in the technical sense.

lordan · August 5, 2020, 1:26pm

Just some quick impressions and questions

It wasn't clear to me at first, but am I understanding the proposal right that this is essentially a new take on the "structs as entity variants" approach (see, e.g., this old thread)?

Esp. for cases with duplicated types this proposal seems safer over anonymous enums: I.e., in a enum Foo(Thing, Thing) I know the intended order is Foo-like, whereas one source of an anonymous enum(Thing, Thing) may or may not produce the same index order as another source.

Can index-based TupleEnum::0 be used as a constructor function like with regular enum tags?

Can you use explicit discriminant values with this proposal? If yes, there might be confusion between the positional index and the actual discriminant value.

Looking at the linked syn code makes me want to have ECS (entitiy component system) support in the language...

CAD97 · August 5, 2020, 3:24pm

Here, the intent was that make_left and make_right both return Either so there wouldn't be an issue (though I did fail to make this clear).

No. Rust generics are not templates. The semantics of the function are resolved while its still generic, and monomorphizations cannot impact that. This is in direct comparison to e.g. C++ templates, where name and semantics resolution are done after concrete types have been substituted in, resulting in a loss of what I call Generic Hygiene here.

To build a full example (assuming construction by coercion, which is not guaranteed nor settled):

fn example<Left, Right>(
    make_left: impl Fn() -> Left,
    make_right: impl Fn() -> Right
) {
    let place: Either::<Left, Right>;
    if random() {
        place = make_left();
    } else {
        place = make_right();
    }
    match place {
        place: Left => println!("left");
        place: Right => println!("right");
    }
}

this is resolved while still generic to

fn example<Left, Right>(
    make_left: impl Fn() -> Left,
    make_right: impl Fn() -> Right
) {
    let place: Either::<Left, Right>;
    if random() {
        place = Either::0(make_left());
    } else {
        place = Either::1(make_right());
    }
    match place {
        Either::0(place) => println!("left");
        Either::1(place) => println!("right");
    }
}

using local type information. There is no resolution of semantics that depends on monomorphization.

I don't disagree, I just think that the theory helps extend the rationale of the utility of the feature that I believe is there. (Also, as a selfish side benefit: if nominal sum types beyond the existing enum are too complicated to add, then structural sum types are definitely too complicated to justify and I can finally stop trying to clarify and guide discussion around them )

I think this is a fair assessment of the indexed/named member tradeoff, but would add that indexed fields are also ideal for when the fields have no meaningful name. Most of the time this also means that the struct itself has no meaningful name (thus should be a tuple).

The exact same argument applies to tuple structs: the brevity benefit is just MyStruct(u32, i32) vs MyStruct { a: u32, b: i32 }, because indexing with indexes is no more brief than short names, making the brevity of tuple structs as much more brief than full structs as tuple enums are than full enums.

The brevity doesn't come from the actual number of tokens, imho, it comes from not needing to bother naming the members.

If it were possible to forbid tuple enums that aren't just "disjoint unions," I probably would; the reason that I don't is that it's not possible, due to black-box generics / generic hygiene.

All of the "ability to use" around being an ADT sum type and not an ADT union type is about making it possible to use in these edge cases. The target niche for tuple enums is small nongeneric enums with disjoint types, where the difference between sum and union doesn't matter.

dhm:

I like the idea, but the current non-indexed syntax seems a bit too magical for me.

What about:
let expr: Expr = /* ... */;
match expr {
    Expr::_(expr: ExprCast) => /* ... */,
    // or
    Expr::_(expr) => { let _: ExprIf = expr; /* ... */ },
    /* ... */
}
The idea would be to emphasize even more on the indexed nature of these Tuple Enums , but rely on a pattern-ellision-lookalike mechanic to elide the actual indexes thanks to type checking (I think that in practice there is no way for the compiler to use type-checking to affect pattern matching, so the technical aspect may be complicated).

I like this, and it nicely sidesteps the nesting issues with refinement syntax. I don't know the viability of having the variant type as an inference variable, but it definitely would be convenient if it were to work. Refinement syntax could also always be added later if desirable. It's also less different, so makes the proposal easier to digest; the specific syntax is not what I consider important here.

I think this definitely a tricky edge case; my intuition would be that this is the Left case (because the visible trait impl is Is<EqTo = Left>). The other possible solution is just to disallow use of type projection (is that the right term?) on type variables for the purpose of tuple enum pattern type ascription.

Understood. I'm updating the OP with that and the other things I've addressed in this post.

Yes, this is at least in the same design space. That post falls somewhere halfway between what I refer to as "enum variants are types" and "types as enum variants". I think that just that proposal of sticking external types in an enum alongside variant types is not reasonably possible, though, due to the difference in layout of the type externally to the enum and internally.

Yes. enum TupleEnum(A, B); should be useable in all the ways enum TupleEnum { 0(A), 1(B) } would be if those were valid variant names for a regular enum.

I would say no, due to the exact point you've stated. The discriminant should be the index.

Not for this thread, but I'd be interested to hear how you think ECS could be applied to strongly typed trees. It's my personal opinion that syntax trees which represent string input should generally have weaker typed trees, like rust-analyzer's Rowan, C#'s Roslyn, or Swift's libSyntax, but strongly typed trees still exist (such as in-memory IR or syn, even, as its primary use case is proc-macro manipulation of valid source code) and I'm definitely interested in ways of representing them beyond "sea of newtype variant / tuple enums" and Rowan's "typed view of homogeneous^† tree."

^† except for the fact that Rowan does have a different representation for internal nodes and for leaf nodes, I suppose.

tkaitchuck · August 5, 2020, 4:26pm

Oops. Edited original post.

CAD97 · August 5, 2020, 4:50pm

This wouldn't be allowed, as T is not a member variant of Size.

Aloso · August 5, 2020, 5:58pm

A better example would be

enum Foo<T>(u8, T);

fn example<T>(foo: Foo<T>) {
    match foo {
        _: T  => println!("first match arm"),
        _: u8 => println!("second match arm"),
    }
}

I guess this isn't possible under the current proposal, since T and u8 might be the same type, and the type refinement syntax can't be used if the variant types overlap. Making code like this illegal makes sense, because it can easily cause bugs. It's as if Some(None) was treated the same as None.

One problem I have with the proposal is that this enum:

pub enum Expr (
    ExprArray,
    ExprCast,
    ExprIf,
    ExprMethodCall,
    ExprTuple,
    // many variants omitted
);

Looks very similar to this, which has a different meaning:

pub enum Expr {
    ExprArray,
    ExprCast,
    ExprIf,
    ExprMethodCall,
    ExprTuple,
    // many variants omitted
}

The other problem is that the enum index matching syntax is not very readable. I would be more comfortable, if it was desugared like so:

enum TupleEnum(u32, String);
// is desugared to
enum TupleEnum {
    u32(u32),
    String(String),
}

Unfortunately this doesn't work for tuples and other structural types, or when an enum contains both Vec<String> and Vec<u32>, so I guess this is not an option.

scottmcm · August 5, 2020, 6:12pm

CAD97:

Matching on an Expr is now done by type:

let expr: Expr = /* ... */;
match expr {
    expr: ExprCast => /* ... */,
    expr: ExprIf => /* ... */,
    expr: ExprMethodCall => /* ... */,
    /* ... */
}

I agree that this has to be the syntax. I wish we could change IpAddr to be this as well -- that's my usual example from std of this.

Note that, for tuple structs, the desugaring to indexes is literal enough that you can do struct Foo(u32); and let x = Foo { 0: 3 };. Would you expect the manually-desugared form to work here too?

nit: please be explicit about which arm is meant here, since the reference section shouldn't rely on intuition.

Is this the user-visible syntax for this? If so, I'm not a fan because I feel like I shouldn't have to specify the index on construction, just like how I don't have to specify it on destruction.

CAD97 · August 5, 2020, 6:22pm

OP is updated with feedback so far! OP has a changelog and the edit view is fairly readable.

Aloso:

A better example would be
enum Foo<T>(u8, T);

fn example<T>(foo: Foo<T>) {
    match foo {
        _: T  => println!("first match arm"),
        _: u8 => println!("second match arm"),
    }
}
I guess this isn't possible under the current proposal, since T and u8 might be the same type,

No, this is allowed, as T and u8 are distinct local types. I've integrated this example into the section on generic hygiene in the OP; it's a good example.

The important takeaway is that semantics are determined while the type variables are still placeholders, not a concrete type.

Aloso:

One problem I have with the proposal is that this enum:
pub enum Expr (
    ExprArray,
    ExprCast,
    ExprIf,
    ExprMethodCall,
    ExprTuple,
    // many variants omitted
);
Looks very similar to this, which has a different meaning:
pub enum Expr {
    ExprArray,
    ExprCast,
    ExprIf,
    ExprMethodCall,
    ExprTuple,
    // many variants omitted
}

This definitely is unfortunate. Tuple structs don't have that problem, as struct Foo { A, B } isn't valid syntax. I don't have a proposed resolution right now, unfortunately.

@scottmcm we raced here, the OP has been updated addressing most of those points.

scottmcm:

CAD97:
Matching on an Expr is now done by type:
let expr: Expr = /* ... */;
match expr { expr: ExprCast => /* ... /, expr: ExprIf => / ... /, expr: ExprMethodCall => / ... /, / ... */ }
I agree that this has to be the syntax.

The main proposal uses a bit more of a "worse" more explicit syntax about the wrapping for now (as the syntax isn't a core part of the proposal, and refinement matching can always be added back later).

Yes, I'd expect the same property to hold for tuple enums. Having just the basic syntax is where I started, and the nicer syntax is to make the feature more usable.

I rewrote that section racing with your post to make it a lot clearer.

It's an available syntax (the basic, desugared form) but I'm still undecided on what the "main, nice" syntax for the simple disjoint union case should be. I'm not excited to propose coercing values of variant types into the values of tuple enum type. The main potential option is, in analogue to tuple structs, provide an overloaded construction function:

enum Foo(A, B);
// multiple implementations of `Fn`
fn Foo(a: A) -> Foo { Foo::0(a) }
fn Foo(b: B) -> Foo { Foo::1(b) }

but this still has drawbacks, thus, a nicer construction syntax is still an unresolved question.

Also, I don't think that syntax could be a proper function and still respect generic hygiene the way that I would expect it to need to in order to behave properly in edge cases, so it'd need to be magic rather than just multiple Fn implementations.

scottmcm · August 5, 2020, 6:43pm

Hmm, that's odd to me. Because the thing I liked here was that it was mostly just syntax -- having a desugar, to me, says that this is avoiding semantic changes and just providing a less-verbose way to do something you already could.

If the consumption syntax is changing to Expr::_(expr: ExprCast) => /* ... */, then consider maybe removing the whole "tuple enum syntax" entirely. If allowing inference for the variant type becomes the core here, it could work with IpAddr. And then if it worked only in more-restricted situations, that'd be ok because there'd be no expectation that it'd work because it came from a specific definition style.

lordan · August 5, 2020, 6:52pm

Shouldn't that example rather read:

enum Foo(A, B);
// multiple implementations of `Fn`
fn Foo(a: A) -> Foo { Foo::0(a) }
fn Foo(b: B) -> Foo { Foo::1(b) }

?

CAD97 · August 5, 2020, 6:52pm

whoops yes

lordan · August 5, 2020, 7:04pm

Regarding the similarity between tuple enum syntax and regular enum syntax: given that this doesn't actually construct a (product type) tuple, instead only ever will hold - and be constructed by - a single value, perhaps it's not necessary to follow the struct syntax analogue, to avoid pitfalls. Bikeshedding:

pub enum ExprFoo<T> for ExprArray, ExprCast, ExprIf, T, u64, u64, (u64, T);

Topic		Replies	Views
pre-RFC: anonymous enums language design	13	5531	March 25, 2019
Pre-RFC: Anonymous variant types language design	93	6161	March 25, 2019
Pre-RFC: Explicit enum variant visibility (`pub enum E { pub(self) V }`) language design	14	3704	May 8, 2020
Pre-RFC: Enum variants through type aliases language design	14	2968	March 25, 2019
Was this idea for struct/enum unification considered? ideas (deprecated)	7	1893	March 25, 2019

Concept RFC: Tuple Enums

Summary

Guide-level explanation

Example

When would I use this?

This is not "variant types"

This is not "type sets" / "type unions"

Reference-level explanation

A brief type theory overview

Generic hygeine

Implementation details (effective desugaring)

Interactions with #[non_exhaustive]

Drawbacks

Rationale

Alternatives

Prior art

Unresolved questions

Future possibilities

CC

Changelog

Technical Section

Philosophical Section

Type Theory Completionism As a Rationale

Tuple Struct Rationale

Tuple Struct vs Tuple Enum

Type Union vs Tuple Enum

Related topics

Interactions with `#[non_exhaustive]`