Thoughts about enums

porky11 · January 25, 2021, 9:59pm

I just thought about enums. I might want to make an RFC, if some of these ideas sound useful.

What if we interpreted structs as single field enums?

A simple struct in rust looks like this:

struct Struct [...]

[...] can either be a tuple like (val1, val2, ...), a named parameter list like { name1: val1, name2: val2, ... }, or nothing.

When interpreting it as an enum, it could look like this:

enum Struct {
    <default-variant-name> [...]
}

A mechanism, which would qualify some variant as default variant could be the name. I would suggest the name being the same as the struct:

enum Struct {
    Struct [...]
}

Everything, that works for a struct, will now have to work for enums of this kind.

They might also work for different enums, like single value enums with a different variant name name, or with enums, where only one of the variants is called <default-variant-name>. An even simpler rule could always qualify the first variant as default variant.

So the default enum variant can be constructed in these two ways:

Struct [...]

Struct::<default-variant-name> [...]

Derives

Custom derives, which work for structs, do now also have to work for enums if they work for structs. For example deriving Default does currently not work for enums. This way, it would just derive default for the default variant.

Also writing your own custom derives will get simpler (assuming you can easily convert struct definitions to enum definitions for parsing). You don't have to write special cases for enums and structs. The version for enums will also work for structs. And if you don't want them to work with real enums, you could just restrict them to enums with a single variant.

Joining enums

Sometimes you might want to have similar enum variants in multiple enums. These enums might be subsets of other enums.

enum RoundShape {
    Circle(f32),
    Ellipse {
        w: f32,
        h: f32,
    }
    RoundedRectangle {
        w: f32,
        h: f32,
        rad: f32,
    }
}

enum RectangularShape {
    Square(f32),
    Rectangle {
        w: f32,
        h: f32,
    }
    RoundedRectangle {
        w: f32,
        h: f32,
        rad: f32,
    }
}

enum Shape = RoundShape | RectangularShape;

The last line would become something like this:

enum Shape {
    Circle(f32),
    Ellipse {
        w: f32,
        h: f32,
    }
    Square(f32),
    Rectangle {
        w: f32,
        h: f32,
    }
    RoundedRectangle {
        w: f32,
        h: f32,
        rad: f32,
    }
}

The variant, which is available in both categories, will just be merged once. If the fields of the shared variant differ, it's an error.

Besides it could define some converters like this:

impl From<RoundShape> for Shape {
    ...
}

impl TryInto<RoundShape> for Shape {
    ...
}

impl From<RectangularShape> for Shape {
    ...
}

impl TryInto<RectangularShape> for Shape {
    ...
}

This simplifies sharing enums.

Using both features together

Example

Often you would like to be able to use enum variants as structs. For example in an enum like this:

enum Shape {
    Circle(f32),
    Ellipse {
        w: f32,
        h: f32,
    }
    Square(f32),
    Rectangle {
        w: f32,
        h: f32,
    }
    RoundedRectangle {
        w: f32,
        h: f32,
        rad: f32,
    }
}

Since this is not possible in current rust, you would end up using something like this:

struct Circle(f32);
struct Ellipse {
    w: f32,
    h: f32,
}
struct Square(f32);
struct Rectangle {
    w: f32,
    h: f32,
}
struct RoundedRectangle {
    w: f32,
    h: f32,
    rad: f32,
}

enum Shape {
    Circle(Circle),
    Ellipse(Ellipse),
    Square(Square),
    Rectangle(Rectangle),
    RoundedRectangle(RoundedRectangle),
}

That's probably an extreme example. In most cases you would just replace a few variants this way.

Most features of enums are now pretty unnecessary. You only have single value variants, where the type of that value has the same name as the variant.

Alternatively you could define your enum like before and maybe add some converters from and to every single type.

But when you want to add a field to one struct, you would have to add the same field to all of your enums (you might want to have more than one of these enums.)

Solution

So when structs are interpreted as single field enums, you can do something like this:

struct Circle(f32);
struct Ellipse {
    w: f32,
    h: f32,
}
struct Square(f32);
struct Rectangle {
    w: f32,
    h: f32,
}
struct RoundedRectangle {
    w: f32,
    h: f32,
    rad: f32,
}

enum RoundShape = Circle | Ellipse | RoundedRectangle;
enum RectangularShape = Square | Rectangle | RoundedRectangle;
enum Shape = Circle | Ellipse | Square | Rectangle | RoundedRectangle;

This might be a good alternative to enum variant types.

More ideas

Restricted

A restricted version could also allow enums defined this way to support only real structs as arguments and would not require interpreting structs.

So the solution example would still be possible and useful.

When thinking about it, this might even be a better solution.

Tuple matching

Splitting struct name and fields to tuples or named tuples when matching could make this even more elegant. You just don't specify the argument:

enum Enum {
    Unit,
    Tuple(...),
    Named {...},
}

use Enum::*;

let some_enum: Enum = ...;
match some_enum {
    Unit unit => process_unit(unit),
    Tuple tuple => process_tuple(tuple),
    Named named => process_named(named),
}

Thes match would be the same as this:

match some_enum {
    Unit => process_unit(()),
    Tuple(a, b, ...) => process_tuple((a, b, ...)),
    Named { a, b, ... } => process_named( { a: a, b: b, ... } ),
}

The benefit is, it might be possible to change, what kind of struct a struct is, without having to change something in the match.

This kind of matching would also apply to structs, not only enum variants.

The problem is, we don't support named tuples.

Struct matching

If a struct is defined, using the restricted method of defining new enums, a similar matching method would work:

struct Unit;
struct Tuple(...);
struct Named {...};

enum Enum = Unit | Tuple | Named;

let some_enum: Enum = ...;
match some_enum {
    unit: Unit => unit.process(),
    tuple: Tuple => tuple.process(),
    named: Named => named.process(),
}

Specifying the struct type of the variant is only possible if the enum is defined as a variant of multiple structs and only necessary, if you want to get the whole referenced struct.

You could still use the old way of pattern matching.

When I see this, I wonder, why rust hasn't done it that way from the beginning. Probably because it would be too difficult to implement C-like enums that way.

Conclusion

I think, none of these ideas would be a breaking change to rust. But my favorite addition is rather a small change. Enums are still defined like they currently are, but you can use structs to import new enum variants named by the structs. Besides you can also match the struct types for the imported variants.

struct StructUnit;
struct StructTuple(...);
struct StructNamed {...};

enum Enum {
    use StructUnit;
    use StructTuple(...);
    use StructNamed {...};

    EnumUnit,
    EnumTuple(...),
    EnumNamed {...},
}

let some_enum: Enum = ...;
match some_enum {
    unit: StructUnit => unit.process(),
    tuple: StructTuple => tuple.process(),
    named: StructNamed => named.process(),
    Enum::EnumUnit => process_enum_unit(),
    Enum::EnumTuple(a, b, ...) => process_enum_tuple(a, b, ...),
    Enum::EnumNamed { a, b, ... } => process_enum_named(a, b, ...),
}

This might even be compatible with RFC #2593

scottmcm · January 25, 2021, 10:25pm

Is there anything's coming to mind that doesn't work like this? Structs and single-variant enums already work pretty much identically...

porky11 · January 25, 2021, 10:36pm

See the next section, for example this:

bascule · January 25, 2021, 10:37pm

As it were, I believe that rustc internally models all structs as an enum with a single variant

H2CO3 · January 25, 2021, 11:07pm

That could be made work without changing the surface language semantics at all. Derive macros know how many variants their enum argument has, so it should be trivial to add that kind of check and then emit OnlyVariant(Default::default()). (Of course, modulo potential breakage that this causes.)

porky11 · January 25, 2021, 11:19pm

If that's the case, the implementation itself should not be the problem.

porky11 · January 25, 2021, 11:29pm

Right, but since we have structs, single variant enums are not useful.

But when you know MyEnum(...) is always the same as MyEnum::FirstField(...), MyEnum::Default(...) or MyEnum::MyEnum(...) it would be clear, that this field is the default variant, which will be derived as by something like Default. Just having the name Default also doesn't sound bad.

Aloso · January 26, 2021, 5:49am

Because it's more limited; some things would be difficult to implement this way. For example, Result:

enum Result<T, E> = T | E;

That looks quite elegant. But how do you match on a Result<String, String>? Are the two variants merged together in this case? Or take Option as an example:

enum Option<T> = T | ();

Now an Option<()> has two variants with the same type. An even trickier example is Option<Option<T>>. And the more nested the types get, the more complicated and error-prone it becomes. The current syntax avoids these problems, in exchange for a more verbose syntax. I've thought about this a lot in the past, and it seems that there is no perfect solution, but there's probably room for improvement.

I'm currently writing a parser for a programming language. For the abstract syntax tree, I have enums wrapping a bunch of structs everywhere. This is quite tedious, and your proposal would help a lot.

However, I'm not sure if adding another syntax to write enums is the right direction, since it adds cognitive overhead. RFCs need a strong motivation in order to be accepted, and I'm not sure the motivation for this idea is strong enough.

Maybe there are simpler ways to improve ergonomics. For example, a #[derive(From)] macro that generates trivial From implementations for enums would help.

porky11 · January 26, 2021, 7:00pm

struct Ok<T>(T);
struct Err<E>(E);
enum Result<T, E> = Ok<T> | Err<E>;

Everything, that's possible would still be possible. You just have to create the structs in advance, which is not that convenient in some simple cases like this.

But good, that you mention it. There needs to be a requirement, that all variants are required to be distinct. The easiest way would be only allowing things, which already work: All variants have to be distinct structs.

(I also had this problem, when writing an AST)

I don't think, deriving From is such a good improvement. You can't properly match this way, and it's probably a runtime overhead.

atagunov · January 26, 2021, 10:49pm

There was a lengthy discussion: Ideas around anonymous enum types and an rfc which suggested syntax like

let x = (i32 | &str)::0(1_i32);

match x {
    (_ | _)::0(val) => ...,
    (_ | _)::1(_) => unreachable!("...")
};

and I think in it was suggested a generic parameter in obvious cases would be translated into this ::0 or ::1 as appropriate. So this problem is solvable.

We were considering a &dyn (A|B) which was a fat pointer consisting of discriminant and reference to actual data and &dyn (A|B) was cheaply coersable to &dyn (A|B|C) - the coercion just entailed a change of the discriminant (part of fat pointer) but not of the data..

But it isn't really possible to coerce &dyn mut (A|B) to &dyn mut (A|B|C) as C may require more space than either A or B. Also if I remember correctly it wasn't obvious how to represent &dyn u8|Trait. This "fat" pointer would go super-fat: it would gain both discriminant and a pointer to the type's vtable.

Main motivation for these exercises was to return errors nicely: say child function can return Err1|Err2 but parent wants to return Err1|Err2|Err3 and we want to avoid expensive operation of copying memory around.. That has never been solved. It would have required some kind of a &out buffer to store data passed from the very top and it was hard to know what size of that buffer would be needed.
This prompted me to suggest a very elaborate schema that would allow functions to place values into parent/gradparent/grand^10-parent's frame but this suggestion clearly failed to gain interest

bascule · January 26, 2021, 11:08pm

Related resources to this idea:

Types for enum variants a.k.a. variant types
Anonymous enums
Type-level sets

porky11 · January 30, 2021, 9:57pm

Thanks.

I don't really like enum variant types, and that's the only thing I knew about enum variations before.

Anonymous enums are nice and don't seem like a lot of work, but I don't see a real benefit. I probably would switch back to the current system, when one of the anonymous enums gets more than two entries.

I really like the type level sets. I'm not sure if it might cause problems with backwards compability to enums. I can kind of agree to the solution of the "problem" if multiple variant types are the same. It might also be a good idea to restrict such a feature, but I still have to think about it.

system · April 30, 2021, 9:57pm

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
[PreRFC] enum-variant-types language design	17	1984	September 14, 2023
Rust enums and generalization	5	858	May 18, 2023
Allow deriving fields as struct inside enum itself (enum variants as types) language design	14	2148	July 5, 2021
Pre-RFC: Using existing structs and tuple-structs as enum variants language design	40	4956	March 25, 2019
Revisit: Types for enum variants language design	6	2121	March 25, 2019