Pre-RFC: Anonymous variant types

I really hope I’m not the only person whose eyes are bleeding a little bit from that…

4 Likes

I would not recommend to do this, but my eyes are not bleeding. If I had to do this, I would probably write it like this :

fn test(i: u8) -> impl Debug {
    type ReturnType = (i64 | () | i64 | (i64, f64));
    match i {
        0 => ReturnType::0(10),
        _ => ReturnType::1(()),
        2 => ReturnType::2(20),
        3 => ReturnType::3(30,0.0),
    }
}
1 Like

exactly… and at that point, making a regular enum is not much more effort.

So I assume you never use tuples?

That’s a little harsh, but that’s the angle I see it from. Just like some times don’t deserve the rigmarole of a full struct definition, some times don’t quite deserve the rigmarole of a full enum definition, and an anonymous one fits the bill better.

For API surfaces, names are great, and anonymous types don’t communicate intent. For internal quick details and as intermediate refactoring steps, anonymous types are huge.

1 Like

I do. However, in my experience there's an asymmetry here. I really missed enum-like sum types in languages without real sum types, and I somewhat missed tuples in languages without tuples. I literally never (as far as I can remember) thought "hey, this would be neat as an anonymous sum type". They just don't come up as often as tuples.

1 Like

I feel like the positional approach to structural coproducts encourages you to write code that is unclear wrt. semantic intent (because the number is devoid of human-understandable meaning). Meanwhile, if you name the variants, the meaning is more clear and you can also encode return types such as -> FlagA | FlagB | FlagC. Furthermore, named variants have the commutative property, i.e. Foo | Bar == Bar | Foo.

Perhaps... Tho I think that structural coproducts can be useful for rapid prototyping, i.e. you quickly encode a sum type in the return type and match where you need it; then later, you come back and make the code more robust and refactor into a nominal enum. If the variants are named, this should be fairly easy as most of the information (except for the type name) is already present; in fact, you could facilitate it with IDE provided refactoring actions. All of this allows for a "gradual" approach to development.

6 Likes

A proper open/dynamic "Variant" type with some reflection capabilities and language support might better support this goal. As is, you lose a bit of that rapidity by having to specify all of the variant types, which makes it a hard to justify[1] compromise between the flexibility of an open-ended type and reliability of a closed one.

[1] "Hard to justify" in the sense of explaining why you chose to use it in a code review situation. I don't know where it should be considered idiomatic to use an anonymous enum where a regular enum would work.

1 Like

Like... row polymorphism but for structural coproducts? Maybe you could elaborate with examples?

I think it's mostly a question of "how many times are you using the type"; if it is once or twice then a structural variant could be better.

1 Like

I'm not that familiar with row polymorphism, so I can't really say if that's a sensible way to think about it. I was thinking of something like an enum whose discriminant is instead a pointer to some metadata which allows the contained value to be properly recovered. Like impl Any, but dynamic. This would probably need a lot of language support to be usable (to do error handling and matching in a reasonable way, etc..)

Anyway, I only bring it up as a potential alternative which may satisfy some of the need for rapid prototyping tools. I have no idea if it is workable with Rust's semantics or borrowing rules.

Fourth revision. Even more explaining, along with a tl;dr section on the guide-level explanation and a codegen example to show the relative simplicity of the proposed syntax.

1 Like

I would say

fn test(i: u8) -> impl Debug {
    type ReturnType = (_| _ | _ | _);
    match i {
        0 => ReturnType::0(10),
        _ => ReturnType::1(()),
        2 => ReturnType::2(20),
        3 => ReturnType::3(30,0.0),
    }
}

Would it be possible to have the compiler auto-build the enum, for example

fn test(i: u8) -> impl Debug {
    match i {
        0 => _::0(10),
        _ => _::1(()),
        2 => _::2(20),
        3 => _::3(30,0.0),
    }
}

would be turned into

fn test(i: u8) -> impl Debug {
    match i {
        0 => (i32 | () | i32 | i32, f32)::0(10),
        _ => (i32 | () | i32 | i32, f32)::1(()),
        2 => (i32 | () | i32 | i32, f32)::2(20),
        3 => (i32 | () | i32 | i32, f32)::3(30,0.0),
    }
}

This way it would be easy to rapidly build up anonomous enums, and most of the noise will be eliminated. If there are any missing numbers (for example if we forgot 1, and had 3 variants) then the compiler would error saying that it could not infer the enum type.

Ofc this will only be done local to 1 function, not across function boundaries.

1 Like

Probably; if each of those statements types as a partial sum, the compiler should be able to unify them.

Or something like the following?

fn test(i: u8) -> impl Debug {
    type R = (i32 | () | i32 | i32, f32);

    match i {
        0 => R::0(10),
        2 => R::2(20),
        3 => R::3(30,0.0),
        _ => R::1(()),
    }
}
2 Likes

Fifth revision, and the first candidate for release into a pull request as a proposed RFC. The minimal nature of the RFC’s proposed type is more prominently stated.

If there does not appear to be anything wrong with the proposed RFC, then I’ll send it up to be pull requested.

3 Likes

I would support the idea of anonymous enums, but I have a hard time getting behind the idea of anonymous variants.

I agree with the idea of keeping the proposal minimal, and not bake in extra features such as subsampling and other forms of automatic conversions between anonymous enums. This is indeed an excellent way to proceed.

And therefore I’d argue that a more minimal proposal is anonymous enums with named variants; and that when the time comes to introduce anonymous variants, a second proposal should be made encompassing both named and anonymous enums.

Should we hash out whether to start with anonymous or named variants in the anonymous enum before a full-fledged RFC is submitted?

5 Likes

Yeah, about that. I'm pretty sure that's wrong. This would actually take more work, as there would have to be compiler groundwork to have some way of corresponding names to type as read off from the type, along with associated syntax bikeshedding, and to go back and look variants up when the variant names come back up. Compare with the codegen example, which performs all of its work in a perfectly straight line, and can just work with numbers and indices, only outputting the associated numerals at the end when everything is figured out.

1 Like

While I do want a lower-developer-friction story for error handling (who doesn’t?), I’m also very risk averse and being able to match based on index, combined with distinct variants with the same type signature, seems like a recipe for logic bugs.

For example, I like to keep my lists of things alphabetically sorted if they don’t have some underlying numerical value with more significance to their ordering. That’s such a minor thing that I don’t even think of it as refactoring.

I did have another concern, but I forgot to make point-form notes (I just woke up) and it slipped my mind while thinking through the point about trivial refactoring.

3 Likes

I don’t like the proposed approach that much, especially matching looks quite bad:

match x {
    (_ | _)::0(val) => assert_eq!(val, 1_i32),
    _::1(_) => unreachable!("Value was set to the first variant")
};
match bar {
    (i64 | () | i64 | (i64, f64))::3((a, b)) => a == -3_i64 && b == 0.0_f64,
    (_ | _ | _ | _)::2(_) => false, 
    _::1(_) => false, 
    _ => false
}

It looks like pile of emojis to me and Rust is already perceived as a language a bit too heavy on sigils.

Bad humour inside

I wonder if following the “turbo-fish”, match combination (_|_):: will be named as a “turbo-butt”.

I think a better approach will be to introduce refined enums, which can look like this:

enum MyError {
    A,
    B(f32),
    C(String),
    D,
}

// `MyError[A, B]` (the exact syntax is up to bikeshedding) is essentially
// an alias for `MyError`, but compiler knows that only variants A and B
// can be used with it
fn foo() -> Result<(), MyError[A, B]> { .. }

// we can use aliases to make code more readable
type BarError = MyError[A, D];
fn bar() -> Result<(), BarError> { .. }

fn baz() -> Result<(), MyError> {
    match foo() {
        Ok(()) => (),
        Err(MyError::A) => { .. },
        Err(MyError::B(val)) => { .. },
        // we can omit variants C and D, compiler will not complain and 
        // on desugaring will add `_ => unsafe { std::intrinsics::unreachable() }`
    }
    match bar() {
        Ok(()) => (),
        // refined alias can be implicitly coerced to the parent type
        // (after all it's just an alias)
        Err(err) => return Err(err),
    }
}

// compile error with message indicating that variant D can not be used with
// the given enum refinement
fn foo1() -> MyError[A] { MyError::D }

// compiler error: foo can return Err(MyError::B), while only variants A and D
// were expected
fn foo2() -> Result<(), BarError> {
    foo()
}

// enum refinement can be coerced not only to the parent type, but also to other
// refinement if it's wider than the coerced one
fn foo3() -> Result<(), MyError[A, B, C]> {
    foo()
}

// we could warn on unnecessary matches
fn foo3() {
    match foo() {
        Err(MyError::D) => (), // warning: unreachable match arm
        _ => (),
    }
}

This approach also automatically solves automatic trait implementation concerns, as programmer will be in full control over which traits will be implemented for parent enum.

4 Likes

Refined enums os an entirely orthogonal approach. So if you want it, please create a separate RFC for it. This RFC doesn’t want to name the enum at all.