Pre-RFC: Using existing structs and tuple-structs as enum variants

Imagine this table:

               | product type | coproduct type
  named fields | struct       | enum
unnamed fields | tuple struct | sum enum
     anonymous | tuple        | sum

The intent here is to make enums truly dual to structs. Two of these spaces are missing, which could be filled in by anonymized versions of enums. (I could imagine that a more obnoxious name for “sum” is “cotuple”.)

I disagree with having not to not care if it’s Bar | Baz or Baz | Bar. After all, (Bar, Baz) and (Baz, Bar) are different types! I’d also like to be able to say Bar |, since (Bar,) is a valid tuple, though I question your desire to use it. I also especially dislike that A | B implicitly has where A != B, which is both unecessarily restrictive and unpronounceable, so we can’t use A | B as an unbiased version of Result.

After all, the equivalent objects in math, coproducts A |_| B, are not commutative in any mainstream setting, and being able to write A |_| A turns out to be a useful edge case.

4 Likes

I think the more important part is that (Bar,) and (Bar, Bar,) and (Bar, Bar, Bar,) are different types.

I would be quite happy for enum(A | B | A) to be the same type as enum(A | B), and even with type AB = enum(A|B); type BC=enum(B|C); type ABC = enum(AB|BC); for ABC be the same type as enum(C|B|A).

1 Like

Yeah, I think we just fundamentally disagree about this sort of collapse behavior. I see that sort of coproduct collapse as a nice idea on paper but one that will ultimately produce more problems than it will solve, such as the fact that the following function can’t be written:

fn first<T: Default, U>(x: T | U) -> T {
    match x {
        // using your syntax
        x: T => x,
        _: U => T::default(),
    }
}

If T::default() has side-effects, the behavior of first::<T, T>(t) is now undefined! So in your proposal, either such generic functions are unacceptable (which means the compiler needs to check for them…) or we need to add where T != U. Worse, without this where clause you can’t use generic sums in structs! Hell, using T | () as an ad-hoc Option (why the hell would you, I know, but you get my point) is no longer allowed!

My syntax side-steps this problem, since you need to call first as first::<T, T>(0(t)). However, I think that in a non-generic context I think it would be fine to be able to write first(t) if it was fn foo(x: i32 | &str).

@scottmcm @mcy Interesting suggestions. Personally, I’d like to see the two absent anonymised versions of enums (as @mcy called them) implemented, but in addition to my original proposal. I see them as complementary rather than alternatives. If we get enum-variants-as-types too, then all the better!

As for the debate on commutativity and idempotency of enum(...) as a type constructor, I’m tempted by the mathematical reasoning, but equally, how would one match on a type like enum(A | A) or enum(A | B | A)?

With type ascription in patterns appearing plausibly-going-to-happen, I more than ever think that this is the right way for this to be consumed:

match x {
    y: Ipv4Addr => ...,
    z: Ipv6Addr => ...,
}

I don’t know the rest of the design, though :slightly_smiling_face:

Perhaps the answer is that if the types are the same, it runs the first matching arm – like it would if you translated it into a sequence of downcasts off an any. And yes, that means you can’t use it place of Result, but I think that’s fine in same way you can’t tell the difference between (r, g, b) and (x, y, z) in tuples the way you can between { r, g, b } and { x, y, z } in structs.

If we go with the ordered-variants approach, probably pattern matching just based on order would work best, i.e.:

match x {
    var1 => ...,
    var2 => ...,
    var3 => ...,
}

I feel this parallels the philosophy of tuples and tuple structs the best.

In the specific example from Syn each variant shares a field named attrs, so it might be simpler to allow the user to derive a method named get_mut_attrs that returns a mutable reference to a version the shared field. (Note: I haven’t tested if mem::replace actually works with a reference here)

match self.get_mut_attrs() {
    Some(attrs) =>  mem::replace(*attrs, new),
    None => Vec::new()
}

I feel like this would be extremely surprising if you didn't know that x was such a type, as it looks identical to obviously-dead code. And it gets really strange if you only want the third one -- does that need this?

match x {
    _ => {}
    _ => {}
    var3 => ...,
}

Compare the type ascription one, where something like

if let addr: Ipv6Addr = x {

feels completely natural. (Can you even do an order-based thing in if let?)

My 2 cents: | shouldn’t be used unless there’s either collapsing behavior (T | T = T), or enforced disjointness (in general, T | U is disallowed, but something like Option<T> = Some<T> | None<T> is allowed).

I personally don’t like anonymous sum types. I think that if you really need to be able to do different things for each of the possible cases, you should name the constructors.

That sort of goes hand in hand with what I do want T | U for: automatically generated trait impls.
Even if you can’t use pattern-matching in the general case where the types are unknown, that’s not that significant of a limitation if instead you have access to trait methods, for traits that both T and U implement.

With -> impl Trait, the compiler could be generating T | U types when the types it sees differ, and then the caller of that function will still be able to use Trait, without anyone writing a match on T | U.

Also, collapsing behavior is great for this, if the types end up the same, you don’t waste space!

4 Likes

I can't parse this.

Yep, I've always been in favour of this! This was discussed before in a long issue (RFC PR even?), but I can't remember where. A few people claimed it was too much compiler magic, but I'm not bothered. The convenience outweights that.

I agree it has that benefit along with some other conveniences, but the mathematical/type theoretic argument (along with consistency with enum) is still a strong one.

Okay, fair point. That syntax is fine with me, but I still think we need to support either

match x {
    y: Ipv4Addr => ...,
    z: Ipv6Addr => ...,
}

or

match x {
    Ipv4Addr @ y => ...,
    Ipv6Addr @ z => ...,
}

as the primary pattern-matching syntax for unnamed enums. (The second is a little more in line with existing syntax I feel, but either could work.)

I absolutely agree that

match x {
    y @ Ipv4Addr { .. } => ...,
    z @ Ipv6Addr { .. } => ...,
}

should work, since each of those arms individually already works today.

And with unit structs – such as error ones – I’d expect just this to work too:

match x {
    OverflowError => ...,
    NotADigitError => ...,
}

the same way that let OverflowError = x; works for unit structs.

Yeah, I’m with you there. Do you think both syntaxes should be supported though (@ and type ascription, in my above post), or just the former?

Both should be supported, as should all patterns that sufficiently constrain the type.

For example, this seems like it ought to be fine too, to match an Option<i32> | String:

match x {
    Some(y) => ...,
    None => ...,
    z: String => ...,
}

I think there are a lot of cases where the names of the enum variant constructor names overlap 100% with a type constructor names as the enum variant constructor just wrap the types in a newtype-like pattern; especially in error handling and AST parsing. Don't you think that clutters code?

I also think there is another motivation in "unifying" struct-like types and variants, but it kind of goes the other way than using existing structs as variants: using variants as types. The benefit is that the variant-type is then binary-compatible with the enum it's from (having the same layout; even the discriminant). That means that we get a very natural way of doing simple refinements.

Sounds fair, I just don’t like there being too equivalent syntaxes (equally powerful). Maybe the @ one could be linted/deprecated going forwards?

Well, they’re not equally powerful. x @ i32 doesn’t work, and neither does x: None.

Though given type ascription in patterns, I definitely agree that a clippy lint to replace x @ String { .. } => with x: String => would be a good idea.

I was specifically talking about the case where the types aren’t necessarily known to be disjoint, so T | U in a function generic over both T and U. Then I would prefer it to not support pattern-matching.

Whereas something like Some<T> | None<T> is “clearly disjoint”, effectively isomorphic to Some<T> + None<T> (aka Option<T>) and pattern-matching on it would be no difficult than on an enum (we can even define Rust enums as the | of all of their variants).

You’re effectively “naming your constructors” by using | between several different nominal types (no matter their parameters), so I have nothing against that (in fact, I want it for several things, but not necessarily in Rust itself, maybe a DSL on top, we’ll see).

2 Likes

Sounds fair, yep. A lint would work.

That sounds like it could meld with my original propose easily too (assuming we implement the enum-variants-as-types RFC). In an enum definition typical variants would then be types within the scope of their parent enum, whereas variants that reuse existing types would be just that.