UPD: See the next version of this proposal: Pre-RFC: type-level sets
This proposal is a draft of alternative to RFC 2587. This proposal is less flashed out compared to the RFC 2587 and provides a general direction, which I believe is better to pursue.
About terminology
The proposed construct will be called sum-enum to distinguish it from the existing enum
s. Some prefer to call such construct union type, but unfortunately Rust already has a different kind of union
s. Just consider it a temporary name. (better naming proposals are welcomed of course)
Summary
- Introduce a new syntax
enum(A, B, C, ..)
andenum Foo(A, B, C, ..)
- Utilize type ascription in patterns to perform matching on sum-enums
Motivation
There is two main use-cases for this proposal:
- Removing a need for short-lived “throw-away” enums:
// enum(..) implements `From` for each "summed" type, // so `?` works as expected fn foo(val: Weak<str>) -> Result<i64, enum(NoneError, ParseIntError)> { let strref = Weak::upgrade(val)? let num = i64::from_str_radix(strref, 10_u32)? Ok(num) } match foo(val) { Ok(n) => { .. }, // equivalent to `Err(NoneError: NoneError)` and `Err(_: NoneError)` Err(NoneError) => { .. }, // note that for now you can't write `Err(err): Err(ParseIntError)` Err(err: ParseIntError) => { .. } }
- Allow to return several types using
impl Trait
by automatically creating a sum-enum:// output of this function is a sum over 2 anonymous `impl Trait` types, // variants are automatically collected by compiler fn foo<'a>(data: &'a [u32], f: bool) -> impl Iterator<Item=u32> + 'a { if f { // `into` is required for conversion into implicitly constructed // sum-enum data.iter().map(|x| 2*x).into() } else { data.iter().map(|x| x + 2).into() } }
Explanation
Existing enum
s can be seen as a sum-type which implicitly creates wrapper type for each variant (it’s not quite how it works today of course), so we can generalize this behavior and introduce a tuple-like syntax based on existing enum
keyword:
enum Foo(u32, u64)
// or alternatively
type Foo = enum(u32, u64);
type Bar<T> = enum(u32, T);
fn foo() -> enum(u32, u64) {
if condition() {
1u32.into()
} else {
1u64.into()
}
}
fn foo<T>() -> enum(u32, T) { .. }
Considering sum type nature of sum-enums, they have the following properties:
-
enum(A, B)
is equivalent toenum(B, A)
-
enum(enum(A, B), enum(C, D))
is equivalent toenum(A, B, C, D)
-
enum(A, B, A)
is equivalent toenum(A, B)
-
enum(A)
is equivalent toA
-
enum(A, B, !)
is equivalent toenum(A, B)
The main way to create sum-enums will be to use Into
trait implementation.
In cases when ordering of type arguments is required it can be done based on TypeId
of each type.
Internally sum(u32, u64)
can be represented as:
union SumUnion_u32_u64 {
f1: u32,
f2: u64,
}
struct SumEnum_u32_u64 {
// tag is TypeId based, and can be u8/u16/... depending
// on a number of variants
tag: u8,
union: SumUnion_u32_u64,
}
Alternatively for tag we could use TypeId
directly, it will simplify conversions between sum-enums, but will result in a 8-byte overhead, while usually having just 1 byte will be enough.
Matching on sum-unions will be the same as for usual enum
s:
match val: enum(u32, u64, ()) {
v: u32 => { .. },
_: u64 => { .. },
// type ascription can be omitted
() => { .. }
}
// see motivation example
match foo(val) {
Ok(n) => { .. },
Err(NoneError) => { .. },
Err(err: ParseIntError) => { .. }
}
match val: enum(u32, u64, ()) {
v: enum(u32, u64) => { .. },
() => { .. },
}
match val: enum(u32, u64, ()) {
v: u64 => { .. },
// `v` will have type `enum(u32, u64, ())`
v => { .. },
}
In the last example we don not convert type of v to enum(u32, ())
for several reasons:
- Simplicity of implementation
- Potentially incompatible memory layouts
- Being coherent with the existing
match
behavior
Sum-enum will implement a minimal set of traits of included types. In other words trait is implemented for sum-enum only when all its variant types implement it.
Generic code
One of the issues often mentioned in discussions of sum types is problems with generic code, for example:
// what will happen if U == V?
fn foo<T, V>(val: enum(T, V)) {
match val {
v: U => { .. },
v: V => { .. },
}
}
Arguably considering monomorphization this code is very similar to this code:
// `get_id` is a static method of `MyTrait`
fn foo<U: MyTrait, V: MyTrait>(input: u32) {
match input {
n if U::get_id() == n => { .. },
n if V::get_id() == n => { .. },
_ => { .. },
}
}
In other words we can “solve” this problem by (re)specifying that match arms are evaluated and executed in order, so if U
and V
(e.g. u32) have the same type foo
will get monomorphized into the follwoing function:
fn foo2(val: u32) {
match val {
v: u32 => { .. },
v: u32 => { .. },
}
}
It’s obvious that only the first arm will be executed and the second one will be always ignores (and removed by optimizer). To prevent potential bugs compiler can issue unreachable_patterns
warnings.
If one of generic types will be sum-enum, this case is handled by ability to include sum-enums into match arms (see third match example). If U = enum(A, B)
and V = enum(B, C)
, then we’ll get the following code:
// enum(enum(A, B), enum(B, C)) == enum(A, B, C)
fn foo(val: enum(enum(A, B), enum(B, C))) {
match val {
v: enum(A, B) => { .. },
v: enum(B, C) => { .. },
}
}
It’s obvious that if variant has type B
it will always go to the first arm, and the second arm will always get variant with type C
. So code will work without any problems, and result will be predictable.
Possible quality-of-life features
In addition to the basic functionality we can introduce a trait which will allow as to generalize over sum-enums and to do various conversions.
trait SumEnum {
/// return TypeId of the current variant
fn get_type_id(&self) -> TypeId;
/// get Iteratore which contains possible variants `TypeId`
fn get_type_ids() -> impl Iterator<Item=TypeId>;
/// create a new sum-enum from provided value if possible
fn store<T: Sized>(val: T) -> Result<Self, Error>;
/// convert to another sum-enum if possible
fn convert_into<T: SumEnum>(self) -> Result<T, Error>;
/// tries to extract variant with type T
fn extract<T: Sized>(self) -> Result<T, Error>;
fn can_store(tid: TypeId) -> bool {
Self::get_type_ids().any(|&t| t == tid)
}
}
Additionally we also could automatically implement TryInto/TryFrom
traits for variant types. Ideally conversions should be able to check if conversions can be done at compile time, but unfortunately currently Rust does not provide tools for that, maybe in future with advancement of cons fn
s.
Sum-enum as a generalized enum
As was mentioned sum-enums can be viewed as a generalization of enum
:
// this enum
enum A {
Foo,
Bar(u32),
Baz{f: u64},
}
// can be (theoretically) desugared as
struct Foo;
struct Bar(u32);
struct Baz{f: u64};
enum A(Foo, Bar, Baz)
This could’ve automatically solved problem of joining nested enums (which currently is not always handled optimally), in other words Option<Option<u8>>
would’ve been a sugar for enum(Some(Some(u8)), Some(None), None)
. Also matching would’ve been unified for usual and sum-enums.
Unfortunately this change is backwards incompatible (e.g. A::Bar
currently has type Fn(u32) -> A
), but nevertheless I think it’s an interesting idea to consider.
Unresolved questions
- Naming: sum type, sum-union, type union, etc.
- Keyword:
enum(..)
vsunion(..)
vs something else - Delimiter:
enum(A, B)
,enum(A | B)
,enum(A + B)
- Tag construction: auto-incremental approach, truncating
TypeId
and dealing with collisions - Infallible generic conversion of “subset” sum-enums to a wider sum-enum. (e.g.
enum(u8, u16)
toenum(u8, u16, u32)
) - Interaction with lifetimes. To start things off we could restrict sum-enum usage only with types which ascribe to
'static
. (the same restriction as currently forTypeId
) -
impl Trait
variations: some have proposed to useenum impl Trait
, or usingenum
keyword for converting values to sum-enums - Details on how unification and handling of sum-enums should be done internally.