UPD: See the next version of this proposal: Pre-RFC: type-level sets
This proposal is a draft of alternative to RFC 2587. This proposal is less flashed out compared to the RFC 2587 and provides a general direction, which I believe is better to pursue.
About terminology
The proposed construct will be called sum-enum to distinguish it from the existing enums. Some prefer to call such construct union type, but unfortunately Rust already has a different kind of unions. Just consider it a temporary name. (better naming proposals are welcomed of course)
Summary
- Introduce a new syntax
enum(A, B, C, ..) and enum Foo(A, B, C, ..)
- Utilize type ascription in patterns to perform matching on sum-enums
Motivation
There is two main use-cases for this proposal:
- Removing a need for short-lived āthrow-awayā enums:
// enum(..) implements `From` for each "summed" type,
// so `?` works as expected
fn foo(val: Weak<str>) -> Result<i64, enum(NoneError, ParseIntError)> {
let strref = Weak::upgrade(val)?
let num = i64::from_str_radix(strref, 10_u32)?
Ok(num)
}
match foo(val) {
Ok(n) => { .. },
// equivalent to `Err(NoneError: NoneError)` and `Err(_: NoneError)`
Err(NoneError) => { .. },
// note that for now you can't write `Err(err): Err(ParseIntError)`
Err(err: ParseIntError) => { .. }
}
- Allow to return several types using
impl Trait by automatically creating a sum-enum:// output of this function is a sum over 2 anonymous `impl Trait` types,
// variants are automatically collected by compiler
fn foo<'a>(data: &'a [u32], f: bool) -> impl Iterator<Item=u32> + 'a {
if f {
// `into` is required for conversion into implicitly constructed
// sum-enum
data.iter().map(|x| 2*x).into()
} else {
data.iter().map(|x| x + 2).into()
}
}
Explanation
Existing enums can be seen as a sum-type which implicitly creates wrapper type for each variant (itās not quite how it works today of course), so we can generalize this behavior and introduce a tuple-like syntax based on existing enum keyword:
enum Foo(u32, u64)
// or alternatively
type Foo = enum(u32, u64);
type Bar<T> = enum(u32, T);
fn foo() -> enum(u32, u64) {
if condition() {
1u32.into()
} else {
1u64.into()
}
}
fn foo<T>() -> enum(u32, T) { .. }
Considering sum type nature of sum-enums, they have the following properties:
-
enum(A, B) is equivalent to enum(B, A)
-
enum(enum(A, B), enum(C, D)) is equivalent to enum(A, B, C, D)
-
enum(A, B, A) is equivalent to enum(A, B)
-
enum(A) is equivalent to A
-
enum(A, B, !) is equivalent to enum(A, B)
The main way to create sum-enums will be to use Into trait implementation.
In cases when ordering of type arguments is required it can be done based on TypeId of each type.
Internally sum(u32, u64) can be represented as:
union SumUnion_u32_u64 {
f1: u32,
f2: u64,
}
struct SumEnum_u32_u64 {
// tag is TypeId based, and can be u8/u16/... depending
// on a number of variants
tag: u8,
union: SumUnion_u32_u64,
}
Alternatively for tag we could use TypeId directly, it will simplify conversions between sum-enums, but will result in a 8-byte overhead, while usually having just 1 byte will be enough.
Matching on sum-unions will be the same as for usual enums:
match val: enum(u32, u64, ()) {
v: u32 => { .. },
_: u64 => { .. },
// type ascription can be omitted
() => { .. }
}
// see motivation example
match foo(val) {
Ok(n) => { .. },
Err(NoneError) => { .. },
Err(err: ParseIntError) => { .. }
}
match val: enum(u32, u64, ()) {
v: enum(u32, u64) => { .. },
() => { .. },
}
match val: enum(u32, u64, ()) {
v: u64 => { .. },
// `v` will have type `enum(u32, u64, ())`
v => { .. },
}
In the last example we don not convert type of v to enum(u32, ()) for several reasons:
- Simplicity of implementation
- Potentially incompatible memory layouts
- Being coherent with the existing
match behavior
Sum-enum will implement a minimal set of traits of included types. In other words trait is implemented for sum-enum only when all its variant types implement it.
Generic code
One of the issues often mentioned in discussions of sum types is problems with generic code, for example:
// what will happen if U == V?
fn foo<T, V>(val: enum(T, V)) {
match val {
v: U => { .. },
v: V => { .. },
}
}
Arguably considering monomorphization this code is very similar to this code:
// `get_id` is a static method of `MyTrait`
fn foo<U: MyTrait, V: MyTrait>(input: u32) {
match input {
n if U::get_id() == n => { .. },
n if V::get_id() == n => { .. },
_ => { .. },
}
}
In other words we can āsolveā this problem by (re)specifying that match arms are evaluated and executed in order, so if U and V (e.g. u32) have the same type foo will get monomorphized into the follwoing function:
fn foo2(val: u32) {
match val {
v: u32 => { .. },
v: u32 => { .. },
}
}
Itās obvious that only the first arm will be executed and the second one will be always ignores (and removed by optimizer). To prevent potential bugs compiler can issue unreachable_patterns warnings.
If one of generic types will be sum-enum, this case is handled by ability to include sum-enums into match arms (see third match example). If U = enum(A, B) and V = enum(B, C), then weāll get the following code:
// enum(enum(A, B), enum(B, C)) == enum(A, B, C)
fn foo(val: enum(enum(A, B), enum(B, C))) {
match val {
v: enum(A, B) => { .. },
v: enum(B, C) => { .. },
}
}
Itās obvious that if variant has type B it will always go to the first arm, and the second arm will always get variant with type C. So code will work without any problems, and result will be predictable.
Possible quality-of-life features
In addition to the basic functionality we can introduce a trait which will allow as to generalize over sum-enums and to do various conversions.
trait SumEnum {
/// return TypeId of the current variant
fn get_type_id(&self) -> TypeId;
/// get Iteratore which contains possible variants `TypeId`
fn get_type_ids() -> impl Iterator<Item=TypeId>;
/// create a new sum-enum from provided value if possible
fn store<T: Sized>(val: T) -> Result<Self, Error>;
/// convert to another sum-enum if possible
fn convert_into<T: SumEnum>(self) -> Result<T, Error>;
/// tries to extract variant with type T
fn extract<T: Sized>(self) -> Result<T, Error>;
fn can_store(tid: TypeId) -> bool {
Self::get_type_ids().any(|&t| t == tid)
}
}
Additionally we also could automatically implement TryInto/TryFrom traits for variant types. Ideally conversions should be able to check if conversions can be done at compile time, but unfortunately currently Rust does not provide tools for that, maybe in future with advancement of cons fns.
Sum-enum as a generalized enum
As was mentioned sum-enums can be viewed as a generalization of enum:
// this enum
enum A {
Foo,
Bar(u32),
Baz{f: u64},
}
// can be (theoretically) desugared as
struct Foo;
struct Bar(u32);
struct Baz{f: u64};
enum A(Foo, Bar, Baz)
This couldāve automatically solved problem of joining nested enums (which currently is not always handled optimally), in other words Option<Option<u8>> wouldāve been a sugar for enum(Some(Some(u8)), Some(None), None). Also matching wouldāve been unified for usual and sum-enums.
Unfortunately this change is backwards incompatible (e.g. A::Bar currently has type Fn(u32) -> A), but nevertheless I think itās an interesting idea to consider.
Unresolved questions
- Naming: sum type, sum-union, type union, etc.
- Keyword:
enum(..) vs union(..) vs something else
- Delimiter:
enum(A, B), enum(A | B), enum(A + B)
- Tag construction: auto-incremental approach, truncating
TypeId and dealing with collisions
- Infallible generic conversion of āsubsetā sum-enums to a wider sum-enum. (e.g.
enum(u8, u16) to enum(u8, u16, u32))
- Interaction with lifetimes. To start things off we could restrict sum-enum usage only with types which ascribe to
'static. (the same restriction as currently for TypeId)
-
impl Trait variations: some have proposed to use enum impl Trait, or using enum keyword for converting values to sum-enums
- Details on how unification and handling of sum-enums should be done internally.