This proposal is a continuation of sum-enums Pre-RFC, which in turn is an alternative to (postponed) RFC 2587.
About terminology
You can consider “type-level sets” (TLS) a temporary name. (I know it’s an unfortunate abbreviation, but let’s continue with it for now) Alternative names are:
- type sets (probably will be too confusing considering “typeset”?)
- true unions
- sum-enums
- auto-enum
- amalgamated sum
- (anonymous) conflating enums
- normalized variant
- fibered sum
Summary
- Introduce a new syntax
A | B | C
which will create an anonymous TLS type. - Utilize type ascription in patterns to perform matching on TLS.
- Use TLS to represent static
impl Trait
with several possible variants.
Motivation
There is two main use-cases for this proposal:
- Removing a need for short-lived “throw-away” enums:
// `?` desugaring will include `become` keyword (see further), so `?` // will work as expected fn foo(val: Weak<str>) -> Result<i64, NoneError | ParseIntError> { let strref = Weak::upgrade(val)? let num = i64::from_str_radix(strref, 10_u32)? Ok(num) } match foo(val) { Ok(n) => { .. }, // equiv. to `Err(NoneError: NoneError)` and `Err(_: NoneError)` Err(NoneError) => { .. }, // for now we can't write `Err(err): Err(ParseIntError)` Err(err: ParseIntError) => { .. } }
- Allow to return several types using
impl Trait
by automatically creating a TLS:// This function returns a union of 2 anonymous `impl Trait` types, // variants are determined via usage of `become` keyword fn foo<'a>(data: &'a [u32], f: bool) -> impl Iterator<Item=u32> + 'a { if f { become data.iter().map(|x| 2*x); } else { become data.iter().map(|x| x + 2); } }
In this proposal we re-use a reserved become
keyword, which was initially planned for tail call optimization. We either can make TLS and TCO compatible, or use an alternative keyword for TLS.
Explanation
Existing enum
s can be seen as a TLS which implicitly creates a wrapper type for each variant (it’s not quite how it works today of course), so we can generalize this behavior in the following way:
type Foo = u32 | u64
// or alternatively
type Foo = u64 | u32;
type Bar<T> = u32 | T;
fn foo() -> u32 | u64 {
if condition() {
return become 1u32;
} else cond2() {
become 1u64
}
}
fn foo<T>() -> u32 | T { .. }
Considering the set nature of TLS, they have the following properties:
-
A | B
is equivalent toB | A
-
((A | B) | (C | D))
is equivalent toA | B | C | D
-
A | B | A
is equivalent toA | B
-
A | A
is equivalent toA
-
A | B | !
is equivalent toA | B
The main way for creating TLS is become
keyword. It can be used for safely converting sub-sets into larger sets:
fn foo() -> u8 | u32 | u64 {
let val: u8 | u32 = become 1u32;
if cond() {
become 2u64
} else {
become val
}
}
Internally u32 | u64
can be represented as:
union TlsUnion_u32_u64 {
f1: u32,
f2: u64,
}
struct Tls_u32_u64 {
// tag is TypeId based, and can be u8/u16/... depending
// on a number of variants
tag: u8,
union: TlsUnion_u32_u64,
}
Using TypeId
-based tags will make it more efficient to convert from A | B
to A | B | C
when there is no tag collision.
Matching on TLS will be the same as for usual enum
s:
match val: u32 | u64 | () {
v: u32 => { .. },
_: u64 => { .. },
() => { .. }, // type ascription can be omitted
}
// see motivation example
match foo(val) {
Ok(n) => { .. },
Err(NoneError) => { .. },
Err(err: ParseIntError) => { .. }
}
match val: u32 | u64 | () {
v: u32 | u64 => { .. },
() => { .. },
}
match val: u32 | u64 | () {
v: u64 => { .. },
// without an explicit type ascription `v` will have type
// `u32 | u64 | ()`
v => { .. },
}
In the last example we don not change type of v to u32 | ()
for several reasons:
- Simplicity of implementation
- Potentially incompatible memory layouts
- Being coherent with the existing
match
behavior - Reducing number of code breakage on match arm changes.
TLS will implement a minimal set of object-safe traits of included types. In other words trait is implemented for TLS only when all its variant types implement it and it’s an object-safe.
Object-safety is a strong restriction, but will make it easier to start. In future this restriction can be relaxed. (e.g. nothing forbids using consuming method on TLS, but associated types and constants probably should be forbidden)
Generic code
One of the issues often mentioned in discussions for similar proposals is problems with generic code, for example:
// what will happen if U == V?
fn foo<T, V>(val: T | V) {
match val {
v: U => { .. },
v: V => { .. },
}
}
Arguably considering monomorphization, this code is very similar to this code:
// `get_id` is a static method of `MyTrait`
fn foo<U: MyTrait, V: MyTrait>(input: u32) {
match input {
n if U::get_id() == n => { .. },
n if V::get_id() == n => { .. },
_ => { .. },
}
}
In other words we can “solve” this problem by utilizing the fact that match arms are evaluated and executed in order, so if U
and V
(e.g. u32) have the same type foo
will get monomorphized into the follwoing function:
fn foo2(val: u32) {
match val {
v: u32 => { .. },
v: u32 => { .. },
}
}
It’s obvious that only the first arm will be executed and the second one will be always ignores (and removed by optimizer). In practice it’s highly unlikely that elimination of the second arm will cause any sufficient problems.
If one of generic types will be TLS, this case is handled by ability to include TLSes into match arms (see the third match example). If U = A | B
and V = B | C
, then we’ll get the following code:
// (A | B) | (B | C) == (A | B | C)
fn foo(val: (A | B) | (B | C)) {
match val {
v: A | B => { .. },
v: B | C => { .. },
}
}
It’s obvious that if variant has type B
it will always go to the first arm, and the second arm will always get variant with type C
. So code will work without any problems, and result will be predictable.
TLS as a generalized enum
As was mentioned TLS can be viewed as a generalization of enum
:
// this enum
enum A {
Foo,
Bar(u32),
Baz{f: u64},
}
// can be (theoretically) desugared as
struct Foo;
struct Bar(u32);
struct Baz{f: u64};
type A = Foo | Bar | Baz;
This will naturally solve problem of enum variant types which is targeted by RFC 2593.
Type inference
Let’s take a look at the following code:
fn call_a_function<A>(value: A | (), f: impl Fn(A | ()) -> A) { .. }
fn main() {
let x: i32 | () = become 1i32;
call_a_function(x, |x| Clone::clone(&x));
}
Currently type checker will not be able to handle such cases (see @ExpHP comment here), so in the beginning such non-deterministic cases can result in a compilation error. Later type checker can become smarter in a backwards-compatible way and deduce the minimal type for cases like this.
become
vs enum impl Trait
vs implicit conversion
Some proposals introduce enum impl Trait
syntax, which will be used like this:
fn foo() -> enum impl Display {
if cond() {
1u32
} else {
2u64
}
}
I believe this approach unnecessary clutters function signature. Inner working of impl Trait
value should be an internal detail, which should not matter to users.
As for implicit conversion proposals, I believe they are too magic for Rust. Additionally they can result in bugs for traits which are implemented for ()
. For example the following code with an easy to miss mistake will compile without any issues with such approach:
fn foo() -> enum impl Debug {
if cond() {
1u32
} else {
2u64
};
}
Thus we need an explicit conversion inside code, with the most straightforward option of prefix keyword. Alternatively postfix variants can be used as well, like expr@convert
, expr.convert!()
, etc. (see discussions in this issue)
Extension: const variants
One possible extension of this feature is adding const variants:
type Foo = u8 | const 1u32 | const 2u32;
// note that 1u8 != 1u32, if N=1, this TLS get unified into `1u8 | 1u32`
type Bar<const N: u32> = 1u8 | 1u32 | const N;
match val: Foo {
v: u8 => { .. },
1: u32 => { .. },
2: u32 => { .. },
}
match v: Bar {
1u8 => { .. },
1u32 => { .. },
N: u32 => { .. },
}
Unresolved questions
- Name bikeshedding.
- Should we automatically implement
Into
andFrom
traits for TLS variants? (i.e.From
trait will be baked into language) What about sub-sets? (i.e.From<A | B> for A | B | C
) - Interaction with lifetimes.