Pre-RFC: sum-enums

ExpHP · November 10, 2018, 2:05am

fn call_a_function<A>(
    value: enum(A),
    function: fn(enum(A)) -> A,
)

fn main() {
    let x: enum(i32) = 1i32.into();
    call_a_function(
        x,
        |x| Clone::clone(&x),
    );
}

When type inference encounters call_a_function, it introduces a type inference variable.

// an approximation of what the compiler sees
// (after skipping some boring steps)
let x: enum(i32) = 1i32.into();
call_a_function::<?0>(
    x,
    |x: enum(?0)| Clone::clone(&x),
)

When it sees that x is an argument, it unifies the type of enum(A) with enum(i32). It can follow one of three strategies here:

It can proceed the way it always has for any other type constructor and assume ?0 = i32. Type checking will fail when the function produces enum(i32) instead of i32.
It can do something it has never done before, and say that the type of ?0 is one of i32 or enum(i32). Type checking will succeed because the function produces enum(i32). However, I suspect this strategy will lead to exponential blowup.
It can refuse to type-check this example.

Which path shall it follow?

newpavlov · November 10, 2018, 11:59am

You are missing the fact that enum(A) == A (see properties in the OP), so call_a_function will look for compiler like that:

fn call_a_function<A>(
    value: A,
    function: fn(A) -> A,
)

OF course A can be a sum-enum, but it does not change anything.

ExpHP · November 10, 2018, 1:09pm

You are right, I did miss that. But enum<A, ()> is still idempotent, so replace my example with

fn call_a_function<A>(
    value: enum(A, ()),
    function: fn(enum(A, ())) -> A,
)

fn main() {
    let x: enum(i32, ()) = 1i32.into();
    call_a_function(
        x,
        |x| Clone::clone(&x),
    );
}

Nemo157 · November 10, 2018, 1:13pm

@ExpHP Do you mean enum(A, !) is idempotent? enum(A, ()) is an alternative representation for Option<A> where A != (). Your example should compile with enum(A, !) as well as it is special-cased to be equivalent to enum(A) (which is then equivalent to A) in the OP as well.

ExpHP · November 10, 2018, 1:16pm

No, I am referring to the fact that enum(enum(A, ()), ()) = enum(A, ()).

enum(A, !) satisfies this property too, but the fact that it is equal to A gives it the same problem as the original enum(A) example.

Nemo157 · November 10, 2018, 1:18pm

Ah, so sort of like enum(A, ()) where A: enum(()) = enum(A) (if enums were sub-typable), given an enum containing the () type adding another () to it is idempotent.

ExpHP · November 10, 2018, 1:21pm

I think the confusion might have been over the meaning of the word “idempotence.” I understand this word to describe a function f such that

f(f(x)) = f(x)

I.e. it is a function that can be meaningfully applied only zero or one times.

Nemo157 · November 10, 2018, 1:24pm

Yes, I think I jumped to identity function rather than idempotence when first reading your post. Re-reading it now idempotence is definitely correct, sorry for the side-track.

newpavlov · November 10, 2018, 2:41pm

So compiler takes ?0 = enum(i32, ()), now for function we have fn(enum(enum(i32, ()), ())) -> enum(i32, ()), which after sum-enum flattening becomes fn(enum(i32, ())) -> enum(i32, ()), so type checking should pass without problems. Yes, it will require an additional “flattening” step after monomorphization, but I believe it should be doable?

UPD: I think I am starting to get the issue, you mean it will be difficult to process cases like enum(?0, ()) = enum(i32, ()) && ?0 = enum(i32, ()), as system will have to be able to deduce ?0 value. And code could produce significantly more complex systems?

ExpHP · November 10, 2018, 2:50pm

But how does it know to do this just by unifying the types enum(?0, ()) and enum(i32, ())?

Why would it not choose ?0 = i32?

Kind of, although I picture it happening in the other direction due to the in-order nature of the type checker. i.e. at the point that it unifies enum(?0, ()) with enum(i32, ()), it needs to produce a "nondeterministic" result that ?0 is somehow either enum(i32, ()) OR i32.

newpavlov · November 10, 2018, 3:07pm

Well, one solution will be to require explicitly providing exact types via turbofhish if on unification type checker hits nondetermenistic case. In future if there will be a need, it will be possible to make typechecker smarter in a backwards compatible way.

skysch · November 12, 2018, 7:01pm

Could this be treated the same way impl Trait is? That is, instead of thinking of enum(A, B) as a concrete type, would it be useful to think of it as an opaque object which provides the desired semantics?

lordan · November 13, 2018, 10:56am

One addition I’d like to see is an ability to optionally specify the discriminator, as an integral type, or - more interestingly - another enum. Using an alternative syntax:

type TwoTypes = enum(u8) Foo | Bar;   // or "enum(Foo | Bar; u8)", or ...

In the above, the discriminator is of size u8, values assigned automatically. Going further:

type FourTypes = enum(TwoTypes) Foo | Bar | Baz | Quz;
type ThreeTypes = enum(FourTypes) Bar | Baz | Quz;

Here, FourTypes expands on TwoTypes discriminator and uses the same discriminator values for the same types TwoTypes does. Hence, TwoTypes can seamlessly be assigned or referred-to (non-mutably) wherever FourTypes is expected.

Conversely, ThreeTypes is a subset of FourTypes and can also be used wherever the latter is expected.

However:

type MixedTypes = enum(ThreeTypes) Bar | Baz | Other;

This re-uses the discriminator size and values for Bar and Baz, but both omits Quz and adds Other. IOW, while this would compile fine, MixedTypes cannot be used in the place of ThreeTypes, nor vice versa.

lordan · November 13, 2018, 11:50am

I’ve previously wanted to make a proposal very similar to this. Some pitfalls I encountered (and please correct me where I’m wrong!):

struct Point3D<T>(T, T, T);

Here, Point3D<enum(u32, u64)> is not the same as enum(Point3D<u32>, Point3D<u64>)! The latter is probably what you’d want.

// lifetimes?
type SameOrNot = enum(&'a i32, &'b i32);    # Same?

// references
fn foo(a: &enum(u16, u32, u64)) { ... }
let b: enum(u16, u64) = ...
foo(&b)     // possible? Feels like it should be.
foo(&5u32)  // possible? Feels like it should be.

Regarding the last point:

&mut reference: not possible, as the code may want to change the “enum”
read-only reference could be solved via “fat references”, i.e., a pair (&discriminator, &data):
- compatible enum: refer to discriminator and data normally
- compatible types but incompatible discriminator: create temporary discriminator to point to
- single value, as in the example: use a 'static discriminator

Regarding “Sum-enums as a generalized enums”:

~~Option<Option<T>> has layout [0..1][0..1]<u8>, i.e., be of size 3, while enum(Some(Some(u8)), Some(None), None) would have a single discriminator: [0..2]<u8>, i.e., have size 2.~~

mcy · November 13, 2018, 10:09pm

False. Thanks to @eddyb's work, the compiler will collapse the discriminant of the first option into the second. Thus, mem::size_of::<Option<Option<u8>>>() == 2.

lordan · November 14, 2018, 10:48am

Cool! Good to know!

mcy · November 14, 2018, 3:47pm

If you’re interested, I suggest digging up the PR for niche detection. It’s much fancier than just discriminant coallesion!

scottmcm · November 14, 2018, 9:25pm

Also, if you’re interested in the layout optimizations, you might like

newpavlov · November 14, 2018, 9:38pm

A bit off-topic: unfortunately cases like this do not get optimized yet.

oooutlk · November 21, 2018, 3:26pm

I wrote a project named enumx, implementing anonymous enum in Rust, FYI:announcement

Topic		Replies	Views
Pre-RFC: type-level sets	18	2085	May 1, 2019
pre-RFC: anonymous enums language design	13	5499	March 25, 2019
Pre-RFC: Anonymous variant types language design	93	6147	March 25, 2019
[Pre-RFC] sum/union types language design	9	1138	May 13, 2023
Better enums language design	8	2413	October 9, 2022

Pre-RFC: sum-enums

Related topics