Pre-RFC: sum-enums


#62

The parser is not aware of such semantic information at that moment. To resolve the ambiguity you simply pick a parse; you assume that it means A : (B | C) and let the user disambiguate if they want the other interpretation.


#63
fn call_a_function<A>(
    value: enum(A),
    function: fn(enum(A)) -> A,
)

fn main() {
    let x: enum(i32) = 1i32.into();
    call_a_function(
        x,
        |x| Clone::clone(&x),
    );
}

When type inference encounters call_a_function, it introduces a type inference variable.

// an approximation of what the compiler sees
// (after skipping some boring steps)
let x: enum(i32) = 1i32.into();
call_a_function::<?0>(
    x,
    |x: enum(?0)| Clone::clone(&x),
)

When it sees that x is an argument, it unifies the type of enum(A) with enum(i32). It can follow one of three strategies here:

  • It can proceed the way it always has for any other type constructor and assume ?0 = i32. Type checking will fail when the function produces enum(i32) instead of i32.
  • It can do something it has never done before, and say that the type of ?0 is one of i32 or enum(i32). Type checking will succeed because the function produces enum(i32). However, I suspect this strategy will lead to exponential blowup.
  • It can refuse to type-check this example.

Which path shall it follow?


#64

You are missing the fact that enum(A) == A (see properties in the OP), so call_a_function will look for compiler like that:

fn call_a_function<A>(
    value: A,
    function: fn(A) -> A,
)

OF course A can be a sum-enum, but it does not change anything.


#66

You are right, I did miss that. But enum<A, ()> is still idempotent, so replace my example with

fn call_a_function<A>(
    value: enum(A, ()),
    function: fn(enum(A, ())) -> A,
)

fn main() {
    let x: enum(i32, ()) = 1i32.into();
    call_a_function(
        x,
        |x| Clone::clone(&x),
    );
}

#67

@ExpHP Do you mean enum(A, !) is idempotent? enum(A, ()) is an alternative representation for Option<A> where A != (). Your example should compile with enum(A, !) as well as it is special-cased to be equivalent to enum(A) (which is then equivalent to A) in the OP as well.


#68

No, I am referring to the fact that enum(enum(A, ()), ()) = enum(A, ()).

enum(A, !) satisfies this property too, but the fact that it is equal to A gives it the same problem as the original enum(A) example.


#69

Ah, so sort of like enum(A, ()) where A: enum(()) = enum(A) (if enums were sub-typable), given an enum containing the () type adding another () to it is idempotent.


#70

I think the confusion might have been over the meaning of the word “idempotence.” I understand this word to describe a function f such that

f(f(x)) = f(x)

I.e. it is a function that can be meaningfully applied only zero or one times.


#71

Yes, I think I jumped to identity function rather than idempotence when first reading your post. Re-reading it now idempotence is definitely correct, sorry for the side-track.


#72

So compiler takes ?0 = enum(i32, ()), now for function we have fn(enum(enum(i32, ()), ())) -> enum(i32, ()), which after sum-enum flattening becomes fn(enum(i32, ())) -> enum(i32, ()), so type checking should pass without problems. Yes, it will require an additional “flattening” step after monomorphization, but I believe it should be doable?

UPD: I think I am starting to get the issue, you mean it will be difficult to process cases like enum(?0, ()) = enum(i32, ()) && ?0 = enum(i32, ()), as system will have to be able to deduce ?0 value. And code could produce significantly more complex systems?


#73

But how does it know to do this just by unifying the types enum(?0, ()) and enum(i32, ())?

Why would it not choose ?0 = i32?

Kind of, although I picture it happening in the other direction due to the in-order nature of the type checker. i.e. at the point that it unifies enum(?0, ()) with enum(i32, ()), it needs to produce a “nondeterministic” result that ?0 is somehow either enum(i32, ()) OR i32.


#74

Well, one solution will be to require explicitly providing exact types via turbofhish if on unification type checker hits nondetermenistic case. In future if there will be a need, it will be possible to make typechecker smarter in a backwards compatible way.


#75

Could this be treated the same way impl Trait is? That is, instead of thinking of enum(A, B) as a concrete type, would it be useful to think of it as an opaque object which provides the desired semantics?


#76

One addition I’d like to see is an ability to optionally specify the discriminator, as an integral type, or - more interestingly - another enum. Using an alternative syntax:

type TwoTypes = enum(u8) Foo | Bar;   // or "enum(Foo | Bar; u8)", or ...

In the above, the discriminator is of size u8, values assigned automatically. Going further:

type FourTypes = enum(TwoTypes) Foo | Bar | Baz | Quz;
type ThreeTypes = enum(FourTypes) Bar | Baz | Quz;

Here, FourTypes expands on TwoTypes discriminator and uses the same discriminator values for the same types TwoTypes does. Hence, TwoTypes can seamlessly be assigned or referred-to (non-mutably) wherever FourTypes is expected.

Conversely, ThreeTypes is a subset of FourTypes and can also be used wherever the latter is expected.

However:

type MixedTypes = enum(ThreeTypes) Bar | Baz | Other;

This re-uses the discriminator size and values for Bar and Baz, but both omits Quz and adds Other. IOW, while this would compile fine, MixedTypes cannot be used in the place of ThreeTypes, nor vice versa.


#77

I’ve previously wanted to make a proposal very similar to this. Some pitfalls I encountered (and please correct me where I’m wrong!):

struct Point3D<T>(T, T, T);   

Here, Point3D<enum(u32, u64)> is not the same as enum(Point3D<u32>, Point3D<u64>)! The latter is probably what you’d want.

// lifetimes?
type SameOrNot = enum(&'a i32, &'b i32);    # Same?

// references
fn foo(a: &enum(u16, u32, u64)) { ... }
let b: enum(u16, u64) = ...
foo(&b)     // possible? Feels like it should be.
foo(&5u32)  // possible? Feels like it should be.

Regarding the last point:

  • &mut reference: not possible, as the code may want to change the “enum”
  • read-only reference could be solved via “fat references”, i.e., a pair (&discriminator, &data):
    • compatible enum: refer to discriminator and data normally
    • compatible types but incompatible discriminator: create temporary discriminator to point to
    • single value, as in the example: use a 'static discriminator

Regarding “Sum-enums as a generalized enums”:

Option<Option<T>> has layout [0..1][0..1]<u8>, i.e., be of size 3, while enum(Some(Some(u8)), Some(None), None) would have a single discriminator: [0..2]<u8>, i.e., have size 2.


#78

False. Thanks to @eddyb’s work, the compiler will collapse the discriminant of the first option into the second. Thus, mem::size_of::<Option<Option<u8>>>() == 2.


#79

Cool! Good to know! :slight_smile:


#80

If you’re interested, I suggest digging up the PR for niche detection. It’s much fancier than just discriminant coallesion!


#81

Also, if you’re interested in the layout optimizations, you might like


#82

A bit off-topic: unfortunately cases like this do not get optimized yet.