Haskell style data constructors (GADTs) for rust enums

The Problem

In haskell we can write:

data Maybe a = Just a | Nothing

And similarly in rust we can write:

enum Maybe<T> {
    Just(T),
    Nothing
}

But, in haskell the above example is just syntactic sugar for:

data Maybe a where
    Just :: a -> Maybe a
    None :: Maybe a

Which isn't the case in rust.

This generalization of data constructors lets us do things like creating a vector with a type-level length:

data Zero
data Succ a
data Vector l a where
    Nil :: Vector Zero a
    Cons :: a -> Vector l a -> Vector (Succ l) a

-- length is included as part of the type!
x :: Vector (Suc (Suc (Suc Zero))) Int
x = 1 `Cons` 2 `Cons` 3 `Cons` Nil

This is somewhat possible in rust using private traits:

// Makes it possible to use the `Vector` trait outside of the
// module where its defined without being able to implement it.
trait PrivateVector<T, L> {}
pub trait Vector<T, L>: PrivateVector<T, L> {}
impl<T, L, V: PrivateVector<T, L>> Vector<T, L> for V {}

pub struct Zero;
pub struct Succ<T>(std::marker::PhantomData<T>);

pub struct Nil {}
impl<T> PrivateVector<T, Zero> for Nil {}

pub struct Cons<T, L, V: Vector<T, L>>(T, std::marker::PhantomData<L>, V);
impl<T, L, V: Vector<T, L>> PrivateVector<T, Succ<L>> for Cons<T, L, V> {}

But, besides it being much more verbose, this method is much more limiting because it doesn't allow us to do pattern matching:

fn some_fn<T, L, V: Vector<T, L>>(vector: V) {
    match vector {
        // Rust thinks this is a variable named `Nil`
        Nil => (),
        // mismatched types: expected type parameter `V`
        // found struct `Cons<_, _, _>`
        Cons(_, _, _) => (),
    }
}

NOTE: you can kind of achieve pattern matching with traits but it would essentially require you to include multiple constraints for every function including all the traits you need to pattern match that specific one.

The Suggestion

There are two possibilities here in my eyes, one is to allow pattern matching on generic values:

fn some_fn<T>(value: T) {
    match value {
        Option::Some(value) => (),
        Result::Err(value) => (),
        // default case is required because `value` could literlly be anything
        _ => ()
    }
}

And then we could use that for the specific case of GADTs:

fn some_fn<T, L, V: Vector<T, L>>(vector: V) {
    match vector {
        Nil => (),
        Cons(_, _, _) => (),
        // the compiler recognizes `value` can be a finite amount of things
        // so a default case isn't required.
    }
}

Which is unlikely given rust's low level nature.

Or, alternatively, we can extend the enum syntax to make haskell-style data constructors possible:

struct Zero;
struct Succ<T>(std::mem::PhantomData<T>);

// proposed syntax
enum Vector<T, L> {
    // notice generics have to be specified for each varient seperately
    Nil<T> -> Vector<T, Zero>,
    Cons<T,L>(T, Box<Vector<T,L>>) -> Vector<T, Succ<L>>
}

Pattern matching is then done as usual:

// no need to deal with dynamic traits, which is a bonus too
fn some_fn<T, L>(vector: Vector<T,L>) {
    match vector {
        // Here it should be inferred that L=Zero
        Nil => (),
        // Here it should be inferred that L=Succ<_>
        Cons(_,_) => ()
    }
}

This syntax can also finally enable a safe implementation of a type equality type:

enum TypeEq<T,K> {
    Refl<T> -> TypeEq<T,T>
}
impl<T,K> TypeEq<T,K> {
    fn cast(&self, value: T) -> K {
        match &self {
            // it is inferred that this case can only be reached when T=K
            Refl => value
        }
    }
}

Conclusion

GADTs (which is what haskell's data constructors are) are a really powerful tool for expressing types that more thoroughly check our values out of the box, without the need for the parse pattern or complex traits that make dealing with the actual values that use them a pain.

What do you think? Is this feature too complicated for a compiled language? Should this be an entirely new data type, separately from enum? Is there an already a way to do this using the existing type system that I missed?

Subtyping via lifetimes really complicated this issue. Something Haskell doesn't deal with. How do you think this would interact with subtyping?

1 Like

If I understand what you're asking correctly, If think requiring lifetimes in a constructor to also appear as part of the whole type could solve this issue:

// causes type error
enum NullableReference<T> {
    Exists<T, 'a>(&'a T) -> NullableReference<T>,
    Null<T> -> NullableReference<T>
}
// passes type checker
enum NullableReference<T, 'a> {
    Exists<T,'a>(&'a T) -> NullableReference<T, 'a>,
    Null<T,'a> -> NullableReference<T>
}

Although, in case this I misunderstand you, could you provide an example of a situation where subtyping could be problematic?

pretty sure you can do this via the typestate pattern and encapsulating methods, but you lose the ability to pattern match against the original type.

in general, it's much less common for rust crates to expose their ADTs the same way Haskell packages do, so features like this wouldn't be as useful (and they already aren't used much in haskell)

I felt motivated by this post to write a compiling version of the code right below this statement:

the code I wrote builds on a TypeEq witness to implement pattern matching over whether the list head is a Cons or Nil: https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=fe943e493bdada5bd25d91a0ee2bcd34

pub fn sum_all<T: Vector<Elem = u64>>(vector: T) -> u64 {
    match T::VEC_WIT {
        VectorWit::Nil(_te) => 0,
        VectorWit::Cons(te) => {
            // vector: Cons<u64, T::Next, T::L>
            let vector = te.coerce(vector);
            vector.elem + sum_all(vector.next)
        }
    }
}

While this was fun to write, I don't see being able to do linked lists with a Length generic argument as being a compelling usecase for adding GADTs to Rust.

The only usecase I managed to find for type witnesses (a subset of GADTs) is limited emulation of trait method polymorphism in stable generic const fns. This will be unnecessary once const traits get stabilized.

Update: I continued to find this fun to make to the point that I made the nlist crate (work in progress).

2 Likes