Blog post series: Alternative type constructors and HKT

There was some discussion on similar themes back on ye olde default type parameters RFC.

I would be fine with this. Rust’s type system is already Turing complete. The only case where I would have a problem is if local type inference became impossible.

1 Like

Is there a current example that demonstrates this? The only ones I'm aware of stopped working before 1.0.

Syntactic issue: for<..> is deeply unwelcoming syntax that makes bounds very frightening to most users.

Intuitively, it’d be nice to allow users to just introduce variables:

fn foo<T>(iter: T) where T: Iterable, T::Iter<'a>: Iterator<&'a str> { ... }
// equivalent to
fn foo<T>(iter: T) where T: Iterable, for<'a> T::Iter<'a>: Iterator<&'a str> { ... }

However, this presents a very serious problem. If the name for your variable is already in scope, we cannot interpret this as ‘shadowing’ and must instead interpret it as applying that variable. Accidentally reusing a name that’s already in scope makes your code mean something very different. For example:

fn foo<'a, T>(iter: &'a T)  where T: Iterable, T::Iter<'a>: Iterator<&'a str> { ... }
// very different from
fn foo<'a, T>(iter: &'a T) where T: Iterable, T::Iter<'b>: Iterator<&'b str> { ... }

Lifetime ellisions make this even more acute.

Introducing type parameters with let

We could avoid this by allowing you to introduce new type variables with let. If you shadow with a let, we will treat it as an error (like we already do). For example:

fn foo<T>(iter: T) where T: Iterable, T::Iter<let 'a>: Iterator<&'a str>

This is only a few less characters than the for bound, but it is considerably less frightening in my opinion.

Agreed. It's worth thinking about if we can find a good replacement.

There’s already

impl<'a> Iterator<&'a str> for Iter<'a> { ... }

The idea of having to pre-introduce the variable isn’t really new. What’s new is the particular keyword used to introduce it in this context. I suppose you could do

fn foo<T>(iter: T) where<'a> T: Iterable, T::Iter<'a>: Iterator<&'a str> { ... }

This is somewhat contextually sensitive though, because you can also have

fn foo<T: Trait>(iter: T) where <T as Trait>::Type: OtherTrait { ... }
2 Likes

Why? The binder on the where-clause can just be merged with the binders on each individual bound. This would avoid the scary for syntax while maintaining the same semantics.

However, it might cause some confusion of the "do I put my lifetime on the function or where-clause" sort.

I believe @steven099 meant there could be parsing ambiguities, because the type immediately following the where clause can begin with an open angle bracket.

Putting the bounds on the where is interesting and feels consistent with the current syntax. I’m not sure how much it helps though, need to think abot it more.


Thinking more about why for is a problem: I don’t think the semantics of HRTB are hard to understand; I think most users can sort of intuit that they want this parameter to be “any lifetime” or “any Display type” and so on. There are several problems with the syntax, but my biggest objection is that it draws attention to the fact that an HRTB is different from other trait bounds.

The problem with this drawing attention is that it exposes users to the concept of “higher ranked types,” which is likely expose them to either a highly technical or a poorly explicated description of this concept from a type theoretic / PL perspective. I don’t think you need to understand the theoretic concept of higher rank types in type class constraints to be to effectively use an HRTB.

Associated types are an example of a place in which we successfully avoided this problem. Associated types are an implementation of a fairly advanced type system feature (and I know the implementation has had to deal with a lot of the issues that arise from that). But to most users, they look like a type alias inside of a trait. This simplistic appearance masks a powerful capability for polymorphism that users are able to employ intuitively without thinking at all about what it “means” from a type theoretic perspective.

3 Likes

Some minor corrections. In the first post:

    self.cell.next = Some(Box::new(cell));

I think it should be:

    self.cell = Some(Box::new(cell));

In the second post:

fn floatify_family<F>(ints: &F::Collection<i32>) -> F::Collection<f32>
    where F: CollectionFamily
{
    let mut floats = F::Coll::empty();

I think it should be:

fn floatify_family<F>(ints: &F::Member<i32>) -> F::Member<f32>
    where F: CollectionFamily
{
    let mut floats = F::Member::empty();

And similarly, F::Member not F::Collection in the paragraph beneath that snippet.

2 Likes

A nugget which just occurred to me and which I haven’t seen mentioned: there is some prior art for the “ATCs only, but no HKTs” approach, namely the various MLs, which support abstracting over types-with-parameters in signatures, but do not allow type variables to range over types-with-parameters (a.k.a. HKTs) directly. It might be a good place to look to learn about the tradeoffs. (Probably has to be filtered through the modules vs. type classes prism as well though.)

Niko, in your 3rd blog entry, you describe HKT versus ATCs as having their own tradeoffs and expressiveness potential. Specifically you describe the bounds requirements of hashsets and the non-genericity of bitsets. This reminds me of a similar problem that’s already been solved before in Rust: having a hierarchy of traits to model increasing amounts of “power”. The closure hierarchy of FnOnce, FnMut, and Fn is elegant and has proven to be very useful in Rust because people choose how much flexibility they need based on the problem at hand. For bitsets and Collections vs. HKTCollections, we find the same kind of choice. Let me proceed with some arbitrary definitions: the largest group of types we want to have methods for are things that are “Collections” over some (single) type. These can return iterators, add to their collection, create new empty collections, etc. A subset of these collection types are “Containers”, collections that are capable of holding many different kinds of types and are not bound to any particular one. These have the ability to map over them functor style, implement your floatify function in your second blog post, and most importantly do not require excessive type annotations to constrain the polymorphism/inference.

So if we think of HKTs as being a small improvement to ACTs, we can come up with this hierarchy:

// Collection trait
trait Collection<T> {
    // create an empty collection of this type:
    fn empty() -> Self;
    
    // add `value` to this collection in some way:
    fn add(&mut self, value: T);

    // iterate over this collection:
    fn iterate(&'a self) -> Self::Iter<'a>;
    
    // the type of an iterator for this collection (e.g., `ListIter`)
    type Iter<'a>: Iterator<Item=T>;
}

// "A collection that can contain many different types"
// I take some liberty here with the syntax
trait Container where Self<_>, for<T> Self<T> : Collection<T> {
 ...
}

impl Collection<usize> for BitSet { ... }

impl<T> Collection<T> for Vec<T> { ... }
impl Container for Vec { ... }

Etc.

Orthogonal consideration: you mentioned hashsets. Two people also mentioned constraint kinds / “associated bounds” as a solution. This seems like a pretty big leap in complexity but is certainly as intuitive as associated types are. In fact, because of Rust’s handy “default values for associated types”, this complexity is totally transparent to users until they need or want the power. Again, I take liberties with the syntax here:

trait Container<trait C = Id> where 
    Self<_>, for<T: C> Self<T> : Collection<T> {
 ...
}

impl<T: Hash> Collection<T> for HashSet<T> { ... }
impl Container<Hash> for HashSet { ... }

Where Id is a new built in trait auto-implemented for everything and means nothing (similar to the Id wrapper type as commonly used in Haskell). In essence: if you don’t specify a trait, there is no trait requirement (I would expect this pattern a lot). That particular pattern is also backwards compatible.

I think this is very elegant!

3 Likes

Worth noting Collection<T> and Container is rather similar to Collection<T> and CollectionFamily; the difference I see is that this uses a higher kinded self to automate the injective association between them.

2 Likes

Interesting observation. But I might add this also avoids dummy structs and “backlinking”; the only link here is the data type in question. I suppose this is what you mean by automation, though (at first I thought you just meant the inference).

It also encompasses more cases - you can have a Collection<T> which isn’t constructed by a Container, but you can’t have a Collection<T> which doesn’t have an associated CollectionFamily because of the way the backlink works.

1 Like

The infamous Scala SI-2712 might also be very relevant here. My hazy impression is that the situation in Scala is something along the lines of: they allow arbitrary type-level lambdas (similar to associated type constructors without any ordering/number-of-uses restriction on variables?), but will only infer the cases where you in fact have a type constructor partially applied in-order to some number of type arguments. But I’m not really familiar with it.

I deeply share @gasche’s sentiment that we should get in touch with the people who actually designed and implemented this stuff for Haskell and Scala (and perhaps other languages, if applicable). Increased cooperation across language communities can only be to the benefit of everyone involved. What have we got to gain by going it alone?

7 Likes

I’ve just started reading the first post of the blog series and struggled already to fully understand the background info on traits. So I ran the example and found some typos. Here are my changes to make it compile in case anyone else wants to play with it:

11c11
<     cell: Option<Box<ListCell<T>>
---
>     cell: Option<Box<ListCell<T>>>
32,33c32,33
<         let cell = ListCell { value: value, next: old_head };
<         self.cell.next = Some(Box::new(cell));
---
>         let cell = ListCell { value: value, next: List { cell: old_head } };
>         self.cell = Some(Box::new(cell));
42c42
< pub struct ListIter<'iter, T> {
---
> pub struct ListIter<'iter, T: 'iter> {

@nikomatsakis Can you make these corrections? Your posts were well written, but things like this make it hard to follow posts about complicated concepts; I waste time wondering if I’m wrong and missing something, or whether there’s an actual mistake.

3 Likes

@glaebhoerl @gasche I’d be happy to talk to folks who have worked on these issues in other languages, but I don’t know the best way to reach them. Recommendations?

1 Like

@withoutboats I would just send them an email. You need to decide who is “them” (one good option is to just write to SPJ, which acts as somewhat of a steward on these questions, and trust him to include whoever is relevant to the conversation), but it would also help to have a reasonably compact description of the question and design space you considered – @nikomatsakis’ blog post are thorough and excellent in terms of explaining the design process to the Rust community, but maybe they could be condensed in something that fits in a couple pages to not scare new readers away. On the other hand, you need to keep code examples in, because they are important and useful to follow what you are talking about – especially across communities with different vocabularies and backgrounds.

If you decided to submit a ML workshop talk on higher-kinded types in Rust (the next edition of the ML workshop is in September 2017 in Oxford, UK), you would have to write a two-page extended abstract in the typical dreaded two-column ACM format. That might be a reasonable format to constrain oneself to concisely explain the problem – although you should feel free to sprawl more pages in one-column format, two-column is the worst for code examples.

@glaebhoerl rightly pointed out that the Scala community struggled with these questions as well. To my knowledge the most up-to-date public document on this discussion is the summary by Martin Odersky, Guillaume Martres and Dmitry Petrashko of the several approaches that have been taken to represent higher-kinded types in Dotty, the clean-slate Scala compiler effort: Implementing Higher-Kinded Types in Dotty. You should feel free to contact both groups.

My (fairly uninformed) gut feeling, however, is that the Scala discussion was more concerned with representation within the new Dependent Object Types (DOT) core language/formalization of Scala’s type system, and therefore more likely to draw conclusion specific to Scala’s design and implementation approaches, while the Haskell community tends to have a fairly general approach to these design questions that will transfer more easily to other languages.

2 Likes

Another option might be to send an email to the haskell-cafe@haskell.org mailing list (and/or maybe glasgow-haskell-users) to see who is interested. There are quite a lot of knowledgeable people lurking in the Haskell community (type theorists, compiler implementors, original inventors of things) so casting a wide net might have benefits.

I’m much less familiar with the Scala community, though it seems like a greater proportion of them are active on Twitter.

I was grepping the PDF linked by @gasche to see if it references SI-2712 and found the link to the actual pull request implementing it which further links to an in-depth explanation which are both probably more useful than the link to SI-2712 itself which I had posted earlier.

1 Like