[Analysis / Pre-RFC] Variadic generics in Rust

Too much type theory jargon- just another way to describe those "map and fold on h-lists and sums." :slight_smile:

MLTT - Martin-Löf Type Theory - is a dependent type theory along the same lines as the Simply Typed Lambda Calculus or other members of the Lambda Cube.

Instead of letting you define new types and then write recursive functions over them, MLTT bakes a few types into the theory along with "eliminators" for them- premade operators like map/fold/etc. that a) are general enough to let you write functions over those data types, while also b) still guaranteeing termination.

2 Likes

I think the current lang team thoughts are still that we know there's interest, and it'll probably happen, but that it's likely not a near-term thing:

Maybe there's inspiration to be had from The pi type trilogy · Issue #1930 · rust-lang/rfcs · GitHub -- none of those three were actually merged, but the scoping and splitting were interesting regardless, IIRC.

I think this is one of those features that's so big that it can't really make progress without ignoring syntax for a while. What problems is this trying to solve? Which things should it not support ever? Those are way more important than whether it's ...in Ts or in [...Ts] or whatever.

A few random thoughts:

  • Can I make a variadic Either?
  • Are there value variadics? Or lifetime variadics? or only type variadics?
  • Is this the way to let people index their matricies as m[i, j]? Can we do that without breaking people?
  • What does it mean to put all the variadic types into a struct? Can I only do that as a tuple, or can I make fields for them?
  • Are there any strange interactions between variadics and the usual rust problem-causers (such as DSTs)?
  • Is there some point where it's good to force people over to macros instead of variadics?

(Some of these might be answered in the markdown file; I have to admit I haven't read more that snippits of it.)

4 Likes

Okay. Let's imagine that I have written an analysis which directly addresses these questions and then some (which I have).

What's the next step?

1 Like

IMHO yes, because they are a new concept to be learned and maintained in the compiler, yet don't add anything beyond what tuples already give you.

1 Like

I added the following sections to the document:

First-class types and reflection

While they're not strictly speaking variadics, first-class types are an idea that comes up often enough.

The general idea would be to treat types as values (no different than integers or string) that can be passed to and returned from const functions, eg:

const fn do_something_with(arg: type) -> type;

type my_type = i32;
type my_other_type = do_something_with(my_type);

This is extremely powerful beyond the scope of variadics; first-class types make it possible to create languages inside the language, and to generate entirely new type semantics inside const functions, while using the familiar syntax of Rust code.

In principle, we can even use them to create variadic tuples and variadic generics without any dedicated syntax, extending existing features:

// Hardcoded by the compiler
impl const Iterator<Type> for Tuple {
    const LEN: usize;
    // ...
}

let my_tuple = ...;
for member in my_tuple.iter() {
    member.do_stuff();
}

struct MyTypes<const Types: Vec<Type>> {
    things: (Types),
    optional_things: (Types.map(|Type| Option<Type>)),
}

In general, these proposals are suggested as a replacement for variadic generics. The idea being that we don't need variadics if we can use existing syntax instead, applied to types instead of values.

Avoid first-class types

First-class types are sometimes brought up as a counter-argument against implementing variadics. The reasoning goes that we can implement all of the desired use-cases without adding any new syntax; if we do add a special ...Ts syntax, that syntax will become a special-case, technical debt that will need to be maintained in the compiler and taught to newcomers, without pulling its weight.

Using first-class types, we can be much more flexible about how we handle our types. We can have generics that use imperative rust code to define the types they handle, which means they can be defined on a much broader set of types without additional syntax.

But, for the reasons I listed in previous sections, I'm extremely skeptical of any such proposal.

I think the above reasoning is flawed. While first-class types may appear simpler at a glance, they come with a slew of corner cases, and subtle semantic changes, that would be much more difficult to implement (and maintain in the compiler, and teach to newcomers) than variadics alone. And, not to belabor a point, but post-monomorphization errors are bad and first-class types would bring millions of post-monomorphization errors.

Bluntly speaking, I don't think they're on the table.

3 Likes

In other news, I just posted a thread on r/rust about collecting use-cases for variadics:

(@weiznich @ibraheemdev)

I just saw the comment about parser combinators in the reddit thread. Would something like variadic sum types (an enum with a variant for each type parameter) be possible with this design? The analysis only covers variadic product types (tuples).

Also, I'd like to suggest [T] as a syntax, capturing the meaning "a list/slice of types".

fn foo<[T]>(tuple: (*T)) {}

The *T should work like a spread operator.

Disclaimer: yes, I would very much like to see Rust arrive at first-class types.

However, that's not the issue here. In order to do variadic generics you need to have a list of types (of arbitrary length), and be able to iterate over them (what you call "tuple for" in your proposal).

Rust already has support for lists of types in the form of tuples. My argument is that you could simple use those - in very much the same fashion as outlined in your document - without the need to introduce parameter packs as a new ("non-type") kind of list of types.

I don't think that's the argument, either, at least not mine. Of course you'd need new syntax, but you don't need new concepts.

To be fair, in order to come to that conclusion one would need to provide a rigid definition of the term "variadics" first, followed by all corners of the language they'll have an impact on. I think you'll find that will end up not quite as straightforward as you make it out to be.

3 Likes

Thanks @PoignardAzur for leading the discussion.

One thing that I want to highlight is that most of the examples given in the Reddit thread works because C++ has unconstraint generics (templates), and overloading. I think it would be possible to re-write them in Rust (with the appropriates changes to the language), but we need to be sure that we can do it using traits (constraints generics) and without relying on overloading.

2 Likes

That's not quite true; in general it only applies for sized types:

error[E0277]: the size for values of type `str` cannot be known at compilation time
 --> src/main.rs:4:11
  |
4 |     foo::<(str, str)>();
  |           ^^^^^^^^^^ doesn't have a size known at compile-time
  |
  = help: the trait `Sized` is not implemented for `str`
  = note: only the last element of a tuple may have a dynamically sized type

There are possible workarounds for that -- such as foo::<(PhantomData<str>, PhantomData<str>)>(); -- but that reminds me too much of reference_wrapper to like it.

4 Likes

That is a fair point. But ironically, if we add an unconstrained list of types - whether in the form of parameter packs or some first class [type; N] support - then you have the reverse issue: how to generate variadic tuples out of them without potentially incurring monomorphization errors. OTOH, if all types in such a list are to be Sized only then that would be overly constraining for regular tuples where the last type can be ?Sized.

By using tuples (at least for the time being) that would implicitly be taken care of "for free" and still cover all the goals listed for MVP.

There are plenty of use cases for variadics beyond immediately throwing all the elements into a tuple, and some of them don't share the requirement that all but the last item be Sized.

Consider something as simple as applying a type constructor to each element- if the constructor is & or Box then all the elements can be ?Sized. For example:

fn foo<...Ts: ?Sized>(tuple_of_refs: (...&Ts,))
1 Like

Do we also need to support variadic references in the MVP? (to have a tuple of references with independent lifetime)

fn foo<...'a, ...Ts>(tup: (&`a... Ts...))

I'm curious how you think the Either use case should be handled, eg:

fn split_eithers<...Lefts, ...Rights>(eithers: ...Either<Lefts, Rights>)
    -> (...Option<Lefts>, ...Option<Rights>);

I don't really see a way to express this with existing tuples, without new concepts; even with the zip syntax you wrote earlier.

(it isn't a niche concept: applying type constructors like that is something a lot of use cases would need)

1 Like

I was mainly trying to address the MVP goals that include:

No zipping, flattening, concatenating, or applying type constructors to tuples. (...Ts) is allowed, (...As, ...Bs) , ...Option<As> , ...(As, Bs) , etc, aren't.

with as few changes to the language as possible. It isn't meant as a be-all, end-all one stop solution. (@rpjohnst and @scottmcm already shot holes in the "tuples for everything" theory... :stuck_out_tongue:)

But first I'd ask a question in return: what does ...Either<Lefts, Rights> mean, exactly? When you pin that down, how do you address these cases:

// zip of same-length tuples/type lists
fn zip<...A, ...B>(a: (...A), b: (...B)) -> (...(A, B));   // ??? I suspect
// zip of type-lists of different length, returning (Option<A>, Option<B>) pairs
fn outer_zip<...A, ...B>(a: (...A), b: (...B)) -> ???  // length is max of A, B
// cross-product of tuples, should return all (A, B) pairs between the input lists
fn cross_product<...A, ...B>(a: (...A), b: (...B)) -> ???
// List of entities, return all possible pairs, 
// e.g., for a collision multi-dispatch framework in a game
fn collision_pairs<...Entities>(entities: (...Entities)) -> ???

Then there's more fundamental questions:

  • How would a fully qualified instantiation look like?
let fptr1 = cross_product::<A, B, C, D, E, F>;
//                          ^^^^^^^^^^^^^^^^ where does the first parameter pack end?
// Probably the same issue with your example, too
let fptr2 = split_eithers::<A, B, C, D, E, F>;
  • What is the type signature of a variadic function?
  • How does the mangled name of a variadic function look like?
  • Can you expand a parameter pack next to introducing a new one? (IOW, won't you have to end up using C++'s syntax after all)?

And outside functions:

  • How do variadic structs or traits work? (Note: the syntax seems easy, the details of what it means in terms of additions to the language seem not.)
  • How do variadic tuples as associated types work?
  • @voidc's question regarding variadic enums

Again, I find it hard to discuss this in detail, just with assurances of "it's simple!", without actually having rigid definitions of the additions to the language - which, I think, will end up following C++ parameter pack terminology rather closely.


So to answer your question I would circle back to the first response I gave in the thread: rather than introducing new concepts with arbitrary conventions (like parallel pack expansion) that are limited to variadics, my preferred approach would be to enhance Rust's meta-programming capabilities step-by-step with features that are generally useful, and not just tied to variadics.

For the posed problem my thoughts go towards type arrays or lists (to not be tied to the limitations of tuples), some add'l type reflection to get at generic type parameters, and HKTs. But this is future talk, outside of the scope of any MVP.

(Disclaimer: I actually do like variadics in C++, and think they are one of the more elegant additions to that language. I just find simply porting them over to Rust wholesale to be an "insufficiently ambitious" approach).


In any case, I think the immediate thing for any approach would be to answer @robinm's question regarding references and lifetimes (to which my answer would be: "hellidunno" :stuck_out_tongue: )

3 Likes

Suppose you could say

  • the inside of <> presented as part of a type or fn declaration
    is a pattern that matches
    the inside of <> presented at type or fn usage site

  • the only kind of patterns allowed there in the first iteration of the feature is tuple type constructor, e.g. (..)

  • fn foo<...A, ...B>(..) is invalid same as in C++
    fn foo<(...A), (...B)>(..) is however valid

  • allow limited type level functions

    // a type-level function has no body is defined via Haskell-style pattern matching
    fn f_Pair<A, B> -> (A, B);
    fn f_Either<A, B> -> Either<A, B>;
    fn f_Ref<A> -> &A;
    fn f_Option<A> -> Option<A>;
    
  • and introduce a limited list of built-in type-level higher-order functions like ...map<>, ...zip<>, ...cross<>

  • ...map<> and ...zip<> built-ins implicitly introduce the constraint that their argument type lists are of the same length

  • ... and perhaps ...zip<>, ...map<>, ...cross<> could be extended to also work on values

  • the name of a "type pack" is always written with leading ... making e.g. ...Lefts. This turns ... into a kind of a sigil

  • same for a variables list that has "type pack" as its type

yielding

fn split_eithers<(...Lefts), (...Rights)>(
    ...eithers: ...zip<(...Lefts), (...Rights), f_Either>)
    -> ((...map<...Lefts, f_Option>), (...map<...Rights, f_Option>)) {
    ((...map(...eithers, Either::left)), (...map(...eithers, Either::right))
}

// usage
let Either<u8, &str> eitherA = ..
let Either<i32, &str> eitherB = ..
let result : ((Option<u8>, Option<i32>), (Option<&str>, Option<&str>))
        = split_eithers::<(u8, i32), (&str, &str)>(eitherA, eitherB);

Handling type lists of different length might also be possible by introducing some more complex built-in higher order functions on types/values, however I'm not convinced they are worth the trouble.

I think the general counter-argument from the "incremental" approach is that this ends up adding a lot of special-casing to the language and special semantics to learn.

I personally don't agree with the incremental approach, but I'm not sure I like the idea of a ...cross<> constructor (I initially tried to make a whole chapter about type mappings and ended up scrapping it, so I'm glad we're hashing out relevant concepts).

Also, I'm not sure about the use cases. Intuitively, I'd wager that the vast majority of use-cases only require "mapping" and "zipping", which, using a syntax close to existing macro_rules, could be done without reified type constructors.

On the other hand, @lordan gave a very good example of an use-case for a cross operator (collisions in a game engine). And generally speaking, what I said about taking a giant knife and cutting the possibility space notwithstanding, we probably don't want to commit to a syntax and later realize that "cross variadics" are something people want, but the syntax we committed to makes them awkward.

Like, if it were 100% up to me and I was working on a toy language, I would absolutely take the bet that the cross "operation" wouldn't be useful in real code, and design around mapping and zipping. But as an approach that I'd recommend for writing a RFC, particularly a minimal one... Yeah, we'll probably hedge more. Pick features that don't commit too much, a syntax that's compatible with the idea of variadics-as-tuples-only, and avoid design choices that can't be cancelled with an edition change.

My immediate takeaway is that this is a lot of boilerplate for a simple use case.

1 Like

This wouldn't work because the eithers would be moved into Either::left, so ...map(...eithers, Either::right) would fail to compile. You would need a sort of unzip, or generalized match for variadic tuples

:+1: almighty Yato :slight_smile:

 fn un_either<A, B>(either : Either<A, B>) -> (Option<A>, Option<B>) {..}

 fn split_eithers<(...Lefts), (...Rights)>(
    ...eithers: ...zip<(...Lefts), (...Rights), f_Either>)
    -> ((...map<...Lefts, f_Option>), (...map<...Rights, f_Option>)) {
    ...unzip(...map(...eithers, un_either))
}

That was the idea

The cost of flexibility? I also hoped it would still be intuitively readable
Look, ma, no magic syntax!

Lifetimes could be allowed into the super-wordy syntax I suggested above:

  • allow a lifetime to be used in each place where a type variable is allowed:
// type-level function
fn f_Ref<'a, T> -> & 'a T;

// usage
fn some<(...'a), (...T)>(..) -> (...zip<...'a, ...T, f_Ref>) {...}

a special equivalent syntax can also be provided (and a similar syntax can also work on values):

fn some<(...'a), (...T)>(..) -> (...&<...'a, ...T>) {...}

This special syntax can also be allowed inside fn some<...> as a matching template:

fn some<...&<...'a, ...T>>(..)
// this can be used as
some::<&'d D, &'e E>(..)
// which would make
// ...'a be assigned 'd, 'e
// and
// ...T be assigned D, E

Not for MVP, but demonstrating lifetimes can be added