Pre-RFC: Variadic Generics

Hi, I wrote an RFC for adding variadic generics to Rust which is roughly based on rust-lang/rfcs#2775 and inspired by C++'s variadic templates. I tried to find a syntax that is not ambiguous with range expressions and fits well within the existing pattern matching syntax.

Rendered

Any feedback is very much appreciated

4 Likes

previously in variadic generics

10 Likes

Tuple Helper Functions

I don't think we need to move forward with these just yet, it would be better to get some sort of variadic generics before these extensions.

fn zip<(A @ ..), (B @ ..)>((a: A) @ .., (b: B) @ ..)
    -> ((A, B) @ ..)
{
    ((a, b) @ ..)
}

What happens if the two tuples are different lengths? Hint: it shouldn't be a post-monomorphization error, because that would run counter to how Rust handles generics.

 fn all<T @ ..>((t @ ..): (T @ ..))
    -> bool
    where (T: Into<bool> + Copy) @ ..
{
    (t.into() && ..)
}

It would be better if this took a closure as an argument, but that would require generics closures.

For generic type parameter IDENTIFIER@.. or (IDNTIFIER@..) the identifier declares a type parameter pack

How would this interact with pattern matching? i.e. let x @ .. = ..; is valid syntax right now, as is fn foo(x @ ..: ..) at the very least, this should use ... to get around this, but that will have to be bounded on an edition, because 3 dots is valid in patterns.

fn foo<A@..>(a: A@..)       // => fn foo<A1, A2, B>(a1: A1, a2: A2)
fn foo<A@..>((a: A)@..)     // => fn foo<A1, A2, B>(a1: A1, a2: A2)
fn foo<A@..>(a: (A@..))     // => fn foo<A1, A2, B>(a: (A1, A2))

Are these all missing a B in the comments? Shouldn't it be

fn foo<A@..>(a: A@..)       // => fn foo<A1, A2, B>(a1: A1, a2: A2, a3: B)
fn foo<A@..>(a: (A@..))     // => fn foo<A1, A2, B>(a: (A1, A2, B))
fn foo<A@..>((a: A)@..)     // => fn foo<A1, A2, B>(a1: A1, a2: A2)
fn foo<(A@..)>(a: (A@..))   // => fn foo<(A1, A2), B>(a: (A1, A2))

I don't think that these should be valid syntax. For the first, I don't see the value in two different ways to express the exact same thing. For the second, I don't see the use case for it.

fn foo<A@.., B@..>((a: A)@.., (b: B)@..)         // => ERROR: multiple type parameter packs
fn foo<(A@..), B@..>((a: A)@.., (b: B)@..)       // => ERROR: multiple function argument expansions

instead of making these errors, we could say, in order to specify type parameter packs, you must provide a tuple. This also removes the need for the (A@..) syntax, and makes things unambigious.

foo::<(A, B, C), (D, E, F)>
fn foo<A>((a@..): A)                     // => fn foo<(A1, A2)>((a1, a2): (A1, A2))

This should be an error, we should only allow unpacking type parameter packs, not normal generic parameters.

fn foo<A@(..)>(a: A)                     // => fn foo<(A1, A2)>(a: (A1, A2))

This seems to be closest to my desired syntax. (Which i will show at the end)

Implementation

I don't think we need to cover how to implement this, only how it is specified.

fn foo<A...>(a: A) { todo!() }
fn bar<A..., B...>(a: (...A), b: (...B)) { todo!() }

bar((0, "".to_string()), (0, Bar))
2 Likes

These are just examples. I do not propose to add such functions with this RFC. That's why they are in the Motivation chapter.

Pack expansions and fold expressions with multiple parameter packs mentioned require those to have the same length. (See the last point here)

This only talks about the type parameter lists of generic functions. AFAIK there's currently no form of pattern matching allowed there.

Correct, must have overlooked that.

Regarding the first: I don't like that there's two way to express this either. This comes from the idea to make paranthesis in front of an @.. optional. (See Unresolved Questions) Personally, I would prefer fn foo<A@..>((a: A)@..) as it is more consistent with the rest of the syntax. ((a: A)@.. unfolds to a1: A1, a2: A2, ...Ā“.

Regarding the second: i do not see a use-case for this either. If the parameter pack is used inside the function I would write fn foo<A@(As@..)>(a: A). (A matches the whole tuple, As the single elements) That this is possible just comes from the fact, that you can create a type parameter pack by matching with a tuple type similar to let (head, tail@..) = (1, 2, 3, 4).

This would be possible. Personally I think it would be more intuitive to be able to write and use a variadic function without the need for tuples as e.g. a non-variadic function add::<A, B> would naturally extend to add::<A, B, C> not add::<(A, B, C)>. Also - at least I think that - all functions in Rust that are currently generated using macros could then not be relaced by variadics without breaking the code. I do not know if that would be an issue.

I will change that.

Just for clarification: Here A would be a tuple type, not a type parameter pack. So @(..) is basically a constraint on A to be a tuple type.

I will remove this.

So first, I do not think using ... fits Rust's current syntax. Rust uses .. everywhere else. Also for bar it would be quite unclear how you would take a function pointer to it. Is A... a tuple type? If so, this is not clear from the syntax.

I also briefly discussed this design here.

You could make a type that acts like Vec<dyn Any+Sized>, or really for any dyn Trait+Sized, by decomposing the dyn Trait+Sized fat pointers and exploiting their sizes, all with one heap allocation:

pub struct DynVec { .. }

impl DynVec {
    fn len(&self) -> usize { .. }
    fn push<T: Any+Sized>(&mut self, value: T) { .. }
}

impl Index<usize> for DynVec {
    type Output = dyn Any+Sized;
    fn index(&self, index: usize) -> &Self::Output { .. }
}
impl IndexMut<usize> for DynVec { .. }

impl<'a> IntoIterator for &'a DynAdic {
    type Item = &'a dyn Any+Sized;
    ..
}

I think Index<Range<usize>> cannot work because Index::index returns a fat pointer.

We'd need allocations around varadic methods anyways normally, so maybe such approaches avoid the complexity without undo overhead.

Why would we need allocations? Variadic generics would use monomorphization just like generics already do. Also just like generics they would therefore have no runtime overhead.

I definitely look forward to having variadic generics at some point in the future.

And while I don't want to discourage design on this, I do want to set expectations that I personally* would be surprised if lang were to take action on this in the near future. I suspect that it logistically would end up sequenced behind basic const generics, at least.

* I haven't done anything to confirm this broadly, so don't take this as an official team decision.

7 Likes

This still introduces a potential for post-monomorphization errors, which this pre-RFC under-specifies.

Tweaking your example:

fn zip_and_print<(A @ ..), (B @ ..)>((a: A) @ .., (b: B) @ ..)
{
    println!((a, b) @ ..)
}

There is nothing in the function's declaration which indicates that the function will fail to compile if given mismatched tuples.

Variadics probably need to some way to express "you can't do println!((a, b) @ ..)", because there is no guarantee a and b are the same size inside zip_and_print.

The obvious way to do that is to say "((a, b) @ ..) is only allowed if ((A, B) @ ..) is somewhere in the declaration", but that's completely unprincipled.

A better solution would probably require making a survey of variadic generics in C++ and D, their patterns of use, their anti-patterns, their failure modes, etc, and decide what we want to import into Rust.

The problem with existing "variadic generics" RFCs is that they don't really have a strong story behind them.

6 Likes

It feels about right for me at least :slight_smile: -- variadic generics are fairly low in the list of priorities, as I see it (if we ever do it, it would be in the order of several years from now, imo).

1 Like

This might be a nice feature, but IMHO it should first be thought through very thoroughly. Rust is a powerful language with lots of nice features. However, the sheer amount of features are already overwhelming to beginners sometimes. Variadic generics will most certainly add to that load of things a newbie should learn.

While I didn't read the proposal in detail, similar code should be possible with macro_rules! macros, shouldn't it? It might just not be as ergonomic. Let's not go the C++ route to add 10 different ways to do everything and overload symbols for 10 different contexts. Rust will only be hurt by it. Because of its already large necessary complexity, we should really think twice about adding more of it where it isn't really clearly beneficial.

I think the syntax is unnecessarily verbose, it actually looks like another macro system. Also, ... would be a better symbol, since it's already used for C variadic functions. Here's the alternative syntax I came up with:

pub trait Fn<Args...>: FnMut<Args...> {
    fn call(&self, params: Args...) -> Self::Output {
        func(params...)
    }
}

The above only requires the following features:

  • Variadic type arguments, i.e. trait Fn<Args...>, where Args accepts multiple type arguments that are "collected" into a tuple type
  • Variadic types, i.e. FnMut<Args...>, where Args is a tuple type that is "spread out"
  • Variadic patterns, i.e. fn call(&self, params: Args...), where params accepts a series of values that are "collected" into a tuple
  • Variadic tuples and function calls, i.e. func(params...), where params is a tuple that is "spread out"

This is only a subset of the features you are proposing, but it's simple, coherent, and integrates well with the existing type system, because everything is desugared into tuples:

struct Foo<T...>(T...);

let tuple = (2, 3, '5');
assert_eq!(Foo(1, tuple...), Foo(1, 2, 3, '5'));
assert_matches!(tuple, (2, _...));

This idea could be extended later to allow trait bounds on the tuple items, but it is intentionally very simple.

Prior art: Javascript spread syntax

P.S. this would require a new edition where the ... inclusive range pattern (which currently emits a deprecation warning) is removed.

5 Likes

I wasn't aware of that. In the "Variadic Tuple" RFC I linked in my first post and the discussions on that in this forum const generics were only mentioned once. Also from the discussion I personally got the impression that variadics is a wanted feature.

If a RFC on variadic generics would currently have no chance of being merged anyways, I'm fine with abandoning it. Yet, I would also be happy discussing it if it would have a chance of success. As I'm completely new here, you would have to tell me. But yours and also @Centril's post pretty much give me the impression that it won't - at the moment at least.

Actually, (a, b) @ .. is a pack expansion and therefore requires a and b to have the same size. This in turn also requires A and B to have the same size.

Similar code in C++ would look like this:

template<class ...A, class ...B, size_t ...AIdx, size_t ...BIdx>
void zip_impl(
    std::tuple<A...> a, std::tuple<B...> b,
    std::index_sequence<AIdx...>, std::index_sequence<BIdx...>)
{
    auto zipped = std::make_tuple(std::make_pair(std::get<AIdx>(a), std::get<BIdx>(b))...);
}

template<class ...A, class ...B>
void zip(std::tuple<A...> a, std::tuple<B...> b)
{
    zip_impl(a, b, std::index_sequence_for<A...>{}, std::index_sequence_for<B...>{});
}

when you'd call zip_and_print(std::make_tuple(1, 2, 3), std::make_tuple('a', 'b')) you'd get this error (with clang 9):

test.cpp:25:87: error: pack expansion contains parameter packs 'AIdx' and 'BIdx' that have different lengths (3 vs. 2)
    auto zipped = std::make_tuple(std::make_pair(std::get<AIdx>(a), std::get<BIdx>(b))...);
                                                          ~~               ~~     ^
test.cpp:31:5: note: in instantiation of function template specialization 'zip_and_print_impl<int, int, int, char, char, 0, 1, 2, 0, 1>' requested here
    zip_and_print_impl(a, b, std::index_sequence_for<A...>{}, std::index_sequence_for<B...>{});
    ^
test.cpp:38:5: note: in instantiation of function template specialization 'zip_and_print<int, int, int, char, char>' requested here
    zip_and_print(std::make_tuple(1, 2, 3), std::make_tuple('a', 'b'));
    ^
1 error generated.

The error for your example would be something similar. In C++ you could also explicitly require this using either std::enable_if, static_assert or with C++20 a requires-clause, i.e.

template<class ...A, class ...B>
void zip_and_print(std::tuple<A...> a, std::tuple<B...> b)
    requires sizeof...(A) == sizeof...(B)

That's why I added this section.

I intentionally kept my proposal very close to how C++'s variadic templates work, because that's where I'm coming from and IMHO they work really well and are very intuitive to work with. (I initially stumbled across this because I searched for something similar to Boost.Signals2 in Rust.)

This looks pretty close to rust-lang/rfcs#376. In general, multiple of the prior RFCs on this have already proposed syntax similar to params.... The problem is that even if you could resolve ambiguity by using three dots, this syntax would still be pretty close to range expressions. If param only needs to be a tuple type to be unfolded this way, then (0,).. (which is valid today) and (0,)... would be valid syntax. That's the reason I tried to find a different syntax for this.

But:

I do think that some, or even much, of what I am proposing could be replaced with macros. Personally I think that syntax for this would probably be easier to use. (Again, I'm coming from the world of C++, I'm more used to their syntax.) In RFC#2775 in this post you (@Aloso) suggested a repeat! built-in macro. With that you could write:

fn print<..A>(a: ..A)
{
    repeat!(for v in a { println!(a); })
} 

In my proposal that would simply be (println!(a) @ ..). IMO there at least should be a simple, built-in way to unfold not only parameter packs alone, but expressions as in (a, b) @ .. => (a1, b1), (a2, b2), ....

Yes, but there's nothing in the declaration that requires this. The error would thus be post-monomorphization; in other words, it happens at template instantiation time. In C++, instantiation-time errors are the norm; these days you can stick in requires clauses (and in the past you could use enable_if, static_assert, etc.), but even then, nothing ensures that the requirements you wrote match the actual usage in the template body, so you can still get instantiation-time errors even if the requirements are satisfied. In Rust, the compiler type-checks generics when they're declared; you can only use operations that match the declared trait bounds, so instantiation-time errors are impossible, with only minor exceptions.

10 Likes

I find that a bit confusing, because it's not immediately clear which parts of the expression are unfolded. For example, if you have the following code, and a is a parameter pack, but b is not:

(println!("{}{}", a, b)  @ ..)

Then a would be unfolded, but b would just be repeated. Although this is similar to how macro_rules! works, I'd prefer to be explicit about which values are unfolded.

Also, I'd like to avoid introducing another declarative macro system (which the language team probably wouldn't approve anyway; they are usually against language features that are likely very difficult/time consuming to implement/learn, unless their benefit is remarkable).

1 Like

Thanks for the clarification. The best (and probably easiest) solution I could come up with would be using the following trait (assuming ..A for declaring a parameter pack):

trait SameArityAs<T> {}
impl SameArityAs<()> for () {}
impl<A, (..As), B, (..Bs)> SameArityAs<(A, ..As)> for (B, ..Bs)
    where (..As): SameArityAs<(..Bs)>
{}

with that you could write:

fn zip_and_print<(..A), (..B)>(a: ..A, b: ..B)
    where (..A): SameArityAs<(..B)>
{
    println!((a, b) @ ..)
}

If such unfoldings are then only allowed with this trait bound, this should be an error during monomorphization, correct?

That's the obvious solution, yes, but it's also a band-aid.

What about cases where you want to concatenate tuples? Or combine them in some other non-obvious way?

A proper RFC would need to make a survey of what will and will not be possible with variadic generics. That survey should be based around existing use-cases in other languages, and have rules designed to fail gracefully (that is, the compiler should be able to explain to users where they made a mistake).

This could be solved with a built-in trait, once const generics are stabilized:

pub trait Arity {
    const ARITY: usize;
}

Since specifying the arity is in the future possibilities sections, I guess this is not a priority. However, I have identified some other issues:

  1. I noticed that the grammar is ambiguous in some situations:

    FoldExpression: (Expression OPERATOR ..) | (.. OPERATOR Expression) | (Expression OPERATOR .. OPERATOR ConstantExpression) | (ConstantExpression OPERATOR .. OPERATOR Expression)

    For example, (a / ..) is currently parsed as a / (..) and (.. + a) is parsed as ..(+a).

  2. Parsing multiple paramter packs can be ambiguous:

    fn zip<(A @ ..), (B @ ..)>((a: A) @ .., (b: B) @ ..) -> ((A, B) @ ..)
    

    Invoking it (e.g. zip(4, 2, "", true)) can't be parsed, because we don't know which argument should belong to which paramter pack. This must therefore be forbidden. Also, nested parameter packs like A@..@.. must be forbidden (but not necessarily (A@..).., which is a parameter pack containing variadic tuples).

    This is implied by some examples, but the exact rules aren't explained.

  3. The grammar is incomplete and vague. From the nomenclature section:

    In fn foo<A@..>(a: A)@..) the syntax (a: A)@.. is called an function argument pack expansion .

    Function arguments are patterns, but the proposed grammar only accepts tuple patterns, so fn foo<A@..>((a: A)@..) would be rejected.

    The expression <Expression>@.. is called a variable pack expansion , in this context .. is called expansion operator .

    This part is also missing in the grammar. I can't find any information where variable pack expansions should be allowed. Obviously they're allowed in tuples and function calls, but what about tuple structs, tuple-like enum variants or arrays literals?

  4. The semantics aren't always clear:

    fn foo<A@(..)>(a: A)                     // => fn foo<(A1, A2)>(a: (A1, A2))
    fn foo<A@(..)>(a: (A@..))                // => ERROR: A is not a type parameter pack
    fn foo<A@(As@..)>((a@..): A, b: (As@..)) // => fn foo<(A1, A2)>((a1, a2): (A1, A2), b: (A1, A2))
    

    IIUC, the type parameter A@(..) is equivalent to A with the constraint that A must be a tuple. This isn't explained anyhere.

    Also, I have some questions about the semantics of the following:

    // what does `Bound refer to in these examples?
    fn foo<A@..: Bound>(a: A@..)
    fn foo<(A@..): Bound>(a: A@..)
    fn foo<A@(..): Bound>(a: A@..)
    
    // what does `Bar` refer to in these examples?
    trait Foo<A@.. = Bar> {}
    trait Foo<(A@..) = Bar> {}
    trait Foo<A@(..) = Bar> {}
    
    // should this be allowed?
    fn foo<(A: Bound)@..>(a: A@..)
    fn foo<(A: Bound)@(..)>(a: A@..)
    trait Foo<(A = Bar)@..> {}
    trait Foo<(A = Bar)@(..)> {}
    
  5. For consistency, it should be possible to have variadic, but non-generic types, e.g.

    fn foo(a: ((&str, &str)@..));
    
  6. Should it be possible to have variadic lifetimes? E.g.

    fn foo<('a, 'b)@..>(a: ((&'a str, &'b str)@..));
    
  7. Should it be possible to pass pack expansions to macro_rules! macros? It could look like this:

    macro_rules! foo {
        ($($e0:expr),* $(, $exp:expansion, $($e:expr),*)* $(,)?) => {...};
    }
    foo!(a, b@.., c, d);
3 Likes

I think some of your question show a misunderstanding of what I am fundamentally trying to achieve in my proposal. So maybe first in general: the way I came up with the syntax in my proposal was that in current Rust the following is valid syntax:

let tup@(a, .., b) = ...

and my initial idea was that you could allow to assign an identifier to .. like it is allowed in slices, i.e.. tup@(a, as@.., b). From there I thought it would be intuitive if

  • you would use the same syntax for type parameters and
  • that when you can "collect" a parameter pack (i.e. a comma-separated list of types) by @.. you could also "uncollect" it similarly (into a comma-separated list).

And the most important bit is this rule of thumb:

In any context, (E(a))@.. where E is a type, expression, etc. mentioning a parameter pack a, it can be read as E(a1), E(a2), ..., E(an).

What creates some confusion I think is that I made paranthesis optional on the left side of the @ and specified that @ would match the expression or type as far to the left as possible. This leads to the rather unintuitive situation where a: A@.. is equivalent to (a: A)@... I think it would be a better choice to make the paranthesis mandatory.

Also I found quite some placed where there should be a , to make it an actual tuple expression or pattern. I'll need to fix this.

Applying the "rule of thumb" to your questions:

  • A@..: Bound would theoretically expand to A to A1, A2, ..., An: Bound. This would be disallowed, cause it's unintuitive. To put a bound on every element of the tuple you would write (A: Bound)@..
  • (A@..): Bound expands A to (A1, A2, .., An): Bound, so it's a bound on the tuple type.
  • A@(..) is basically equivalent to the above one, you just did not assign an identifier to .. but to the full tuple. (like you could also do in my general explanation for variables where tup was the full tuple ans as was the parameter pack)
  • (A: Bound)@(..) would be disallowed. The correct syntax here would be A@(..): Bound.
  • For = Bar it's basically the same explanation as above for : Bound.

That's why in my proposal:

fn foo<(A@..), B@..>((a: A)@.., (b: B)@..)       // => ERROR: multiple function argument expansions

This is actually in the rules. Here the outer expansion would have no parameter pack within it's scope, which is forbidden. (You could also formulate this rule the other way around: each parameter pack can only be expanded once.)

How does the proposed grammer only accept tuple patterns? Actually, function parameter are not exactly patterns according to the reference. They are OuterAttribute* Pattern : Type which is why I gave them another name. By the rule of thumb from above fn foo<A@..>((a: A)@..) unfolds to fn foo<A1, A2, ..., An>(a1: A1, a2: A2, ..., an: An).

I think the notion of parameter packs would naturally extend to lifetimes, yes. I haven't considered that in my proposal yet. I'll add this as a todo chapter.

As of now I would put this under future possibilities until the basic syntax is more fleshed out.

I put this under future possibilities because I didn't think of those traits at the time of writing. I will add those to the document.

That's true. I actually think that at least for fold expressions a macro would be better suited. Something like fold!(a with +). I will put this under future possibilities.

I will update my proposal with some of your feedback. But I'll need some time for that.