Variadic generics design sketch

Variadic generics are a long-desired feature in Rust, but the syntax and semantics are not easy to get right. I've sketched out a possible design in a HackMD:

I would appreciate feedback, especially on marked TODOs. Feel free to respond either here, or in comments on the HackMD.

28 Likes

Using variables should be either ...vals or ...(val1, val2, val3)

let e: Foo = Foo(...d);
let f: [i32, 3] = [1, ...(2, 3)];
let g: (i32, i32, i32) = (1, ...(2, 3));
let h: Foo: = Foo(...(true, false));
let i: [i32, 3] = [1, ...(2, 3)];

but d is wrong.

// wrong
let d: (bool, bool) = (...Foo(true, false),); 

// right
let Foo (...d) = Foo(true, false); 

Same for arrays:

// wrong
let _: &[i32; 2] = head;
// right
let _: &[i32; 2] = [...head];

// wrong
assert_eq!(head, [1, 2]);
// right
assert_eq!([...head], [1, 2]);

let [ref mut ...head, tail] = a;

// wrong
let _: &mut [i32; 2] = head;
// right
let _: &mut [i32; 2] = [...head];

And for "static for" patterns:

let t: ((i8, u8), (i16, u16), (i32, u32)) = static for i, u in (-1, -2, -3), (1, 2, 3) {
    (i, u)
};

// wrong
let t: ((i8, u8), (i16, u16), (i32, u32)) = static for ...tup in (-1, -2, -3), (1, 2, 3) {
    // wrong
    tup
    // right
    (...tup)
};
2 Likes

Your second example, Foo(...d), is correct, I agree. But why do you think (...Foo(true, false),) should be rejected?

My intention with the current design was that ...ident patterns should have the same meaning as ident @ ... Why do you want that to change?

tup is already a tuple, there is no need to splat it. (I've updated the doc to specify this explicitly.)

Ok, next: Much better to have unified syntax, like this

  1. fill "...Ts" by ...<T1, T2>
  2. use "for" for group "...for<_>" by < for<'a1, T1>, for<'a2, T2>>
<...Ts> =>   <...<i32, u32, usize>>
<...Ts, ...Us>  =>  <...<i32, u32, &str>, ...<u32, usize>>
<...'as, ...'bs>  =>  <...<'static, '_>, ...<'a, 'b>>
<const ...As: i32, const ...Bs: u32>  =>  <...<-1, 0, 1>, ...<42>>

<...for<'a, T: 'a,  const N: T>>   =>   <...<for<'static, i32, -4>, for<'_, bool, false>>>

And function with grouped

// wrong
let t: (usize, i32) = static for type T in <usize, i32> {
    T::default();
};
// right
let t: (usize, i32) = (...<usize, i32>::default());

In the "Design Principles" section, there is a bullet point:

  • “Explicit is better than implicit.”
    • No implicit iteration! Ugly loops are preferable to beautiful bugs.

The explicit static for loop syntax was very much a conscious choice.

1 Like

because Foo is a type (technically a constructor), but not a sequence

by analogy:

// wrong
let d: (bool, bool) = (...Foo(true, false),);
// wrong, accidentally is right by  ...(val1, val2, val3) syntax
let d: (bool, bool) = (...(true, false),);
// wrong
let d: (bool, bool) = (...[true, false],);

I think that pettern variable has same rules as ordinary variables, so we need ...ident @ .., but not ident @ ..

no, tup is a "sequence"

Do you wish loops? Ok, then we need something like that:

let mut t: (usize, i32) = 
let t: (usize, i32) = static for x : T in (...<usize, i32>) {
    x = T::default();
};

So far I like it.

You're missing examples of how this would work with where clauses.

3 Likes

While we’re on the topic, I’ll mention that Swift recently added a (currently fairly restricted) form of variadic generics with syntax very different from the C++ ..., and it would at least be worth mentioning as a “considered alternative”.

15 Likes

Cool design. I especially like the type-level for<T in Ts> syntax. Overall it seems like a good exhaustive list of the things you might want from generic variadics.

That said, I think it's maybe trying to do too much at once? Like, it's trying to do variadic types and tuple spreading and varargs in functions. It seems a bit too big for an RFC. For a design sketch, though, it gives a decent goal to navigate towards, so idk.

Something I never see in any of these proposals that I wish people did more is to consider what error messages would look like. What would be forbidden, what would be allowed, how would the lines be drawn and what would be the corner cases. This is especially important if you're introducing compile-time tuple spreading, because that has a lot of potential for post-monomorphization errors.

3 Likes

Awesome to see this picked up again.

I'd suggest to add to the list of prior work the Circle lang. It is an extension of c++20 with a lot of design effort put into its metaprogramming and it leverages variadics extensively. A lot of good ideas there that we could learn from.

One thing that I'd like to see is making types eventually first class entities of the language. This would actually simplify greatly this proposal.

6 Likes

Firstly, I really don't like the description of static for loops as "iteration, but guaranteed to be unrolled at compile time"; that implies that it's possible to iterate over the expression after the in at runtime. Rather, I'd prefer to see static for initially explained as syntactic sugar for writing out repetitive blocks of code; in this model, you get:

let _: (Option<u32>, Option<bool>) = static for i in (42, false) {
    if !pre_check() {
        continue None;
    }

    Some(expensive_computation(i))
};
// desugars to
let _: (Option<u32>, Option<bool>) = {
    {
        if !pre_check() {
            continue None;
        }

        Some(expensive_computation(42))
    },
    {
        if !pre_check() {
            continue None;
        }

        Some(expensive_computation(false))
    }
};

This makes it very obvious that while we're copying the forms used for iteration, it's not "iteration, but unrolled at compile time" (and thus you can't expect to use a static for over an ExactSizeIterator or a TrustedLen iterator, even though in principle, you might expect to be able to fully unroll those loops at compile time).

There is a problem here desugaring static for loops that contain a break, because you need some way to jump to the end of the loop, but it heads off my objection to iterating over a heterogenous container (since it's not iteration, it's just a neat way to write repetitive code).

Secondly, I'd like to see more thought about the "weird" combinations. What happens when I have two variadic generics in a single function type? What happens when I try to put a non-variadic argument after a variadic one?

Thirdly, you introduce <...Is: Iterator> without explaining how I'm supposed to interpret it - there's only one obvious interpretation, but I'd like to see you call that out explicitly, in case someone else comes up with a silly interpretation.

Finally, I've spotted a few typos/thinkos, which I've commented on in the HackMD.

The proposed syntax is interesting. It reminds me of argument capture in Rust's declarative macros. I suspect it would have the same cognitive load in non-trivial type expressions.

There is one such: look for fn all_3.

It's already linked from the "Prior Art" section.

Yes, this is 3-4 RFCs minimum.

There is one doesnt_work example, I will try to add more. The intention is to only reject combinations that are ambiguous, so for example putting a variadic argument in front of a non-variadic one is allowed (though I don't feel stongly about it). There aren't intended to be any post-monomorphization errors.

Thank you for the reference! I've added it to the doc, will try to review it in more detail.

Part of me also wishes we could have something like Zig comptime in Rust, but there are difficulties. Notably, lifetime-dependent specialization must remain impossible.

Scroll up, the syntax is explained in the fn default example.

3 Likes

As an editorial note, I expected that blocks without explanatory text between them were all examples of the same thing (since they were earlier in the document, with static for).

Having a single sentence telling me what the next block is about would make it much easier to follow - so something like "You can put bounds on variadic generics; these are applied to every element of the variadic" would catch my eye and stop me scrolling straight past the explanation I was looking for.

1 Like

Looking at doesnt_work specifically, I would love to see you flesh out the following two examples:

fn confused<...Ts, I: Iterator, ...Us>(i: I, ts: ...Ts, us: ...Us);
fn evil<...Ts, ...Us>(ts: ...Ts, i: usize, us: ...Us);

These two are vicious in different ways, since they are potentially ambiguous, and my bias is to say that they're both errors, since there are clearer ways to write them.

confused is one that is unambiguous if either all types in ...Ts or all types in ...Us do not implement Iterator. For example, confused(vec![1,2,3].into_iter()) is unambiguous, since ...Ts and ...Us are empty, as is confused(0u32, 1.0f64, vec![1,2,3].into_iter(), 2.0f32, 3i8), since none of the numeric types implement Iterator. But historically, trying to reason about negative trait implementations has led to soundness issues, and I'd prefer confused to be always an error to avoid that risk.

evil is straight up nasty. As long as neither ...Ts nor ...Us contains usize, you can determine unambiguously what evil's type is from its arguments, so in that sense it's not ambiguous. But, unlike confused, there's no way to use the turbofish syntax to explicitly label the type of a monomorphization of evil; the only way to be explicit is to write out the full function signature, including i: usize at the appropriate point.

And you can improve both of them by using grouping in the generic parameters:

fn less_confused<<...Ts>, I: Iterator, <...Us>>(i: I, ts: ...Ts, us: ...Us);
fn less_evil<<...Ts>, <...Us>>(ts: ...Ts, i: usize, us: ...Us);

These are still unpleasant, but it's at least always possible to set their types via a turbofish.

The other thing I see (but have put zero thought into) is having this interact with APIT and RPIT. Is there useful meaning to be given to the ...impl in the following two signatures:

fn apit(is: ...impl Iterator);
fn rpit(...names: <...&str>) -> ...impl std::net::ToSocketAddrs

(where I hope I've understood the syntax properly, and ...names is a variable-sized tuple of &str).

1 Like

Relying on negative reasoning like that is a serious semver hazard; confused is definitely illegal, as is evil.

I will have to think about less_confused and less_evil. Arguably the idiomatic solution is to tuple-wrap the value parameters, but should that be required? My inclination is to be maximally permissive at first; restrictions can be added later based on feedback from implementation or usage.

Your apit and rpit examples are not correct by the design. ... operates on a "parameter pack"[1]. The only ways of producing such a pack are:

  • ...Ident generic parameter
  • Comma separated list of types enclosed by brackets: <usize, &str>
  • for<_ in _> _ mapping over a pack

Clearly this needs to be explained better in the doc. One source for confusion here is that at the value level, ... is strongly associated with tuples, as any collection of values can be expressed as a tuple[2]. In contrast, type-level ... is not particularly linked to tuples[3], because it needs to support lifetime and const generics, and e.g. ('a, 'b) isn't a valid type. There may be a better design here, though.

fn apit(...is: ...impl Iterator); is interesting, currently the proposal doesn't handle it, but arguably it should be the same as fn apit<...Is: Iterator>(...is: Is);?

For fn rpit:

  • ...&str is not valid, because type-level ... only works on packs. There's no way to express "homogeneous tuple of arbitrary length" with the current state of this proposal; arguably that's what arrays are for.

  • ...impl std::net::ToSocketAddrs is not valid either, as again ... only works on packs.

What you could do, however, is this:

fn rpit<...<'as, Names: AsRef<str>>>(...names: for<<'a, Name> in <'as, Names>> &'a Name) -> (...for<_ in Names> impl std::net::ToSocketAddrs,);

(Which reminds me, I need to consider lifetime elision...)


I've added a few TODO sections to the end of the doc, noting limitations of the design.


  1. There is a TODO about nomenclature, suggestions welcome ↩︎

  2. Except for unsized values, but even that is not set in stone. ↩︎

  3. The only such link is with varargs, where the type of the vararg parameter specified in the function signature is a parameter pack, but when the binding is used inside the function body, it acts mostly as a tuple. But even that is marked with a TODO because I'm not sure about it. ↩︎

1 Like

APIT is never required - fn foo(f: impl Trait) can always be rewritten as fn<T: Trait>(f: T), and thus I think your suggested interpretations is sensible, but it's also completely reasonable to say that there's no APIT with variadics.

I do think you should think about what variadic RPIT should look like - the problem with your suggestion is that (if I'm reading it right, which is never guaranteed), there have to be as many values in the return as in one of the arguments - but it would be useful to be able to express "the return type is a pack of unknown size, all elements of the pack are known to implement a given trait" somehow. For example, if fn rpit filtered the supplied names by some sort of validity criteria, and then returned impl ToSocketAddrs for each of the valid names, how would I write that return type? Similarly, what would I write to make fn rpit(names: &[&str]) -> ...impl ToSocketAddrs workable, where I return a pack based on at most N resolvable names from the input slice?

That would have to happen at compile-time, which would make its usefulness questionable. But maybe there is a use-case I am neglecting?

The only use case I can think of, and it's a stretch, is the case where I have multiple input packs of different sizes, and I want the output pack to be a different size where the compile-time size of the input packs determines the size of the output pack. The two useful cases are summing up the size of two input packs (equivalent to iteration's chain() method), and taking the shortest of two packs (equivalent to zip()).

That said, this needs more than just RPIT - it also needs some way to do the chain or zip operation on inputs at compile time, and I would expect that the syntax that lets you do that at the value level could be reused to let you build the output pack in the way you suggested for fn rpit.