[Analysis / Pre-RFC] Variadic generics in Rust

I've said a const iterator as a tacit agreement with:

iow, conceptually, this is an iterator over the type constructor itself so each Iterator::Item is of type Type and its value is the concrete type, resulting in an unrolled loop at runtime. The loop is executed by the compiler, not by your program. We have added another level of indirection but the same iteration protocol applies.

I fully agree with @PoignardAzur. The idea is to support a use-case Using `impl Trait` for monomorphized heterogenous collections

I understand part of bad track record is that C++ yields post-monomorphisation errors. E.g. errors that show up after a template has been instantiated. Rust has so far avoided such errors and hopefully would avoid them with ...T as well.

BTW post-monomorphisation errors is exactly where macros are worse than special language features like the one proposed here. Yes macros can do a lot of similar things but error messages would be less clear.

I think

fn make_tuple_sing<T> (t: (T)) where T : Sing

syntax doesn't express this intent as clearly as either of

fn make_tuple_sing<(T)> (t: (T)) where T : Sing // hmm.. doesn't (T) mean something already?..
fn make_tuple_sing<...T> (t: ...T) where T : Sing
fn make_tuple_sing<...T> (...t: ...T) where T : Sing

To me this is exactly the "a separate C++-like parameter pack ... syntax". These are templates akin to macros and akin to C++ but hopefully with much better error reporting and overall nicer to use. The only thing we're bike-shedding here is what exactly this syntax might look like.

3 Likes

Oh, yeah, I forgot that part.

To answer the general comment, yeah, if you add extremely powerful meta-programming, compile-time reflection, and first-class types, then sure, taking arbitrary tuples becomes trivial.

But I think you're underestimating just how major of a change would have to be made to accommodate even your minimal example. Rust right now is nowhere near capable of using types as first-class values inside const functions the way eg Zig does. Variadics have the advantage of being an incremental improvement.

Speaking more generally, I'm not really in favor of super-advanced type arithmetics, for reasons I described in the analysis. The more different parts of your type system are Turing-complete, the harder it is for the compiler to reason about types, and the harder it is for maintainers to control language semantics (think Lisp and its 2 billion incompatible extensions). Having restricted semantics has advantages, especially in terms of error messages (and avoiding post-monomorphization errors).

I mean, you're always welcome to make a RFC to add first class types to const functions, but I predict you'll face a brick wall.

(I guess I should add a section to the analysis, specifically about first-class types)

2 Likes

I guess we differ in our conclusion here since as I see it the type system is already Turing complete and so you cannot make it more Turing complete. on the other hand having a more regular implementation instead of disparate use-cases means the compiler handles everything's in a uniform way instead of having to implement and maintain multiple ways of doing essentially the same calculations inside the compiler. Thus i disagree that this approach makes it harder for the compiler to reason about code. Though I'll grant you that it would take more upfront work to get to this better design.

Lastly, we can separate the two concerns - the compiler can special case this use case interanaly for now yet expose a uniform syntax at the surface. Thus users would be able at least to use tuples in a way that is more idiomatic even though they won't be able to implement new type constructors themselves.

True. But this also creates a path dependency. If variadics are added to the language, and people inevitably start (ab)using their capabilities to their absolute maximum, there will be pressure to expand those capabilities by piling more incremental features on top, rather than exploring a cleaner design. That's what happened with C++ variadics. (and Pin in Rust, IMO...)

3 Likes

This is technically true, but also completely irrelevant. Today, virtually no programs actually exploit the type system's Turing completeness, and what you're proposing does have major consequences that variadics alone do not.

As soon as your type system starts doing arbitrary computation like this, you have two options to choose from:

Option A: The compiler fully executes all type-level code before it can type-check the resulting program. This is how Zig works (and C++, except its type level programs repurpose templates for control flow rather than reusing the base language).

Unfortunately, this option forces the compiler to delay many checks until after that execution, leading to an explosion of post-monomorphization errors (or more generally, post-type-level-execution). This is the source of C++'s infamous error messages and is a complete non-starter for Rust.

(Specifically, if you're unfamiliar, these error messages take the form of "X failed, but X was invoked by Y, which was invoked by Z, which was invoked by W, ..."- there is no clear choice for where to pin the blame, because this layer is essentially dynamically typed.)

Option B: Alternatively, the compiler gains the ability to "normalize" functions- executing them abstractly before their arguments are (fully) provided. This is how dependently typed languages work- and barring Option A, your proposal requires this.

Rust mostly avoids this dilemma today, because so far all it has to normalize are trait bounds. Any Turing completeness lurking here is simply cut off with arbitrary limits that normal programs rarely go anywhere near.

(However, when you do hit those limits, you get ugly C++-like error messages again, with chains of "note: required because impl X, which is required because of impl Y, ..."!)

I have one more comparison to make with dependent types: such languages are usually not Turing complete, because that would weaken the power of the type system. Instead, all their control flow is primitive recursive, and funny enough, so is type-level ...T.

So if we want a cleaner design to grow into, arguably it should have absolutely nothing to do with first class types-as-values, let alone using normal Rust control flow on them. Instead it would look more like ...T but applied to other shapes- e.g. structs and enums as well as tuples.

5 Likes

I certainly think it makes sense for there to be ways to look at tuples as sequences or trait objects, or to map over them with generic function, or similar.

But that's only a small part of the things for which people want variadic generics. For example, it doesn't help at all for something which takes a variadic list of types and makes an enum type with a variant for each of the types.

1 Like

No, those aren't the only choices. Rust already makes compromises here:

  • Post-monomorphization errors are already possible in semi-rare circumstances, such as struct size overflow, and stack overflows when checking whether a specialization applies.

  • Macros are not checked at all; they are effectively option A.

    Admittedly, macros do not participate in the type system, but they do compete with it. For example, @PoignardAzur's writeup mentions the possibility of using variadics as a cleaner alternative to custom derives.

    If Rust holds the line against post-monomorphization errors, but in doing so drives people to use even-harder-to-debug macros, that might be a Pyrrhic victory. (But it depends; macros do have some debuggability advantages, such as being able to dump the post-expansion source code.)

It would similarly be possible to compromise by saying that most code can be proven free of post-monomorphization errors, but code that uses compile-time reflection cannot. That would be dangerous in theory but nowhere near the anarchy-in-practice of C++.

I have one more comparison to make with dependent types: such languages are usually not Turing complete, because that would weaken the power of the type system. Instead, all their control flow is primitive recursive, and funny enough, so is type-level ...T .

Dependent types are great if you're willing to spend lots of extra time proving a program is correct. No, really: I'm not against them, in general. But even if Rust had them, having to spend that time just to prove some piece of compile-time metaprogramming doesn't divide by 0 somewhere would be a waste. People would just resort to hacks to make the compiler shut up, e.g. "if b != 0 { a / b } else { 0 }" for "safe" division.

2 Likes

Some (extremely unstructured) thoughts on the whole "Turing completeness" discussions:

I think a lot of the nuance in these discussions often gets lost, because, in the big picture, Turing-complete and non-Turing-complete aren't really the right labels.

For instance, @comex referred to Turing tarpits in the discussion on github, and I think the concept is very relevant here.

Rust's type system is Turing complete; and if you actually try to leverage its "turing-like" features (for instance, with the typenum crate), you can quickly produce code that either compiles or produces unreadable errors. That being said, we don't worry too much about Rust's Turing-complete emergent properties.

The key notion is that these emergent properties aren't on the path of least resistance. If you really want to, you can write overly complicated generics that the compiler can't reason about or turn into human-readable diagnostics, but doing so in Rust is harder than making generics that degrade gracefully.

This is in contrast to C++ generics, or Rust macros, where making code that is too clever to be analyzed is the default, and in some cases, the only way to implement desired features.

This is where we need to consider the "path of least resistance" idea.

Like you point out elsewhere, people are already motivated to stretch the limit of the type system in ways that are inconvenient to maintain; it's just that right now the stretching takes the form of macros, and copy-pasting trait implementations 12 times. Adding less hack-ish ways to do the same things doesn't increase the number of ugly hacks; at worst, it transfers them from proc-macro wizardry to type-system wizardry.

And at best, if variadics are the most convenient way to perform the most common metaprogramming use-cases, then people will be less motivated to create brittle hacks at all, the same way people coming from C++ eventually learn to stop relying on Cell everywhere and start using simpler data structures.

I didn't know the programming theory concept, but yeah, this is exactly what I was thinking about when I wrote the "Restrict tuple arithmetic" section.

Another compromise could be to say "post-monomorphization errors are like panics/unwraps: in a mature library they should never happen, but it's not the end of the world if they do".

If the language goes that route, then we would have to start treating generics like functions in a dynamically typed language: unit tests to check that standard use-cases work, instrumentation to check that every branch is tested, smart fuzzing to test non-standard combinations, etc. The language could help by indicating which branches need to be tested, and which are guaranteed to compile every time; we could even add optional hints for a SMT solver that would refine the analysis further.


Regarding errors in general, the ideal is always If the developer gets something wrong, the compiler can tell her where the error is.

I think the way to maintain that with Compile-Time Reflection would be to do Parse, Don't Validate, except with first-class types.

(ironically, I think it's terrible advice for a compiler, but it's great advice here)

That is, when designing compiler features, always try to frontload detecting any potential errors as "close" to the offending code as possible. When adding reflection features that let users introduce their own holes, encourage them to do the same.

And if the user really needs to make a macro/template/complicated-reflection-thing that can fail deep in the code, encourage them to pass enough information that the buried error can still be turned into a helpful diagnostic.

(This is an area where I think "linear" variadic generics are pretty good; if your bound is "all the types you pass must implement to MyTrait" and you pass a tuple T1, T2, T3, T4 where T3 doesn't implement MyTrait, the compiler can trivially tell you "whoops, T3 doesn't implement MyTrait, try again!")

And I think all of that has to be done deliberately.

Instead of adding reflection the way Zig or D do it, by saying "how can we expose as much information from the type system as we can?", we should ask "What are the common reflection use-cases, and how can we provide tools to implement them, such that the path-of-least-resistance for the tools is one that degrades gracefully, composes well with code you're not aware of, and provides helpful diagnostics when things go wrong?"

(so basically the pit of success except for templates)

3 Likes

Ah, yes, the old "add painkillers to your code" approach.

2 Likes

Uh, sure, but that's not what anyone is proposing? ...T-style variadics without yigal100's first-class types approach are specifically designed to replace a big use case for macros without stepping into dependent types territory OR giving up on type checking before monomorphization.

My comment about primitive recursion was not an attempt to propose dependent types either- rather a suggestion that all these extra capabilities you're worried people will push for should also be achievable without dependent types or post-monomorphization errors.

1 Like

How would that work? To recap: The problem with computing directly in the type system is that it's a Turing tarpit. The solution is to use normal Rust code instead, which we're already getting to some extent in the form of const generics (even if they can't currently compute with types). But Rust code can panic or infinitely loop. So are you suggesting some sort of termination checker for const expressions that enforces primitive recursion? Or are you suggesting that the type system would somehow be extended to not be a Turing tarpit, without using normal Rust code?

If it's the latter, I don't see how primitive recursion has much to do with it.

If it's the former, a checker for const expressions… well, before worrying about termination, let's start with panics. Right now, with const generics, you can't even add two numbers together in a generic context because of the fear of post-monomorphization errors should the addition overflow.

One suggestion has been to add a nopanic function attribute; the compiler would enforce that such functions don't panic and only call other nopanic functions. Then you could use nopanic functions in const expressions without adding bounds. One example of a nopanic function is wrapping_add, which could be used instead of normal addition.

...Except that makes no sense! In the unlikely event the addition overflows, it's almost certainly a bug – and the const evaluator knows it. Using wrapping_add would be asking the compiler not to tell you about the bug, just to avoid the spectre of a post-monomorphization error.

Surely, here, the cure is worse than the disease.

What are the alternatives?

One: don't use addition, or (lest someone propose arbitrary-precision integers as a solution to overflow) division, or array indexing, or unwrap(), or anything else that can fail. That's how you get a Turing tarpit.

Two: have const expressions come with compiler-checked proofs that, as long as the inputs are correct, they will never overflow any of their additions or panic any other way. Sounds cool. However it would require dependent types.

Three: add a "this expression is valid" bound, and just require any generic thing that wants to use const expressions to add those exact expressions to its list of bounds. That seems to be the current plan. It's not as obviously flawed as the other options. But it does mean that you'll often be in a context where you can't use some obvious const generic computation without sticking in semantically meaningless bounds, and as those bounds propagate up the stack, you'll end up with failures that are just as mysterious as "true" post-monomorphization errors.

I know I've gotten a bit off topic. After all, I'm talking about a problem that already exists in const generics, not one that's specific to types-as-values. And I suspect you may not have been proposing types-as-values in the first place.

But that's the thing. I think we're already headed for a reckoning on post-monomorphization errors, once const generics are fully implemented. I think we will end up deciding that post-monomorphization errors in some scenarios are worth the cost. And then the question of whether to hold the line against them here will be moot.

It will, as @PoignardAzur suggested, become less about absolute guarantees, more about ensuring that the 'path of least resistance' checks as many things as possible, as early as possible, and produces as high-quality diagnostics as possible.

Sidenote: Just because something is pre-monomorphization does not mean it has pretty diagnostics. One benefit of post-monomorphization errors via panics in const expressions would be allowing developers to generate their own error messages; I think those might often be easier to understand than the pre-monomorphization equivalents.

4 Likes

Is it, though? Looking at this line in the "Minimum viable product" section:

No variadic functions. Eg fn foobar<...Ts>(tuple: (...Ts)) is allowed, fn foobar<...Ts>(args: ...Ts) isn't.

This would seem to contradict that assumption. The only way I can read this (and other parts of that document) is that ...T is a list of types (i.e., a parameter pack), which can be expanded into a tuple type by surrounding it with brackets: (...T).


My point is that we should think hard about adding a new language item, given that Rust already has tuples and arrays that - in theory at least - can be extended to support those use cases just as well.

While C++-like parameter packs seem easy at first (but as above argument highlights, may actually not be), but will need to be maintained and supported, even as const generics continue to evolve and may well exceed their capabilities.

To bring the discussion back to the "MVP" problem at hand, the question for me is: can the "MVP" goal from the "analysis" document be achieved, but just using already existing tuples and arrays (for the sake of brevity I'll stick with just tuples here), and in a way that'll still allow future extensions in a hopefully seamless way?

My approach to this end would be pretty much the same as @yigal100's, which got dragged into a syntax bike-shedding that IMHO missed the point. To try again, my first steps would be:

A compiler-generated trait for tuples

trait Tuple {
    const LEN: usize;
}

Such that code can easily check whether a type is a tuple, and its length if necessary.

Some nice syntax for generic tuples:

In order to avoid cumbersome impl Tuple<LEN=xyz> everywhere, and to allow further trait constraints on the member types. For the purposes of the MVP my suggestion here would be:

let a: (...);            // tuple of arbitrary size
let b: (...; 7);         // tuple with 7 members
let c: (...; _);         // tuple of arbitrary size, just like 'a' above
let d: (...: Sing);      // tuple of arbitrary size, each member has to implement 'Sing'
let e: (...: Hash; 3);   // tuple of size 3, each member has to implement 'Hash'

With this, the "go-to example" function signatures would look like

fn make_tuple_sing(t: (...: Sing))   
// or, if one needs to refer to the tuple type by name:
fn make_tuple_sing<T: (...: Sing)>(t: T)  

fn hash_tuple<Ts: (...: Hash), S: Hasher>(tuple: Ts, state: &mut S)

fn zip<const N: usize, Ls: (...; N), Rs: (...; N)> zip(ls: Ls, rs: Rs) -> (...: (...; 2); N)

The above intentionally look very similar to the "variadic" examples, except strictly just with tuples. In addition, they also:

  • have the implicit ?Sized bound for the last element of any tuple in the hash_tuple, that AFAICT is missing/unclear in the "variadic" approach
  • the requirement that both tuple arguments have to be the same size is immediately obvious from the signature of zip, as well as the fact that the return type is a tuple of the same length. Downside: the signature doesn't make it obvious that the member types from the parameters will also be in the returned tuple. If desired that'd be an exercise for more involved meta-programming down the road.

What's left as an exercise for the reader is how to implement "tuple for" in the same way as outlined in the original document, which - I strongly suspect - will be far more difficult than the "micro feature" it's described as... :wink:


Going forward, we could extend the capabilities step by step:

// Avoiding the unnecessary explicit `const N: usize` parameter, but harder to read:
fn zip<Ls: Tuple, Rs: Tuple> zip(ls: Ls, rs: Rs) -> impl Tuple<LEN = <Ls as Tuple>::LEN> 
where <Ls as Tuple>::LEN == <Rs as Tuple>::LEN
// Perhaps allow inline-declaration of consts for matching:
fn zip<Ls: (...; const N), Rs: (...; N)> zip(ls: Ls, rs: Rs) -> (...: (...; 2); N)

// require specific types at the start of a tuple:
type HashableDataWithHeader = ([u8; 4], ...: Hash);    

// Once const generics allow for it, can do away with the "same length" requirement. 
// Also throwing out some other ideas and thoughts I had for tuple meta-programming
fn <Ls: (...; const LN), Rs: (...; const RN)> 
outer_zip(ls: Ls, rs: Rs) -> (...; max(LN, RN)) {
    let M = min(LN, RN);

    // tuple comprehension - this is very much spitballing, 
    // syntax VERY much open for bikeshedding
    (
        ... for i in 0..M => (Ls[i], Rs[i]),   // indexing tuple members by variable
        ... for i in M..LN => (Ls[i], ()),
        ... for i in M..RN => ((), Rs[i]),
    )
}
3 Likes

I honestly don't understand the point you're making.

You're implying that the syntax you suggest is a natural extension of existing tuple syntax. I'm not sure how exactly you're defining "natural", and it's probably a little subjective. But if you mean "incremental" or "uses what's already there"... then I don't really see it. Sure, you're adding a tuple trait, and that uses the trait system that already exists.

But if your MVP syntax includes:

// this
fn make_tuple_sing<T: (...: Sing)>(t: T)
// instead of this
fn make_tuple_sing<...Ts: Sing)>(t: (Ts))

I mean, the difference seems minimal.

To be clear, I'm not opposed to your syntax (as long as we're not adding first-class types manipulated by const functions because that would be "simpler" than variadics). But in terms of work to add it to the compiler, and of how these syntaxes would impact future changes and maybe have to be reconsidered, I don't really see how the syntax you propose is better.


To be honest, I'm not fan of "this could make future compiler improvements harder" arguments.

They can be used about any syntax or language change. But at some point, you need to take your giant metaphorical knife and slice the solution space and say "this part is where we're going towards, we're cutting away the rest". There's no point in saying "we should avoid implementing this feature because it could stop us from having a cleaner version of this feature later on", because this will also be true later on. There's a point where hesitating becomes procrastinating.

I'm not saying we should move fast and break things. There are situations where we know we'll have more information later on, and we should wait until then to take a radical decision. The min_const_generics approach is a good example of delaying; it implements parts of the feature which are uncontroversial, so that people will start using them, which will give us more information to handle the more controversial parts.

But this is a deliberate choice.

Only implementing min_X, a small subset of X, only makes sense if you roughly know what X will be like, how min_X can progress towards X, and how the experience of min_X will help you figure out how what you don't know about X.

If you only have a plan for min_X, and no idea what X is, and your min_X is designed to not block any possible future X that would exceed its initial scope... Well, that's not planning. That's the temporal equivalent of design-by-committee.

In my humble opinion, that's not the direction Rust should go in.

(although this higher level design stuff is where we'd want the lang team to set an outline, because forum discussions only go so far; @scottmcm - any thoughts?)

I think what rpjohnst meant was:

"Using "simple" variadic generics, we can achieve most of what people want with reflection (though not actual, general-purpose compile-time reflection), and since variadics would have rules similar to primitive recursion, they would be post-monomorphization-error-free without the need for dependent types."

(That still leaves the const generics debate, but variadics don't have to be a part of that debate)

(I'm starting to think we need a shorthand for "post-monomorphization-error"; PME? That's probably too obscure)

This really needs to be said more often.

1 Like

For me, it boils down to: are we strictly talking about variadic tuples? If so, then we are (mostly) on the same page, and it's just a matter of syntax.

However, your document, as highlighted by this particular sentence:

No variadic functions. Eg fn foobar<...Ts>(tuple: (...Ts)) is allowed, fn foobar<...Ts>(args: ...Ts) isn't.

strongly implies to me that you propose adding C++-like parameter packs in addition to the already existing tuples, which is an approach I'm much less fond of.

1 Like

If tuple meta-programming is okay
and it's okay to write down the type of Ls[i] somehow
is there really a problem with parameter packs?

A thought that just occurred to me, and might be worth exploring, is that we might want to draw a distinction between "general-purpose" const functions, that are just regular code that happens to have the "const" label, and "intended-for-generics" const functions.

The big difference would be that, while general-purpose const functions could panic like any regular function could, intended-for-generics functions would have to be guaranteed (who enforces it though?) to only panic with a compiler-friendly error message.

This could be like with unsafe code: generic-friendly code could call non-generic-friendly code with a may_raise_arbitrary_panics escape hatch; people reviewing the code would know that that hatch comes with obligation that the developer checks/proves that the inner code can never panic given the inputs of the outer code.

The way I was thinking, the MVP would add variadic parameters-packs as a syntax, while still only allowing them to be used in a way completely equivalent to variadic tuples.

(keep in mind that as long as a construct is syntax-only and not part of the type system, it can always be rolled back with an edition change)

As people start using them and corner cases are figured out, further RFCs would add more uses that aren't exclusively tuple-based. Tuple metaprogramming and stuff like (...my_tuple, ...my_other_tuple) would still no be added.

I don't really see any downside this has compared to a "variadic tuples plus tuples metaprogramming" approach (except in terms of flexibility, but I've already said why I don't think that's a plus).

@comex Yes, Rust already makes some compromises here; yes, I agree with PoignardAzur's take that this is more about the path of least resistance than a hard line; even my initial "Option A"/"Option B" was meant in that sense, where we sometimes pick A and other times B. I'm not trying to argue any of that, despite your best efforts :slight_smile:

(Incidentally, I also agree that "overflow is almost certainly a bug [in the panicking function rather than the caller]" suggests that we should perhaps just allow that kind of error, similarly to how we encapsulate unsafe code which promises not to cause UB. But that's another discussion.)

But also, PoignardAzure captured my main point:

Const generics has to make all these hard decisions about bounds and panics and errors. But that's precisely because they're inherently about lifting normal Rust to a new context, and normal Rust already diverges/panics/etc.

Variadic generics, on the other hand, do not inherently need to be defined in terms of normal Rust. The things we need to do here, even in a hypothetical future where mere variadic tuples are insufficient to sate our need for type shenanigans, tend to correspond to simple MLTT-style eliminators (thus my my ill-fated mention of primitive recursion).

That is, all we could ever want here is map and fold on h-lists and sums, right? Surely, at the type level, direct syntax for those is easier to read, write, and implement than doing it by hand in normal Rust! Would you really rather write the Fn traits using const generics, than with some kind of built-in tuple splatting?

At the risk of making a strawman argument, look at custom derive. 99-100% of those proc macros are just maps across fields and their attributes. But even with parsing and quasiquoting libraries they still tend to be huge just because there's so much detail that they have to wade through. Even if we do add first class types to the language, I would still prefer direct variadic generics for a lot of use cases.

1 Like

Simple what now?