A madman's guide to variadic generics

here's maybe the least coherent thing i've ever put together.

it's about the feasibility of variadic generics but i'm not sure how much of it makes sense (i.e. i would like to know what people think about it). think of it as a pre-pre-pre-RFC of sorts.

3 Likes

See also https://github.com/rust-lang/lang-team/blob/master/src/design_notes/variadic_generics.md for lots of previous thoughts in the space.

2 Likes

thanks for linking that here, i had lost the link and will put it at the top of my document.

i had read through each of the proposals and found them to be some combination of

  • A. incomplete
  • B. unRusty or too functional-focused
  • C. an amalgam of ideas with no cohesion

none of those are a bad thing really, i used some parts of each as inspiration but i think a more holistic, considered design was required and i hope that i have filled at least part of that nice. i have done a lot of thinking about this particular feature, so i felt it fitting to do some writing too.

Mom, can we have what Zig has?

That is to say, ISTM that the issue with type-introspection proposals is that they don't go far enough. If we could also output types and items (i.e. full type-metaprogramming) then variadics would be one problem of many that can be solved by slapping some const functions/blocks in the right places. E.g. delegation, const-generic-access (aka Provider API), specialization, variadics, many things currently done at the syntax level by macros, ...

Adding new syntax (..), lang-items (traits describing tuples) and compiler-builtins solves some of the variadic problems. Metaprogramming could be one tool solving many problems.

Granted, we'd need additional epicycles to to integrate it with inference and trait-solving. There'd probably have to be a way to generate bounds (obligations?) that can't be expressed in the surface language.

3 Likes

The biggest challenge that I see to having that kind of thing in Rust is lifetimes. You have to restrict the metaprogramming system so that it can't be used for unsound lifetime-based specialization.

Hrrm... type system stuff is far from my turf, but lifetime-specialization might not be quite the right angle? Isn't 'static the "real" problem? It's the only nameable, concrete lifetime and the only one that impacts traits and subtyping?

I'm the author of the previous attempt at this. Here are my thoughts:

i think there’s no reason not to limit the scope of variadic generics to tuples:

[...]

Variadic<(A, B, C)> // is valid syntax
Variadic<((A, B), C)> // is also valid
Variadic<A, B, C> // too many params 😡

There's one big reason: backward-compatibly making iter::Zip and co. variadic.

if you allow traits to be generic parameters

Sounds like a huge change, and if the Tuple trait needs to be magic anyway, I don't think it is justified.

however, there’s still one blind spot in this design:

fn variadic<T: (..), U: (..)>(params: (..T, ..U)) -> (T, U);

let (left, right) = variadic((0, 1, 2));
// what does this print?
println!("{left:?} vs. {right:?}");

i don’t think it would be too difficult to enforce that when inference converts the type (i32, i32, i32) into (..T, ..U) it greedily generates T, so U is reduced to (). this seems like the most sane approach to me, though we could also try to make this a compiler error (i don't think its possible without violating semver somehow).

I don't see the semver issue?

trait solving

It's good that you put effort into actually specifying an algorithm, I didn't bother with that at all. So kudos to you!

trait Heapify {
   type Boxed: (..);
}

impl Heapify for () {
   type Boxed = ();
}

// there's no overlap with the above impl
// because this one has a length range of `1..`
// and the other has an exact length of `0`.
impl<T, U: (..)> Heapify for (T, ..U) {
   type Boxed = (Box<T>, ..(<U as Heapify>::Boxed));
}

Smart, I like it. I do think people will want an explicit iteration syntax eventually, using recursion for everything would get annoying. But MVPs are allowed to be annoying, as long as they work and don't get held up in endless bikeshed.

trait MyNumsSummed {
	type Output: (..);
}

impl MyNumsSummed for () {
	type Output = ();
}

impl<T: Add, U: (..Add)> for (T, ..U) {
	type Output = (<T as Add>::Output, ..(<U as MyNumsSummed>::Output))
	// `forall<U> { U: MyNumsSummed }` is provable given `U: (..Add)`,
	// using the rules defined in "trait solving#candidate selection".
}

fn sum_my_nums<T: (..Add)>(xyz: T, abc: T) -> <T as MyNumsSummed>::Output {
	match (xyz, abc) {
		// both `xyz` and `abc` are empty; we have no work left to do.
		((), ()) => (),
		// both `xyz` and `abc` have > 1 element.
		// we add what we can and recurse using the remaining elements.
		(
			(x, y_etc @ ..),
			(a, b_etc @ ..),
		) => {
			let x: impl Add = x;
			let y_etc: impl (..Add) = y_etc;

			compose!(x + a, ..sum_my_nums(y_etc, b_etc))
		},
		// the tuples aren't the same length.
		// perhaps the compiler will be able to detect that `xyz` and `abc`
		// resue `T` and so are gauranteed to be the same length
		// but this isn't necessary for an MVP.
		_ => panic!("you've been a naughty boy!"),
	}
}

My proposal has a note about wanting to have some sort of match syntax, but I hadn't though of using associated types on the type-level side; that's smart. One question is, how would this work with type inference? If you don't explicitly annotate the output type of the match, is the compiler going to automatically generate the needed trait + impls? What about IDE type hints?

Also, as with my previous comment, people will want an explicit iteration syntax eventually, even if it's not in the MVP.

lifetime packs

the most recent popular variadic generics proposal included the concept of "lifetime packs" (i.e. variadic lifetimes) and similar ideas for const generics. i'm going to be honest when i say that i just don't really understand what the point is or how it works, but even despite that, i think the functionality can be mirrored with what i call the "smuggler pattern":

I don't love lifetime packs either; they were the least sucky way I could find to fill a hole. Basically, there are already tuples of values, and tuples of types, but no way of representing a list of lifetimes. So I added one, because I needed it to fit lifetimes into the rest of the proposal. Maybe it can be emulated through trait wizardy? That's never fun, but if lifetime variadics are sufficiently rare, it could be Good Enough?

i think whether or not [a variadic number of parameters for a function] should be valid creeps far out of the scope variadic generics as specified in this document. personally, i think its pretty unRusty as it easily allows overloading

You can also hack overloading into Rust with custom impls of the (unstable, sure) Fn traits. I don't think it will be a problem, as long as it's not deliberately made easy, and there is a community standard of not doing it.

5 Likes

:heart:

I've posted this before, Rust should adopt what Zig or Circle-lang have in terms of meta-programming. It would subsume many complex features and would simplify the language by a large factor. This IMO is a must (for long term viability and evolution) given how much Rust over-borrowed already from the complexity budget.

Ideally, we should have a compile-time Metaobject - Wikipedia protocol. That's the holly grail in this space.

I agree though that lifetimes (being a novel feature of Rust) would require some additional design work to make this sound.

2 Likes

Every time this is brought up it never mentions the fact that it is essentially compile-time duck typing, which is the opposite of what Rust has always strived for.

I wonder how that would play with the trait solver, given that adding new trait implementations could change previous results. Bonus if you allow querying for implemented traits in the protocol: in that case new trait implementations could even cause previous trait implementations to no longer exist...

6 Likes

To make this more specific, "the Zig approach" works, because the following is valid Zig code:

fn f(T: type, value: T) void {
    value.g();
}

You don't have to declare that T has a .g method. Rather, at compile time during monomorphisation, the function could either work or fail depending on the concrete type T of a particular instance.

The primary benefit of this approach is that it is both simpler & more flexible than what Rust does.

The primary drawback is that the interface that generic functions require from their generic arguments is unstated, which makes it impossible to reason about semver mechanically, and, more generally, makes it harder to reason about function and its callers separately (with repressions such as worse error messages or less helpful code completions).

In contrast, Rust requires you to write

trait G { 
  fn g(&self) 
}

fn f<T: G>(value: T) {
    value.g();
}

using the language of traits to bound generic interfaces in the signature. Function's body is checked against abstract signature, not against specific substituted concrete types.

"Rust, but with Zig-style comptime instead of traits" I think would be a coherent language, but it won't be an evolution of Rust as it is today.

Rust traits create problems for metaprogramming. Consider reflection-based JSON converter:

fn to_json<T>(value: &T) -> String {
    // reflect on value here
}

If <T> is what's stated in the signature, then this function can compile or not compile depending on specific T which we pass-in. This could work! But this'll mean that you can do

fn f(T: type, value: T) void {
    value.g();
}

in Rust, if you use the ugly meta syntax. To make metra-programming compatible with Rusts golden rule, we need rather something like

trait ToJson {
   fn to_json(&self) -> String
}

default impl<T> ToJson for T 
  where
    // Say something to the effect of 
    //    every T's field is : ToJson
    FieldsTuple<T>: (..ToJson)
{}

which needs both variadics and specialization.

8 Likes

Bonus puzzler:

trait C      { /* ??? */ }

fn f<T: C>() { /* ??? */}

struct A;
impl C for A { /* ??? */}

struct B;
impl C for B { /* ??? */}

fn main() {
    f::<A>(); // Works
    f::<B>(); // Compile time (!) error!
}
Solution
trait C {
    const Value: usize;
}

struct IsZero<T>(T);

impl<T: C> IsZero<T> {
    const Ok: () = [()][T::Value];
}

fn f<T: C>() {
    let _ = IsZero::<T>::Ok;
}

struct A;
impl C for A {
    const Value: usize = 0;
}

struct B;
impl C for B {
    const Value: usize = 1;
}


fn main() {
    f::<A>();
    f::<B>();
}
3 Likes

Yeah, I didn't mean instead of traits. Rather the reflection and type-constructor parts of comptime. E.g. in this example where they take a type as input and compute another one as output.

So reflection-trait-system-integration would look similar but the where bounds would be replaced with a const fn and the impl body would be emitted for each T, potentially a different one for each, something that can be tedious to do declaratively.

Excuse the keyword salad:


impl<T> ToJson for T 
  where
    T: JsonableTuple
{
   fn to_json(&self) -> String = gen_fun<T>(typeof(T));
}

// private helper trait
#[marker]
trait JsonableTuple {}

impl JsonableTuple for T where jsonable_tuple<T>(typeof(T))


static const fn jsonable_tuple<T>(t: TypeInfo<T>) -> bool {
   // the compiler passes T, we check if
   // T is a tuple and enumerate its members via reflection.
   // Since this is a marker trait and the local crate owns it there
   // should be no coherence issues... I hope
}

static const fn gen_fun<T>(ti: TypeInfo<T>) -> fn(&T) -> String
where T: JsonableTuple
{
   // this function could obviously do way more fancy things such as assembling new
   // concrete types, switching between different closures based on type information
   // do all sorts of pre-computation at compiletime...

   let first_field_offset = ti.fields[0].offset;
   // promote reflection type
   type ONE: ToJson = ti.fields[0].ty;

   // Lets become consts, constructed types are available as named types.
   // so this closure doesn't actually *capture* anything, it's a function pointer.
   const move |t: &T| {
        // field accesses should be more ergonomic than this of course
        let first_field: ONE =
             unsafe { t.as_ptr().byte_add(first_field_offset).cast().read() };
        // lazy, skip the other fields....
        return ONE.to_json()
   }
}

Granted, this is a very dumb approach that is equivalent to manually implementing the helper trait for every instantiation of a tuple (if the compiler asks).

There could be some continuum towards expressing more general bounds (though that'd need different syntax), but occasionally we do have the issue of deciding to implement a trait in a way where the rules for what implements that trait can't be conveyed to the compiler and we resort to macros or long lists of nearly-identical impls. Maybe a sort of reflection-query-language AST could be returned to the compiler and it's up to the compiler to either use it as blackbox for yes/no checks on concrete types or for constraint-proving about more generic bounds.

1 Like

While Zig's implementation is currently using duck typing, it doesn't have to be. It's merely an implementation detail. Using a meta class peotocol you could pass the meta-type (i.e. the trait bound) as the parameter.

More over, it should've been obvious that we need a Rust specific protocol. Consider that we already have derive macros and they do generate new trait impls in rust today. Works already with the trait solver, isn't it? :wink:

As was said above, the goal isn't to replace traits (entirely) but traits aren't a free lunch either. There are many cases in Rust today where we have way too many complex bounds that are viral all over the code. These cases would be better served by a zig like approach.

See also a related recent post here: https://reddit.com/r/rust/s/bLphNERgpw

1 Like

Is there something that would force you to do so, so that if it was wrong the library would get an error and not just the caller?

Derive macros don't have such problems because they run earlier, so by the time the trait solver actually runs it has all the informations it needs. But if you allow both inspecting the result of the trait solver, and generate new code that can potentially change those results, then you've got a problem.

1 Like

That doesn't mean it's unsolvable? E.g. queries within reflection could return Result<bool, Pending> which could be bubbled back up to the compiler to rerun the whole thing. It could result in more cyclical query problems, but it might be ok to declare that a programmer error like recursive types if the potential triggers for that don't become too spooky.

I think the same-crate marker trait approach I used above does solve coherence issues without the compiler knowing exactly how broad that pseudo-blanket is. I'm not sure though. It might lead to delayed errors.

What effect will this (Zig inspired comptime) have on compile time though? Rust compile time is already quite slow, I wouldn't want it to get worse once everyone starts relying on this new feature.

It pretty much is. It will create circular logical dependency with arbitary logic in the way. And you can't prove everything by just "rerun the whole thing again and again". (Unless you just reject any code that has a trait bound, of course.)

Wouldn't expanding the "no negative reasoning" to dynamic bounds provide monotonicity that would mean rerunning the whole thing again and again would eventually reach a fix point?

How would you prevent seemingly arbitrary code from doing negative reasoning? Also, monotonicity alone doesn't guarantee convergence to a fix point in a finite number of steps.

How would you prevent seemingly arbitrary code from doing negative reasoning?

API requirement + checking that the result sets never shrink? Error otherwise

Also, monotonicity alone doesn't guarantee convergence to a fix point in a finite number of steps.

Cycle/Overflow error? As long as it converges most of the time that can be good enough?

User code can error anyway...