Phantom const generics?


#1

For many purposes, expecially in low-level programming, fixed-size arrays are useful. This is a simple function that zips two arrays, it enforces at compile-time the equality of the length of the two arrays:

fn zip_array1<const N: usize, 'a, T, U>(a: &'a [T; N], b: &'a [U; N]) -> impl Iterator<Item=(&'a T, &'a U)> {
    // Zip implementation
    a.iter().zip(b)
}

A disadvantage is the code bloat, because every different N gets monomorphized, to avoid most of that you can use an auxiliary function that performs erasure on the array lengths information, turning the lengths in run-time values in the slices:

fn zip_array2<const N: usize, 'a, T, U>(a: &'a [T; N], b: &'a [U; N]) -> impl Iterator<Item=(&'a T, &'a U)> {
    fn helper<'a, T, U>(a: &'a [T], b: &'a [U]) -> impl Iterator<Item=(&'a T, &'a U)> {
        // Zip implementation
        a.iter().zip(b)
    }
    helper(a, b)
}

If not inlined, LLVM should turn zip_array2 into a tiny function.

Perhaps we could invent some way to tag as compile-time only some of the constants. So with code like this the compiler enforces the two arrays to be of equal lenght, but instantiates only one zip_array3 for all possible N:

fn zip_array3<Phantom<const N: usize>, 'a, T, U>
             (a: &'a [T; Phantom<N>], b: &'a [U; Phantom<N>])
             -> impl Iterator<Item=(&'a T, &'a U)> {
    // Zip implementation
    a.iter().zip(b)
}

This requires compiler support because [T; Phantom<N>] currently is invalid syntax. This idea could be generalized to other situations where const generics are used, to keep compile-time correctness and avoid code bloat.


#2

My naive expectation in this design space was that:

  1. rustc/LLVM should at least try to tell when a generic parameter has no effect on the generated code and “elide redundant monomorphizations” as an optimization

  2. If that turns out to not be good enough, we add an attribute on the generic parameter to force the compiler to either not generate different monomorphizations for different values/types or fail with an error if that’s not actually possible.

fn zip_array1<#[do_not_monomorphize] const N: usize, 'a, T, U>
             (a: &'a [T; N], b: &'a [U; N])
             -> impl Iterator<Item=(&'a T, &'a U)> {
    a.iter().zip(b)
}

(https://github.com/rust-lang/rfcs/blob/master/text/1327-dropck-param-eyepatch.md#attributes-on-lifetime-or-type-parameters only introduced attributes on lifetime and type parameters; I’m guessing this is what they’d look like on const parameters)

I don’t know of any advantage to getting the type system involved in this optimization with a Phantom<T>.


#3

My two cents:

  1. The approach of automatically eliding monomorphization if it does not matter for the generated code is difficult, because in many scenarios (including this one) it both technically generates slightly different code and that difference can sometimes make a big difference for subsequent optimizations. For example, a summation function can be greatly improved for small array lengths, in other cases bounds checks can be removed for array lengths above a threshold, etc. – it is difficult for the compiler to automatically make the right call here (of course, “always monomorphize” is often not the right call, but at least it’s predictable).

  2. Annotating parameters to not monomorphize is conceivable, but any proposal in this space has to do the legwork of nailing down exactly what subset of the language can be used with such a parameter (in other words, which restrictions this attribute adds), and ensuring we can feasibly implement the different codegen path for this subset of Rust (see also Idea: polymorphic baseline codegen).


#4

This sort of “you should use an inner function” is already very common with the “Into trick” or thinks taking AsRef; I don’t think there’s anything unique here. Indeed, if you want to write this kind of function today, you can do so by taking impl AsRef<[T]>


#5

An ATC T -> AsRef<[T]> you mean. Also impl Trait does not suffice here.