I’ve thought about this proposal a bit more, and I am afraid it’s quite incomplete.
Syntax is debated to no end in every language so being able to settle on the syntax is already something; however there are potentially many other issues to resolve.
For example: the examples only show constants being passed around, but what about expressions? C++ non-type template parameters matterred because you could use expressions from the get go (even if it was quite barebone), so at least I would expect to have a plan on how to deal with expressions even if they are not introduced immediately. Backward compatibility is for life…
Another unresolved question is how to deal with usize and isize. There are two possibilities:
Use a “virtual machine” based on the target for which the code is emitted
Use symbolic reasoning (so usize::max - 1 is non-reducible and different from 4294967294 even on 32-bits platforms)
It’s possible to limit the RFC to non-target dependent code for now; it’ll require being careful with floating points (probably we’ll want strict IEEE compliance).
A strategy to deal with underflow/overflow in compile-time evaluation need be defined. If the value is necessary at compile-time then it should probably be a hard-error, however there might be impacts on potential evolutions later.
That’s all I could think about in the last day or so; but I suspect there are other issues I did not think off…
I was wondering how a function that is generic over a constant might be compiled.
Specializing the function for each constant it’s applied to, as we do with type parameters, seems like overkill. It’s ungainly to have 33 implementations of Clone in the source as we do now, but it’d also be ungainly to have a separate copy of the machine code for every possible size. Invisible ungainliness is asking for trouble.
Instead, we could produce LLVM code for the function once, having it take the generic constant parameter as an invisible run-time parameter. Each call site must know its value for the generic argument, so it can pass it as needed. Then, LLVM can decide when it’s best to inline the function and propagate the constant, and when it’s best to actually pass it at run time—just as it does now for any other function call.
When a generic function is defined it isn’t monomorphized to every possible type parameter but rather when it is actually referred to. If I only clone an array with a size of 273 then only that impl gets monomorphized and the rest of them don’t exist. Having a single generic impl that can be monomorphized as needed is significantly more efficient than having the 33 impls for just the smallest sizes.
Having integral numbers as template arguments, implemented as in C++, will offer many useful opportunities.
Not having a “const” tag before the value (if this is possible), is a good idea, but having a type annotation is also useful (as in D language):
impl<n: usize, m: usize> std::ops::Add<Matrix<n: usize, m: usize>>
for Matrix<n: usize, m: usize> {
type Output = Matrix<n, m>;
}
But in some cases you want to use such compile-time template arguments (like such matrix sizes) only to perform compile-time correctness validation, but you don’t want (to avoid template bloat) to actually instantiate new code with such integral values. So in this case the integral arguments are ghosts, and if necessary they are passed as run-time values in some way to the function(s).
Please fix me if I’m wrong. As I understand, the real discussion around const prefix is about usage site, and not about declaration site. And the main question is, whether parser will be able to handle such cases. Right?
When a generic function is defined it isn't monomorphized to every possible type parameter but rather when it is actually referred to.
Yes; I shouldn't have written "every possible size". What I was thinking was "every used size".
But @notriddle's objection is much more serious. Although, if C11 can handle dynamically-sized arrays, I assume Rust can as well. The following is valid C:
int f(int n, int i)
{
int a[n];
int j;
for (j = 0; j < n; j++)
a[j] = j;
return a[i];
}
In the past I’ve fought vehemently against allowing any sort of alloca in Rust. I used to have a list of really good reasons, but I’ve forgotten about them by now.
And, in any case, @notriddle’s example has returning the array, which isn’t legal in C. I don’t even think it’d be possible to implement reasonably - the data needs to live in the stack frame of the callee. The caller would then have to somehow know where the data lives and where it can safely move the stack pointer to after the return.
It seems to me that @notriddle’s example could work without being statically specialized, because the length is an input parameter — its binder is inside the angle brackets instead of the parentheses, and the compiler might be able to do term inference for it at the call site instead of needing it specified explicitly, but either way it’s supplied by the caller to the callee, which means that the caller can also supply enough space for the return value.
Another thing — which I realized while trying to write that out with hypothetical dependent types and then decided that just made it more confusing, not less — is that this is weaker than alloca: even if the compiler doesn’t emit specialized code, it could (I think?) still do enough monomorphization to statically reason about the set of possible values of each type-level integer. For example, it should be able to detemine the maximum possible size of each non-constant alloca it emits.
That might help somewhat with the reason I’ve seen for avoiding alloca and variable-length arrays in C, which is that they’re unsafe and a known security hazard if the length is untrusted (assuming the implementation doesn’t check for stack overflow and arithmetic / address space overflow).
All that said, I wonder if there are cases where the programmer would specifically want specialization — e.g., to allow loop unrolling, or avoid expensive operations like division by a non-immediate.
In C, when a function returns an aggregate like a struct, the caller passes an invisible parameter pointing to the space in its stack frame where the callee should store the return value. Wouldn't this work for arrays just as well? In @notriddle's example, a caller would be responsible for (implicitly) passing an adequately-sized buffer, the value for len, and finally the real parameter input.
If the caller were itself generic in the size, that buffer would be obtained via alloca-like behavior. If the caller were not generic, then it would be passing a constant for len and a pointer to a fixed-size buffer to hold the return value.
The usual objections to alloca revolve around bad sizes. The sizes involved here would be fixed at compile-time.
So you're saying that, just as Rust lets us choose whether to pass dynamically-dispatched trait objects, or use type parameters to request compile-time specialization, by the same token it might be valuable to make specialization the presumed implementation for integer generic parameters as well? That makes sense.