On Thursday at the language team meeting we discussed const generics, both in terms of RFC 1931 and the work that eddyb has done and plans to do on the const system.
First, the tl;dr version:
- We believe that we can have an RFC accepted and const generics available on nightly by the end of 2017.
- We discussed the major outstanding issues, which have to do with the theory of equality we use in type checking. Iāll summarize the discussio below.
- One thing we want to throw to the community right now is the question of the exact syntax for const generics. We hope that this thread can be a good space for have that conversation.
Once we have the syntax figured out, weāre going to try to draft an RFC which includes that syntax and addresses the equality issues discussed here. This may be a new RFC or it may be an amendment to 1931 (the main reason to make it a new RFC is that weāve had trouble trying to edit significant additions into existing RFCs in the past).
Equality of consts, unification & overlap
A major issue (indeed, probably the most major issue), is determining the equality of two constant expressions during typechecking. This is needed in two different ways:
- Unification: Unification is the process of determining that two types are equivalent. In order to unify two types which have a const expression, we need to know: are the values of those consts equal?
- Overlap: The overlapping impls checks needs to do essentially the inverse of unification. In order to guarantee that no two impls overlap, we have to ensure that the types of the input types to those trait impls cannot be unified. If those types have const expressions, we need to know: are the values of those expressions never equal?
These issues have been touched on in the comment thread for 1931, so if youāve been following that closely this section probably doesnāt have too much new information for you.
Abstract expressions
The first big issue has to do with constant expressions in which one member is an abstract constant parameter. Consider these pairs of types: Do they unify? Do they overlap?
for<N: usize> [i32; {N + 1}];
for<N: usize> [i32; {N + 2}];
for<N: usize> [i32; {N + 1}];
for<N: usize> [i32; {1 + N}];
for<N: usize> [i32; {N + 1}];
for<N: usize> [i32; {N * 2}];
With these examples, you can probably determine the answer by analyzing the expressions, but whatever solution we arrive at needs to be general.
As a first pass, we propose a solution which treats any expression involving a const parameter other than the identity expression as an opaque function. This has a convenient analogy in associated types:
// Do these unify? Do they overlap?
for<T: IntoIterator + Deref> Vec<T::Item>
for<T: IntoIterator + Deref> Vec<T::Target>
The answer here is that these donāt unify and they do overlap, because we need to be conservative. Its possible (though unlikely) that T::Target == T::Item
, and its possible (very likely) that T::Target != T::Item
.
At least at first, we would treat all const expressions containing parameters the same way, except for the identity expression (e.g. N
). Syntactically we may distinguish these expressions with braces, as I have done in my example.
This would mean, for example, that [T; {N + 1}]
and [T; {1 + N}]
would not unify, but would overlap. Its plausible that at a later stage we will develop more sophisticated rules that are aware of the commutatitivity of integer addition and similar well-understand properties, so that simple cases like this will ājust workā how you expect them to.
Disallowing unconstrained const parameters
Consider this impl:
impl<const N: usize> Foo for [i32; {N * 2}] { }
A consequence of treating {N * 2}
as an opaque function is that given eg. <[i32; 4] as Foo>
we cannot resolve the correct impl, because we cannot determine the inputs of the function from its outputs. For this reason we will have a similar rule to the āunconstrained type parameterā rule: every parameter must appear in the impl in an identity expression.
Structural equality, floats, etc
The other issue unification brings up is the exact definition of equality. In particular, it is necessary that this equality be reflexive; otherwise we cannot guarantee that T == T
. Importantly, float equality is not reflexive, because NAN != NAN
. This would mean, for example, that Foo<NAN> != Foo<NAN>
.
For now, it seems the most conservative solution is to limit const params to types which have the structural_match attribute (as stated in RFC 1931). This means, for example, that you canāt have a floating point const param. Someday we may extend what types can be used (for example, by implementing equality of floats as bitwise equality), but that is a future extension.
This also means that a const generic cannot be of a generic type, because we have no way to constrain that type to implement structural equality (since it is an attribute and not a trait).
Syntax
Now, to the syntax. This has several parts:
- How should a declaration of a const parameter look? (eg
impl<$CONSTANT>
) This needs to contain both the name and type of the constant. - How should a use of a const parameter look? (e.g.
impl... for Foo<$CONSTANT>
) This need only contain the name, not the type. - How should a const expression look? (Should it be enclosed in curly braces or not?)
- Should we separate the const params from the other params with any special character (eg
;
) or just a comma? - Stylistically, should const params be snake_case or SHOUTING_SNAKE_CASE, and should we lint about it?
To start the conversation, Iāll give my own opinion (not discussed at the lang team meeting):
@withoutboatās const generics syntax
Declaring a const param:
impl<T; const N: usize>
When declaring a const param, the const
keyword is used, followed by the type. The use of const is to sharply visually distinguish the consts from the type parameters. A semicolon also separates the type parameters from the consts to be consistent with use sites.
Using a const param:
Foo<T; N>
When using a const parameter, just the name is used, no const
keyword or type. A semicolon separates const params from type params, so that you can determine from the use site that N
is a const, not a type.
The semicolon is also used to make this syntactically consistent with the construction of arrays: [T; N]
.
Const expressions:
Const expressions containing constant params are contained within braces, e.g. [T; {N * foobar(N, 2)}]
. The only exception is the identity expression: N
. This has several advantages:
- This visually distinguishes those constant expressions which have to be dealt with specially by the compiler from those which do not.
- This makes it easier to parse arbitrary expressions (e.g. comparisons:
Foo<{N > 7}>
).
Expressions containing no params can also be wrapped in braces, but it is not necessary except for parsing reasons (e.g. Foo<6 > 7>
).
Casing:
Like all constants, constant parameters should by in SHOUTING_SNAKE_CASE, and any other case will be linted against just the same as other constants.