[lang-team-minutes] Const generics

Yes, this is not useful for ‘writing functions over consts’ or anything like that; that is what const fn is for. You technically can, but this is what it looks like:

trait UsizeToUsize<const N: usize> {
    const OUT: usize;
}

struct Increment;

impl<const N: usize> UsizeToUsize<N> for Increment {
    const OUT = {N + 1};
}

// vs
const fn increment(n: usize) -> usize { n + 1 }

Regarding the requirement to introduce them with const, as @notriddle explained, we need to distinguish kinds prior to typechecking. It seems like the keyword const is the best choice, since const already means “constant time value” in Rust. The main alternative was to always require a ;, as in struct Foo<; N: usize>, which seems less appealing.

1 Like

Oh, good point, but…

Doesn't that also rule out (in a semicolon-less syntax) allowing single identifiers to be specified as const parameters without braces, i.e. Foo<N>, as @eddyb suggested?

After all, N could be defined as both a type and a const.

The difference there is that if you know where the definition of Foo is (which is what path resolution does), you also know which parameters are marked const.

We use this property, albeit after path resolution, but still before the “bringup of the typesystem”, to perform lifetime elision nowadays - we can’t reason about how the lifetime parameters are used, but we at least know how many for each definition that a regular path (but not a method call or T::Assoc, those require type-directed resolution) refers to.

To summarize the responses to my post: It seems like we need const or ; to simplify the internal implementation within the compiler.

It disappoints me that we need to influence language ergonomics for implementation simplicity, but yes I completely understand the priority for implementation simplicity can out weigh the priority for ergonomics.

It’s not just a matter of “simplifying the compiler.” The Lexer Hack makes parsing unfinished code less reliable (like in an IDE), and it can make it harder for the human reader (having to go to the definition before you can even tell types and values apart).

2 Likes

You could say that you start parsing type syntax instead of bounds and the existing syntax today more or less ends up being a trait object, and only then do you actually treat it as a list of bounds.

But struct Foo<'a, T: 'a + Copy>(&'a T); works without 'a + Copy being a valid type.

If you can come up with a sane grammar that allows both the existing T: (lifetime bound|trait_bound)+* and N: type, and have a rule to tell between them that’s based only on path resolution results (i.e. you can tell between type A = ...; or trait B {...} but you can’t know what type A resolves to or what types implement B), then we can discuss it as a real alternative.

Layering is important for language comprehensibility, not only compiler simplicity. Our language becomes an inextricable web of interacting features if we don’t do clean separations like this. I also just don’t agree that having to write the const keyword when introducing a constant parameter is any significant burden or speedbump.

6 Likes

FWIW, you guys know the internals of the compiler, I do not. As such, I've already been convinced.

If you can come up with a sane grammar that allows both the existing T: (lifetime bound|trait_bound)+* and N: type, and have a rule to tell between them that's based only on path resolution results

This seems fun though, and helps me understand the compiler internals better, so I'll give it a try.

Disclaimer, I am not a type theorist, and I don't understand the internals of the compiler, so forgive me if I use the wrong terms, I'm just going off intuition.

In the grammar why not just treat them all as Kinds? T: (Kind)+*; Where Kind: (lifetime|trait_bound|type)

Then some time after (or during) type resolution, but before monomorphization, we check for these generic parameter's "downkinds".

Where "downkind" is defined as:

trait_bound "downkinds to" const_immutable_type;
/// Where the type impls the trait bound.
/// By definition all types are const and immutable, but I'm calling it out explicitly.

type "downkinds to" const_immutable_value;
/// Where the value is of type.

// This gets weird I don't fully know how to describe it
lifetime(eg. 'a) "downkinds to" const_immutable_lifetime(eg. 'b);
/// Where 'b is >= 'a.

We could additionally, consider both Values and in the future HKTs as Kinds.

value "downkinds to" nothing.
/// As such any `struct Foo<'a, T: 1>` would throw a compile time error

HKT "downkinds to" (const_immutable_type|const_immutable_htk)

This Kind + "downkinding" rule is how my brain parses and groks how to use generic function definitions. Please let me know where my intuition is flawed. Thanks for putting up with me :slight_smile:

You can’t wait that long, because you need to know whether a parameter is const during path resolution, so you use the correct namespace. You have to be able to tell between kinds before understanding types or querying for trait impls.

New problem:

trait MyTrait<const X: bool> {
    fn function() { }
}

impl MyTrait<true> for () { }
impl MyTrait<false> for () { }

fn foo<const X: bool>() {
    <() as MyTrait<X>>::function()
}

Is this code well-typed? Can we determine that MyTrait is implemented exhaustively for all booleans for ()? When can we determine this?

Seems connected to our ability/inability to determine match exhaustiveness on literals of certain types (yes for bools & enums, no for integers).

5 Likes

New problem: [...] Is this code well-typed?

I think we should wait with functionality like this after the RFC has been implemented, and gate it as separate feature, e.g. when we allow where clauses, it appears to me the problems to solve are quite similar. Until then we should forbid any const dependent type impl be used in a context more generic than an impl it fits. Otherwise it might be hard to keep the 2017 announcement, and stabilisation may not arrive next year either, the feature being stuck in "unstable limbo" like so many great upcoming features.

1 Like

I talked to @eddyb on IRC and he agrees that this should be a second iteration. For now it seems we will never understand that a set of impls are exhaustive for a const value, and instead will treat all consts as open, just like types are (what would be interesting about extending this exhaustiveness logic is that it would essentially enable ‘sealed’ traits).

6 Likes

I quite like this. Especially if it's encouraged (via rustfmt, or something) for places not in the usual lifetimes-types-consts order, like SmallVec<T, N, type A>.

I don’t think that the assertion that “constant are SHOUTY_SNAKE_CASE” is right.

I think that the right sentence right now is : “globals are SHOUTY_SNAKE_CASE”, since upper case is used for global constants but also for global statics that are not constant.

It make sense to not use upper case on const gereric since they are not globals.

2 Likes

Consts and statics need not be global though, you can declare one inside a function. The shoutiness seems to come from the fact that they’re values that are known at compile time.

Interestingly though we don’t use it for const fns.

I didn’t know that const and statics could be declared inside functions. But “shoutiness for values known at compile time” is not consistent anyway since statics are not known at compile time.

If you mean that static muts can mutate at runtime thats true but they exist in the binary, theyre not created at runtime

Yes, they are inside the binary, but that’s not what I call being a value known at compile time. You can tell you know their memory location at compile time.

I really think that we should not have a bad syntax for a really important feature just to fit a convention that is not clearly related to the feature, and don’t bring any benefit. Formatting conventions are made to make the code more readable. Shouting case constant on generic would make the code, more difficult to read.

Well, as you probably know, the actual origin of the shoutiness is C, where constants are often defined as macros, and macros are usually uppercased because they’re macros. (C enums are often but not always uppercased to match; uppercasing macros, however, is near universal.) Rust statics aren’t always constant, but they usually are, and non-constant statics are basically unidiomatic to start with; so the analogy fits.

In Rust, if there’s a benefit to shoutiness, I think it comes from identifying globals in positions where you’d usually see a local variable - saving readers of the code from “where is this defined?” confusion. It makes sense not to apply this to functions, whether const or not, because most references to functions are direct calls, and most calls are to global functions, not local variables (the opposite of the usual rule), avoiding confusion. As you say, consts and statics don’t have to be global, but again they usually are; using them in functions is unidiomatic in almost all cases. So it doesn’t really matter one way or the other, whether ones in functions look nice or whether shoutiness is warranted for them.

Const parameters can be either local or semi-global (if scoped to a big impl block), but even the latter must introduced explicitly in the source file rather than, say, coming in through a glob import (or even appearing in a long list of explicit imports; not like anyone reads those upfront). I think I’d prefer lowercase, or even differing case depending on whether they’re local or not. But that’s just my opinion.

5 Likes

You make a good point that we should be centering the practical impact of the style choice over a platonic kind of consistency. I do think there’s a benefit to making them shouty, in that it makes it easier to explain the style than if they are not (consts and statics are shouty, always, is an easy thing to explain).

So it seems like syntically we have pretty broad consensus on most things. That is, we agree that we will solve the declaration parsing problem with const $ident: $ty syntax. What remains to be determined is application, and specifically how (or if) we will distinguish const applications from type applications.

  1. We don’t make any effort to distinguish them; consts and types are separated by a comma, they are most often single uppercase letters, you just can’t tell them apart.
  2. We separate them with a ;, making them distinguishable from one another if (and only if) there are both types and consts in the use site.
  3. We declare them using non-shouty snake case, making them distinguishable through style.
  4. We have a preference for full names, tending to make them distinguishable through style.
  5. We don’t allow identity expression const params to be outside of braces, so you can always distinguish {N} from T.

I think I’ve come around to the non-; position, and I seem to have been the primary holder of that view.

Three of the remaining options are just ‘style’ preferences, rather than syntax:

  1. Consts should be shouty snake case, with a preference for 1 character names.
  2. Consts should be shouty snake case, with a preference for whole word names.
  3. Const should be non-shouty snake case.

Turned another way, this is one of those “three valuable things, choose two” sort of situations:

  1. Const params should be differentiable from type params.
  2. Const params should have as short a name as possible.
  3. Const params should be consistent with non-param consts.

I personally tend to value 1 and 3, leading to name-length shouty snake case, but YMMV.

5 Likes