[Idea] Exhaustiveness checks for const generic impls

Changelog
  • Mention control flow in item/type position alternative approach
  • Mention the specialization-based alternative approach
  • Reorganized to put more in the “future extensions” section
  • Add @comex’s design for implementing tuples for traits
  • Mention that current design is restricted to generic parameters only, discuss headroom for extension to type parameters
  • Formatted into multiple sections
  • Added more examples to the motivation section
  • Added “exhaustive impl” proposal from @binomial0

Motivation

Consider the following code:

struct Bool<const B: bool>;
trait Trait {}
impl Trait for Bool<{true }> {}
impl Trait for Bool<{false}> {}

In current Rust, Trait is not implemented for Bool, i.e. using a Bool where T: Trait is expected will fail. I believe that this behavior is surprising, and that there should be a way to implement a trait for a const-generic type by implementing it for all values of its generic parameter.

Of course, it can be argued that this language feature is rarely necessary and can often be emulated by using other existing language constructs. For example, the above trait implementation can be trivially rewritten as:

impl<const B: bool> Trait for Bool<{B}> {}

However, this is not always the case. For example, the original motivation for this proposal is that the following trait implementation cannot be rewritten as a single impl block:

// Pick a type among two choices according to a const bool value
trait Select<T, F> { type Out; }
impl<T, F> Select<T, F> for Bool<{true }> { type Out = T; }
impl<T, F> Select<T, F> for Bool<{false}> { type Out = F; }

Further, if the trait implementation strongly diverges from one value of the const generic parameter to another, packing all implementations into a single impl block can lead to less readable code.

For example, it can be argued that this kind of branchy impl block…

trait ComplexTrait {
    const FOO: u32;
    fn bar();
}

impl<const B: bool> Trait for Bool<{B}> {
    const FOO: u32 = if B { some_const_fn() } else { some_other_const_fn() };
    fn bar() { if B { some_code() } else { some_other_code() } }
}

…is easier to read as a pair of disjoint impl blocks…

impl Trait for Bool<{true}> {
    const FOO: u32 = some_const_fn();
    fn bar() { some_code() }
}

impl Trait for Bool<{false}> {
    const FOO: u32 = some_other_const_fn();
    fn bar() { some_other_code() }
}

…and this readability issue will only become more acute as the complexity of the const generic parameter grows, for example if people try to use an enum with many variants as a const generic parameter.

Proposed design space

The simplest way to allow the above code would be to perform exhaustiveness checking on the const generic impls and declare the trait to be implemented for the type if it is implemented for all values of its const generic parameters (and other type parameters).

However, if this is felt to be too implicit, a more explicit “impl match” construct could be introduced instead. Here’s a mockup for your bikeshedding pleasure:

impl<const B: bool> Trait for Bool<{B}> match B {
    true => { /* impl of Trait for Bool<{true}> goes here */ },
    false => { /* impl of Trait for Bool<{false}> goes here */ },
}

Personally, I prefer more “implicit” options, because “explicit” syntax (as done above at least) is quite noisy, induces some rightward drift, may complicate parsing, and does not resolve the ergonomic problem of the compiler behaving in a surprising way on the code at the start of this post.

But I know that some of us care a lot about being explicit, so I’m open to discussing such options as well. After all, there may be nicer explicit syntaxes that I haven’t thought about, or more complex use cases that can only be handled using an explicit syntax (think match guards).

As an interesting middle ground between the fully-implicit form and the fully-explicit form, @binomial0 made the case that users should explicitly assert that they believe the trait was implemented for all variants of the const generic type:

exhaustive impl Trait for Bool;

This would allow the compiler to report an error in cases where not all values of the const parameter were actually covered, much like match is able to tell us when not all values of an enum are covered today.

The compiler could also trivially lint about lack of an “exhaustive impl” statement when it’s clear that the code’s author has simply forgotten to add one.

Future extensions

Exhaustive impls for type parameters

As currently proposed, this feature is only usable for exhaustive enumeration of const parameters of a generic type. It would not be possible to use it in order to exhaustively implement a trait for all possible type parameters of a generic type.

To understand why, consider that the reason why this proposal makes sense at all is that some const generic parameters, by construction, only accept a finite set of values, which can therefore be enumerated in a finite set of impl blocks. This is trivially true of exhaustive enumerations (bool and any enum not marked #[non_exhaustive]), and somewhat less trivially true of other types such as machine integers.

The same cannot be said of type parameters, however. Unbounded type parameters, such as Vec<T>, trivially accept an infinite number of types and are therefore not suitable for the trait implementation strategy presented here. Less obviously, trait bounds cannot be used to restrict the set of types which a generic type accepts as parameters to a finite set, because it is always possible to roll new implementations of a trait.

One solution would be sealed traits, which can be partially emulated today using pub(crate) traits. This would allow only the current crate to break an exhaustive impl by adding new trait implementations, which could be considered acceptable much like the possibility of breakage induced by adding a new enum variant is generally considered acceptable as long as only the crate providing the enum can do it (and takes care to either mark the enum #[non_exhaustive] or tag a new major semver version when it changed it).

I believe, however, that if sealed traits were to become so important as to affect how trait impl blocks can be written, they should probably cease to be library-based constructs and receive first-class syntaxic support from the language.

Exhaustive impls for associated const bounds

@comex also proposed using exhaustive disjoint impls based on associated consts (rather than const parameters as discussed above) as part of a strategy for implementing traits for tuples using a recursive head/tail + empty case design:

enum TupleKind {
    Empty,
    NonEmpty
}
trait Tuple {
    const KIND: TupleKind;
}
impl<T> Foo for T where T: Tuple<KIND=TupleKind::Empty> { ... }
impl<T> Foo for T where T: Tuple<KIND=TupleKind::NonEmpty> { ... }

Any language feature aiming to resolve the above problem should probably also allow for this, given that the difference between const parameter values and associated const bounds is irrelevant to the problem at hand.

However, it should be noted that enabling this particular use case will require further language changes, as Trait<CONST=value> associated const bounds are not valid in Rust today.

So unless we want to resolve that problem at the same time (which yields a bigger RFC, and thus reduces the odds of acceptance), we should only consider this as a future extension that any proposed syntax must support, and not as an issue that we want to directly address today.

Alternatives

Const generics specialization

As @gnzlbg points out, the proposed functionality can be approximated using const generics specialization:

struct Bool<const B: bool>;
trait Trait {}
default impl<const B: bool> Trait for Bool<B> {}
impl Trait for Bool<{true }> {}
impl Trait for Bool<{false}> {}

One remarkable property of this approach, which can be either an advantage or disadvantage depending on the use case, is that Rust will continue to consider the trait impl to be correct if the type used as a const generic parameter is extended (for example, if a new variant is added to an enum type).

This can improve forward compatibility, or it can cause the codebase to silently break without a compiler error, depending on whether the use case allows a sensible default impl or not.

A net drawback of this approach is that a default implementation of the trait must be written even when there is no real need for it. For example, in the above case, a bool can not and will never be anything else than true or false, so the default impl is unnecessary boilerplate that will need to be filled up with dummy values, types, and unreachable!() statements for no good reason.

Control flow in impl or type position

A completely different way to approach this issue would be to allow some degree of control flow based on const values in type or item position. For example, the above Select functionality could be implemented like this…

trait Select<T, F> { type Out; }
impl<T, F, const B: bool> Select<T, F> for Bool<{B}> {
    type Out = if B { T } else { F };
}

…or even, given permissive enough syntax, by avoiding traits entirely like this:

type Select<T, F, const B: bool> = if B { T } else { F };

…not to mention that we could also handle completely unrelated use cases like this:

match SOME_ENUM_CONST => {
    Enum::VariantA => fn foo() { /* does something */ },
    Enum::VariantB => fn foo() { /* does something else */ }
}

Such “meta-control flow” would be an extremely powerful metaprogramming tool. It would have the potential to completely supersede existing functionality of perfectible ergonomics like #[cfg()]. But it would also be incredibly dangerous.

If this functionality is not appropriately restricted, we could end up in scenarios where a library changing the value of a public const could massively affect (and possibly break) other libraries. It would also potentially have major implications in the design of rustc and the place that const evaluation takes in the compilation process.

This is a much more complex feature to design, with many more trade-offs and hidden implications and I don’t think the Rust community is ready to face this proposal in this “fallow year”. And speaking personally, I certainly am not ready to design and propose that language feature.

Your turn

Do you think that it would make sense to turn this idea into an RFC? Did you see oversights in the above post, or do you have further thoughts on the topic to add? Feel free to comment!

2 Likes

This seems similar to

trait Foo {}
trait Bar {}

impl<T: Foo> Bar for T {}

Then Rust can unfer that if you have the bound Foo, you can access Bar. So I would expect that this works like the implicit case.

3 Likes

I like this idea, and I hope there won’t be any weird edge cases or problems during implementation. But writing a RFC sounds like a good idea.

I also really believe that some kind of explicit annotation should be required. That’s because a really great feature of Rust’s enums is that if you match on them but forget a variant (or a variant is later added to the upstream definition), you get a compiler error right away. I think we also want this behavior for the proposed feature.

If you meant to implement Trait for Bool by implementing it for all variants, you should be able to write down that intention, and the compiler should emit an error if you missed a variant.

I don’t really have an opinion about the form of this explicit annotation. Your explicit mockup looks good, although a bit noisy (as you already said).
Another variant would be to write down all the special cases like in your implicit proposal and then add something like

exhaustive impl Trait for Bool;

This would

  • assert that Trait is indeed implemented for all variants of Bool
  • make it so that a Bool that isn’t specified any further actually implements Trait.
2 Likes

Yes, something like this seems useful, as it could replace the pattern we see in for example byteorder:

trait ByteOrder { ... }

enum LittleEndian {}
impl ByteOrder for LittleEndian { ... }

enum BigEndian {}
impl ByteOrder for BigEndian { ... } 

LittleEndian and BigEndian are empty enums/void/never types and are only meant as values at the type level. Like for example in a serializer, where you might have a different implementation based on if the target endianness matches the host endianness (simple copy vs swapping bytes):

trait SerializeBytes { ... }
struct Serializer<B: ByteOrder> { ... }

impl SerializeBytes for Serializer<LittleEndian> { ... }
impl SerializeBytes for Serializer<BigEndian> { ... }

In the current system this, correctly, does not mean impl SerializeBytes for Serializer, as there is no way to express in the type system that ByteOrder is a closed set except with hacks like a private sealing trait. impl<B: ByteOrder> SerializeBytes for Serializer<B> { ... } is the only option, but then you have to use workarounds to be able to match on B inside the actual implementations (for example an associated const).

Using const generics ByteOrder could just be an enum const parameter and it would be guaranteed that LittleEndian and BigEndian are the only inhabitants and SerializeBytes is implemented for all possible values. This exhaustiveness check would thus make sure SerializeBytes is implemented for Serializer.

However, while thinking more about this while writing this, it might not strictly be necessary, as with const generics we have already elevated these markers to actual values which we can compute with and match on. Bool might also be implemented as:

impl<const B: bool> Trait for Bool<{B}> {
    fn invert(self) -> Self {
        match B {
            true => Bool<{false}>,
            false => Bool<{true}>
        }
    }
}

With just normal matches in all places where you want differing behavior. This would be less ergonomic but has the exact same effect. That being said I still believe an RFC for this would be a good idea.

1 Like

Just to clarify this point a bit, the Bool example in the OP is a purposely minimized example, but my original use case cannot work with an implementation that uses impl<const B: bool> Trait for Bool<{B}>:

// Machinery for type level if-then-else
struct Bool<const B: bool>;
trait Select<T, F> { type Out; }
impl<T, F> Select<T, F> for Bool<{true }> { type Out = T; }
impl<T, F> Select<T, F> for Bool<{false}> { type Out = F; }

// if B then T, else F.
type If<T, F, const B: bool> = <Bool<{B}> as Select<T, F>>::Out;

In other cases, a solution based on ifs or matches on the const parameter could work, but will be less ergonomic than fully disjoint impl blocks because every method and const of the trait will potentially be full of conditional logic.

I should probably edit the OP a bit to further describe those sorts of motivations (EDIT: Done!).

This is the kind of thing we intentionally left for future extensions. We need to explore the full implications of what this will enable, and there’s a possibility (that is, I genuinely don’t know if its a problem or not) of creating forward compatibility issues similar to the ones caused by !Trait bounds.

For example, there are obvious problems with treating user defined types (which may change in the future) as exhaustively covered. These problems already come up in practice with match of course. But what are the implications of importing those problems into the trait system, which has been designed to have an “open minded” view of the world in which its results are always compatible with discovering additional, coherent implementations?

5 Likes

Thanks for this very interesting track of thought! Can you give me an example of a compatibility hazard that would exist if this exhaustive impl feature were accepted, but would not already exist with only impl<const T: SomeType> Trait for OtherType<{T}> blocks?

It seems to me that as soon as const generic impl blocks are allowed to match on their const generic parameters, the compatibility hazards which you are describing already exist, and that this proposal does not create new ones. But I might be missing new avenues for incompatibility!


EDIT: It should perhaps be noted that this proposal only targets const parameters and should not be generalized to type parameters until the language introduces first-class support for defining a closed set of types (basically by providing first-class support for the sealed trait pattern).

The reason is that while adding more variants to an exhaustive enum is already (begrudgingly) accepted to be a compatibility-breaking library change, adding a new implementation of a non-sealed trait is not, and should not become, a compatibility-breaking change.

Maybe this is what you had in mind?

For another use case, I’m going to quote my own comment from a recent thread about tuples:

However, to be able to recurse on the Args parameter in a generic context would require some way to express disjunction, as I mentioned before. You need to be able to write separate impls of your trait for non-empty tuples and for (), and then somehow convince the compiler that your trait is necessarily impl’ed by Args given that Args: Tuple.

One potential approach could be based on associated constants. Something like:

enum TupleKind {
    Empty,
    NonEmpty
}
trait Tuple {
    const KIND: TupleKind;
}
impl<T> Foo for T where T: Tuple<KIND=TupleKind::Empty> { ... }
impl<T> Foo for T where T: Tuple<KIND=TupleKind::NonEmpty> { ... }

The compiler would have to add support for referencing associated constants with Trait<Foo=Bar> syntax, not just associated types, and additionally be able to tell that the impls together cover all possible variants of TupleKind.

3 Likes

@withoutboats @comex I tried to add some early discussions of the points that you raised in a “future extensions” section of the opening post, please cross-check it and comment further on the parts of your argument which this does not yet address.

An alternative could be to solve this using specialization. You’d need to write a generic impl, and then in this particular case, specialize it for the two values of bool:

struct Bool<const B: bool>;
trait Trait {}
default impl<const B: bool> Trait for Bool<B> {}
impl Trait for Bool<{true }> {}
impl Trait for Bool<{false}> {}
3 Likes

@gnzlbg Ah yes, I saw this proposal before but forgot to add it to the OP. Added an alternatives section that discusses it.

Compared to exhaustive disjoint impls, this specialization-based approach…

  • Does not break when the type of the const generic is extended (e.g. by adding an enum variant), which can be either a forward compatibility improvement or an avenue for silent codebase breakage without compiler errors. It depends on if a sensible default impl exists for the target use case.
  • Forces implementation of a default case even when it is unnecessary, which is a strict drawback.
1 Like

Note that this is a “feature”/“issue” of specialization in general, not something specific to const generics.

2 Likes

It also occurs to me that this problem would be more elegantly resolved if control flow like if or match based on const values were allowed in item or type position.

Something like this:

match SOME_ENUM_CONST => {
    Enum::VariantA => fn foo() { /* does something */ },
    Enum::VariantB => fn foo() { /* does something else */ }
}

fn bar() -> if INTEGER_CONST == 42 { bool } else { i32 } {
    /* I do not want to imagine the horror lying in there */
}

This would have many implications in other areas, such as reducing the need for generics-based hacks when doing type-level reasoning, and eliminating many use cases for the unpleasant #[cfg()] syntax. But I think there are also very strong arguments against it:

  • It seems highly incompatible with the way the compiler currently works (where AFAIK const evaluation happens quite late, and informations like fn signatures are needed much earlier).
  • It opens the floodgates for very, very messy codebases, where merely changing the value of a const can have arbitrary effects. As a matter of fact, a library can break its clients just by changing the value of a public const in this design.

Generally speaking, this seems like a much larger language feature with many edge cases and complex design tradeoffs, and even if I would see merit in such an RFC, I don’t think this is the right year to propose it and I wouldn’t want to be the one championing it.

Bool<{true}> ≠ Bool<{false}>, so this does not typecheck, does it?

I’d imagine something more like this:

impl<const B: bool> Negate for Bool<{B}> {
    type Negation = Bool<{// !B
        match B {
            | true => false,
            | false => true,
        }
    }>;
}
2 Likes

Somehow I’m reluctant to think that something like this could ever become possible.

What I mean is, I could picture exhaustiveness-checking being added, so that the compiler recognizes that Bool<_> implements the trait. But I would think you’d only be able to use associated methods and consts, not associated types.

It’s just that, whenever I see reasoning about monomorphized types being performed at the level of type-signatures, alarms go off in my head. I can’t quite put my finger on a specific example, but it feels like there are certain patterns this would enable—patterns that I once missed dearly from C++‌—that I later came to understand are fundamentally incompatible with Rust due to certain properties of the language we value, like errors at the definition site, and parametricity.

For example, consider how you can’t name associated types that can be overridden by specialization.

Okay, let’s try this:

trait ItsClone {
    type Assoc: Clone;
}

impl<T: Clone, F, const B: bool> ItsClone for Bool<{B}>
{
    type Assoc = <Self as Select<T, F>>::Out;
}

This impl is incorrect, because when B = false, Assoc = F does not impl Clone.

But when should the compiler check this? Surely we should we get this error upfront at the definition as we always do… but is it really a good idea to be having the compiler exhaustively check that Assoc fulfills the trait bound for every possible substitution? I think there could easily be trait impls that have millions of cases to check even though only one or two of them will ever be used.

1 Like

It should only have to check it for every relevant impl,instead of every possible value of the const-generic parameter.If a trait is specialized such that most values are in the default impl,it will have to check very few impls.

As someone already pointed out somewhere, this is already possible on nightly using specialization:

pub trait Select<T, F> { type Out; }
impl<T, F, const B: bool> Select<T, F> for Bool<{B}> { default type Out = T; }
impl<T, F> Select<T, F> for Bool<{false}> { type Out = F; }

In this respect, this proposal only improves upon the specialization-based solution by allowing exhaustiveness checking and eliminating the need for nonsensical defaults. And probably being easier to stabilize than specialization too.

In your example…

trait ItsClone {
    type Assoc: Clone;
}

impl<T: Clone, F, const B: bool> ItsClone for Bool<{B}>
{
    type Assoc = <Self as Select<T, F>>::Out;
}

…if we take the definition of Select above, then the compiler should refuse this definition of ItsClone because Select<T, F>::Out doesn’t have a Clone bound. If such a bound were added to Select, then the compiler would reject use of Select<T, F> in the impl block instead because F does not have a Clone bound. This seems reasonably straightforward to me.

(Actually, the current compiler will die earlier than that because it only accepts certain usages of generic parameters in impl blocks and this isn’t one. But as far as I know, that is a limitation of the current implementation, not a fundamental language design issue.)

4 Likes

Hmm. Interesting. Last time I played around with specialization it was impossible to mention associated types that could refer to a default impl. Playing around with them again I see that it is now possible to do so, so long as code using the associated type only makes use of the trait bounds provided on the associated type in the trait definition:

(that is to say, Assoc: std::fmt::Debug is required in the following code, because when the compiler type-checks main, it will treat <i32 as Trait>::Assoc and <u32 as Trait>::Assoc as completely opaque types. Thanks to this limitation, it would appear that parametricity is preserved, and there continues to be no reasoning about the monomorphized type until codegen)

#![feature(specialization)]

pub trait Trait {
    type Assoc: Default + std::fmt::Debug;
    
    fn make() -> Self::Assoc;
}

impl<A: Default> Trait for A {
    default type Assoc = A;
    
    default fn make() -> <A as Trait>::Assoc { Default::default() }
}

impl Trait for u32 {
    type Assoc = Vec<u32>;
    
    fn make() -> Vec<u32> { Default::default() }
}

fn makeit<T: Trait>(_: T) -> T::Assoc { T::make() }

fn main() {
    println!("{:?}", makeit(4i32));
    println!("{:?}", makeit(4u32));
}

So it would make sense for the compiler to allow <Bool{B} as Select<T, F>>::Out to be named, so long as it is treated as a completely opaque type.

In particular, ItsClone could not exist unless a Clone bound was added to type Out in the Select trait. Edit: Ah, indeed you acknowledged this.

2 Likes

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.