[Pre-RFC] Scoped `impl Trait for Type`

Tamschi · November 26, 2023, 4:59am

Feature Name: scoped_impl_trait_for_type

[…]

Summary

This proposal adds scoped impl Trait for Type items into the core language, as coherent but orphan-rule-free alternative to implementing traits globally. It also extends the syntax of use-declarations to allow importing these scoped implementations into other item scopes (including other crates), and differentiates type identity of most generics by which scoped trait implementations are available to each discretised generic type parameter (also adding syntax to specify differences to these captured implementation environments directly on generic type arguments).

This (along with some details specified below) enables any crate to

locally, in item scopes, implement nearly any trait for any expressible type,
publish these trivially composable implementations to other crates,
import and use such implementations safely and seamlessly and
completely ignore this feature when it's not needed*.

* aside from one hopefully very obscure TypeId edge case that's easy to accurately lint for.

This document uses "scoped implementation" and "scoped impl Trait for Type" interchangeably. As such, the former should always be interpreted to mean the latter below.

Motivation

While orphan rules regarding trait implementations are necessary to allow crates to add features freely without fear of breaking dependent crates, they limit the composability of third party types and traits, especially in the context of derive macros.

For example, while many crates support serde::{Deserialize, Serialize} directly, implementations of the similarly-derived bevy_reflect::{FromReflect, Reflect} traits are less common. Sometimes, a Debug, Clone or (maybe only contextually sensible) Default implementation for a field is missing to derive those traits. While crates like Serde often do provide ways to supply custom implementations for fields, this usually has to be restated on each such field. Additionally, the syntax for doing so tends to differ between derive macro crates.

Wrapper types, commonly used as workaround, add clutter to call sites or field types, and introduce mental overhead for developers as they have to manage distinct types without associated state transitions to work around the issues laid out in this section. They also require a distinct implementation for each combination of traits and lack discoverability through tools like rust-analyzer.

Another pain point are sometimes missing Into<>-conversions when propagating errors with ?, even though one external residual (payload) type may (sometimes contextually) be cleanly convertible into another. As-is, this usually requires a custom intermediary type, or explicit conversion using .map_err(|e| …) (or an equivalent function/extension trait). If an appropriate From<>-conversion can be provided in scope, then just ? can be used.

This RFC aims to address these pain points by creating a new path of least resistance that is easy to use and very easy to teach, intuitive to existing Rust-developers, readable without prior specific knowledge, discoverable as needed, has opportunity for rich tooling support in e.g. rust-analyzer and helpful error messages, is quasi-perfectly composable including decent re-use of composition, improves maintainability and (slightly) robustness to major-version dependency changes compared to newtype wrappers, and does not restrict crate API evolution, compromise existing coherence rules or interfere with future developments like specialisation. Additionally, it allows the implementation of more expressive (but no less explicit) extension APIs using syntax traits like in the PartialEq<>-example below, without complications should these traits be later implemented in the type-defining crate.

For realistic examples of the difference this makes, please check the rationale-and-alternatives section.

[…]

Rendered on GitHub: 3634-scoped-impl-trait-for-type.md
Use the rightmost icon in the file header to show the table of contents. The current section highlight doesn't sync automatically for me there though, unfortunately.

Change History: Commits · Tamschi/rust-rfcs · GitHub

Apologies in advance that it's a bit long. I had the initial idea earlier this year but only saw that the 2024 lang roadmap asks for ideas on coherence about two weeks ago and decided to give it a go. Then it turned out more complicated than I thought and kind of escalated.

~~There are a few rough parts where I think I need help from someone more familiar with how implementations are currently bound to call sites and on generic type parameters.~~ If something doesn't add up regarding the feature implementation then that's likely because I never really looked at the compiler code so far, so please read those bits in terms of 'behaves as if' (and please tell me about those issues so I can fix them).

In any case, I hope there's nothing blatantly obvious that I missed, and thanks for your consideration.

Shout-out

to teliosdev for a piece of very helpful early syntax criticism and
to cofinite for pointing out how this can be used for sugary API extensions with syntax traits.
to tfpdev, and to @SkiFire13 below, for suggestions on how to make the draft more approachable and easier to understand.

Tamschi · November 26, 2023, 5:23am

I'd forgotten to specify layout-compatibility, so I just added a small section in that regard.

The forum also helpfully pointed out two earlier threads, Implementing traits and types for multiple foreign crates (pre-RFC) and Pre-RFC: pub(crate) impl Trait for Type, which might be good to link here for reference. I'm pretty sure my draft avoids the concerns with the former, and it does support the latter's use case as the scoped implementations follow item privacy rules.

SkiFire13 · November 26, 2023, 9:38am

I find the rules for when a scoped impl is used or not, and for when a type splits identity, pretty confusing. Consider for example this situation:

crate_a:

pub struct A(u32);

impl A {
    fn new_set() -> std::collections::HashSet<Self> {
        HashSet::new()
    }
}

crate_b:

use impl Hash for crate_a::A {
    // ...
}

fn b() {
    let mut set = crate_a::A::new_set();
    set.insert(crate_a::A::(42)); // Is this allowed?
}

Is crate_b allowed to insert in the HashSet by using a scoped impl of Hash for crate_a::A? Now, this may seem silly with an HashSet (why even provide a new_set?) but I can imagine some library where HashSet is replaced with a wrapper that actually does something even when crate_a::A doesn't implement the corresponding trait to Hash.

I can imagine two outcomes:

insert doesn't compile
- but if you have to assume any value returned by a third party crate may use its own use impl internally then use impl becomes equivalent to a new-type, losing all the other benefits.
A::new_set() actually becomes <A: Hash in crate_b>::new_set() and the insert compiles
- but this has the following problem. Consider this update to crate_a:

pub struct A(u32);

// New in A's update!
// Is this now considered a breaking change?
impl Hash for A {
    // ...
}

impl A {
    fn new_set() -> std::collections::HashSet<Self> {
        let mut set = HashSet::new();
        // Which impl does this use?
        set.insert(A(0));
        set
    }
}

Now crate_a provides its own implementation of Hash for A. This is not considered a breaking change, so it shouldn't break crate_b, right? But it also calls set.insert inside A::new_set(), meaning new_set now needs an implementation of Hash for A. Which one does it use? It may rely on the implementation details of Hash (remember, it could be some other trait), so it can't use the use impl in crate_b. Hence it must use its own Hash implementation in crate_a, meaning new_set now returns a HashSet<A: Hash in crate_a> and crate_b's insert either silently changes meaning (to use crate_a's Hash implementation) or it stops compiling.

To me any outcome seems undesiderable. Or maybe I misinterpreted something?

Tamschi · November 26, 2023, 1:37pm

This is addressed by making implementations part of type identity of generics like in Genus, though they are captured implicitly here.

It's briefly mentioned in the summary, covered by the sections Scoped implementations and generics, Resolution at type instantiation site and (especially) Type identity, is the main distinction from 𝒢, 'collection use cases' (Ctrl+F) are mentioned in Rationale and alternatives and it's also brought up as example use-case in Explicit binding. A HashSet-like example also comes up in Trait binding site.

It's true that this fairly spread-out and could be much more visible in how it relates to containers specifically, though. I'll go over your questions one by one:

No, at least if there's no Hash implementation captured in crate_a at all. The available implementations for the HashSet<>'s type parameters are captured only where the type is discretised (here in the return type of A::new_set), so the example you give does not compile.

The relevant error is

error[E0599]: no method named `insert` found for struct `std::collections::HashSet<crate_a::A>` in the current scope`

No, adding a global implementation is not a breaking change (but would make the example compile and produce an 'unused' warning on the use impl Hash for crate_a::A in crate_b instead).

However, if crate_a switched to its own scoped implementation, then that would change the type identity of the set instance from HashSet<A> to HashSet<A: Hash in crate_a> which are not mutually assignable. That would be a breaking change.

Note that it could also be considered a breaking change to switch from its own scoped implementation to a re-exported external one, as that would create type-identity with other HashSets that use that scoped implementation, which, while it wouldn't break code statically, would change TypeId comparisons.

This would infer the type parameter to be A according to the return type of the function, since the type parameters are not specified in HashSet::new(), and as such would use the global implementation crate_a provides.

If a scoped implementation was in scope for the return type definition instead, then the type parameter would be inferred to be A: Hash in crate_a and that implementation would be used there and in crate_b. This applies even if the scoped implementation is not pub directly.

All that said, this is a way for crate_a to make breaking changes accidentally, for example if it also uses a scoped implementation of e.g. bevy_reflect::Reflect for A in that scope to derive Reflect for another type where it appears as field.

It would be a good idea to have a warning if a captured scoped implementation (local or imported) 'leaks' through a discretised generic, i.e. when the implementation is not made at least equally visible by crate_a itself.

Edit: I just set a bookmark on this post for tomorrow as a reminder to add the warning to the draft. (Today I'm taking a break after publishing this )

Tamschi · November 26, 2023, 2:26pm

Regarding this specifically: The type identity distinction only applies to generics since only they may expect this kind of consistency on their type parameters.

Top-level bindings like Hash on an A value are entirely transient, so any discrete A is assignable to and from any other A regardless of the scoped implementations available. I think this is also how Genus handles it, but it seems to not be made explicit there. (Genus does not appear to have syntax for specifying a model to use outside of generic type parameters, but I could have missed that.)

mathstuf · November 26, 2023, 2:45pm

HashSet::new doesn't require any trait bounds, so how is the error meant to be surfaced? Or is it going to look at every possible impl that might exist based on A's missing impl Trait bits and error? If A is not Eq, does an error get raised about HashSet::reserve or HashSet::difference being inaccessible even if no one cares?

Tamschi · November 26, 2023, 2:53pm

It's surfaced in crate_b where the implementation is actually required. crate_a compiles regardless of whether any Hash implementation is captured into the discrete HashSet<A> type. crate_b cannot provide an implementation for the externally-discretised type parameter.

The type identity of the set instance is based on the set of all scoped implementations that exist on A where the discrete HashSet<A> type comes from, regardless of whether each is actually used or required by bounds. Here, that's in the return type declaration of A::new_set where the type parameter is written out explicitly, but it could be fully inferred in some cases (that I don't think can have any effect across crate boundaries, though, aside from the existing auto-trait leakage through return position impl Trait).

Edit: This likely seems a bit overly strict at first glance, but it's the only way (I think?) to ensure both locality of the type identities of generics and consistency of the implementations each of their instances uses. Relaxing a generic type parameter bound is not a breaking change due to coherence rules for blanket implementations, so bounds shouldn't influence the type identity here either (in case they are present on the type itself) unless we accept that types may accidentally become identical at any point.
Edit2: That could then lead to conflicts of scoped implementations in dependent crates though, aside from TypeId-related behaviour changes, so I don't think that would be a good idea.

Tamschi · November 26, 2023, 3:47pm

I just had an idea for how to use-declare the global implementation that I'd like feedback on.

Since global implementations exist regardless of crate boundaries, I think they could be imported from the root namespace:

use ::{impl Trait for Type};

This can be useful to 'reset'/shadow scoped implementations in a nested scope and is already supported by the grammar in the draft. Note that the global implementation must actually exist for this to compile.

With explicit binding, this would be

Generic<Type: Trait in ::>

, which would have to be special-cased since :: alone isn't a valid SimplePath.

Tamschi · November 26, 2023, 8:17pm

I did come up with a realistic API where there would be unexpected or outright faulty behaviour:

pub struct ErasedHashSet<S: BuildHasher = RandomState> {
    storage: HashMap<TypeId, Storage<S>, S>,
}

impl<S: BuildHasher> ErasedHashSet<S> {
    pub fn insert<T: Hash + Eq + 'static>(&mut self, value: T) -> Result<(), T> {
        let storage = self.storage.get_mut(&TypeId::of::<T>());
        todo!("Manipulate `storage`.")
    }

    // ...
}

I didn't specify this properly so far, so there are two useful options here:

(not useful) Generic type parameters of functions capture the full set of available scoped implementations. This would make TypeId::of::<_> behave unexpectedly for discrete types, so I'm not considering it here.
Generic type parameters of functions ignore the available scoped implementations at the call site completely. This would cause this API ErasedHashSet to have no way at all to distinguish Hash implementations properly (except heuristically by function pointer, but that's not a solution as there are no equality/inequality guarantees on those), so if RandomState is used, the set can misbehave randomly.
Generic type parameters of functions have identity that's distinguished by available scoped implementations for their bounds only. This distinction carries over when the generic type parameter is used elsewhere, like when calling TypeId::of::<T> with the generic type parameter here.

I think that is by far the best option here because, while the behaviour may be unexpected in some cases, the unexpected behaviour would at least be deterministic (separating distinct Hash implementations cleanly).

(This wouldn't limit reflexive implementations since on those, the type parameter is on the implementation rather than the function.)

Option 2 is more complex to explain, but I think that's worth it for getting okay-ish behaviour here.

Edit: Closures and function pointers must use the same rules as generic structs to avoid misbehaviour, I think, so this decision here is only about the generic functions' actual types that I think currently cannot be referred to in at least stable Rust.

Late edit: I found a better way to define the behaviour of 2. that doesn't require special-casing functions and should be less confusing. I'll write that out and update the draft hopefully later today or tomorrow.

Minor edit: There should be an Eq bound on that function, too.
~~Minor edit: And that second S should be on Storage. Sorry, I'm not great at this without the compiler helping me.~~
2 days later edit: Clearly I really shouldn't program freehand. S still goes onto the HashMap too. I have a proper version of this with more comments in my revised draft, which should be ready to publish soon-ish depending on how much I can work on it.

scottmcm · November 29, 2023, 3:31am

One vague thing I was wondering here: What if we gave names to the non-global impls?

When I see things like

use other_module::{
    impl Trait for Type,
    impl UnsafeTrait for Type,
};

I feel like they wanted names, and with names you even define two impls -- say a case-sensitive and case-insensitive Eq -- in the same module.

I'm also inspired here by how much better the NLL errors got when the started doing "let's call this lifetime '3" to be able to talk about things more easily. If we're going to introduce errors about "well yes, they're both HashSet<Foo>, but they're different types because they're using different impls of some trait", I think it would be nice to give normal paths to named things in those errors, rather than "the impl from this module" or what have you.

(Sorry if this is already addressed, I haven't dug deep into the details here yet. Thanks for putting a bunch of work into it! The noise in doing this today is a huge complaint and I'm excited to see all this progress.)

Nadrieril · November 29, 2023, 3:43am

That'd be very neat, and would solve the verbosity of importing a generic impl like

use module::{
    impl<T: Trait1> Trait2<T> for Box<[T]>,
};

I'm looking forward to the syntax bikeshedding on how to name an impl :D.

let CaseInsensitiveStr = impl PartialEq<&str> for &str { ... }
impl CaseInsensitiveStr: PartialEq<&str> for &str { ... }
impl (CaseInsensitiveStr) PartialEq<&str> for &str { ... }

scottmcm · November 29, 2023, 3:51am

Actually, one other request: could you add an overview somewhere for "here's a sketch of how this addresses the big problems that coherence exists to avoid"? (If there already is one and I missed it, sorry.)

My immediate questions here are things like SkiFire brought up: What's the basic way I should think about how this knows that the PartialEq+Eq+Hash in effect are important for HashSet<T>'s insert and similar methods, but that the T: Debug that's in effect doesn't matter, so I can debug-print a &HashSet<Foo> from someone else even if I have a different Foo: Debug from them?

I assume the proposed rules work together in concert to address things like that, but I'd like the "here's a rough overview" version to have in my head as I go through the details.

Tamschi · November 29, 2023, 5:52am

I admit that I did it on a hunch initially, but making them anonymous ended up having two very useful properties we would lose by naming them.

First, as scoped implementations are resolved implicitly (to avoid the noise-level newtypes suffer from), they must not overlap in the same scope (e.g. exporting module).

If they are anonymous like this, it's easy and intuitive to ensure this with the existing coherence rules for global implementations, just applied to one scope at a time. It also makes it clear 'what' is applied 'to what' every time they are in scope, and the uniqueness means that the module can be used as name in error messages (which would have to mention the trait and type separately anyway, in my opinion).

Second, the explicit subsetting in the imports is required for coherence. If you import an implementation by a name instead of subsetting it, then it is a breaking change for the implementing crate to broaden that implementation. I would like to avoid this since broadening an implementation is non-breaking under existing coherence rules (because the rules for blanket implementations prevent those that may potentially overlap in the future that way).

That said, I'm very open to syntax suggestions. Acquaintances have told me that the use module::{impl Trait for Type}; syntax is intuitive, but that the use impl Trait for Type { /*...*/ } in definitions feels odd. use is just the closest existing keyword that felt somewhat right to me there, though it does very subjectively rhyme nicely with the imports.

I didn't want to dig up override since that feels… very… dynamic dispatch-y, but that exists too.

I actually missed something in that regard completely in my first draft (see this earlier post), so some of that is upcoming. That said, two HashSet<T>s that were created with distinct T: Debug implementations are incompatible (but nicely mixable in the same scope!) types under this proposal. I don't think that there's any way around that really, without having the compiler do thorough data flow analysis or introducing a large number of coherence footguns.

That said, the compiler should be able to warn you if you're about to run into this issue to a large extent, since it can immediately tell when a non-public implementation is captured in a public alias or signature. Using scoped implementations within functions is also less tricky, since you can be sure it won't show up in public items.

As for type-erased value-keyed collections, those will be able to make that distinction based on the bounds of their type parameters to show expected behaviour, but this additionally requires permission to transmute specific newly-distinct types as long as this isn't observed by implementation bounds. (I have a proper writeup about how this works in my draft and should have that published soon. There's a tradeoff of this that needs an impact analysis.)

Overall I also added a few more examples and rewrote much of the section(s) on type identity to be clearer.
I still want to revise one more section tomorrow (and then need to go through everything one more time to make sure it's consistent and actually compiles/would compile), but in my eyes it's already a good incremental improvement and I should be able to publish the update fairly soon.

Oh also, and this is unrelated to your posts, but if a specific section or paragraph is hard to understand in this then please tell me! I'm not a native English speaker and being very precise in this language can be tricky for me at times. Someone pointed out elsewhere that I'm using jargon to that end which may not be all that accessible.

Tamschi · December 3, 2023, 1:24am

And updated, finally. Changes:

Defined rules for the observed TypeId of generic type parameters' opaque types, rewrote sections on type identity and layout-compatibility
Added section on sealed traits
Corrected the use of 'blanket implementation' vs. 'generic implementation' vs. 'implementation on a generic type'
Sketched two more warnings and one more error
Added a section Behaviour change/Warning: TypeId of generic discretised using generic type parameters
Removed the Trait binding site section but kept Coercion to trait objects and slightly expanded it
Added section Unexpected behaviour of TypeId::of::<Self>() in implementations on generics in the consumer-side presence of scoped implementations and transmute
Near-completely rewrote the Rationale and alternatives section with subheadings and before/after-style examples, added more positive effects of the feature
Rewrote Alternatives section
Removed some Unresolved questions that are now tentatively resolved
Added top-level syntax and a field example to Explicit binding, elaborated a bit more
Added Future possibilities:
- Conversions where a generic only cares about specific bounds' consistency
- Scoped bounds as contextual alternative to sealed traits
- Glue crate suggestions
Various small fixes, adjustments and clarifications

The draft is unfortunately too long for HackMD now, but overall should be much easier to read through now since it is much more structured especially under the Rationale and alternatives heading. I also added several additional cross-links to make navigation to related subsections easier.

I rewrote these sections completely to make them easier to find and hopefully more clear. You can now find this information in one go starting at Generic type parameters capture scoped implementations.

The examples in the (renamed) section [Type identity of generic types] come with some additional expalanation in this regard now, though I'm not sure it's clear enough as the example is still abstract.

These warnings are now described back-to-back in Scoped implementation is less visible than item/field it is captured in and Imported implementation is less visible than item/field it is captured in.

This is now integrated into the main text.

The section TypeId of generic type parameters' opaque types contains the full example with comments and explanation. This is resolved now in a way that should have the least unexpected behaviour possible in as many situations as possible while maintaining future compatibility of existing code as much as possible.

(The problematic cases can be surveyed automatically and should not result in unsoundness except for (hopefully) very unusual implementations.)

There finally is one now

You can find consumer-side before/after comparisons in the Avoid newtypes' pain points. The initial glue code doesn't change too much with this proposal – essentially you can just skip the newtype struct definition itself and a few conversions there if this goes through.

Sorry for how long this took. I ended up rewriting a bunch of stuff in addition to the changes needed to work with the TypeId behaviour wrt. transient generics that I hadn't considered in the first version of the draft. Found a bunch more positive side-effects of solving these issues (vaguely) this way.

Random thought: Should glue code crates in Rust be called "grease crates" ?
They wouldn't be all that rigid and 'sticky' anymore if something resembling this draft ends up accepted.

dlight · December 3, 2023, 4:13am

How does this solve the hashtable problem?

That is, we have a generic type A<T: Something> and we want to instantiate A<i32> but i32 doesn't impl Something. We use scoped impl Trait to impl Something for i32 two times in two different modules. Are we allowed to send an A<i32> created in one such module to the other module?

(usually A here is something like HashSet and i32 is the key of the hashset; and Something is the Hash trait; and we must guarantee that for a given concrete hash table, the key is always hashed the same; that is, the same impl will always be selected)

My gut feeling is that no, A<i32> from one module should be a different type from A<i32> from another module, because the type parameter from A requires a Something bound and the two places that instantiate A<i32> have different impls.

This is effectively what happens when two crates depend on the same third crate, but with different versions (and as such they try to share the "same" type, but they are in fact different type (and you get confusing type errors as result).

However, doing this may be impossible, because the A<i32> type may appear as part of a generic instantiation done in a third crate.

Tamschi · December 3, 2023, 4:38am

You're correct in that they are different types, but you can send and use these instances across modules with this draft.

As long as you don't need to name the type explicitly, there is no friction at all with this proposal. It'll just work according to the behaviour of where the type was defined (so you can actually create an A according to the other module's specification if the generic type parameter is inferred).

If you do need to name the type in the other crate, you'll want to export a discretised type alias like pub type ThisA = A<i32>;. That takes a 'snapshot' of the available implementations on the type parameter, so you can then refer to that configuration by name elsewhere. (You don't necessarily need to fully discretise it to capture anything, but I'm assuming your Something implementation is on i32 specifically here.)

The section that explains this is Type identity of generic types, which has an example that goes over all the scenarios you mention, including which types are identical and which are convertible (with this RFC alone. Conversions where a generic only cares about specific bounds' consistency points out a possible future mechanism for containers to define the conversion with broader, more accurate bounds).

Note that these distinct collections can still feed into each other directly through their iterator APIs (as long as the type captured into the type parameters is identical) even generically, since the items are convertible through the reflexive blanket conversions (because blanket implementations disregard outer captured implementations when they bind on their target/receiver).

Edit: Someone pointed out that it would help to name this concept, so: Generic type parameters capture the implementation environment of where they are specified. This is carried across module boundaries using type inference and explicit type aliases.

I'll edit this into the draft later, to act as breadcrumbs/ctrl+f anchor for related ideas. I'm a bit too exhausted to do it today though, so I'll bookmark this post and update it when the edit is public.

SkiFire13 · December 3, 2023, 10:03am

This is a bit more descriptive, but IMO it's still pretty hard to understand without context. The example after that uses only type aliases, which is something but I believe most of the time people will deal with calling functions/methods and it's unclear how that example translates. What about an example with:

module a:
- a type A without a Hash implementation
- a type alias HashSetA to HashSet<A>
- a function taking HashSetA
- a function taking HashSet<A>
- a function taking HashSet<T> for a generic T
module b:
- a type B with a scoped Hash implementation
- the same type alias/functions as in a, except for type B
module c:
- a type C with a non-scoped Hash implementation
- the same type alias/functions as in a, except for type C
module d:
- define a scoped implementation of Hash for A, B and C
- try to call the functions in modules a, b and c and describe what happens.

dlight · December 4, 2023, 6:06pm

But what if I instantiate the same generic type two times in different implementation environments? Then they will not be compatible, right?

Tamschi · December 4, 2023, 10:00pm

Yes, iff those environments differ for the specified type parameters.

Implementations on unrelated types and on the outer generic itself don't matter.

Tamschi · December 5, 2023, 12:36am

I just pushed an update to the draft that now includes this example here. Overall changes:

Added a list of bullet points to the Summary and revised it slightly
Coined the term implementation environment to refer to the set of all implementations applicable (to a given type) in a given place
Near-completely rewrote the Logical consistency subsection to add subheadings and examples
Small fixes and adjustments

For convenience, here's that section in full as of right now (with a tiny formatting fix that's already in my draft here):

Logical consistency

Binding external top-level implementations to types is equivalent to using their public API in different ways, so no instance-associated consistency is expected here. Rather, values that are used in the same scope behave consistently with regard to that scope's visible implementations.

of generic collections

Generics are trickier, as their instances often do expect trait implementations on generic type parameters that are consistent between uses but not necessarily declared as bounded on the struct definition itself.

This problem is solved by making the impls available to each type parameter part of the the type identity of the discretised host generic, including a difference in TypeId there as with existing monomorphisation.

(See type-parameters-capture-their-implementation-environment and type-identity-of-generic-types in the reference-level-explanation above for more detailed information.)

Here is an example of how captured implementation environments safely flow across module boundaries, often seamlessly due to type inference:

pub mod a {
    // ⓐ == ◯

    use std::collections::HashSet;

    #[derive(PartialEq, Eq)]
    pub struct A;

    pub type HashSetA = HashSet<A>;
    pub fn aliased(_: HashSetA) {}
    pub fn discrete(_: HashSet<A>) {}
    pub fn generic<T>(_: HashSet<T>) {}
}

pub mod b {
    // ⓑ

    use std::{
        collections::HashSet,
        hash::{Hash, Hasher},
    };

    #[derive(PartialEq, Eq)]
    pub struct B;
    use impl Hash for B {
        fn hash<H: Hasher>(&self, _state: &mut H) {}
    }

    pub type HashSetB = HashSet<B>; // ⚠
    pub fn aliased(_: HashSetB) {}
    pub fn discrete(_: HashSet<B>) {} // ⚠
    pub fn generic<T>(_: HashSet<T>) {}
}

pub mod c {
    // ⓒ == ◯

    use std::collections::HashSet;

    #[derive(PartialEq, Eq, Hash)]
    pub struct C;

    pub type HashSetC = HashSet<C>;
    pub fn aliased(_: HashSetC) {}
    pub fn discrete(_: HashSet<C>) {}
    pub fn generic<T>(_: HashSet<T>) {}
}

pub mod d {
    // ⓓ

    use std::{
        collections::HashSet,
        hash::{Hash, Hasher},
        iter::once,
    };

    use super::{
        a::{self, A},
        b::{self, B},
        c::{self, C},
    };

    use impl Hash for A {
        fn hash<H: Hasher>(&self, _state: &mut H) {}
    }
    use impl Hash for B {
        fn hash<H: Hasher>(&self, _state: &mut H) {}
    }
    use impl Hash for C {
        fn hash<H: Hasher>(&self, _state: &mut H) {}
    }

    fn call_functions() {
        a::aliased(HashSet::new()); // ⓐ == ◯
        a::discrete(HashSet::new()); // ⓐ == ◯
        a::generic(HashSet::from_iter(once(A))); // ⊙ == ⓓ

        b::aliased(HashSet::from_iter(once(B))); // ⓑ
        b::discrete(HashSet::from_iter(once(B))); // ⓑ
        b::generic(HashSet::from_iter(once(B))); // ⊙ == ⓓ

        c::aliased(HashSet::from_iter(once(C))); // ⓒ == ◯
        c::discrete(HashSet::from_iter(once(C))); // ⓒ == ◯
        c::generic(HashSet::from_iter(once(C))); // ⊙ == ⓓ
    }
}

Note that the lines annotated with // ⚠ produce a warning due to the lower visibility of the scoped implementation in b.

Circles denote implementation environments:


◯	indistinct from global
ⓐ, ⓑ, ⓒ, ⓓ	respectively as in module `a`, `b`, `c`, `d`
⊙	caller-side

The calls infer discrete HashSets with different Hash implementations as follows:

call in `call_functions`	`impl Hash` in	captured in/at	notes
`a::aliased`	-	`type` alias	The implementation cannot be 'inserted' into an already-specified type parameter, even if it is missing.
`a::discrete`	-	`fn` signature	See `a::aliased`.
`a::generic`	`d`	`once<T>` call
`b::aliased`	`b`	`type` alias
`b::discrete`	`b`	`fn` signature
`b::generic`	`d`	`once<T>` call	`b`'s narrow implementation cannot bind to the opaque `T`.
`c::aliased`	`::`	`type` alias	Since the global implementation is visible in `c`.
`c::discrete`	`::`	`fn` signature	See `c::aliased`.
`c::generic`	`d`	`once<T>` call	The narrow global implementation cannot bind to the opaque `T`.

of type-erased collections

Type-erased collections such as the ErasedHashSet shown in typeid-of-generic-type-parameters-opaque-types require slightly looser behaviour, as they are expected to mix instances between environments where only irrelevant implementations differ (since they don't prevent this mixing statically like std::collections::HashSet, as their generic type parameters are transient on their methods).

It is for this reason that the TypeId of generic type parameters disregards bounds-irrelevant implementations.

The example is similar to the previous one, but aliased has been removed since it continues to behave the same as discrete. A new set of functions bounded is added:

#![allow(unused_must_use)] // For the `TypeId::…` lines.

trait Trait {}

pub mod a {
    // ⓐ == ◯

    use std::{collections::HashSet, hash::Hash};

    #[derive(PartialEq, Eq)]
    pub struct A;

    pub fn discrete(_: HashSet<A>) {
        TypeId::of::<HashSet<A>>(); // ❶
        TypeId::of::<A>(); // ❷
    }
    pub fn generic<T: 'static>(_: HashSet<T>) {
        TypeId::of::<HashSet<T>>(); // ❶
        TypeId::of::<T>(); // ❷
    }
    pub fn bounded<T: Hash + 'static>(_: HashSet<T>) {
        TypeId::of::<HashSet<T>>(); // ❶
        TypeId::of::<T>(); // ❷
    }
}

pub mod b {
    // ⓑ

    use std::{
        collections::HashSet,
        hash::{Hash, Hasher},
    };

    use super::Trait;

    #[derive(PartialEq, Eq)]
    pub struct B;
    use impl Hash for B {
        fn hash<H: Hasher>(&self, _state: &mut H) {}
    }
    use impl Trait for B {}

    pub fn discrete(_: HashSet<B>) { // ⚠⚠
        TypeId::of::<HashSet<B>>(); // ❶
        TypeId::of::<B>(); // ❷
    }
    pub fn generic<T: 'static>(_: HashSet<T>) {
        TypeId::of::<HashSet<T>>(); // ❶
        TypeId::of::<T>(); // ❷
    }
    pub fn bounded<T: Hash + 'static>(_: HashSet<T>) {
        TypeId::of::<HashSet<T>>(); // ❶
        TypeId::of::<T>(); // ❷
    }
}

pub mod c {
    // ⓒ == ◯

    use std::{collections::HashSet, hash::Hash};

    use super::Trait;

    #[derive(PartialEq, Eq, Hash)]
    pub struct C;
    impl Trait for C {}

    pub fn discrete(_: HashSet<C>) {
        TypeId::of::<HashSet<C>>(); // ❶
        TypeId::of::<C>(); // ❷
    }
    pub fn generic<T: 'static>(_: HashSet<T>) {
        TypeId::of::<HashSet<T>>(); // ❶
        TypeId::of::<T>(); // ❷
    }
    pub fn bounded<T: Hash + 'static>(_: HashSet<T>) {
        TypeId::of::<HashSet<T>>(); // ❶
        TypeId::of::<T>(); // ❷
    }
}

pub mod d {
    // ⓓ

    use std::{
        collections::HashSet,
        hash::{Hash, Hasher},
        iter::once,
    };

    use super::{
        a::{self, A},
        b::{self, B},
        c::{self, C},
        Trait,
    };

    use impl Hash for A {
        fn hash<H: Hasher>(&self, _state: &mut H) {}
    }
    use impl Hash for B {
        fn hash<H: Hasher>(&self, _state: &mut H) {}
    }
    use impl Hash for C {
        fn hash<H: Hasher>(&self, _state: &mut H) {}
    }

    use impl Trait for A {}
    use impl Trait for B {}
    use impl Trait for C {}

    fn call_functions() {
        a::discrete(HashSet::new()); // ⓐ == ◯
        a::generic(HashSet::from_iter(once(A))); // ⊙ == ⓓ
        a::bounded(HashSet::from_iter(once(A))); // ⊙ == ⓓ

        b::discrete(HashSet::from_iter(once(B))); // ⓑ
        b::generic(HashSet::from_iter(once(B))); // ⊙ == ⓓ
        b::bounded(HashSet::from_iter(once(B))); // ⊙ == ⓓ

        c::discrete(HashSet::from_iter(once(C))); // ⓒ == ◯
        c::generic(HashSet::from_iter(once(C))); // ⊙ == ⓓ
        c::bounded(HashSet::from_iter(once(C))); // ⊙ == ⓓ
    }
}

// ⚠ and non-digit circles have the same meanings as above.

The following table describes how the types are observed at runtime in the lines marked with ❶ and ❷. It borrows some syntax from explicit-binding to express this clearly, but denotes types as if seen from the global implementation environment.

within function (called by `call_functions`)	❶ (collection)	❷ (item)
`a::discrete`	`HashSet<A>`	`A`
`a::generic`	`HashSet<A: Hash in d + Trait in d>`	`A`
`a::bounded`	`HashSet<A: Hash in d + Trait in d>`	`A` ∘ `Hash in d`
`b::discrete`	`HashSet<B: Hash in` `b` `+ Trait in` `b``>`	`B`
`b::generic`	`HashSet<B: Hash in d + Trait in d>`	`B`
`b::bounded`	`HashSet<B: Hash in d + Trait in d>`	`B` ∘ `Hash in d`
`c::discrete`	`HashSet<C>`	`C`
`c::generic`	`HashSet<C: Hash in d + Trait in d>`	`C`
`c::bounded`	`HashSet<C: Hash in d + Trait in d>`	`C` ∘ `Hash in d`

The combination ∘ is not directly expressible in TypeId::of::<> calls (as even a direct top-level annotation would be ignored without bounds). Rather, it represents an observation like this:

{
    use std::{any::TypeId, hash::Hash};

    use a::A;
    use d::{impl Hash for A};

    fn observe<T: Hash + 'static>() {
        TypeId::of::<T>(); // '`A` ∘ `Hash in d`'
    }

    observe::<A>();
}

Topic		Replies	Views
Alternative / relaxed orphan rules to make working with foreign trait impls easier language design	7	697	March 27, 2024
Pre-RFC: pub(crate) impl Trait for Type language design	17	1283	March 25, 2019
A partial remedy for orphan impl troubles: procedural blanket impls language design	4	1165	March 25, 2019
'Implementation Generics', A Potential Solution to the Orphan Rules language design	3	683	March 21, 2024
Revisit Orphan Rules language design	89	10576	March 25, 2019