Since more than a week has passed without any responses, I assume that this mapping between a 128-bit TypeId and the stable identifiers for crABI is logically sound.
I understand that we’re heading into the Easter break, so I’m leaving this here for us to reflect on. I look forward to hearing the team’s perspective once the break is over. If the next step is to address the 64-bit collision risks and the implementation involves drafting a formal Pre-RFC, please let me know.
The v0 scheme resolves symbol mangling for the linker, but it remains a compiler implementation detail; it is not unique to the language. It can change between versions of rustc without being considered a breaking change.
What I propose is different: elevating that information to an explicit identifier guaranteed by the language, one that does not change between compiler versions. Not something that depends on the compiler, but on the language.
The problem with the 64-bit hash is a consequence of this; if the identifier isn’t stable across compilers, it doesn’t matter whether you use 64 or 128 bits. First you need stability, then collision resistance.
v0 is an excellent basis for deriving those identifiers, but it doesn’t address the need for a permanent, guaranteed TypeId.
Ok, so it's more of an issue with the Rust project willing to commit to a precise stability guarantee, rather than need for a specific identifier format.
The opaque hash is a problem, but it's deeper than the hash itself: crate names are not unique identifiers, even within a single compilation unit. Rust/Cargo supports multiple major versions of the same crate and allows different registries and path dependencies to have the same crate name. To disambiguate that you'd also need to add something like Cargo's pkgid.
But what I don't understand is why you're proposing nominal typing for the ABI. Wouldn't it be better to identify types by their structure, so that non-ABI-breaking changes don't change the identifier?
The main use case I have in mind is plugins and dynamic Rust-to-Rust linking. In that case, nominal typing is safer; if you load a plugin that exports a type, you want to know exactly what that type is, not just any struct with the same fields. Semantic identity matters.
With structural typing, you could load a structurally compatible but semantically incorrect type without detecting it until runtime.
That said, I understand there are cases where structural typing would be useful; moving types between modules without breaking the ABI, for example. For those cases, an explicit annotation like #[abi_structural] could be the solution, similar to how #[repr(C)] already works today; nominal by default, opt-in for special behavior.
Regarding the pkgid issue you mentioned, you’re right; crate names aren’t sufficient, and we’d need to include something equivalent to Cargo’s pkgid to resolve ambiguity.
I don't follow the safety part. With plain nominal typing you could load a plugin that expects an older/newer and incompatible version of a type, then passing the wrong type would be instant UB.
There's also an argument to be made of which issue is more common. I find it hard to believe that you often have structurally equal types with different names and you confuse one with the other. It is however very common for types to change in non-backward/forward compatible ways, which then break when used with nominal typing.
Moreover:
While it does not give an explicit definition of stable identifiers, it does talk about name mangling and type informations. The stated goal is to make it not-UB to call an exported function after changing a function signature (including any type mentioned in it). There are also some guidelines on what such a mangling scheme should take into account, and for types with public fields it is much more similar to a structural type system than a nominal one. Your proposal doesn't address any of these points, and instead goes against some of them.
You're right. After reading #3435 in detail, I see that it covers much more than I initially thought, including the move toward structural typing for types with public fields. I should have read it more carefully before starting this discussion. Thanks for the clarification.
After reading #3435 more closely, I realize that my proposal overlaps significantly with it. However, the question I asked in post #14 remains open for me: is there any existing work or discussion on stable type identification that isn’t directly tied to ABI compatibility?
I already acknowledged this in post #14; my proposal addresses a different problem than the one #3435 needs. The question I’ve been asking ever since (posts #14 and #28) is whether there is any prior work on stable type identification that isn’t directly tied to ABI compatibility. That question remains unanswered.
How do you reach that goal without tying such identifiers to the ABI? Your proposal as is doesn't reach the goal because it doesn't protect against UB from a mismatched ABI, so it must also be unsafe.
The idea has evolved over the course of this thread. The initial post mentioned dynamic linkage as the motivation, but in post #14 I explicitly distinguished between two separate issues: identifiers linked to ABI compatibility, which is what #3435 addresses, and stable type identification independent of the ABI. My question since post #14 specifically concerns the second case. Is there any prior work on this?
What I don't understand is what do you need stable type identification independent of the ABI for. What problem are you trying to solve? What code, program, or system are you trying to write that you can't because you're missing this feature? And how would this feature solve it?
There is already a specific use case: Bevy issue #32, opened in 2020. The problem is that TypeId is not stable across binaries, making it unsuitable for serialization, networking, or plugin systems. The workaround has been type_name or TypePath; which the author himself described as sufficient for scenes and networking, but implemented manually at the library level, not provided by the language. That is exactly the drawback I’m describing: stable type identification independent of the ABI, with a real-world use case that a major Rust project has been patching for four years without a proper solution.
Networking is really using serialization under the hood, and plugin systems generally require ABI considerations to be safe. However I can see your point for serialization of open sets of types.
BY the way that issue was eventually closed by its creator with a comment stating "I personally haven't felt the need for a "stable type id" in awhile".
The fact that the network layer uses serialization actually expands the use case you already acknowledged; it doesn't restrict it. Regarding plugins, not all of them require an ABI; those based on WASM, IPC, or explicit serialization do not need one. And the Bevy issue wasn’t closed because the problem went away: cart closed it because they settled for TypePath as a workaround, explicitly acknowledging that the language doesn’t provide the right tool for this.
To clarify the design: two types are considered the same if they share the same logical identifier derived from a canonical, a user-visible type path (for example, the public path used to reference the type). This system does not attempt to capture structural or semantic compatibility beyond that. Renames are explicit and detectable changes, not silent failures.
Your explanation of paths is overly simplistic and doesn't actually work for all types. I recommend looking at DefPath in the compiler for the complexity of paths.
For example, types can be defined inside functions, functions can be inside impl blocks, crates can have have the same name.
Also I'm not sure this proposal itself is solving any problem on its own but is instead part of something greater. That's fine, but just understand that this proposal won't be "accepted" for anything on its own because of that, so writing an RFC for it isn't particularly useful. It would be evaluated as part of a bigger scheme, which of course doesn't mean that it can't be discussed here.
You've asserted that stable type names (or some form of identifier) is desirable, but as far as I can tell, you've failed to identify why. What exactly are the use cases, why do they only care about type identity and not structure/ABI, and why aren't they served by existing solutions?
People have identified potential flaws in the initial idea, but without semi-concrete use cases that we can evaluate the solution against, it's difficult for anyone to identify whether those are blocking or a non-issue (or even a desirable limitation).
The closest to a concrete application I was able to determine[1] was for (de)serialization, especially in a bevy ECS like serialized List<dyn Any> scenario.
For typical serde-like serialization schemes, the type name alone is enough (or even more than enough), as type context informs what the possibilities are. It doesn't matter if Left is ambiguous, because in the context of deserializing the specific container type you already know it must be Either::<T, U>::Left. This is what allows highly contextual an non-self-describing binary formats to work at all.
In the case of a heterogenous collection, things get more complicated, which is why serde can't support deserializing dyn Trait out of the box. What can be done is collecting a list of every impl Trait that you potentially care about[2], and then essentially deserializing a fake enum Everything with variants for every implementor.
Bevy very specifically wants legible names for their textual scene format, so 128bit or 256bit hashes are not a great option for them. For formats that aren't meant to be human readable, though, an opaque but stable type identifier would mean this strategy won't run into name collisions! ...Except you still need to deal with renames, potentially even cross-crate renames[3], and you need to annotate/register your types somewhere, so you might as well assign a #[track_caller]Location[4] based identity by default and allow specifying a user defined id for cases that need stability in the face of Location changing refactora.
"For plugins" isn't very specific, and any usages I can come up with don't need name stability between versions without also needing ABI compatibility info or just simplifying to the (de) serialization case. ↩︎
Either via explicit registration like bevy-reflect, or implicitly like typetag. ↩︎
E.g. extracting a type into a lib-core crate so it's still the same type between lib@1 and lib@2. ↩︎
It's not currently possible to construct Location except via the panic machinery. I'd be weakly in support of extending that capability to user code alongside appropriate warnings about stability and appropriate usage like exists on any::type_name and other for-debug-usage APIs. ↩︎