Policy idea: Change TypeId's implementation details every other release

ZerothLaw · November 20, 2018, 9:01pm

Hah, so not a hole, just anymap not using this impl correctly. That should be easy to fix.

Soni · November 20, 2018, 9:03pm

Eh, I mean, tbh, I do believe AnyMap should just use a “dummy hasher” that just returns the last 8 bytes pushed into the thing.

Centril · November 20, 2018, 9:09pm

Yup; that's (sort of) correct; As you can see in the implementation, the crate assumes that bytes.len() == 8. The usage of debug_assert! is a bit suspect tho; If compiled under release mode and the representation of TypeId is changed then the implementation may become unsound. If it was assert!(...) instead it'd be fully safe (but still error prone). Tho even if we change TypeId to u128 underneath it should still be OK because it will just read the first 64 bits and ignore the rest. The only soundness hole would arise if we suddenly made bytes.len() < 8, I think (which I can't imagine us doing...).

Soni · November 20, 2018, 9:10pm

Can you imagine us making bytes.len() == size_of::<usize>()? (4 on windows, 8 everywhere else)

Centril · November 20, 2018, 9:11pm

Would you be willing to elaborate on the rationale for this particular change?

Soni · November 20, 2018, 9:11pm

As for making it fast again: I have an idea, but I need to figure out if Windows can at least merge the same global together across DLLs, like it can with errno...

Soni · November 21, 2018, 12:12am

Because the core of it need not be fast! And the goal is to have 0 collisions.

comex · November 21, 2018, 12:49am

errno is part of the C runtime which is usually statically linked per DLL anyway.

Windows can’t merge globals by name across DLLs, but if you declare a non-generic static in a crate, you can rely on that having the same address in every dependent crate, since all users will reference it rather than instantiating their own copy. If you’re planning to make some sort of global lookup table to merge duplicate type IDs, that should be doable.

Soni · November 21, 2018, 12:56am

Yep, FastAny would be part of std, and it would have two impls:

On windows, it uses a non-generic static that is shared, allowing FastTypeId to work properly.
On everything else, we don’t use that.

FastAny would involve a few language features that don’t exist yet, but that shouldn’t stop us from fixing Any first.

Let’s not focus on FastAny for now, please?

retep998 · November 21, 2018, 6:00pm

If you have two DLLs which both statically link the CRT, they will both have their own independent versions of errno. Similarly if you have two DLLs which both dynamically link to the CRT but link to different versions of the CRT, they will still both have their own independent versions of errno. There's no merging of globals done by Windows.

Soni · November 21, 2018, 8:31pm

So if you depend on a DLL that makes use of errno, you don’t have access to errno?

retep998 · November 21, 2018, 11:58pm

errno is provided by the CRT, so if you depend on a DLL which makes use of errno and you want to be able to access the same errno as that DLL, you need to make sure both you and the DLL dynamically link to the same CRT.

Soni · November 22, 2018, 12:16am

Hmm. So just ship a triple-R (Rust Runtime Redistributable) on Windows? This can be per-Rust-version, since we already guarantee TypeId incompatibility across Rust versions.

retep998 · November 22, 2018, 2:35am

Rust guarantees that inside a given Rust dependency tree there will be precisely one copy of any given Rust crate (two different versions of a crate are considered different crates), so Rust does not currently have any issues with ensuring there is only one copy of a given static. As long as it is known where a given static exists, all compilation units can ensure they link against the static from that known location, and Rust does that just fine. There's no need to get some sort of Rust runtime redistributable.

Soni · November 22, 2018, 9:15am

Okay, so a slow TypeId is not gonna be an issue.

Can we get a slow TypeId?

Centril · November 22, 2018, 10:32am

This is not a forum for "Can we get X?". If you want something to happen, do the work, come up with a design, and spend the time writing a detailed RFC, blog post or comment in the following issue: Collisions in type_id · Issue #10389 · rust-lang/rust · GitHub.

chris-morgan · November 27, 2018, 1:53am

On the use of debug_assert!: https://github.com/chris-morgan/anymap/pull/32 resolved that a week before this discussion, removing all unsafe code there, though not the underlying assumptions; I just hadn’t merged it.

For the rest: I stand by the principle of the TypeIdHasher. This is the dummy hasher, it just happens to be assuming that there are exactly eight bytes rather than merely taking the last eight bytes.

Performance is the name of the game here; I wrote it the way I did under the belief that no substantial change to TypeId’s internals was likely. If it turns out to be likely to change, well then, we’ll change it in anymap and any similar code.

But so far I’m with Centril; until we get compelling rationale for change, I’m happy leaving the assumption in.

(Incidentally, it could be more robust if it did transmute to u64: then if the size changed it’d fail to compile, instead of merely panicking the first time you use it. But as it is, changing the internal representation won’t cause problems if it still hashes to eight bytes, which seems most likely.)

Soni · November 27, 2018, 2:06am

My proposal involves adding as much metadata as possible, as strings, into the TypeId.

Right now my priority is to fix accidental collisions. We can make it fast later.

dcarosone · November 27, 2018, 10:25am

Just in general: if there’s some situation where the internal implementation of a thing should not be depended on by consumers, but for some technical reason it can’t be completely hidden and so might be depended on in an undesirable way that could cause surprising breakage when the internal implementation changes, then changing that implementation and seeing what breaks is a perfectly reasonable thing to do - in a test case, not a release.

Bloop · November 27, 2018, 9:48pm

std::mem::size_of::<usize>() evaluates to the size of a pointer, which is always 8 on 64-bit and always 4 on 32-bit.

If it was actually determined by operating system like that, it would be a huge, obvious, ecosystem-wide problem.

Topic		Replies	Views
[Pre-RFC] NonInternalMut auto trait (never mind) libs	8	1110	March 25, 2019
[Pre-RFC] Safer Transmutation language design	38	6324	November 30, 2020
Type erasing non-`'static` types libs	18	651	August 8, 2024
[Pre-RFC] TypeId for non-static Types	8	2495	March 25, 2019
Extremely bad solutions to the Any 'static requirement language design	20	2143	September 7, 2020

Policy idea: Change TypeId's implementation details every other release

Related topics