Surprising interaction between zero sized structs and Hash

pedrocr · June 26, 2017, 11:27am

I was working on hashing a set of structs that all implement the same trait and some are zero-sized and some not. I have a sequence of them and am using the hash to represent the cumulative set of them up to that point in the sequence (for caching type uses mostly). I was then very surprised when I found that that hashing a ZST is actually a no-op. Here’s a minimal example:

use std::collections::hash_map::DefaultHasher;
use std::hash::{Hash, Hasher};

#[derive(Hash)]
struct Empty {
}

fn main() {
    let mut s = DefaultHasher::new();
    1.hash(&mut s);
    println!("Hash is {}", s.finish());
    let e = Empty{};
    e.hash(&mut s);
    println!("Hash is {}", s.finish());
}

It returns the surprising result:

Hash is 1742378985846435984
Hash is 1742378985846435984

I’d say this breaks the assumption of what .hash() does in that you’ve passed a new value into it and the hash hasn’t changed. It’s definitely a corner case though and having it be a no-op should definitely help performance and be fine in almost all cases.

For my specific use case this was quite easy to work around. My trait includes a .name() that returns a string so I hashed that first before hashing the struct. But I thought it was worth it to discuss first

kennytm · June 26, 2017, 1:40pm

I don't think there is such an assumption. The only requirement of Hash is a == b ⇒ hash(a) == hash(b). Since a concrete ZST only has one value, making its hash() do nothing won't break this requirement.

Perhaps you should not use Hash if you need anything stronger.

pedrocr · June 26, 2017, 1:56pm

I at least assumed that of .hash(), that it's never a nop. It seems reasonable to think this but maybe not.

Here's a list of things that look strange to me with the current Hash:

Empty{} and OtherEmpty{} hash to the same value
Foo("somestring") and Bar("somestring") hash to the same value
Hashing any number of Empty{} structs in sequence returns the same value

In rust most times these won't bite you because you can't create a hash table of different structs and so the fact that different types hash the same is less important. But it still seems inconsistent given the signature of .hash(&mut Hasher) that allows you to easily hash a sequence of different types. It's at least surprising that in the ZST case the result of the hash will not change no matter how many different types you use and no matter how many different calls you make.

kennytm · June 26, 2017, 2:15pm

(123_u8, 45_u8) and 11643_u16 both hash to the same value.

You can't compare the hash if they are different types.

arielb1 · June 26, 2017, 4:21pm

If hash of a ZST was not a nop, performance of hashing a struct with zero-sized markers would decrease with the # of markers, which would suck.

Ixrec · June 26, 2017, 5:45pm

It sounds like you were expecting Hash to hash a value's type in addition to the actual value. This is what I'd expect to happen in a dynamically typed language where a value "carries around its type" at runtime, but in Rust I would've expected and am not surprised to learn that "types aren't hashed", or as kenny put it:

That said, "type hashing" seems like a reasonable thing to do as long as it's kept separate from regular hashing. Maybe someone could write a crate that does impl<T: Hash + Any> TypeHash for T using TypeId::of::<T>().

pedrocr · June 26, 2017, 6:08pm

Actually I only need something weaker, which is that when .hash() is called on a ZST the hash value changes. Just hashing a fixed value for all ZSTs would be fine by me, but I see that for most use cases this doesn't matter and the performance implications are potentially relevant. The workaround was extremely simple so I only opened the discussion because the result was strange to me at first.

Ixrec · June 26, 2017, 6:12pm

Changes from what to what? I'm afraid I can't tell how this is different from "hashing the type" (or perhaps "hashing the ZST-ness of a type").

scottmcm · June 27, 2017, 12:30am

Note that this is always possible, even for non-ZSTs, because of the limited size of the hash. I'm sure there's some value that just happens to result in the hash not changing.

pedrocr · June 27, 2017, 7:33am

Sure, but good luck finding that one...

pedrocr · June 27, 2017, 7:35am

It's right there in the next sentence:

Just turning the nop into 0.hash(&mut hasher); would work for me.

system · March 25, 2019, 8:28am

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Low-latency hashing libs	11	1316	December 18, 2024
Stability of hash values documentation	8	5577	March 25, 2019
Pre-RFC: bless the `type FastHashMap` pattern	92	3977	March 25, 2019
Help wanted: fast hash maps in std	29	6094	March 25, 2019
Generalize Hasher to different outputs libs	8	953	December 30, 2021

Surprising interaction between zero sized structs and Hash

Related topics