F32/f64 should implement Hash

Attempt at a comprehensive summary

This is an attempt at putting in a single place that can be referenced later what was discussed around the topic of hashing floats.

Base Facts

There are some facts about floats and also about how rust currently is setup that are important to clarify:

  1. Equality in floats is more complex than in most types because for math reasons you want things like NAN != NAN and +0 == -0 to be true. This is why floats in rust implement PartialEq but not Eq as at least the reflexive property (a == a) is broken by the NAN case.
  2. There is a different equality concept that’s generally useful which is that if a == b it’s not possible to write myfunc(a) != myfunc(b). This is useful for caching situations and float equality in rust doesn’t guarantee this because +0 == -0 and yet .is_sign_negative() and friends allow you to differentiate between them. Note that this isn’t a property that’s necessarily exclusive to floats. If someone ever ports rust to an architecture that does integers as sign and magnitude the same +0 == -0 case would exist.
  3. The output of Hash cannot be relied upon to be stable. The same version of rust can return different values in different architectures. This is not a property of the Hasher that you’re using but instead of the way Hash happens to be implemented for the type you’re using (e.g., the current implementation of Hash for slices of integers returns different values in big and little-endian architectures).

If you want to use floats as HashMap keys

In general floats make for a poor choice as a map key because of the way equality is handled (see facts #1 and #2) so the best answer is just don’t do it. If you have a really good reason for it (e.g., caching the output of a really expensive func(f32)) here’s what you’d probably need to do:

  • Implement a separate equality operation that treats +0 != -0 and NAN == NAN. The easiest way to do that is probably to just transmute f32 to u32 and f64 to u64 and compare those but you may want to do something fancier like making all possible NAN values be equal.
  • You can then implement Hash by just hashing the underlying bytes after also doing normalization of NAN values if you did that for equality.

To implement this in the standard rust library a new equality trait would need to be created so that for math NAN != NAN and for hashing NAN == NAN. Since this is a corner case the easiest solution is to just do a wrapper type around floats like the ordered-float crate does, for the few situations where you want it.

If you want to have a content hash of a struct that includes floats

In some situations hashing is used to get a unique identifier for some content calculated from the content itself. For this Hash is not workable as it doesn’t guarantee the results are stable between architectures (see fact #3). Here your best solution is to use serde’s Serialize on your struct and then implement a Serializer that just passes the content onto a cryptographic hash. This can almost surely be done generically and so there will probably be a turn-key solution in the near future (see the discussion here). There are also fancier solutions for this in the works like the objecthash crate.

6 Likes