New here, so forgive me if this is the wrong place to post proposals, or if I'm doing it wrong.
Use case
When benchmarking, one wants to get rid of sources of noise. Randomized hash functions are one such source of noise. The solution is to set a fixed seed—somehow.
Solutions
Custom logic (not good enough)
One alternative is to have each application or libraries have its own custom logic for doing this. However, dependencies may also be creating HashMaps; it would be unworkable to have everyone create their own.
Also, as far as I can tell Rust's default hash doesn't even have the option to set a custom seed?
Environment variable (proposed solution)
If Rust's default hasher, and hopefully third-party hash libraries, checked for a well-known environment variable as their seed, it would be possible to set a single environment variable and have all libraries and dependencies use a fixed seed. Python has PYTHONHASHSEED
, for example.
Rust could use RUST_HASHSEED
, which can be a string of a u64
, e.g. export RUST_HASHSEED=12345
. Or if the resulting security impact is a concern, RUST_INSECURE_HASHSEED
just to emphasize what you're getting into.
Different hashers have different seed requirements, of course. E.g. ahash
takes four u64
. But that's fine, the goal isn't to enable a production source of randomness, the goal is just to be able to set a fixed seed for benchmarking only—even if it only covers a subset of the seed space that's not an issue.
Other solutions?
Got to be more than just two, I assume; more ideas are welcome.