There have been times when I wanted HashMap/HashSet in some code which only has alloc to work with, but the default BuildHasher implementation being std-only prevents that from working.
What stops us from just moving the type definition to alloc and removing the default type parameter there, and providing the default again via a type alias in std? I think it's just methods/trait implementations which explicitly call out the RandomState argument (which only exists in std). These are:
new/with_capacity: I see no reason why we can't allow that for any S: Default instead, and this would allow new to be const once const traits happen (this also works for the with-allocator equivalents).
I think this shouldn't break existing code type inference since the existing (soon to be re-export) std name for it still has the default argument.
From<[(K, V); N]>: This can again be made to use Default like above, but the possibility of .into() into a place which might be generic and relying on it for type inference could cause type inference failures. Definitely at least worth a crater run before committing anything.
So I think it might be possible? I'm not an expert on how type inference works so there might be wider fallout than I'm thinking of.
But hold on, how is the compiler inferring S in this snippet?
let mut my_map = HashMap::new();
my_map.insert("hello", "there");
println!("{:?}", my_map);
What possibly determines it? Well, we can find out the hard way by using HashMap::default instead:
let mut my_map = HashMap::default();
my_map.insert("hello", "there");
println!("{:?}", my_map);
error[E0282]: type annotations needed for `HashMap<&str, &str, S>`
--> src/main.rs:4:22
|
4 | let mut my_map = HashMap::default();
| ---------- ^^^^^^^ cannot infer type for type parameter `S` declared on the struct >`HashMap`
| |
| consider giving `my_map` the explicit type `HashMap<_, _, S>`, where the type parameter `S` is specified
YUP! HashMap::new hardcoding S = RandomState is load bearing. Any code that creates a short-lived HashMap as part of some algorithm would run into this horrible error about how S is unknown!
An alias is too transparent to help in the expression case on its own.
Well then, RIP my aspirations of this being a quick and easy fix (I think this sort of inference breakage is technically allowed under semver as a minor version bump, but I expect it'd break too much existing code to be worth it).
Is this an area where people are trying to improve the language such that I can wait and this eventually becomes possible? Or is inference never going to be able to handle this change?
Having stuff defined only when std or alloc is available already has some problematic behavior. Having stuff defined differently depending on whether you have access to std or not sounds like it would probably only cause more problems.
To be fair, the fact that every no_std crate using maps via alloc needs to use hashbrown is kinda weird. Something that fundamental should be in alloc.
We do not even have core::io or alloc::io yet, despite there being massive numbers of crates that depend upon external projects that supply io::{Read,Write} outside std.
Anything that touches randomness sounds vastly more complex. lol
We could've some uninhabited type like
pub enum RandomState { };
or alternatively
pub type DefaultHasher = !;
impl BuildHasher for RandomState {
type Hasher = !;
fn build_hasher(&self) -> ! {
panic!("Instantiated HashMap outside std!")
}
}
Yet, these merely force you into declaring your randomness source, so the code would be less compatible than simply doing pub type HashMap = getrandom_hashmap::HashMap or whatever. Not great
If I read correctly, getrandom has 14 different target_os options and another 12 features, including many options only accessible through RUSTFLAGs to avoid cargo feature additivity. I doubt std ever supports all those options.
Also, we could even migrate the ecosystem away from std::collections::HashMap and towards some semi-blessed probabilistic data structures project that always uses getrandom everywhere, so morally shrink std but not really.
That situation already describes alloc as-is. If you depend on it there must be a GlobalAlloc instance somewhere but alloc itself will not (strongly) provide that symbol on a per-target basis. Instead, there's a declared GlobalAllocator singleton-type with the interface that's relaying calls to some user-defined symbol annotated with a macro, which must appear somewhere in the final crate graph.
There's a compatibility problem, we can't outright add the requirement to alloc without a fallback but fallbacks are the problem, but the concept itself is established. If that were a normal crate, I'd say to put the data structures under a new feature flag..
I don't know why you're focusing so much on randomness when nothing in my post requires any random input.
In std's current HashMap, the RandomState default implementation requires randomness, but I can provide my own argument that does whatever I want, which may or may not involve randomness.
My proposal was that code with only alloc would be able to use HashMaps but would have to provide an explicit BuildHasher. This could pull in randomness from some source, or depending on the code it might not need randomness at all if the keys are trusted input data.
And as a bonus, exposing HashTable would also open up some usecases that people often complain about, like querying a HashMap<(String, String), T> using a (&str, &str), or using a type as key that requires outside context for hashing/equality.