Pre-RFC: bless the `type FastHashMap` pattern


#41

AttackableHashMap is a short non-abbreviation that clearly captures the problem.

I think VulnerableHashMap and InsecueHashMap sound fine too. And the abbreviations NonCRHashMap works too.

WeakerHashMap works but communicates less. Fast is wrong word because collision resistance keeps the map fastish under DoS. Unsafe is wrong word because unsafe refers to memory unsafety.


#42

IMHO, NonCRHashMap is very nice. It is short enough and still makes you wonder (go to the docs) what does that ‘Non’ prefix means in terms of missing functionality, and why would you choose something that lacks such property?


#43

It’d be great to bless the use of FxHashMap. I’d suggest deprecating fn new(), replacing with fn new_secure() & fn new_fast(). Along with type aliases SecureHashMap & FastHashMap.

This way you state a preference and the constructor names clearly state the intended benefit. It answers the question “why would I not use ‘fast’?” - because it isn’t as secure, “why would I not use ‘secure’?” - because it isn’t as fast.

Or just use HashMap::default() if you don’t care.


#44

We are absolutely not deprecating new.


#45

Well the idea still works even keeping new. Basically having secure alongside fast in autocomplete, each suggests the reason for the other.

I’d still mark new for removal in the next edition, but it isn’t the crux of the argument.


#46

Anytime you write HashMap<K,V> you actually get HashMap<K,V,RandomState>, which then assures collision resistance in the type system. We’d need some new AttackableState: BuildHasher, which works exactly like RandomState but uses a faster hasher, like:

pub struct AttackableState(RandomState);
impl BuildHasher for AttackableState {
    type Hasher = AttackableHasher;
    fn build_hasher(&self) -> AttackableHasher {
        AttackableHasher(FxHasher::new_with_keys(self.0.k0, self.0.k1))
    }
}

type AttackableHashMap<K,V> = HashMap<K,V,AttackableState>;
type AttackableHashSet<K> = HashSet<K,AttackableState>;

or whatever color gets selected.

If I understand, you’re proposing to distinguish between these types with inherent methods, like:

impl HashMap<K,V,RandomState> {
    pub new_secure() -> HashMap<K,V,AttackableState> { HashMap::new() }
}
impl HashMap<K,V,AttackableState> {
    pub new_attackable() -> HashMap<K,V,AttackableState> { HashMap::new() }
}

#47

I like WeakHashMap, it is short enough, has clear negative connotation, can be explained as “weak hash” + “map”, and re-uses Java name, so it will be immediately familiar to some. I don’t think that similarity to WeakMap from JS is a big issue, and it certainly does not warrant long unwieldy names.


#48

WeakHashMap in Java the weakness refers to the garbage collectability of the keys. In rust “weak” is used the same way in Rc. So it isn’t ideal.


#49

I’m strongly of the opinion that any name involving “weak” is a non-starter here, because “weak” already refers to weak ownership in C++, Java, C#, Haskell, Kotlin, JavaScript, and Rust itself (and apparently D doesn’t have this by any name). Javascript was merely following existing usage when they picked that name.


#50

Yes, collision with rc::Weak will be unfortunate, but I think “weak” is a too common word for reserving it just for this use-case. Other alternative could be VulnHashMap, it has the same length, and reading “vuln” as “vulnarable” shouldn’t be an issue. We could use more explicit VulnarableHashMap, but it’s a bit too unwieldy for my taste.


#51

As another example, in dotnet you almost always want SemaphoreSlim, not Semaphore, as the latter is actually an inter-process OS-level thing. I would absolutely expect that FastHashMap is what one should use by default, and the HashMap some legacy thing (perhaps because of API differences).

This is especially true as the edition release is likely to re-emphasize that we have a migration story for language features, but not for the library, so new things getting a name that implies you should use them over the other is absolutely something I’d expect for standard library evolution.


#52

LinearHashMap?

It’s obviously faster if you only have 3 elements. The Hasher should just be return 0.


#53

Maybe something like SimpleHashMap? Implying that “simple” code is probably faster, but may lead folks to investigate the drawbacks (collision resistance).


#54

Nope, simpler is usually better for usability and that alone makes it preferred.


#55

Other synonyms along that line are possible too, like NaiveHashMap.


#56

That sounds like it’d be slower.


#57

Well, if there are many collisions, it will be slower.


#58

That choice also raises the problem of the diaeresis that should be over the i of Naïve to indicate that the ai is not a diphthong.


#59

Use Naiive instead?


#60

Plain “naive” is the more common English spelling, but if you think that’s troublesome, we can keep looking…