Everything is a trade-off.
Indeed. I’ve not tried alternatives, so it’s all theories from me. (I feel this is sidetracking things a bit, so maybe move we should move this to another thread.)
Lookup hash probing is very small
Because you only test lookups at extremely low occupancies; 52% and 66%. The occupancy gets as high as ~90%. If you test with sizes 1.9m and 7.6m, you’ll see the worst case gets way higher. The average is still only ~6, but it’s a long tail distribution.
Histogram hist_h: Min: Ok(1) Avg: Ok(6) Max: Ok(52) StdDev: Some(6) Median: Ok(4) 95pct: Ok(16) 99pct: Ok(25)
Lookups are the strongest point of Robin Hood, though, and a probe length of 12 being within one standard deviation isn’t particularly worrying.
The insert displacements had big outliners (max 735) but was very controlled (5±15) for the 10M.
The average is heavily compensated for by the fact your benchmarks measures inserts over the full range of occupancies. Measuring just over higher occupancies, I get numbers more like
1.5m → 1.9m
Histogram hist_d: Min: Ok(1) Avg: Ok(11) Max: Ok(767) StdDev: Some(25) Median: Ok(2) 95pct: Ok(48) 99pct: Ok(118)
7.0m → 7.6m
Histogram hist_d: Min: Ok(1) Avg: Ok(19) Max: Ok(1073) StdDev: Some(36) Median: Ok(5) 95pct: Ok(81) 99pct: Ok(169)
19±36 no longer seems very controlled.
keys probing didn’t go above 1
Indeed; key probing only happens on a full 64-bit collision, which is vanishingly unlikely.
very low variance on these metrics is the raison d’être for the whole algorithm
The original paper actually uses double hashing, FWIW.