The Entry API provides a way for a user of a map to hook into the search/insertion/removal logic and avoid unneccessary work. Specifically, it enables you to perform conditional logic based on whether the key is already present in the map or not. Generally, this is done by capturing the state of the insertion algorithm after it identifies whether the given key is present or not.
However there are two competing initialization scenarios for a consumer of this API:
- I have a by-value key ready, and I don’t need it anymore. I would like to feed it directly into the Entry to avoid any further clones.
- I have a by-ref key ready, and I don’t want to clone it unless an insertion is actually necessary.
Respectively, these APIs want:
- by-val:
fn entry(key: K) -> Entry<K, V>
- by-ref:
fn entry<Q: ?Sized>(query: &Q) -> Entry<Q, K, V> where K: Borrow<Q>
And each is suboptimal for the other’s usecase.
- If I have a by-val key and the by-ref API, I’m forced to pass in the key by-ref, causing the
API to needlessly
clone
the key if an insertion occurs. - If I have a by-ref key and the by-val API, I’m forced to pass in the key by-val, causing me to
needlessly
clone
the key if an insertion doesn’t occur.
Initially, the API was by-val simply because it pre-existed std::borrow
. After std::borrow
came to be, it was decided that the by-val-must-clone-on-insert API was the “least bad” one from a
performance stand-point, on the premise that anyone using the Entry API expects lots of
duplicate keys. However it was discovered that certain Map instances couldn’t work with
the by-ref API, and the change was reverted.
This was not-great but generally hand-waved as ok because there was nothing preventing a hypothetical unification of the interfaces with the right traits and blanket impls.
Entry API 4.0 is supposed to be that API:
The basic idea is simple: accept IntoCow
, which generalizes over owned and borrowed data, with
the ability to upgrade borrowed data to Owned. Unfortunately, this doesn’t work, because IntoCow
lacks the necessary blanket impls (and it’s not clear that this would be possible with
coherence).
Instead, I considered adding a new method, entry_lazy
which takes borrowed data. We then unify
these two APIs through Cow, providing a uniform API and implementation with two trivial
entry-points. The actual changes to the implementation are completely trivial: just add call borrow
in set up
and into_owned
in VacantEntry::insert
.
Unfortunately this has a negative side-effect: the value in the entry
method needs to be feed-able to a Cow.
This means we need to be able to declare a type for the value to Borrow to. Naively this is fairly easy: K is Borrow<K>
for all K. However Cow also requires that the Borrow is ToOwned<Owned=K>
, which only
is true when K: Clone
. The end result is that we have to specify that K is Clone, even though this is
specifically the entry method which will never call Clone.
In a fever dream I produced a glorious hack to solve this. Just add this to the std::borrow:
// A type that cannot be realized, for making Cows that are
// statically guaranteed to be Owned
pub struct FakeBorrow<T>(PhantomData<T>);
impl<T> Borrow<FakeBorrow<T>> for T {
fn borrow(&self) -> &FakeBorrow<T> { panic!("Noooooo") }
}
impl<T> ToOwned for FakeBorrow<T> {
type Owned=T;
fn to_owned(self) -> T { panic!("Nooooooo") }
}
Now entry
just makes a Cow<'static, FakeBorrow<K>>
, and everyone’s happy. Right???
Unfortunately this is totally busted as soon as the code tries to call moo.borrow()
to search the structure.
The exact Entry API as written has this as a non-problem, though! Simply directly match on the Cow enum
an instead of borrow
ing in the Owned case, just take a reference to it! This creates some nasty logic and monomorphization duplication as the Borrow case works with a different value from the Owned case, but
in principle it should work (haven’t tested this design out fully).
It may have a bad interaction with future changes to the Entry API that expose the key
argument for richer
delayed computation of the value, though. Also obviously FakeBorrow is a big huge hack.
Alternatively we can take entry_lazy to its logical extreme and just make a duplicate parallel Entry API with all the same interface but different types.
I dunno, I’ve been bashing my head against this too much.