Extend HashSet with `remove_by_hash` & `find_by_hash`

Given the following struct:

pub struct WebsocketConnection {
   session_id: usize,
   tx: UnboundedSender<Message>,
   user: Option<User>,
   // More fields
}

impl Hash for WebsocketConnection {
    fn hash<H: std::hash::Hasher>(&self, state: &mut H) {
        self.session_id.hash(state);
    }
}

Since the one field being used for creating the hash is the session_id it would be nice if I can retrieve the WebsocketConnection from a HashSet only by the session_id. The current workaround if you only have the session_id is to create a temporary WebsocketConnection and then pass that to remove/get/contains which seems kinda annoying. Or instead of using a HashSet one could use HashMap<session_id, WebsocketConnection>

C++ has a similar feature: std::unordered_set<Key,Hash,KeyEqual,Allocator>::find - cppreference.com

I am wondering if HashSet should be extended with function(s) that allow finding/removing values from a HashSet by its hash or maybe instead allow to find/remove by anything that implements the trait Hash. That way one could write a custom wrapper type for session_id and only pass that to remove/find an element.

Best regards, Dario

You can do this today! That’s what the Borrow trait is about in the signature of all those HashSet functions.

Oh I must have misread the the type signature then, let me retake a look. Thanks for pointing that out! :slight_smile:

Demo:

use std::hash::Hash;

#[derive(Eq, Hash, PartialEq)]
struct SessionId(pub usize);

pub struct WebsocketConnection {
   session_id: SessionId,
   other_stuff: String,
}

impl PartialEq for WebsocketConnection {
    fn eq(&self, other: &Self) -> bool {
        self.session_id == other.session_id
    }
}
impl Eq for WebsocketConnection {}
impl Hash for WebsocketConnection {
    fn hash<H: std::hash::Hasher>(&self, state: &mut H) {
        self.session_id.hash(state);
    }
}

impl std::borrow::Borrow<SessionId> for WebsocketConnection {
    fn borrow(&self) -> &SessionId {
        &self.session_id
    }
}

#[test]
fn borrowing() {
    let set = std::collections::HashSet::from([
        WebsocketConnection {
            session_id: SessionId(10),
            other_stuff: String::from("whatever"),
        }
    ]);
    
    assert!(set.get(&SessionId(10)).is_some());
}

The SessionId type is just for a little extra type-safety: it would work the same with Borrow<usize>.

By the way, I would personally choose the HashMap instead, unless I made sure it was impossible to construct an unrelated WebsocketConnection with duplicate session ID (e.g. via private static atomic counter for assigning IDs). It can be extremely confusing to have values which Eq + Hash the same but are functionally different.

1 Like

Ah perfect, now I get it. Thanks for providing an example!

By the way, I would personally choose the HashMap instead, unless I made sure it was impossible to construct an unrelated WebsocketConnection with duplicate session ID (e.g. via private static atomic counter for assigning IDs). It can be extremely confusing to have values which Eq + Hash the same but are functionally different.

I agree. It feels like an abuse of Eq and Hash to implement them like this.

That said, the situation OP is in is one I've seen come up a lot: You have a table of values indexed by some key but the key is stored in the value. You can just clone the key and use a HashMap, but that wastes space and makes it possible to represent invalid states where the two keys don't match. It would be better if there was a separate type for this. You can implement that type using a HashSet of a private type which has those questionable Eq + Hash impls:

pub struct OwnedKeyHashMap<K, V>
where
    V: Borrow<K>,
{
    set: HashSet<Field<K, V>>,
}

struct Field<K, V>
where
    V: Borrow<K>,
{
    _ph: PhantomData<K>,
    value: V,
}

impl<K, V> Hash for Field<K, V>
where
    V: Borrow<K>,
    K: Hash,
{
    fn hash<H: Hasher>(&self, hasher: &mut H) {
        self.value.borrow().hash(hasher)
    }
}

impl<K, V> PartialEq for Field<K, V>
where
    V: Borrow<K>,
    K: PartialEq,
{
    pub fn eq(&self, other: &Field<K, V>) -> bool {
        self.borrow() == other.borrow()
    }
}

impl<K, V> Eq for Field<K, V>
where
    V: Borrow<K>,
    K: Eq,
{}

impl<K, V> Borrow<K> for Field<K, V>
where
    V: Borrow<K>,
{
    fn borrow(&self) -> &K {
        self.value.borrow()
    }
}

impl OwnedKeyHashMap<K, V>
where
    V: Borrow<K>,
    K: Eq + Hash,
{
    pub fn new() -> OwnedKeyHashMap<K, V> {
        OwnedKeyHashMap {
            set: HashSet::new(),
        }
    }

    pub fn get(key: &K) -> Option<&V> {
        Some(&self.set.get(key)?.value)
    }

    pub fn insert(&mut self, value: V) -> Option<V> {
        let field = Field {
            value,
            _ph: PhantomData,
        };
        let prev_field = self.set.insert(field)?;
        Some(prev_field.value)
    }
}

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.