Currently, rustc spends 10% of no-opt time creating and destroying Substs structs. More specifically, Substs structs are interned, but they each contain 2 Vec-s, which have to be allocated and then freed when an already-interned Substs is re-interned, so I would like some way to intern Substs without allocating.

This could be done relatively easily with find_equiv: create a variant of Substs whose Vec-s are allocated on the stack, and look it up. If it isn’t found, create a real Substs and intern it.

However, the core team thinks find_equiv isn’t a good API for a hash-table. I think a good solution would be to add a HashInterner. I would like to use parts of std::collections::hash for it - so it should probably be in std.

An API would be something like

impl<'cx, T: Hash> HashInterner<'cx, T> {
    fn new() -> Self { /* .. */ }
    fn intern<K, F>(&mut self, k: K, f: F) -> &'cx T
        where K: PartialEq<T> + Hash,
              F: FnOnce(K) -> &'cx T
    { /* .. */ }


It sounds like you need …

dramatic pause

… Recycler! https://github.com/frankmcsherry/recycler

At least, I wrote it for you. Maybe it will help!


Do I get this right? You want to hash an object that does not yet exist?


I don’t want dead objects, I want live ones. I basically want some version of Hash Consing.


I can already hash such an object. I just want to check whether it exists within a hash table.


The intent here is that you could re-use the dead-but-memory-backed objects rather than repeatedly dropping and re-allocating them. It is a different solution to the problem that you actually have, that churning the allocator is expensive.


The dead-but-memory-backed objects here are just vectors of a bunch of different sizes.


As I understand your post, repeatedly acquiring and releasing new “vectors of a bunch of different sizes” from the allocator is what makes your code slow at the moment. If, rather than ask the allocator to hand you two new fresh vectors for your Substs, you take two empty but allocated vectors from a Recycler, use them exactly as if you had allocated fresh vectors, and return them to the Recycler rather than let them drop, your code may go faster.


Incidentally, this sort of situation is where an inline allocator could shine. Since Rust has sized deallocation, both allocation and deallocation could in theory be optimized based on object sizes known at compile time, and they could turn into a handful of instructions each in the fast path. (Especially if you had constant offsets from %fs/%gs, but I guess that’s just fantasy in code that can be called from C.)

Though I suppose even that would find it hard to compete with stack allocation.


what is an inline allocator?


Sorry, I wasn’t clear; I just mean an allocator whose code gets inlined - the fast path, anyway. Like some modern mallocs, it would probably give each size class its own free list (singly linked?), cached in thread local storage, so there wouldn’t be a need for free slot merging, synchronization, etc in that path; it would just be popping and pushing an entry from the appropriate TLS linked list.

I don’t know if anyone has tried this in C or C++; maybe there’s some dealbreaker I haven’t thought of.


Did you ever figure out a solution to this issue? I’m running into this exact issue.