Well, the key difference is that RCU in the kernel eagerly destroys garbage, while crossbeam stashes it away for later reclamation.
The problem with delayed object destruction is two-fold:
-
If we say that objects get destructed “sometime”, then we need to place bounds like T: 'static on those objects. That means data structure bounds would be littered with 'static. In order to avoid the bounds, we may decide that objects with destructors (those that implement Drop) go into data-structure-local garbage rather than the thread-local one. That way we could simply destroy all remaining garbage a data structure produced when it itself gets dropped (no more 'static bounds!).
-
The amount of produced-but-unreclaimed garbage is not controlled. Who knows how much garbage is actually waiting to be reclaimed? We may have a String there, but who knows how long it is - maybe several bytes and maybe hundreds of megabytes. Or perhaps it’s not even a String, but a user-defined type that owns heap-allocated memory. Moreover, those destructors may execute long-running code, which in turn may delay memory reclamation of other data structures and cause unpredictable performance in terms of both memory and runtime.
Crossbeam can easily build up garbage lists of 100K objects. All those are still waiting to be reclaimed. That may be a lot of memory, a lot of Drop code to be run, and a lot of other owned heap-allocated memory, too. This can be a real problem - it’s just the nature of EBR.
So, a solution to both problems could be the decision that we’re not going to deal with object destruction at all. I like to think of crossbeam as a very primitive and robust layer - merely a supplement to jemalloc that delays calls to free(). Everything else can be then built on top - even delayed object destruction for cases where you need it. 
Delayed object destruction is tricky matter. In my opinion, this is something that shouldn’t pervade the whole framework nor be in the way to other data structures that don’t need it. There are many subtleties with it that users would need to be aware of. It’s something that’s best to leave offloaded to particular data structures that need it.
I propose two solutions for delayed object destruction:
-
Combine EBR with some reference counting. Like @aturon said, this is more to do with the data structure API. With skiplists it works really well. There’s a small performance overhead which gets dwarfed by cache misses caused by tower traversal, so we’re OK.
What I like about this is that users can easily understand all performance intricacies: objects eagerly get destroyed as soon as possible, just as if everything was reference counted in the first place!
-
Stash objects away (or maybe even just destructors?) in data-structure-local garbage lists. Execute destructors from time to time as you do other operations on the structure. If the structure gets dropped, execute all remaining destructors. What I like about this is that we don’t need 'static bounds and that other data structures don’t need to execute some other structure’s object destructors. Sharing the job of calling free()s between everyone using crossbeam is OK because it’s cheap and predictable, but sharing destructor execution is not OK because it may be expensive.
To sum up, I’d like crossbeam to build a solid (and not overly ambitious!) fertile ground for concurrent data structures. Object destruction is (IMHO) a can of worms best left to data structures rather than crossbeam itself. I invite everyone interested to read the code I’ve written so far here: https://github.com/stjepang/epoch This is a very thin, fast, and robust EBR layer with the following design choices:
- Is not concerned with object destruction, only memory reclamation.
- Incremental garbage collection (guarantees shorter pauses).
- Pinning is designed to be cheap and used liberally (very important for rayon!)
- Keeps the amount of stashed garbage under control (it could always know the exact amount down to bytes).
- Thread-local garbage is limited by 16 KB, and goes to a global queue when it gets full.
- There is a dedicated global queue for really big objects (e.g. arrays that back hash tables or deques).
- Big objects urgently get reclaimed (e.g. important after hash table resizing!).
Note that choice number #1 is an important one that makes others possible or easier to achieve.
Crossbeam’s EBR comes with the following problems:
- Possibly long pauses when emptying large lists of garbage or resizing vectors.
- Threads may hoard really large lists of garbage instead of distributing the work by sharing them with others.
- Suboptimal thread pinning.