I would love to help with development of crossbeam.
Incidentally, because crossbeam was apparently dead, I’ve recently started working on a crossbeam-like library that implements epoch-based memory reclamation, and another library with actual concurrent data structures I intend to publish within the next few weeks or so (still messy code, but contains a queue, deque, and skiplist so far). So sorry that I can’t show more at this moment, but more is to come, I promise! 
This is not a new project - in fact, I’ve been thinking, reading papers, and experimenting with different ideas for a year now. Many ideas and lines of code were scratched and thrown in the trash can, and I’ve started from scratch many times. Only now I’ve started pouring some code with confidence that these ideas will work. 
It’s really hard to design lock-free structures with reasonable APIs in a language without garbage collection. Crossbeam still has just a queue, deque, and stack. These are simple structures, and it is no wonder that we still don’t have any more advanced structures.
First, I’d like to emphasize that crossbeam is a revolutionary library in the same way Rust is revolutionary as a language. Crossbeam collects bits and pieces of many ideas that live in the research world and brings them into a nice package for all Rust users. You can use crossbeam without being an expert at lock-free programming! And that’s amazing.
The road ahead is hard. Really hard. Just like Rust was frequently changing during it’s infancy, crossbeam will as well. A testament to this is a simple fact that no library like crossbeam exists. Libcds is the best library (written in C++) with similar goals, but it’s riddled with bugs and unfriendly to non-experts. I did a lot of digging and have seen pretty much all other similar libraries on the internet. There are some popular ones like Intel’s TBB and Facebook’s Folly, but those don’t provide lock-freedom, and even if they do, then don’t support “remove” operations.
Java has some concurrency primitives in it’s standard library: for example, real lock-free ordered maps! No other language ever achieved that - it’s been a decade and we’re still waiting… Java is the king of lock-freedom. That means literature and papers often use Java as pseudocode or assume Java’s garbage collection fairies. Can Rust steal Java’s throne? We’ll see.
Crossbeam is in a unique position where it can leverage borrowing, lifetimes, and Rust’s type system to design easy APIs that don’t compromise performance nor correctness. Rust already opened a new chapter in concurrency (no data races! rayon!) and I truly believe we’re about to venture even farther with crossbeam. So excited about that!
I’d like to briefly present where my library is headed, and would like to hear what others think. I’m sure many will disagree with some points, but this is the design I’ve arrived at so far, after a lot of experimentation… Of course, some points may change as I gain more experience.
Epoch-based memory reclamation
Crossbeam uses a textbook implementation of EBR. Pinning itself takes 27 nanoseconds, but could faster: my library takes 20 nanoseconds, and is inspired by the scheme described in one of Microsoft’s papers on Deuteronomy used in SQL Server.
Crossbeam stores garbage in three vectors that rotate as the epoch advances. My library instead keeps thread-local bags of garbage (bounded by 16 KB) and pushes them into a global garbage queue when they get filled up. All bags are tagged with the epoch when they got filled. This makes incremental garbage collection easier, so GC pauses are small and bounded!
Crossbeam’s API is a bit awkward. There’s lots of Option<Shared<T>> and Option<Owned<T>>. My library uses Ptr<T> and Box<T> instead.
An important design choice I made is that EBR must not concern itself with object destruction, but memory reclamation only. This constraint makes certain things easier, like watching out for the amount of leaking memory. After time I came to realize that this is not a severely limiting constraint.
The philosophy is that users of concurrent data structures generally shouldn’t have to worry about or even be aware of pinning. If you’re wondering “but how would you implement HashMap::get or iterators, then?”, the answer is that you can combine EBR with reference counting, if you’re willing to accept a (reasonably!) small performance cost. Dead-easy API must be the default, with optional slightly higher-performing APIs for advanced users. In other words, @ticki’s Pinned<T> falls into the “advanced users” category in my book. Anyways, I could expand on this some other time.
Stack
Stack is a pretty useless data structure (high contention!), but it can be included as the hello-world of lock-free programming, or for the few use cases where it actually comes handy. Crossbeam’s TreiberStack should be extended with method pop() that parks threads.
Queue
Crossbeam already has a good MsQueue implementation. I’d probably ditch SegQueue because it has really subtle tradeoffs. Very few people can explain them.
There are some things to polish, e.g. the queue holds the guard even when it parks the thread, which can be catastrophic and may prevent garbage collection forever. But that’s fixable and no big deal.
Deque
Crossbeam’s ChaseLev works, but is still noticeably slower than the deque crate that doesn’t use crossbeam. This is a problem for rayon and is the reason why it still didn’t switch to crossbeam. The problem can be mostly alleviated by combining fences that come from pin() with SeqCst fences in the algorithm. Also, smarter EBR scheme with smaller overhead helps quite a bit.
I’ve been experimenting with this during the past few days and am confident that the problem is solvable (yay for rayon!).
Ordered maps and sets
The real challenges begin here.
The usual solution here are skiplists. I’ve implemented a lock-free skiplist in Rust. It easily beats ConcurrentSkipListMap in Java in performance, so that’s great
It’s also easy to use, users don’t have to care about pinnings, and everything magically works! Very satisfied with them, except they’re not the fastest data structure in the world.
Some kind of B-tree would be faster. Microsoft Research invented Bw-trees in 2013. Bw-tree is the state-of-the-art ordered lock-free map. According to my benchmarks, Bw-trees absolutely destroy Java’s skip lists. It’s no walk in the park though, but it’s implementable with a lot of effort. Designing a reasonable API that can be exposed to users is really tough (unsolved problem so far) and presents further challenges in implementation. That said, I believe I have a solution and that we’ll have nice easy-to-use Bw-trees in Rust.
Why do we need both skiplists and Bw-trees? Because the former has very flexible API and the latter has a bit more restrictive API (e.g. K: Clone, V: Clone), but better performance. All this may sound scary, but you can think about those data structures as: “both are just maps, except the latter one is faster and has bounds on types”.
Hash maps and sets
This is the area I’ve mostly just read about but didn’t experiment too much. That said, there are several directions to explore. It’s possible that we’d have several kinds of hash maps with different bounds, just like with ordered maps.
- Striped hash map - shard the map into several regions each protected by a
RwLock.
- Feldman’s hash map - tree-like structure.
- Cliff Click’s hash map & junction - uses probing, very promising!
- Split-ordered lists (backed by lock-free linked lists) - a very common and proven solution.
- Cuckoo hash maps with some locking - see libcuckoo.
I’d like to explore all directions and see which ones work best - this is something that needs a lot of experimentation.
Conclusion
I hope all this was somewhat illuminating. Not sure whether you like the plan for my library nor whether it aligns with crossbeam’s goals, so let me know. When I suggested that I’d like to experiment with it until the ideas I’ve described get proven to work, @steveklabnik, @aturon, and @ticki expressed concerns about splitting the ecosystem and not working on crossbeam directly.
That’s fair. My only concern is that the community might be having slightly unrealistic expectations or undermining the difficulties that lie ahead. Crossbeam (with non-trivial data structures) is still far from stabilization and needs to try out a lot of wrong ideas until it finds the right ones. I’m not sure whether such a widely used crate as crossbeam is a good fast-moving testing ground. There might be a lot of friction, but I might also be wrong. 
In any case, I’d be excited to help in any way possible.