Multi-core reference counting


#1

A long time ago (2010), a post was made to the rust-lang mailing list on the topic of immutable data structures and reference counting. In particular, the author points out that naïve reference counting is quite expensive in a multi-core setup (due to the large number of cache invalidations), and that there are many known techniques for significantly reducing this cost:

The cost of reference counting in a multi-core scenario is a concern. Rust already limits the number of reference operations using aliases, which presumably gets rid of a lot of the cost. Is Rust wedded to the idea of having accrate refcounts at all times? As far as I can tell, modern ref-counting algorithms seem to be about on par with modern GC algorithms for performance (just barely) even in the context of multithreading, but they achieve this feat through deferred ref-counting and things like that. Should Rust not do that instead? I kind of think refcounts may be the way forward in the future, because the cost of GC’ing in a ref-count system is proportional to the amount of actual garbage, rather than being proportional to the size of the heap, but it seems like the consensus on this issue in the literature is that refcounts are only performant when they’re not being constantly kept up to date. Am I wrong?

Has any consideration been given to whether Rust should incorporate some more clever strategies for reference counting? I searched both RFCs and issues, but couldn’t find any reference to it? This is of particular important to read-mostly data structures in which readers are constantly accessing objects, but where those objects are occasionally modified or deleted. If readers are required to always modify the global refcount atomically, this kills multi-core scalability as the cache coherency traffic becomes the bottleneck.


#2

Move semantics give our reference counting some smarts: simply passing an Arc/Rc around doesn’t touch the counts at all, that only happens for an explicit clone call. Combined with lightweight/non-owning references, this means that the reference counts are usually touched quite rarely.


#3

Ah, but consider something like a hash map where you have many concurrent readers. They all need to clone the entry in the map to ensure that a writer doesn’t free the memory underneath them.


#4

The good thing with reference counting is that multiple implementations of it can interoperate. High-performance multithreading primitives are currently beyond the scope of std - we would prefer that domain experts iterate on a good library and abstractions that mesh well with Rust.

Also, Rust reference counts aren’t particularly automatic or pervasive - with probably not more operations than say the Linux kernel, which does use standard atomic RC primitives. Of course, there are probably workloads for each multithreaded data-structure - implementations are surely welcome!