Shifgrethor: garbage collection as a Rust library

Very interesting. I‘ll take a dive into the api and the code to really comment further…

Do you see the utility of Shifgrethor as an arena for graph data structures or also as a building block for programming language implementation in rust?

I think if you’re implementing a language in Rust you probably don’t need the reference flexibility, which gives you a lot more freedom in some regards.

What a cool post! I’ve never thought GC could be something “desirable” when you have already a more performant, automatic memory model like in Rust. I wanna learn more about why you’re making this, this is very exciting.

That’s actually a very interesting point. From Rust’s “zero-cost abstractions” maybe there’s a way to implement GC’d languages in GC’d Rust without much of a performance penalty. That’d be huge! You could start sharing code among different GC’d language projects, making Rust a really nice fit for them :smiley:

Very nice post!

This is the magic I was looking for: the way to ensure that a value to Pin has been declared on the stack. I would never have thought of doing this via macros!

This is very interesting, thank you.

  • The API could be trivially adapted to support concurrent GCs, though the current implementation is not thread safe.

Wouldn’t it be interesting to implement concurrent data structures with a garbage collection library like this. How does this compare with the crossbeam-epoch crate? (Which does gc through epochs though).

One potential use case for a GC library in Rust is to integrate with a platform GC. For example, the WebAssembly platform will likely get native access to the GC of the associated JavaScript engine. This could be very impactful on the ability of Rust libraries to integrate with other languages running on that platform - not only JavaScript but also other GC’d languages compiled to wasm.

However, the current WebAssembly GC proposal does not allow direct references into the GC the way that shifgrethor does, because it is intended to support copying collectors that do not have pinning. This means you couldn’t use shifgrethor’s API on top of that proposal. It’s unclear to me how all of this will shake out, but shifgrethor is just trying to research the implications on GC APIs if we have this particular constraint (that we support direct references).


I know that the current implementation is not focused on performance, but do you have a feeling how this constraint will impact optimization down the road?

The main problem (and probably this series will have a whole post about this - its going to be a long series) is copying/moving garbage collectors. These collectors want to move objects that are being GC’d around to optimize their performance. They need to support “pinning” to be compatible with this API - some way of informing them not to move this type because there are live references to it. If you don’t support direct references at all, they don’t need to support this sort of pinning at all.

Most of the highest quality garbage collectors are what are called generational garbage collectors. These collectors split their heap into multiple subheaps: in the simplest implementation, there is a “nursery” and a second, main heap (I don’t know if there’s a term for the non-nursery heap). They most frequently collect the nursery, and only infrequently collect the second heap. When an object survives in the nursery collection, it gets moved to the second heap. (This can be extended to N arbitrary subheaps, instead of just two). This second heap is collected less often.

The principle of generational garbage collectors like this is that most objects are small and shortlived and don’t survive a whole collection. By only sweeping a small nursery frequently, they reduce the time it takes to collect. You can think of this as amortizing the cost of collection: you most frequently check the data that is most likely to need collection (the new data).

I explain this because along the same principles, you don’t want to make it too expensive to manipulate this shortlived data. If every access imposes a “pinning overhead,” that’s a cost that you’d like to avoid. Instead, it would be preferable if somehow during the tracing/collection phase, you can mark items as one of a) dead, b) alive, unpinned, c) alive & pinned, and then either collect, move, or leave in place according to that marker.

However the only strategy I know to safely track pinned references right now is instead, every time you try to access that type, you mark it as pinned. This imposes overhead on those shortlived items that will be dead by the next time you collect, which is what we’re trying to avoid. I don’t know a way to safely check if something should be pinned only during collection time.

But as Joshua Yanovski said on Twitter:


First, thanks for the thorough answer.

I have problems understanding that bit. As i understand it, a GC-ed value has to be pinned when rooted references to it exists, not only if its accessed? Sorry, i’ve the feeling that i’m a little bit stupid right now…

So to clarify one likely confusion: “pinned” in this context doesn’t mean the same thing as the Pin type. The practical difference between “rooted” and “pinned” in this context is:

  • A rooted type is alive, and so somehow it could be viewed again later. It is plausible to move it, but not free it.
  • A pinned type is not only alive, but is actively being viewed now. It is not plausible to move or free it.

To be a bit more concrete, a type to which there exists a Gc<T> pointer is “rooted,” because you could dereference that pointer later. But it isn’t pinned, because the Gc type’s deref impl can have ways of updating itself.

But what if a user has an active &i32 which points to a field inside a Gc’d object? That object needs to be pinned in place, because we have no way of knowing from the type &i32 that the object has moved.

There are other solutions in which instead of pinning values in place, we just rewrite all of the references to them to point the new object when we move it. This is even harder for Rust - it relies on a more precise form of rooting than shifgrethor implements, in which you don’t just know that the object is alive, but you know exactly where all the references to it are.


Ah, thanks. Now it makes sense. Basically, the program tries to obtain a read-lock and is able to do that and can dereference that value, finds a tombstone or something of that sort to the new value or has to wait because the value is currently moved.

And the collector can only move values that are not read-locked. Yeah, i cannot imagine that being performant :stuck_out_tongue: (also, pinned values could really throw a spinner into a bump-allocating generational collector by blocking collected generations for reuse)

1 Like

Very interesting! Thanks for exploring this. While this is all uncharted territory to me, it still somehow feels like you’re on the right track here.

I have a few questions though:

The whole garbage-collected heap lives and dies in a single scope, right? What I mean is, if I want to make a library that contains a graph with garbage-collected nodes as an internal data structure that needs to remain valid across calls into the library, it has to be the user of the library that calls letroot!, correct? And if I have no control over how my library gets called (I’m currently creating a WebAssembly library to be called from JavaScript), I can’t use Shifgrethor? Or is there’s some way to do it with self-referential generators (I have no experience with those)?

The principle of generational garbage collectors like this is that most objects are small and shortlived and don’t survive a whole collection.

I don’t think the same will necessarily be true for the objects managed by Shifgrethor. I mean, in a language where every object is garbage collected, there will be a lot of shortlived objects, but in Rust, most of those short-lived objects will be placed on the stack or in regular boxes/Rc/etc. Only a small subset of objects would ever come in contact with Shifgrethor, and those objects may not be so overwhelmingly short-lived.

Looking at this bit of code, it seems like Shifgrethor assumes that pointers to trait objects consist of a data pointer and a vtable pointer. Is that guaranteed to be true? The reference doesn’t say it is, in fact it actually says “you should not rely on this”.

No, there’s one GC per thread and it is static, each object in it has its own lifecycle independent of the others.

Agreed! However, the GC implementation behind the API may not be able to assume this if we’re trying to share a GC with a hosted languages.

It’s true today, and though we reserve the right to change it in the future I think it’ll be very difficult when we do that because I’m certainly not the only one relying on it. shifgrethor definitely relies on several representational that are true today but not guaranteed to be true forever.

Thanks for the answers @withoutboats. But I’m not fully satisfied with the first answer. What I meant was: say I’m writing a library with the following API:

pub struct Library {
    // ...

impl Library {
    pub fn new() -> Library {
        // ...

    pub fn do_something(&mut self) {
         // ...

Is there a way to fill in those // ... so that Library can keep a graph as an internal data structure (constructed in new and read/updated in do_something), where the graph’s nodes are garbage-collected by Shifgrethor? The issue is that any roots you create with letroot! in new() die before the end of new(), you can’t store them. Unless you create a generator that borrows the roots across yield points and you store that generator, but I’m not sure if that is possible (or will be possible when self-referential generators are fully implemented).

@withoutboats, I think the performance benefits of generational collectors come from being also copying/compacting:

  1. Generational hypotesys is that most ot the obects/allocations die young.
  2. Copying collectors are best suited for this case as the work for collecting a nursery is linear to the live objects/allocations (small fraction, according to 1), and the dead objects can be discarded with just a pointer switch (if you don’t have exceptions as pinning).

So for copying collector it may be beneficial to have 65-75% nursery size if your workload is generating a lot of transient garbadge. I think this is not going to be the common case in Rust as transient allocations can be statically allocated/deallocated (compared to Scala for example), so not subject of GC.

In another post I discussed some ideas I came up with.

Basically, I think Rust need a mechanism to say “whenever you send this value to somewhere else, you have to send that value as well”. This guarantees, whenever we move the “weak reference”, the referee is also moved or at least visible for the compiler to generate adjust code.

Without this ability (I think it is fair to say that no existing languages provide such checking in compile time), we will need to use unsafe code to ensure the invariant above. Then we can use the Access trait to achieve this.

(My Access trait is similar to Index/IndexMut, but they are different: it allows to return something that is not a reference. So you can return something like Option<&T> or Result<&T,E>. Also, its self is the index, not the container)

I’m curious if we could have pointers into a single “world-stopping” GC, i.e. some kind of sync Agc<T> pointer. Naively, I imagine each root having mutex that locks whenever any of its Agc<T> s are being read or written, and which also locks during collection of that pointer. This hardly sounds performant, though, and unmanaged OS threads aren’t the sort of thing you can really “stop”…

I’m making a note to explore this code while it’s still got a naive implementation, so that I can understand it. Thank you for writing it.

Learning Rust has opened up the world of systems programming that has always been opaque to me, and projects like this are part of the reason why.

I started my career as a web application programmer, working mostly on the frontend. Like a salmon swimming upstream to spawn, I have been gradually migrating down the stack towards metal for 10 or 11 years now. But I’d been stuck, for a while.

The problem I’ve always had with reading the code under the hood is that most libraries and languages typically have a FFI to core lib written in arcane C or C++. I don’t use C and C++ on a daily basis, and although I know those languages at an intermediate level, the preprocessor and template shenanigans that are used in production libraries have always evaded me. I can read through C++ application code, but as soon as I try to read Boost source code to demystify the magic, my head starts continually exploding until I have to look away.

Not so with Rust! I frequently and without undue hardship can read the implementation source for major, production-quality subsystems, like tokio, std, rayon, and crossbeam.

When I started learning Rust, I was expecting to get certain benefits. Beyond the headline “memory safety without GC”, there’s the rest of the superb type system, algebraic data types + destructuring, and emphasis on cost-free abstraction. All those promised benefits completely panned out. But these cost-free abstractions synergized to provide an unexpected benefit: production-quality crates are written in the same dialect of the same language that I’m writing. The result was a demystification of the programs that I write. The demystification revealed the beautiful hidden world of systems programming.

Thanks for all of your code, and, by the way, all of your fantastic posts, which have been a supreme education for me on CS fundamentals. Whenever I see a new blog post, RFC, or even github comment from you, I know I’m in for an education. It’s difficult to overstate my gratitude in words.


I’ve posted the second post in the series, which provides some technical details about how managed heap is represented as background for the rest of the series.

I want to express how enormously well-received this comment is by me, and not only for the obvious reason that it is very high praise. I think it’s not widely enough known that my formal education was in the humanities, not computer science or even a related technical field. I found Rust (back in 2014) very early in my experience with computer programming, and I had exactly the experience you describe: the Rust language and the Rust community rapidly accelerated my education.

For me it was not only the systems programming problems that Rust tackles and makes accessible, but also the particular intersection of different expertises and goals - the systems problems, the type theoretical approach, the commitment to an accessible user experience - that made Rust a fulcrum of learning for me. Even as my knowledge grows, there has always been a more informed person - often a world class expert - available to answer my questions.

I’ve been very fortunate over these past 4 years to have had the opportunity to become more and more involved in Rust, to the point that now it is my full time employment. It makes me so glad to hear that my work has had for others the impact that the work of others before had on me.


From one humanities major to another, I salute you! I’m amazed that you’ve been at this for such a short time, as well.

Getting back to the GC, though. Would a tricolor mark-and-sweep algorithm be implementable under the constraints that shifgrethor’s API imposes? Like a naive mark-and-sweep, it is a precise, tracing collector.

In practice, it achieves excellent latency even for large heap sizes, and it never moves a pointer, so the whole pinning problem just drops out. The tradeoff of “extra work, but concurrently on another core” is an increasingly good one for modern hardware.