Zero-cost interior mutability
John Nagle October, 2024
Motivation
Some data structures are hard to create efficiently in Rust. Complex linking can be created using Rc and RefCell, but there is both a performance penalty and the risk of panics. This is an initial draft of a proposal to address that through compile time checks.
There are two parts to this. One is a compile time version of RefCell, and the other is a compile-time version of Rc. This note is about a compile time version of RefCell, and discusses another proposal for the Rc problem.
SafeCell - a new type
SafeCell has much the same semantics as RefCell. But the checking would be at compile time.
API
pub struct SafeCell<T> {
...
}
impl SafeCell<T> {
/// Borrow -- checked at compile time, no panics or errors.
fn borrow(&self) -> &T {
}
/// Mutable borrow -- checked at compile time, no panics or errors.
fn borrow_mut(&mut self) -> &mut T {
}
/// Take -- checked at compile time, no panics or errors.
fn take(&mut SafeCell<T>) -> T {
}
}
Checking rules
The ideal rules:
-
Scopes of SafeCell mutable borrows of the same object must be disjoint.
-
No scope of an immutable borrow can include a mutable borrow of the same object.
-
No scope of a mutable borrow can include an immutable borrow of the same object.
Because determining if two objects are the same may be difficult, we probably need a more restrictive set of rules that's easier to check.
Simple but overly restrictive rules:
-
Scopes of SafeCell mutable borrows of the same type must be disjoint.
-
No scope of an immutable borrow can include a mutable borrow of the same type.
-
No scope of a mutable borrow can include an immutable borrow of the same type.
This is easy to check by transitive closure, but is probably too restrictive. It prohibits, for example, swapping the contents of two SafeCell structs.
Slightly less restrictive rules:
Add rule:
- Within a single function, it is permitted to have two mutable borrows of the same type provided that the structs being borrowed from are disjoint.
This allows
fn swap_contents(&mut a: SafeCell<Foo>, &mut b: SafeCell<Foo>) {
mem::swap(a.borrow_mut(), b.borrow_mut())
}
This handles a common case. In general, with SafeCell being a zero-cost abstraction, programs would pass a SafeCell, rather than a borrowed value from a SafeCell, into a function call. Borrowing should be very local. So single function analysis ought to be sufficient.
Notes on compiler implementation
For each function, the set of SafeCell types which are borrowed or mutably borrowed within that function would be constructed at compile time. Then transitive closure must be performed on those sets across calls. This would be done after generics have been expanded.
With that information available, each .borrow() and .borrow_mut() would be checked against the set of SafeCell types obtained for each function after transitive closure.
Error messages might read like:
Mutable borrow of SafeCell conflicts with a non-mutable borrow of the same type inside function foo. Function foo calls function bar which performs a borrow of SafeCell at line ...
Related work
This is a static replacement for RefCell, but does not address reference counting. The Static Rc proposal could possibly be used with SafeCell.
What's generally needed to replace Rc is a type that does what Rc, Rc::Weak, and .upgrade() do, but with compile time checking. The concept is to allow one owning reference, and N weak references. Weak references can be upgraded within local scopes only. The compile time goal is to show that when the owning reference is deleted, control is not inside a scope which upgrades a weak reference. If that can be shown, reference counts need not be maintained and .upgrade() cannot fail.
The idea is to stay with the user semantics Rust programmers already know from Rc and RefCell, but move the checking to compile time. We know those semantics are sufficient for most use cases, because people use them for that now.
If both of these goals can be accomplished, we have safe zero-cost back-references in Rust. This allows doing most of the things people want to do with references safely, avoiding escapes to unsafe code or raw pointers.
Comments?