Coming from a c++ background I found the prevention of multiple or mix of mut and non mut references very restricting in the same thread.
Rust provides a workround which is the so called interior mutability which is a bad version of c++ mutable
and const_cast
.
For those who will complain immediately that if you hold a non mut reference then you must have the guarantee that its pointee's value does not change as long as you are holding it: This is false! Non realistic! Even in rust! Please read about interior mutability
Interior mutability is very bad! in c++ there exists the keyword mutable that allows to mutate a field from const method which is usually used to get value under a mutex lock or to implement a caching mechanism. The rest of the interface does not try to hide mutability or pretend to be const while mutating the class state. const_cast is exclusively used in getters which doesn't affect the state of the class, to not duplicate code adds the mutability to use a mutable getter method then adds the constness back to the result. Interior mutability in rust tries to overcome the limitation imposed by preventing multiple mutable borrows where it shouldn't be a problem. For example a property like type that I was implementing recently and a property may have a constant value or be bound to another property to get its value through a const getter method but once a property has a dependent property it can't be assigned a new value to be observed by its dependents which forced me to make the whole api const and use interior mutability under the hood and for the user to know if the method mutates the property or not he must consult the docs and can't rely on the signature
While searching about this topic I found this blog: The Problem With Single-threaded Shared Mutability - In Pursuit of Laziness and it was very explanatory and helpful and boils down the problem into two problems (in fact it is one) :
- It causes memory unsafety
- Iterator invalidation And after digging around I found it is the same issue: mut references may be used to invalidate pointers to inner data. This seems the reason rust was decided not to have multiple aliases when one of those is mut. Not: unlike the blog states this issue is not specific to rust, it is very common in c++ programs.
- Mutable references can't invalidate the referenced value. Because rust does not allow move out of reference a mut reference can't make the memory it points to invalid on its own. For example this code that does not compile does not result in memory unsafety:
let mut v: Vec<i32> = Vec::new();
let r1 = &mut v;
let r2 = &mut v;
r1.push(1); // does not invalidate r2
r2.push(2); // does not invalidate r1
r1.clear(); // both references are still valid
v = Vec::new(); // both references are still valid
*r1 = Vec::new(); // both references are still valid
- Mutable reference can invalidate a reference to inner memory owned by the object. this use pattern looks like this: an object stores some object on the heap and provides a getter to return a reference to this inner object. And provides another methods (takes mut reference to self) to alter release this memory space thus invalidating all references to it. For example rust compiler rejects the following buggy codes:
let mut val = Box::new(0); // val owns a heap allocated memory
let r = &mut *val; // r points to val's heap memory
val = Box::new(1); // val is dropped then assigned to a new Box so its memory is deallocated
// r now points to deallocated memory
println!("r: {}", *r); // use after free !
Iterator invalidation (common in c++)
let mut v : Vec<i32> = Vec::new();
v.push(0); // v stores its contents on the heap
let r = &mut v[0]; // r points to a value on the heap
v.push(1); // may reallocate if the capacity does not fit !
v.clear(); // invalidates its heap content (even if not deallocations happen !)
// r points to invalid memory (propably) !
println!("r: {}", *r); // use after free !
The problem is not only limited to references that points to the heap but also an object on the stack can also have its memory space invalidated! But doesn't rust compiler already manages stack space and invalidating its content requires use of unsafe pointers? yes but there is an exception: enums are special types that its memory content may change to a completely another type thus making all references to the same memory invalid because they point to another type. This is also common in c and c++.
let mut x = Some(0); // x content: tag | i32
if let Some(ref mut r) = x { // r points to the i32 part in x
x = None; // x new content: tag | None
// r now points to invalid memory
println!("r: {}", *r); // use after free !
}
- Inner mutable references can't invalidate references to their owner. If you have a reference to element in a vector you can't use it to invalidate the vector itself and you can't use a reference to something inside an enum like Option to invalidate the option itself. So the problem only resides in the second case
Rust Solution: The rust solution: at a time you can have either any count of active non mut references or only one active mut references and the pointed to value can't be moved from left invalid
Proposed solution: take from previous solution: a pointed to value can't be moved from left invalid to keep the references valid introduce a new type of references let's call it "inner reference" lets imagine that the inner reference has two single quotes instead of the only one like a regular reference
fn get_ith_ref<'a>(&'a mut self, i: usize) -> &''a Elem
fn get_ith_ref(&mut self, i: usize) -> &'' Elem
The source of inner references is mostly the heap and unsafe code except the enums and like types An object is allowed to have multiple mutable aliases to it or one its fields once an inner reference is derived from the object it freezes all of its mut references Frozen mut references can't be used anymore or passed to functions as long as the inner reference is used (NNL lifetimes)
example:
struct Test {
i: i32,
b: Box<i32>,
}
impl Test {
fn inc_i(&mut self) {
self.i += 1;
}
fn inc_b(&mut self) {
self.b += 1;
}
fn b_mut_inner(&mut self) -> &'' i32 {
&*self.b
}
}
let mut t = Test{ i: 0, b: Box::new(0) };
let r1 = &mut t;
let r2 = &mut t;
let f1 = &mut t.i;
let f2 = &mut t.b;
r1.inc_i(); // all references are still valid
r2.inc_b(); // all references are still valid
*f1 += 1; // all references are still valid
*f2 += 1; // all references are still valid
*r1 = Test{ i: 1, b: Box::new(1) }; // all references are still valid
t.b = Box::new(2); // all references are still valid
let inner_r = t.b_mut_inner();
r1.inc_i(); // Error: may invalidate inner_r
r2.inc_b(); // Error: may invalidate inner_r
*f1 += 1; // Ok: another field
*f2 += 1; // Error: the same field is inner borrowed
*r1 = Test{ i: 1, b: Box::new(1) }; // Error: invalidates inner_r
t.b = Box::new(2); // Error: invalidates inner_r
*inner_r += 1; // inner_r is last used here
Get or insert pattern:
fn get_or_insert (
map: &'_ mut HashMap<u32, String>,
) -> &''_ String // returns inner reference
{
// get returns inner reference
if let Some(v) = map.get(&22) { // v freezes other mut borrows
return v; // return inner reference
}
// v is not used any more so it is fine to mutate map again
map.insert(22, String::from("hi"));
&map[&22] // return inner reference
}
In order to link inner references to their sources the compiler must know which value each reference referes two which is easy for local variables but if those references are function parameters how the compiler can know that each reference refers to different object ? Let's examine this:
// r1, r2 can't have already inner references otherwise they will be frozen
fn test_fn(r1: &mut Test, r2: &mut Test, r3: &Test) {
let inner1 = r1.b_mut_inner();
r2.b = Box::new(2); // if r1 points to the same as r2 then inner1 will be invalid !
}
In order to solve this issue the compiler must check at the callsite and make sure that no function receives multiple references to the same object while one of them is mut, The same as current rules !
test_fn(r1, r1, &t); // Error: r1 and r2 both mutably alias the same variable t
Even if the references are wrapped in wrapper structs the compiler knows that those structs have references to the same object and will disallow passing them to functions. In a nutshell the current rust rules apply when calling functions or initializing a struct since a struct may have multiple borrows to the same object and have a method which takes self and thus have access to the borrows
What about threads? threads are special from the one thread case and so should have a special treating: a new type of references called "unique reference" is introduced, or reused since it is the current mut reference! Yes the current mut reference doesn't mean you are the only one who can mutate but the only who have alias to the object and it was generalized from multi threaded case to single threaded case!
Another improvement: types that return inner references may annotate that some methods that takes mut self reference does not invalidate the inner references which makes it possible to mut borrow multiple elements in a slice, Vec, HashMap... even if this results in the same element borrowed multiple times