Allow multiple mutable aliases

Coming from a c++ background I found the prevention of multiple or mix of mut and non mut references very restricting in the same thread. Rust provides a workround which is the so called interior mutability which is a bad version of c++ mutable and const_cast.

For those who will complain immediately that if you hold a non mut reference then you must have the guarantee that its pointee's value does not change as long as you are holding it: This is false! Non realistic! Even in rust! Please read about interior mutability

Interior mutability is very bad! in c++ there exists the keyword mutable that allows to mutate a field from const method which is usually used to get value under a mutex lock or to implement a caching mechanism. The rest of the interface does not try to hide mutability or pretend to be const while mutating the class state. const_cast is exclusively used in getters which doesn't affect the state of the class, to not duplicate code adds the mutability to use a mutable getter method then adds the constness back to the result. Interior mutability in rust tries to overcome the limitation imposed by preventing multiple mutable borrows where it shouldn't be a problem. For example a property like type that I was implementing recently and a property may have a constant value or be bound to another property to get its value through a const getter method but once a property has a dependent property it can't be assigned a new value to be observed by its dependents which forced me to make the whole api const and use interior mutability under the hood and for the user to know if the method mutates the property or not he must consult the docs and can't rely on the signature

While searching about this topic I found this blog: The Problem With Single-threaded Shared Mutability - In Pursuit of Laziness and it was very explanatory and helpful and boils down the problem into two problems (in fact it is one) :

  1. It causes memory unsafety
  2. Iterator invalidation And after digging around I found it is the same issue: mut references may be used to invalidate pointers to inner data. This seems the reason rust was decided not to have multiple aliases when one of those is mut. Not: unlike the blog states this issue is not specific to rust, it is very common in c++ programs.
  • Mutable references can't invalidate the referenced value. Because rust does not allow move out of reference a mut reference can't make the memory it points to invalid on its own. For example this code that does not compile does not result in memory unsafety:
let mut v: Vec<i32> = Vec::new();
let r1 = &mut v;
let r2 = &mut v;
r1.push(1); // does not invalidate r2
r2.push(2); // does not invalidate r1
r1.clear(); // both references are still valid
v = Vec::new(); // both references are still valid
*r1 = Vec::new(); // both references are still valid
  • Mutable reference can invalidate a reference to inner memory owned by the object. this use pattern looks like this: an object stores some object on the heap and provides a getter to return a reference to this inner object. And provides another methods (takes mut reference to self) to alter release this memory space thus invalidating all references to it. For example rust compiler rejects the following buggy codes:
let mut val = Box::new(0); // val owns a heap allocated memory
let r = &mut *val; // r points to val's heap memory
val = Box::new(1); // val is dropped then assigned to a new Box so its memory is deallocated
// r now points to deallocated memory
println!("r: {}", *r); // use after free !

Iterator invalidation (common in c++)

let mut v : Vec<i32> = Vec::new();
v.push(0); // v stores its contents on the heap
let r = &mut v[0]; // r points to a value on the heap
v.push(1); // may reallocate if the capacity does not fit !
v.clear(); // invalidates its heap content (even if not deallocations happen !)
// r points to invalid memory (propably) !
println!("r: {}", *r); // use after free !

The problem is not only limited to references that points to the heap but also an object on the stack can also have its memory space invalidated! But doesn't rust compiler already manages stack space and invalidating its content requires use of unsafe pointers? yes but there is an exception: enums are special types that its memory content may change to a completely another type thus making all references to the same memory invalid because they point to another type. This is also common in c and c++.

let mut x = Some(0); // x content: tag | i32
if let Some(ref mut r) = x { // r points to the i32 part in x
   x = None; // x new content: tag | None
   // r now points to invalid memory
   println!("r: {}", *r); // use after free !
}
  • Inner mutable references can't invalidate references to their owner. If you have a reference to element in a vector you can't use it to invalidate the vector itself and you can't use a reference to something inside an enum like Option to invalidate the option itself. So the problem only resides in the second case

Rust Solution: The rust solution: at a time you can have either any count of active non mut references or only one active mut references and the pointed to value can't be moved from left invalid

Proposed solution: take from previous solution: a pointed to value can't be moved from left invalid to keep the references valid introduce a new type of references let's call it "inner reference" lets imagine that the inner reference has two single quotes instead of the only one like a regular reference

fn get_ith_ref<'a>(&'a mut self, i: usize) -> &''a Elem
fn get_ith_ref(&mut self, i: usize) -> &'' Elem

The source of inner references is mostly the heap and unsafe code except the enums and like types An object is allowed to have multiple mutable aliases to it or one its fields once an inner reference is derived from the object it freezes all of its mut references Frozen mut references can't be used anymore or passed to functions as long as the inner reference is used (NNL lifetimes)

example:

struct Test {
		i: i32,
		b: Box<i32>,
	}
	
	impl Test {
		fn inc_i(&mut self) {
			self.i += 1;
		}
	
		fn inc_b(&mut self) {
			self.b += 1;
		}
		
		fn b_mut_inner(&mut self) -> &'' i32 {
			&*self.b
		}
	}
	
	let mut t = Test{ i: 0, b: Box::new(0) };
    let r1 = &mut t;
    let r2 = &mut t;
    let f1 = &mut t.i;
    let f2 = &mut t.b;
	
    r1.inc_i(); // all references are still valid
    r2.inc_b(); // all references are still valid
    *f1 += 1; // all references are still valid
    *f2 += 1; // all references are still valid
    *r1 = Test{ i: 1, b: Box::new(1) }; // all references are still valid
    t.b = Box::new(2); // all references are still valid
	
	let inner_r = t.b_mut_inner();
	r1.inc_i(); // Error: may invalidate inner_r
	r2.inc_b(); // Error: may invalidate inner_r
	*f1 += 1; // Ok: another field
	*f2 += 1; // Error: the same field is inner borrowed
	*r1 = Test{ i: 1, b: Box::new(1) }; // Error: invalidates inner_r
	t.b = Box::new(2); // Error: invalidates inner_r
	*inner_r += 1; // inner_r is last used here

Get or insert pattern:

       fn get_or_insert (
		map: &'_ mut HashMap<u32, String>,
	) -> &''_ String // returns inner reference
	{
		// get returns inner reference
		if let Some(v) = map.get(&22) { // v freezes other mut borrows
			return v; // return inner reference
		}
		// v is not used any more so it is fine to mutate map again
		map.insert(22, String::from("hi"));
		&map[&22] // return inner reference
	}

In order to link inner references to their sources the compiler must know which value each reference referes two which is easy for local variables but if those references are function parameters how the compiler can know that each reference refers to different object ? Let's examine this:

// r1, r2 can't have already inner references otherwise they will be frozen
fn test_fn(r1: &mut Test, r2: &mut Test, r3: &Test) {
	let inner1 = r1.b_mut_inner();
	r2.b = Box::new(2); // if r1 points to the same as r2 then inner1 will be invalid !
}

In order to solve this issue the compiler must check at the callsite and make sure that no function receives multiple references to the same object while one of them is mut, The same as current rules !

test_fn(r1, r1, &t); // Error: r1 and r2 both mutably alias the same variable t

Even if the references are wrapped in wrapper structs the compiler knows that those structs have references to the same object and will disallow passing them to functions. In a nutshell the current rust rules apply when calling functions or initializing a struct since a struct may have multiple borrows to the same object and have a method which takes self and thus have access to the borrows

What about threads? threads are special from the one thread case and so should have a special treating: a new type of references called "unique reference" is introduced, or reused since it is the current mut reference! Yes the current mut reference doesn't mean you are the only one who can mutate but the only who have alias to the object and it was generalized from multi threaded case to single threaded case!

Another improvement: types that return inner references may annotate that some methods that takes mut self reference does not invalidate the inner references which makes it possible to mut borrow multiple elements in a slice, Vec, HashMap... even if this results in the same element borrowed multiple times

1 Like

For the record, Cell::from_mut() already allows some patterns like this, without requiring interior mutability on the original type. For example (Rust Playground):

use std::cell::Cell;

let slice: &mut [i32] = &mut [1, 2, 3, 4, 5];
{
    let slice: &[Cell<i32>] = Cell::from_mut(slice).as_slice_of_cells();
    let x0a: &Cell<i32> = &slice[0];
    let x0b: &Cell<i32> = &slice[0];
    let x1: &Cell<i32> = &slice[1];
    let x2: &Cell<i32> = &slice[2];
    println!("{}, {}, {}, {}", x0a.get(), x0b.get(), x1.get(), x2.get());
    x0a.set(2);
    x0b.set(3);
    x1.set(2);
    x2.set(1);
}
println!("{slice:?}");

Of course, this doesn't work if we need to project references to the inner elements, nor if the elements come from a HashMap or similar. For that, the best method I can think of is to extract Rc<RefCell<&mut T>>s from an iter_mut():

use std::{cell::RefCell, collections::HashMap, rc::Rc};

fn get_mut_split<'a, K, V, Q>(
    map: &'a mut HashMap<K, V>,
    keys: &[Q],
) -> Vec<Option<Rc<RefCell<&'a mut V>>>>
where
    K: PartialEq<Q>,
{
    let mut result = vec![None; keys.len()];
    for (key, value) in map.iter_mut() {
        let rc = Rc::new(RefCell::new(value));
        for (i, cmp) in keys.iter().enumerate() {
            if key == cmp {
                result[i] = Some(Rc::clone(&rc));
            }
        }
    }
    result
}

(Of course, this is rather unwieldy for what it does.)

1 Like

As a non-native English speaker, I find it difficult to understand your long and convoluted sentences.

That said, I want to clarify a few things:

While r1 and r2 don't get invalidated, the memory they point to does get invalidated. This would be unsound if allowed, because push can re-allocate the Vec:

let mut v: Vec<i32> = vec![4];
let r1 = &mut v;
let r2 = &mut v[..];
r1.push(1); // invalidates r2
r2[0] = 5; // undefined behavior

This doesn't work, borrowing a value as mutable requires that all previous borrows have ended. I don't see how this could be allowed even with your proposal, since there are no "inner references" here.

To be clear, the semantics of &mut T can not be changed. Rust has strong backwards compatibility guarantees.

This diagnostic doesn't make sense, since r1 and r2 just being alive at the same time would already be a borrow-check error.

If you want to allow shared mutable references, you have to somehow make sure that mutating one doesn't invalidate the others. This is only possible for certain types. For example, it isn't possible for enums or unions, and it isn't possible for heap-allocated types that can grow or shrink.

3 Likes

As I understand it, OP's idea is that mutable references &mut T should be changed to no longer require uniqueness. They assert that requiring inner references &'' T for possibly-invalidated pointers is sufficient to prevent unsoundness. Obviously, this is a non-starter due to backward compatibility, but it's still interesting to think about. For now, I'll call the four proposed reference types &T (immutable), &mut T (mutable), &inner T (inner), and &uniq T (unique).

The idea is that the &mut v[..] would not be allowed, since the underlying memory v[..] could be invalidated by other &mut v references. Only &inner v[..] would be allowed. Then, if we look at the modified snippet, it correctly results in an error:

let mut v: Vec<i32> = vec![4];
let r1 = &mut v;
let r2 = &inner v[..];
r1.push(1); // borrow checker ends lifetime of r2
r2[0] = 5; // borrow checker error!

To be clear, OP is proposing a change to the meaning of &mut. In this proposal, a &mut reference would allow modifying concrete values, but would be unable to project into heap allocations or enum variants that could be invalidated. This proposal's &mut T seems very similar to the current &Cell<T>. There's even a cell-project crate that allows projecting &Cell<T> into fields, not allowing possibly-invalidated projections.

Overall, I don't think the proposal as stated would work very well, due to this "&mut origin" issue. One of Rust's strengths is that the current & and &mut have the exact same meaning in every context. In this proposal, &mut T can be used for shared mutability, right up until it can't: certain patterns work within a single function but can't be factored out into a separate function.

Also, instead of adding two new reference types with tricky semantics, it seems much easier to stop worrying about interior mutability, and to accept &/&mut as the general-purpose shared/exclusive references that they really are. Plenty of existing libraries use interior-mutable types to great effect, without their usage becoming very confusing.

10 Likes

Rust has intentionally made mut references unique. It intentionally and consciously rejected C++'s approach. I don't think there's any chance of going back. Rust pushes you to change your programming style. It wants you to abandon your C++ expectations. It wants you to face this difficulty head-on, and program with it, not around it.

These rules are quite restrictive, but they're also relatively straightforward and uniform. There's a benefit to having such blanket rules, instead of requiring context-specific and implementation-specific knowledge to know when aliasing is safe in practice and when it's not.

Interior mutability is not a workaround. It's a first-class way to mutate shared values. The only mistake Rust made here is in calling &mut this way, rather than &unique or &exclusive that better explains the guarantee it gives (and then it'd be clearer that having &exclusive that is not guaranteed to be exclusive defeats its intended purpose).

29 Likes

I want to point out that interior mutability has a runtime cost, while mutable borrows do not, or at least no more than a regular C pointer. This alone makes the latter preferable to the former when possible.

Rust manages mutability through bindings and borrows, not by making data structures themselves immutable. That is something that could be done (and has been done) on top, but as always immutable datastructures exact a runtime memory and cpu cycle cost on top of what its regularly-mutable version would. That makes immutable datastructures impractical to place at the core of a systems programming language, where performance is a top-tier concern.

1 Like

That’s not always true in this generality, in particular in the single-threaded case which OP seems to care about a lot: While RefCell does involve some overhead (by working with, essentially, a dynamic lock flag), using Cell can be truly zero-overhead, especially with small Copy types, e.g. Cell<u8> or Cell<i32>.

And some non-standard interior mutability primitives are zero-overhead, too, e.g. GhostCell.


Nonetheless, both Cell and GhostCell are severely less flexible than something like RefCell/Mutex/RwLock, and those are less flexible than &mut T or even &T in some ways (e.g. no true &'a self -> &'a Inner projection), so your main point that interior mutability has disadvantages and overhead compared to &mut stands: either in terms of run-time performance (overhead) and additionally slightly less flexible API (like RefCell), or in terms of severely less flexible API (like Cell) (or, in case of GhostCell, a whole differenent access model alltogether), so there’s “overhead” in the ease-of-use in the latter case.

9 Likes

Please don't frame it in such antagonistic way. Language designs have trade-offs, and there are negative and positive sides to most design decisions. If you don't appreciate the upsides, then these choices may seem wrong to you, but that doesn't mean they're somehow objectively wrong. Rust is not a bluff to make a language with a "wrong" design and not "admit" it.

The exclusivity of references in Rust has its uses, even in single-threaded code. For example, it prevents iterator invalidation problem. Rust could have been less strict for single-threaded programs, but the language doesn't have a concept of such program (but at least it does have Cell and Rc that don't cross threads). It does push programmers to embrace multi-threading. It is a design choice, which is IMHO valuable given that all CPUs are multi-core these days. I've enjoyed being able to use 3rd party Rust libraries without worrying about their thread-unsafety.

Rust doesn't have truly immutable data (which has been called "frozen" memory around here). Rust's immutability is only an aspect of data access. It can be temporary. It's a pragmatic solution for preventing data races, rather than a feature for having a truly immutable memory. In this sense interior mutability is fine, because it's not a hole in immutability, but rather another (run-time) way of preventing data races. If you treat & access as "shared" rather than "immutable", it makes more sense. You can have a object that is shared (and even appear immutable to its users), but still use interior mutability for an internal cache or memoization.

The restrictions imposed by &mut and lifetimes are also used intentionally as part of API design. For example Mutex::get_mut can give access to protected data without a lock, because &mut's restrictions make it behave like a compile-time mutex.

28 Likes

For example, it prevents iterator invalidation problem.

As I understand, nsbj's entire argument is that iterator invalidation can be prevented without exclusive references, by introducing &''a in addition to &'a. I think it is an interesting research question whether this works or not. Whether it is worth complexity or how to engineer this to be backward compatible, I think, is a separate question.

I think it is worth investigating (although not necessarily here), since exclusive references are in fact inconvenient.

8 Likes

I agree that this can be a point where Rust might reasonably be able to improve in the future. I wouldn’t limit the focus to interior mutability, since there’s lots of other side-effects that will have to be documented and can’t be seen in the type signatures, such as e.g.

  • file access, network access, etc…
  • mutating global statics
  • printing to and/or reading from stdin/stdout

These effects are super similar and related: E.g. you could implement a (somewhat inefficient) “interior mutability” primitive by having some memory-allocator-esque datastructure in a global static, and storing the mutable data in that. Also global statics are typically made mutable (safely) by using interior mutability.


I believe that the most breaking change you’re proposing is that something like &mut Box<T> -> &mut T should no longer be possible, but should involve returning some kind of “inner reference” instead (whose precise behavior I fail to understand from your descriptions so far).

Whether allowing &mut to be no-longer unique in and by itself is a breaking change is a much more subtle question. I would assume that “probably yes” since API involving unsafe code written today can assume that &mut references to – say – some zero-sized-type handles is actually always unique, but at least it’s definitely “only” more subtle breakage.

5 Likes

There is this old adage that any sufficiently complex single-threaded program is indistinguishable from a multi-threaded one*. And I've learned that lesson the painful way more than once while doing GUI development.

Saying that ownership checking within a signle thread "prevents the iterator invalidation problem" doesn't do it proper justice. It prevents an entire class of bugs and gotchas in the language. It is a tool that empowers you to not care about data being accidentally modified under your ass, it frees you from being careful so that you can focus on actually getting things done.

* 10 internet points if somebody finds a source with the original wording the points go to @afetisov, see below

6 Likes

Something I haven't seen mentioned here is borrow splitting. It's possible to turn a mutable reference to a value into multiple mutable references to interior fields of that value, so long as the references to the interior fields are themselves unique.

Example, loosely inspired by the above:

#[derive(Debug)]
struct Test {
	i: i32,
	b: Box<i32>,
}

let mut t = Test{ i: 0, b: Box::new(0) };
let r1 = &mut t.i;
let r2 = &mut *t.b;

*r1 = 1;
*r2 = 2;

dbg!(t);

Prints:

t = Test {
    i: 1,
    b: 2,
}

Beyond that, though, forbidding mutable aliasing isn't just an implementation detail of the borrow checker that can be "fixed" by a sufficiently clever tweak. Forbidding mutable aliasing is foundational to Rust's memory model, and without it, you're sacrificing memory safety.

To the extent that C++ allows for it, it's because C++ has an unsafe memory model.

Beyond that, rustc now uses mutable-noalias optimizations in LLVM by default.

3 Likes

The earliest specific reference that I know is this post by @Manishearth , which cites this reddit comment.

Aliasing with mutability in a sufficiently complex, single-threaded program is effectively the same thing as accessing data shared across multiple threads without a lock (The above is my paraphrasing of someone else's quote; but I can't find the original or remember who made it)

Edit (Jan 2017): I found the original, it’s a comment by kmc:

My intuition is that code far away from my code might as well be in another thread, for all I can reason about what it will do to shared mutable state.

5 Likes

Again rust lies about constness and mutability since any variable can be wrapped in a RefCell and do whatever what you want and forget about mutability. Multiple mutable aliases is essential for system programming and typically most advanced rust libraries must use interior mutability to achieve this shared mutability but on demand.

One field where it is essential is gui programming because:

  • you can't mutate a widget while it is in a tree
  • Model View Delegate (Controller) isn't possible because you can't have mut reference while holding a non mut one
  • you can't mutate a widget in a callback or signal
  • you can't mutate a property where a dependent one holds a non mut reference to it
  • etc.

This is why rust does not have a good GUI library and the developers are trying odd architectures because the proven ones are not easy to implement in rust yet!

Take a look on the GTK wrapper for rust and how did they wrap everything in Rc<RefCell<Widget>> ! thinks it is easy ? no using this in GUI apps will cause huge amount of cyclic references and memory leak which is tough for GUI apps. They even created some horrible macros to ease capturing the widgets in lambdas.

My thoughts is that rust designers didn't go this route for the sake of banning multiple mut references but it could cause memory safety in some situations and it was hard to get it right so it was decided to chop it. I understand this and the above ideas are not complete or may not work at all in this form. I think it requires more research to tackle this issue

The way to get a shared mutable reference in Rust is to use &Cell<T>. What's missing is convenient ways of manipulating &Cell<T>; things like field projection.

It also might be useful to have some sort of &frozen T which disables shared the aliased mutability of UnsafeCell. However, it still wouldn't make the referee actually immutable, as you could still do mutations behind pointers. Given the only reason it'd be useful is if people use structs of cells instead of cells of structs, I don't think it's worth the complexity it'd add.

1 Like

For the past eight years or so, mainstream GUI thinking has mostly been going towards totally immutable data structures, exactly because mutable spaghetti becomes unmaintainable at some point. I would think Rust is in a good company here.

8 Likes

Again, this has already been said before in this thread, but: if you replace "const" by "shared" and "mutable" by "exclusive", then all those "lies" go away. Yes the naming is unfortunate and maybe Rust may be able to correct it some time in the future (I don't think so), therefore this remains as a lesson for future languages to do better.

RefCell does not give you shared mutability. At each point in time, only one piece of code may hold an exclusive reference to its content (&mut), and it will panic if that invariant is broken. RefCell has exactly the same guarantees as the rest of the Rust ownership model, the only difference is that the check is performed at run time instead of compile time.

This is funny because everything you listed there is a fundamental property of glib's object model which is at the heart of GTK, and not Rust:

  • GObject is reference counted and has shared mutability, with little to no opt-out. In C, this is just pointers and having to call ref and unref methods at the correct point in time, but the equivalent type system representation in Rust is indeed Rc<RefCell<T>>. So Rust does nothing differently here, other than making some assumptions explicit through the type system.
  • If you look at any GTK C code (or Gstreamer FWIW), you'll see a lot of horrible macros too. As it turns out, if you tack on a runtime object oriented type system to a programming language without inheritance, things are going to get ugly. No secrets here.
  • Since all GTK objects are already reference counted regardless of your programming language of choice, the problems about memory leaks through cyclic references are universal. The difference is that in C, you still have to do manual memory management which is an additional source of memory leaks at best, or may even cause segmentation faults
    • FWIW, even garbage collected languages are not immune against accidental reference holding. Have a look at the code of Java AWT/Swing or JavaFX. You're guaranteed to see some careful usage of weak references too (yes, weak references are a thing in Java/GC languages).
10 Likes

There is not yet any clear architecture for implementing a retained GUI in rust but only immediate mode. The problem is that in retained UI the widget is part of a tree and in the same time you own it and are able to change its state. All of this must occur in one thread for this to function well. The only pure rust UI that I used and was satisfied with is slint. They realized that coding a UI in rust will be terrible so they used a markup language and a code generator to take the burden of writing rust code. For more in this topic see this:

1 Like

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.