Stacked Borrows: An Aliasing Model For Rust


#1

In this post, I am proposing “Stacked Borrows”: A set of rules defining which kinds of aliasing are allowed in Rust. This is intended to answer the question which pointer may be used when to perform which kinds of memory accesses.

This is a long-standing open question of many unsafe code authors, and also by compiler authors who want to add more optimizations. The model I am proposing here is by far not the first attempt at giving a definition: The model is heavily based on ideas by @arielb1 and @ubsan, and of course taking into account the lessons I learned last year when I took my first stab at defining such a model, dubbed “Types as Contracts”.

https://www.ralfj.de/blog/2018/08/07/stacked-borrows.html


Reordering of writes via differently-typed pointers
#2

At a first glance I like this a lot! In particular, the relatively seamless integration of raw pointers – I was really surprised to read this part:

Using a raw pointer directly is desugared to creating a shared reference (when reading) or a mutable reference (when writing), and using that.

The retagging and function barriers mechanism also seems extremely natural to me, besides the optimizations they enable. They bring back some compositional reasoning around function boundaries.

There are two things that came to mind while reading that I’d like clarified:

Q1: For something like &(Cell<u8>, u8) we’d want to optimize accesses to the second element normally while being conservative around the first element. How is this handled? Do we create a different borrow for each byte of memory covered by a reference (i.e. in this case, one Raw and one Frz(t)) depending on whether the offset is inside an UnsafeCell?

Q2: Retagging occurs when a reference is “loaded from memory”, does this “memory” include Rust-level locals? For example, does (source code, not pseudo-MIR) let x = &mut foo; let y = {x} /* move, not reborrow */; retag? What about MIR temporaries, e.g. does the source-level expression (&mut x,).0 lead to retagging when reading the reference back out of the temporary tuple?


#3

This is a really cool model! Thanks for your work @RalfJung :smiley:


#4

I am a little surprised that you found this surprising. :wink: Did you expect a difference in behavior between *raw and *(&*raw)?

Yes. The in_unsafe_cell changes per location.

I tried to say this explicitly, but it seems I was not clear enough.

These are great questions, and I do not know the answer. :wink: However, I think in these cases retagging makes no observable difference, does it?

I am very glad that you like it. :slight_smile: One thing I am wondering is if there is an equivalent model that is expressed in the more axiomatic style you have been using (i.e., stating properties that shall always holds true instead of defining an “instrumented machine”). It seems to me there should be, though the barriers could make this interesting. However, I have a hard time working with axiomatic models or thinking in those terms.


#5

The idea is to put a “barrier” into the stack of all function arguments when demo5 gets called, and to make it UB to pop that barrier from the stack before demo5 returns. This way, all the borrows further down in the stack (below Uniq(x) ) are temporarily disabled and cannot be activated while demo5 runs. This means that even if y happens to be the memory address x points to, it is UB to cast y to a reference because the Raw item cannot be activated.

I really like this - I think it’s a really good way of dealing with that issue of this model. It also nicely fixes the global problem, I think :slight_smile:


#6

I’ve gotten used to fearing that accesses through raw pointers would assert somewhat less than regular references, especially around mutation. But maybe I’m imagining things. Or maybe this model has the same differences between raw pointers and references, but moved them into other components than the actual access rules.

Oh, I think I just didn’t register that paragraph. It’s a bit late here :sweat_smile:

I’m really not sure, I haven’t spent much thought on it. The time when a unique borrow is created is only used for the purpose of identifying it, right? Then indeed retagging very frequently shouldn’t cause any difference in behavior.


#7

Hm, actually… if I’m reading the rules right, retagging activates the old borrow and initiates the new borrow. So retagging while memory is frozen would be UB, wouldn’t it? Then it seems like something like the following would be UB if moves of &mut locals retag:

let m1 = &mut foo; // initiate Uniq(m1)
let r = &foo; // freeze foo, and reset borrow stack
let m2 = {m1}; // activate Uniq(m1), initiate Uniq(m2) -- UB

Is this correct?


#8

This is a very interesting direction to go, and it seems quite promising to me!

One thing which strikes me about this is that by providing a clear definition of the tagging behaviour in an instrumented machine, this opens the door to a sanitizer, and that would be wonderful. :slight_smile:

One thing which isn’t clear to me: how do multiple Raw tags in the same stack work? I have several thoughts on how this might work but I am not at all confident that they are correct. :wink:


#9

This looks like a really nice model. I definitely like the general approach, especially the way that this takes the intptrcast model’s broadcast operation and makes it significantly more local and bounded through the creation of references. And making the broadcast be on casting from reference to raw pointers neatly sidesteps all the annoying questions about comparing pointers, since that requires casts to raw pointers.

For this purpose, we tag shared references with some kind of “timestamp” indicating when it was created.

This is definitely out of scope for now, but it seems like when generalized to multiple threads, this will want to be a vector or tree clock that interacts with the happens-before relationships generated by acquire-release pairs. Which seems like it would actually fit super well and possibly make a number of data races detectable, which is super nice.

So, when a &mut is cast to *mut like above, we instead push Raw onto the stack, indicating that any raw pointer may be used to access this location.

Is this also true when a &mut gets transmuted to *mut? I can see arguments for both sides. It would be nice to have a way to cast a &mut into *mut explicitly requesting that you not open the memory location up to modification through other raw pointers, for use in ways that will never be dereferences, such as in comparisons, hash consing, and memoization.

Shared references with interior mutability do not really have any restrictions in terms of what can happen to memory, so we treat them basically like raw pointers.

I’m a little bit confused as to what you mean by “treat them basically like raw pointers”. You’ve addressed in other comments that only the specific bytes that are within UnsafeCell are affected by this, but does that mean that Raw gets pushed when you create a reference to that location? Or just that the frozen flag is ignored?

I also think it might be worth having another marker specifically for this circumstance, i.e. when you create a shared reference to a location with interior mutability, you push a special Shared item onto the stack for the location. This would prevent other random raw pointers from the modifying or viewing area, but doesn’t impose any constraints on which reference is used to access the memory. This would allow transforming

fn foo(x: *mut i32, y: &Cell<i32>) {
    *x = 15;
    y.set(3);
    *x
}

into

fn foo(x: *mut i32, y: &Cell<i32>) {
    y.set(3);
    *x = 15;
    15
}

This might be totally useless in practice, though, since UnsafeCell is primarily used via the get method, which creates a raw pointer. I think to actually get a benefit we’d need to add an intrinsic for “reclaiming” a piece of memory, popping off the Raw and pushing a Shared.

“Time” here is a rather abstract notion, we really just need some counter that we bump up every time a new reference is created. This gives us a unique ID for each mutable reference – and, as we have seen, for shared references we actually exploit the fact that IDs are handed out in increasing order (so that we can test if a reference was created before or after a location was frozen).

I’m a bit concerned that making tags just timestamps makes it hard to reason about moving around the creation of references. There may be no way to actually compare the timestamps of references, but that doesn’t seem at all clear to me. Because of that, I don’t see that it would be valid to rewrite

fn bar<'a>(baz1: &'a mut u32, baz2: &'a mut u32) -> (&'a mut u32, &'a mut u32) {
    let p1 = &mut *baz1;
    let p2 = &mut *baz2;
    (p1, p2)
}

into

fn bar<'a>(baz1: &'a mut u32, baz2: &'a mut u32) -> (&'a mut u32, &'a mut u32) {
    let p2 = &mut *baz2;
    let p1 = &mut *baz1;
    (p1, p2)
}

which in my mind should absolutely be valid - I don’t feel like the relative time of creation of two pointers should matter.

Instead, would it work to tag pointers with a pair of a timestamp and just a fresh unique id? The timestamps remain usable for dealing with shared references, but when considering unique elements on the stack you just look at the id. These ids can be compared for equality, but aren’t ordered. Then, the only moments when the time gets stepped forward when memory gets frozen, not when any reference gets created. In the example above, p1 and p2 would have the same timestamp, and reordering them would be perfectly valid.

In fact, I think in that case it might be possible to record time on a per-memory-location basis, which would mean that most timestamps would be highly unordered, making various properties about aliasing stronger. Possibly.

To be clear, these are just random ideas I’m having that might fix what I’m seeing as somewhat complicated reasoning. I’m not sure my idea would actually simplify things; it just seems like it might.

We change the tags of the mutable references to the current clock value, bumping up the clock after every tag we assign, and then we push those new tags on top of the borrow stack.

So essentially there is an implicit reborrow of every pointer passed into a function? That actually feels very natural, and might actually remove special cases since Rust already allows you to reborrow the input to a function without any extra syntax.

(this can be a function argument, but also loading it from memory or getting it as the return value of some other function)

What happens when a pointer is loaded in many individual bytes and only reconstituted later? Does the retagging happen at reconstitution time?

The idea is to put a “barrier” into the stack of all function arguments when demo5 gets called, and to make it UB to pop that barrier from the stack before demo5 returns.

Does this, in analogy to retagging, also happen when a pointer is first moved into a function through other means (reading memory, etc.)?

A nice side-effect of barriers in combination with renumbering is that even if demo4 from the previous subsection would not use its arguments at all, it would still be UB to call it with two aliasing references:

Initially, I was definitely against this being a “nice” side-effect. However, if we want it to be possible to introduce unnecessary writes to some &mut pointers (which I do for the sake of lifting writes out of loops that might have 0 iterations) and reorder accesses to &mut pointers (which I really do), it has to be UB to just pass two &mut references to an empty function:

fn baz(ref1: &mut u32, ref2: &mut u32) { }

turns into

fn baz(ref1: &mut u32, ref2: &mut u32) {
    let val1 = *ref1;
    *ref1 = 15;
    *ref1 = val1;
    let val2 = *ref2;
    *ref2 = val2;
}

turns into

fn baz(ref1: &mut u32, ref2: &mut u32) {
    let val1 = *ref1;
    *ref1 = 15;
    let val2 = *ref2;
    *ref1 = val1;
    *ref2 = val2;
}

which breaks on aliasing inputs. So I’m tentatively convinced although I was very skeptical of the validity-based model before. And of course being able to safely mark things as noalias is very important.


TL;DR I like this model a lot, at least on first and second read-through. It may want tweaking, but it has a lot of properties that I like.


On an irrelevant editorial note,

…while “Stacked Borrows” is (to some extend) “access”-based.

That should be “to some extent”, I believe.


#10

Thanks for the work!

Some questions:

  • It looks like threading could be worked into this by replacing simple timestamps with a partial order based on happens-before relationships for times between different threads; is something like that in the cards?

  • Does this imply that e.g. Rc will be less efficient than &T, since Rc contains a pointer rather than a ref as an implementation detail, and therefore the compiler must assume that someone could have mutated the object the Rc points to, even though it models an immutable reference? But I guess Rc could be changed to use a real ref instead of a pointer, so maybe that’s not a critical problem. Oh, but you can actually mutate the referent via get_mut(), so never mind.

  • How does this interact with transmutes, or other treatment of memory-as-bytes?

    E.g. right now rust guarantees that references and pointers have the same representation in memory, meaning you can build a data structure containing pointers and then transmute it to one containing references. (Or share a mmap-ed data structure containing references with another process, etc.) This is a really useful property; Rust’s general usefulness for working with memory-as-bytes is one of its key advantages right now, since standard C and C++ have given up this ability.

    A similar example: I have some code that deserializes an object graph, possibly containing cycles, into objects owned by an arena. The result is going to be read-only and shared between threads, so I want to use raw references (or something with equivalent performance) for inter-object references.

    If there’s a cycle, it’s inevitable that at least one reference in the graph must be created before its referent is finished deserialization. But under these rules, I’m not sure how that’s possible. Even if I try to launder the result via ref-pointer-ref conversion of a root pointer or whatever, the model still remembers when all the references in memory were originally created, and makes this UB.

    Hopefully the final model will enable cyclic reference graphs to be legally creatable somehow, even if it requires jumping through crazy hoops.

    Maybe this model can be tweaked to allow it? I think the problem is inherent in associating timestamps with references sitting in memory. What would happen if memory were associated with time-of-read or time-of-write, but references took their ‘createtime’ from the reference that was used to load them from memory?

Edit: reread the article and somehow I missed the details in 5.2 and misunderstood the proposal. It’s much more conservative than I thought at first, so I think my above cases are actually allowed. Now I have the opposite concern - when loading references from memory, it seems like some questionable things become legal:

let mut i = 5;
let p = &mut i as *mut i32;
let rr = &&i;
let y = **x;
unsafe {*p = 17};
let r = *rr;
assert!(*r == 17)

This seems like code that should be UB. The compiler should be free to assume that any chain of immutable references is just as frozen as a simple reference. But as I read the proposal, the timestamp of r is its time of creation (after the write to p), so this is actually legal:

Any time a reference is passed to us from “the outside world” (as function argument, loaded from memory, or returned from a callee; also below structs or enums but not below unions or pointer indirectons), we retag.

Or is this what the ‘pointer indirections’ exception refers to, so in that case the loaded reference copies the timestamp of its ancestor?


#11

However, in my interpretation, the reason it complains is that if it accepted the program, we would have UB in safe code! […] One important property, then, is that if the program has UB and does not use any unsafe code, the borrow checker must detect this.

I think this observation is particularly interesting because it provides a way to answer questions that have been coming up about what the borrow checker should actually allow. For example,

There are some things I am concerned about with NLL that I’d like to resolve before we make any final decision regarding stabilization. These are cases where NLL perhaps accepts “too much”. One such item has to do with the use of liveness, and in particular the implications of disjoint liveness ranges.
~ https://github.com/rust-lang/rust/issues/43234#issuecomment-410720455

Though of course it doesn’t help with “is this something we’re willing to promise forevermore to understand as being in the safe subset”.

I wonder if there’s a way to phrase the borrow checker as something that tries to constant-fold all the borrow stack operations…


#12

Looks good.

I’m somewhat concerned about how barriers interact with cases where lifetime disjointness is enforced by dynamic checks. Let’s take your demo code, which you explained would be UB if x and y point to the same location:

fn demo5(x: &mut i32, y: usize) -> i32 {
  *x = 42;
  foo(y);
}

fn foo(y: usize) {
  let y = unsafe { &mut *(y as *mut i32) };
  *y = 7;
}

But then change &mut i32 to RefMut<i32>, and usize to &&RefCell<i32>:

use std::cell::{RefCell, RefMut};
use std::mem::drop;

fn demo5x(mut x: RefMut<i32>, y: &&RefCell<i32>) {
  *x = 42;
  drop(x);
  foo(y);
}

fn foo(y: &&RefCell<i32>) {
  *y.borrow_mut() = 7;
}

fn main() {
    let x = RefCell::new(42);
    demo5x(x.borrow_mut(), &&x);
}

This is now entirely safe code. But from your borrow model’s perspective, it ought to be equivalent:

  • RefMut<T> contains &mut T as a field.
  • &RefCell<T> is a shared reference with interior mutability, which in your model behaves like a raw pointer (using a Raw item). The double reference (&&RefCell<T>) is so that retagging doesn’t take effect, i.e. neither demo5x nor foo will push another Raw on entry.
  • I had to explicitly drop the RefMut, but that shouldn’t affect anything; it only writes to the RefCell's state field, not the value itself.

So this should also be UB. Am I mistaken?


#13

What do you mean by “global problem”?

There is still one key difference but it falls out of the desugaring: If you use the same mutable reference multiple times, that asserts that there has been no other access to these locations in the mean time. However, using a raw pointer multiple times creates a new reference each time, and hence does not assert such exclusive access.

No that should not be UB – activating Uniq(m1) in the last line will unfreeze the location.

That is, in fact, the very next step in my roadmap. :slight_smile:

Several Raw tags right next to each other are redundant.

However, when there are Uniq tags between them, they do make a difference. Imagine we have a stack Raw, Uniq(x), Raw, Uniq(y). If we now use a raw pointer on this location, that’s fine. Next, we use x, which will pop the first Raw off the stack. If we now use a raw pointer again, that will pop Uniq(x)! So from now on, using x is not fine any more.

This situation can happen when you start with y, create a raw pointer from that, from that raw ptr create x, and then from that create another raw pointer. Using x again invalidates the “later” raw pointer; then using a raw pointer again invalidates x because this means you must be using the raw pointer that was created from y. Of course, raw pointers are not tagged so we cannot know where you got it from; we will always assume you are using the “latest” raw pointer that is still in the stack. (This allows most code.)


#14

Oh, right. But then, the unfreezing is an observable effect of the move? Extending the above example:

let m1 = &mut foo;
let r = &foo;
let m2 = {m1};
*r; // UB if previous line unfreezes foo

#15

Ah, that’s something I had not even though about, but you are right. :slight_smile: Of course, we still have to deal with LLVM’s model here, so pointer comparison is still a very subtle topic.

Hm, interesting. My expectation was to use a single global clock. If you have an operational model of concurrency, there still is a total order of all operations, it is just not observable by the program. So I would have used that order.

I would have to think more about whether there is any difference between these approaches. But, as you said, this is definitely not a priority at this point. :slight_smile:

Oh, transmutes. Good question. We cannot really do anything on transmutes because it is not, in general, possible to even detect when they happen. So nothing happens on the stack for a transmute.

My first inclination is to say that such a transmute is UB: The transmute would create a raw pointer with a tag, which is not a valid raw pointer (the tag usually gets stripped when casting &mut to *mut). That would sidestep the question. However, I am sure I will be told that people expect such transmutes to work. In the model as defined currently, the transmute would give you a raw pointer that may not be used, as there is no Raw on the stack. You have to explicitly say as *mut _ to inform the compiler about what is happening. I am not sure if that is what people expect.

Yes, Raw gets pushed if the location is not already frozen. (That can happen, e.g. when a &RefCell<T> and an &T alias.) The Borrow for a location covered by UnsafeCell below a shared reference is Mut(Raw).

I think your transformation is illegal, and my model deliberately conflates raw pointers and shared references with interior mutability. Mostly this makes the model simpler so I have to consider fewer combinations of things. :wink:

For example, we might want to allow FFI code which takes a &Cell<i32> and passes it to C as int*. The C accesses would all be raw pointer accesses.

I totally did not think about this, but that is a good point. I have two answers: First of all, we are defining something more like a surface language semantics here. So I am less concerned about making everything reorderable. Actually doing so is also really hard. LLVM can still do these reorderings later because it does not have the stack or timestamps.

Secondly, for Uniq(_), the tag is only ever compared for equality. They happen to be ordered but it really does not matter. So actually I think it is not hard to argue that your two programs are equivalent. Things get slightly more complicated with freezing.

Yes. But not just pointers passed to functions, also pointers returned to us from other functions and pointers we load from memory.

Oh my, you have a nasty mind. :wink: When it is loaded as individual bytes, it is loaded as a bunch of u8. Then you might be calling transmute, which in this case is a function returning an &mut – so retagging would happen because you got a reference returned from a function.

No. That wouldn’t work, unfortunately. The barrier reflects the fact that the lifetime of a reference you get as function input will necessarily outlive the current function call. References you receive during the function execution could be living shorter, so adding a barrier would not be correct.

Lifting writes out is not usually possible (you cannot just insert a write, that would change what the rest of the code can observe about this location!). But we certainly want to be able to lift reads out of loops.

I also think that reordering accesses does not need barriers; retagging is enough.

That is the biggest argument for me, TBH. I am not even sure what exactly the rules are with noalias, but my hope is that with the barriers we are on the safe side.


This is a very interesting question, and it comes right back to what I just said above about transmutes. I did expect that someone would tell me these transmutes are needed, I did not expect that that had already happened. :wink: Could you maybe write a small Rust-only test case demonstrating what you think should be possible?

An alternative to transmutes, at least in Rust, might be to use uninitialized memory. But that may not work: We have to create one shared ref first, which should freeze whatever it points at, but that “whatever” will then later get mutated to close the cycle. I do not think it is possible to create a cycle of & this way (if there is no UnsafeCell).

I assume you meant let y = **rr?

In that case, this is not legal. When let r = *rr; happens, we are loading an &i32. That is covered by the clause “whenever a reference is loaded”, so we proceed with retagging. We first activate the reference we loaded. That reference was created further up when &&i happened, and it kept its timestamp while sitting in memory. Activating it, however, will cause UB because the location got unfrozen when you wrote to it using *p.

No, timestamps are never copied. Only freshly assigned during retagging.


Btw, I am extremely happy about all the feedback, questions and concerns I am getting here. If I had to describe a best-case scenario for how a discussion about this blog post would look like, this htread would be it. :heart:


#16

Wow! Thanks for this example. I love it. :slight_smile: (And nice hack with the &&RefCell to trick retagging.)

However, I think what this actually shows is that RefMut is wrong. This is something I have been wondering about for quite some time.

The reason RefMut is wrong is that it contains an &mut with a lifetime which, in this case, outlives the function call – but the exclusive access of this reference does not always outlive the function call, as you showed.

Hm, yes. You are “touching” m1 and hence invalidating all references that were created after m1.


#17

do you propose this should compile? https://play.rust-lang.org/?gist=e1fcf5b59f631a9a1da00beff522997b&version=stable&mode=debug&edition=2015

I don’t want that to compile (unless an “implicit block” is inserted by the compiler between let b and let d, but I think that’d be surprising).


#18

@Soni this already compiles with NLL: https://play.rust-lang.org/?gist=cdea8cd3ec8d1f33e66a16580bda52cb&version=nightly&mode=debug&edition=2015. And that is fully intentional. No longer tying lifetimes to variable scopes is the exact point of NLL. This has nothing directly to do with my proposal, which is about unsafe code.


#19

okay, so NLLs are implicit blocks. (e.g. add let e = b.do_thing(); after a.not_mut())

this is the context: https://cybre.tech/SoniEx2/rust.hexchat.hexchat-plugin/src/branch/master/src/lib.rs#L20-L58 (trying to implement hexchat_list_*), which is about unsafe code.


#20

There will certainly be such transmutes in the wild, but I don’t think they are justified. If you know you have a &mut and want a *mut, casting it is superior to transmuting in every conceivable way.

A slightly more interesting case is if you have a memory region that stores pointers, and is accessed by code that needs raw pointers (e.g., it’s a data structure shared with C code), but some other code wants to view the memory as containing references. However, in that scenario it seems acceptable to me to require that all code agrees that the memory contains raw pointers, and let the code that wants to deal with references explicitly cast between raw pointers and references when loading and storing (and place any lifetime parameters in PhantomData).