Stacked Borrows 2

Recently, I have significantly updated Stacked Borrows in order to fix some issues with the handling of shared references that were uncovered in the previous version. In this post, I will describe what the new version looks like and how it differs from Stacked Borrows 1. I assume some familiarity with the prior version and will not explain everything from scratch.

https://www.ralfj.de/blog/2019/04/30/stacked-borrows-2.html

17 Likes

I like the greater orthogonality of this version.

Regarding the potential issue with LLVM... I could be misunderstanding, but I don't think it's actually a problem. Quoting your example for reference:

fn main() { unsafe {
  let x = &mut 0;
  let raw1 = x as *mut _;
  // Stack: [ (x: Unique), (raw1: SharedReadWrite) ]

  let tmp = &mut *raw1;
  let raw2 = tmp as *mut _;
  // Stack: [ (x: Unique), (raw1: SharedReadWrite), (tmp: Unique), (raw2: SharedReadWrite) ]

  *raw1 = 1; // This will invalidate tmp, but not raw2.
  // Stack: [ (x: Unique), (raw1: SharedReadWrite), (raw2: SharedReadWrite) ]

  let _val = *raw2;
} }

What differentiates it from a simpler program such as this?

let x: &mut i32 = &mut 0;
let y: &mut i32 = x; // unique pointer
*y = 2;
*x; // access not derived from y!

I think your answer would be that raw2 is 'based on' a unique pointer which itself was not supposed to alias with raw1, therefore LLVM might assume that raw2 doesn't alias with raw1.

In reality, though, being 'based on' a noalias pointer doesn't confer restrictions; rather it confers permission to access data when that's otherwise forbidden within a given scope. That's how C restrict is defined in the spec, where the scope is the block containing the declaration marked restrict:

During each execution of B, let L be any lvalue that has &L based on P. If L is used to access the value of the object X that it designates, and X is also modified (by any means), then the following requirements apply: T shall not be const-qualified. Every other lvalue used to access the value of X shall also have its address based on P.

- C11 6.7.3.1 p4

Once the execution of B is over, there are no more requirements.

It's also how LLVM defines the function-parameter noalias attribute, which is all that rustc currently uses, where the scope is the function execution:

This indicates that objects accessed via pointer values based on the argument or return value are not also accessed, during the execution of the function, via pointer values not based on the argument or return value.

- LLVM lang ref

LLVM also has noalias/alias.scope metadata, but that doesn't use the notion of "based on" at all, instead explicitly listing the applicable scopes. There's also the hypothetical third version of noalias that might be added someday, but I'd expect that to either follow C's definition or be strictly more flexible.

For Rust function arguments, scoping noalias to the entire function execution is justified in Stacked Borrows by the function protector. If rustc were to add some sort of noalias declaration for borrows within a function, the scope would have to end after the last access through the corresponding borrow or children thereof (really the earliest possible access that could be the last access, depending on control flow), since that's the point where Stacked Borrows would allow writes through other references. In your example, that would be *raw1 = 1;, so the scope would be over by the time of let _val = *raw2;, and the compiler couldn't assume *raw2 didn't alias anything.

2 Likes

I think that's really the point I was making -- the scope for the noalias cannot be "until the last time this thing or anything derived from it is used", which in my example would be the use of raw2 at the end. I thought that's what the scope should be. I am not sure how easy it is to approximate either this scope or the one you are suggesting statically.

I see. Itā€™s sort of like the dual of a lifetime: we want the last time a derived reference must be used rather than the last time it may be used. And correspondingly, imprecision in the analysis means unnecessarily shortening the scope rather than unnecessarily lengthening a lifetime. Like with nonlexical lifetimes, a ā€œscopeā€ here can be a list of instructions rather than anything related to Rust lexical scopes, since LLVM noalias scopes are just tags you can add to load/store instructions.

As with lifetimes, raw pointer casts are ignored, but unlike them, transmutes arenā€™tā€¦

It sounds theoretically feasible for rustc to calculate such scopes, but I have no idea how expensive it would be or how much benefit it would bring.

Er, thatā€™s not quite right. Raw pointers arenā€™t ignored; rather, a cast to a raw pointer has to ā€˜pauseā€™ the scope until the next time a derived reference that didnā€™t go through the raw pointer must be used, at which point the Untagged entry must be gone from the stack and so the scope can be resumed. Or as a conservative approximation, it could just end the scope rather than pausing it. But noncontiguous scopes should be fine from LLVMā€™s perspective, if the compiler is able to calculate them.

Excellent writing. :slight_smile: This stuff seems technical enough that it could easily become impenetrable, but somehow I [more or less] understand it.

One thing I don't understand is the motivating example, and I don't understand it both times it's discussed.

Basically, once a location is frozen we do not keep track of which pointer is allowed to read; we just allow all pointers to read.

This is in contrast to Stacked Borrows 1, where instead of 2: SharedReadOnly we set a special ā€œfrozenā€ marker on this location that would, as a side-effect, also grant untagged pointers read-only access.

But the example has:

unsafe { *y = 2; } // Hence, this legal in Stacked Borrows.

Which is a write, not a read!

One other thing I was curious about:

I think this is a nice improvement from Stacked Borrows 1, where shared references had a different kind of tag. In particular, transmutes between mutable references and shared pointers no longer need any kind of special treatment.

This indeed sounds like a nice improvement. What actually happens if we were to transmute an &Vec<&mut T> to &Vec<&T>, and then read through one of the shared references inside the Vec? Presumably at that point, the borrow stack contains a permission for Unique access, but we would be accessing it as SharedReadOnly.

1 Like

Thanks :slight_smile:

Uh... you are right. I have to turn this into a read. oops Thanks for pointing this out!

(EDIT: The update is up.)

There is no such thing as "access as SharedReadOnly". Accesses are reads or writes. So, you'd be doing a read access with a Unique permission, which is fine. If you create a reborrow, you'd be adding a SharedReadOnly permission for the new pointer on top of the existing Unique permission. But that's fine, future reads through the Unique will keep the SharedReadOnly on the stack. So, I think this all works. But there might be weird edge-cases.

2 Likes

I made some small tweaks to Stacked Borrows 2, which I describe in my latest blog post.

5 Likes