Recently, I have significantly updated Stacked Borrows in order to fix some issues with the handling of shared references that were uncovered in the previous version. In this post, I will describe what the new version looks like and how it differs from Stacked Borrows 1. I assume some familiarity with the prior version and will not explain everything from scratch.
I like the greater orthogonality of this version.
Regarding the potential issue with LLVM... I could be misunderstanding, but I don't think it's actually a problem. Quoting your example for reference:
fn main() { unsafe {
let x = &mut 0;
let raw1 = x as *mut _;
// Stack: [ (x: Unique), (raw1: SharedReadWrite) ]
let tmp = &mut *raw1;
let raw2 = tmp as *mut _;
// Stack: [ (x: Unique), (raw1: SharedReadWrite), (tmp: Unique), (raw2: SharedReadWrite) ]
*raw1 = 1; // This will invalidate tmp, but not raw2.
// Stack: [ (x: Unique), (raw1: SharedReadWrite), (raw2: SharedReadWrite) ]
let _val = *raw2;
} }
What differentiates it from a simpler program such as this?
let x: &mut i32 = &mut 0;
let y: &mut i32 = x; // unique pointer
*y = 2;
*x; // access not derived from y!
I think your answer would be that raw2
is 'based on' a unique pointer which itself was not supposed to alias with raw1
, therefore LLVM might assume that raw2
doesn't alias with raw1
.
In reality, though, being 'based on' a noalias pointer doesn't confer restrictions; rather it confers permission to access data when that's otherwise forbidden within a given scope. That's how C restrict
is defined in the spec, where the scope is the block containing the declaration marked restrict
:
During each execution of
B
, letL
be any lvalue that has&L
based onP
. IfL
is used to access the value of the objectX
that it designates, andX
is also modified (by any means), then the following requirements apply:T
shall not be const-qualified. Every other lvalue used to access the value ofX
shall also have its address based onP
.
Once the execution of B
is over, there are no more requirements.
It's also how LLVM defines the function-parameter noalias
attribute, which is all that rustc currently uses, where the scope is the function execution:
This indicates that objects accessed via pointer values based on the argument or return value are not also accessed, during the execution of the function, via pointer values not based on the argument or return value.
LLVM also has noalias
/alias.scope
metadata, but that doesn't use the notion of "based on" at all, instead explicitly listing the applicable scopes. There's also the hypothetical third version of noalias that might be added someday, but I'd expect that to either follow C's definition or be strictly more flexible.
For Rust function arguments, scoping noalias to the entire function execution is justified in Stacked Borrows by the function protector. If rustc were to add some sort of noalias declaration for borrows within a function, the scope would have to end after the last access through the corresponding borrow or children thereof (really the earliest possible access that could be the last access, depending on control flow), since that's the point where Stacked Borrows would allow writes through other references. In your example, that would be *raw1 = 1;
, so the scope would be over by the time of let _val = *raw2;
, and the compiler couldn't assume *raw2
didn't alias anything.
I think that's really the point I was making -- the scope for the noalias
cannot be "until the last time this thing or anything derived from it is used", which in my example would be the use of raw2
at the end. I thought that's what the scope should be. I am not sure how easy it is to approximate either this scope or the one you are suggesting statically.
I see. Itās sort of like the dual of a lifetime: we want the last time a derived reference must be used rather than the last time it may be used. And correspondingly, imprecision in the analysis means unnecessarily shortening the scope rather than unnecessarily lengthening a lifetime. Like with nonlexical lifetimes, a āscopeā here can be a list of instructions rather than anything related to Rust lexical scopes, since LLVM noalias scopes are just tags you can add to load/store instructions.
As with lifetimes, raw pointer casts are ignored, but unlike them, transmutes arenātā¦
It sounds theoretically feasible for rustc to calculate such scopes, but I have no idea how expensive it would be or how much benefit it would bring.
Er, thatās not quite right. Raw pointers arenāt ignored; rather, a cast to a raw pointer has to āpauseā the scope until the next time a derived reference that didnāt go through the raw pointer must be used, at which point the Untagged
entry must be gone from the stack and so the scope can be resumed. Or as a conservative approximation, it could just end the scope rather than pausing it. But noncontiguous scopes should be fine from LLVMās perspective, if the compiler is able to calculate them.
Excellent writing. This stuff seems technical enough that it could easily become impenetrable, but somehow I [more or less] understand it.
One thing I don't understand is the motivating example, and I don't understand it both times it's discussed.
Basically, once a location is frozen we do not keep track of which pointer is allowed to read; we just allow all pointers to read.
This is in contrast to Stacked Borrows 1, where instead of
2: SharedReadOnly
we set a special āfrozenā marker on this location that would, as a side-effect, also grant untagged pointers read-only access.
But the example has:
unsafe { *y = 2; } // Hence, this legal in Stacked Borrows.
Which is a write, not a read!
One other thing I was curious about:
I think this is a nice improvement from Stacked Borrows 1, where shared references had a different kind of tag. In particular, transmutes between mutable references and shared pointers no longer need any kind of special treatment.
This indeed sounds like a nice improvement. What actually happens if we were to transmute an &Vec<&mut T>
to &Vec<&T>
, and then read through one of the shared references inside the Vec
? Presumably at that point, the borrow stack contains a permission for Unique
access, but we would be accessing it as SharedReadOnly
.
Thanks
Uh... you are right. I have to turn this into a read. oops Thanks for pointing this out!
(EDIT: The update is up.)
There is no such thing as "access as SharedReadOnly"
. Accesses are reads or writes. So, you'd be doing a read access with a Unique
permission, which is fine. If you create a reborrow, you'd be adding a SharedReadOnly
permission for the new pointer on top of the existing Unique
permission. But that's fine, future reads through the Unique
will keep the SharedReadOnly
on the stack.
So, I think this all works. But there might be weird edge-cases.
I made some small tweaks to Stacked Borrows 2, which I describe in my latest blog post.