Improve upon Stacked Borrows by introducing a tree

I made a Zulip thread for this

Interesting, how are you then facilitating interior mutability?

Right that makes sense, but the optimization is not currently used, why is that (maybe lacking a consistent memory model?)?

In XIR, are you allowed to simultaneously hold a pointer with dereference_write and a pointer with readshallow to the same location (and interleave writes and reads of both pointers)? If yes, then that is what my model tries to express.

  • SRO allows only reads, but may point to memory referenced by other pointers and references
  • SRW allows r/w, but may point to memory referenced by other pointers and references
  • SIR allows reads and guarantees that for the time that this tag exists no writes to the location occur
  • Unique allows r/w and guarantees that for the time that this tag exists all writes to the location occur through this pointer or any descendants

I maybe should add this at the very beginning of my document...

When you create an owned UnsafeCell<i32>, this produces a new object, and all of it's subobjects, including a mutable field. A &UnsafeCell<i32> is *readonly dereferenceable(4) aligned(4) ref const UnsafeCell::<int(32)>, and UnsafeCell::get returns a *int(32) to the mutable subobject inside the UnsafeCell<i32> because readonly has an inwards barrier at a mutable subobject, the limitation on writing posed by holding a readonly pointer doesn't extent to this subobject. When &mut i32->&UnsafeCell<i32> is used, you get a *readonly dereferenceable(4) aligned(4) ref const UnsafeCell::<int(32)> as well, but this time to an UnsafeCell::<int(32)> that was materialized in the storage pointed to by the &mut i32. The same thing is allowed for &i32, but the materialization is moot - because there's a readonly pointer to the actual i32/int(32), the mutable barrier of the field doesn't have any effect.

I believe its a limitation of llvm - llvm does not express noalias et. al as parts of the pointer type, but attributes on function parameters and return values (unlike SB, TB, or Xir's model, where it's part of the type).There are scoped versions of these attributes, but I believe that rustc does not currently make use of them. SB does justify the optimization, though, I believe.

Yes, provided it's not derived-from the readshallow pointer: [ptr.derive] lists the following wrt. write-required ranges and readshallow:

parent of the result means any pointer that the path from the root pointer to the result pointer in the current pointer graph passes through. readshallow thus gives llvm's readonly semantics to the pointer - it will never be used (even indirectly) to write to the object, but is permitted to observe writes to the object (unlike XIR's stronger readonly).

If I expressed it correctly, TB should also have exactly this behavior (from page 5 section 1.4.4):

There are only three legal ways to turn *const T into *mut T: pointer casting *const T to *mut T, if the original raw pointer that was created was *mut T (so round tripping *mut T*const T*mut T is allowed).

At the moment, I have actually not yet mentioned *const T <-> *mut T conversions, so going to add that as well.

Going to wait until this has been cleared up, as it seems weird and inconsistent.

There is aliasing between multiple SIR places:

let x = 0;
foo(&x, &x);

For pointer-to-int casts, as stated above, the current formulation is the generally accepted one with angelic-non-determinism, where iff there exists a provenance that can allow the program to not cause UB.

The actual, still somewhat theoretical algorithm I have for checking this relies heavily on the current structure of Stacked Borrows, so I'll have to figure out how it works with TB, but I don't think these casts will be much, if any of a blocker.

This is really cool, how are you handling unions and enums with UnsafeCell inside?

Yes these optimizations should be part of the SB paper, but I only saw them applied to references, not raw pointers, can anyone confirm that these should also apply to raw pointers?

Consider this example:

Poiner graph (ptr3 is about to be added):

┌──────────────────────────────┐
│    ptr1: readshallow, ...    │
└──────────────────────────────┘
  │
  │
  ▼
┌──────────────────────────────┐
│ root: dereference_write, ... │ ◀┐
└──────────────────────────────┘  │
┌⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯┐  │
⋮ ptr3: dereference_write, ... ⋮  │
└⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯┘  │
  :                               │
  :                               │
  ▼                               │
┌──────────────────────────────┐  │
│    ptr2: readshallow, ...    │ ─┘
└──────────────────────────────┘

Does ptr2 get invalidatet? If yes, shouldnt this also mean that ptr3 is now invalid, or does the root inherit ptr3? ptr1 should remain unchanged.

Same way - the subobject relationship just becomes more complex.

Yes.

Also yes. Invalidating a pointer also invalidates all pointers derived from it (recursively). This causes undefined behaviour if it invalidates the resultant pointer of a derive expression and the resultant pointer has one of the valid-range attributes (otherwise, it's a valid operation and just yields an invalid pointer). [ptr.derive]#6, [ptr.derive]#7, [ptr.derive]#8.

In this case, the behaviour is undefined because ptr2 is invalidated by the derive operation that produces ptr3, which causes ptr3 itself to be invalidated, which is UB under [ptr.derive]#7/8 because ptr3 has the dereference_write attribute.

Agreed, although this solution generally gives me some chills, I am in favor of adding some more APIs to reduce the need to rely on this. I am not too familiar with the frequent patterns of exposing addresses (found xorlists to be the most extreme example yet, as it is combining the provenance of the two pointers). So more examples are welcome (then I can maybe find some more problems in TB).

I have no idea how your algorithm works, but I would imagine that each ptr->int cast exposes the address, so adds it to some global list (maybe storing capabilities and details of the allocated objetct) and when an int->ptr cast is encountered, the list is checked for any matches (considering the length of the allocated object). I have no idea what one would do when there are multiple matches, but i guess you have some magic that handles this.

I guess you know the type of the accessed union field at access time and can just conjure the right kind of pointer.

Ok this makes perfect sense, interesting that it also causes immediate UB.

Yep. XIR unions work a heck of a lot like C++ unions, except every variant is active simultaneously.

TB does not consider this as UB, but miri does:

fn main() { unsafe {
    let mut x_location = 0;
    let x = &mut x;
    let ptr = &mut *x as *mut i32;
    // PoI 1
    *x += 42;
    // PoI 2
    println!("{:?}", *ptr);
    // ^^^^ miri errors here: trying to reborrow <untagged> for SharedReadOnly
    // permission at alloc1283[0x0], but that tag does not exist in the borrow stack for this location
}}

Factoring out parts of this into separate functions, this would also be well-defined?

static mut STASH: *mut i32 = 0 as *mut i32;

fn func1(x: &mut i32) { unsafe {
    STASH = &mut *x as *mut i32;
}}
fn func2() { unsafe {
    println!("{:?}", *STASH);
}}

fn main() {
    let mut x_location = 0;
    let x = &mut x_location;
    func1(x);
    *x += 42;
    func2();
}

However, this implies that, if the optimizer sees main without seeing func1 or func2, it cannot optimize *x += 42; under the assumption that it has exclusive access to x_location, e.g. by moving the write across the call to func2, or by removing it altogether since the variable is never read after that point. IIRC, this type of optimization is one of the things Stacked Borrows originally wanted to allow.

Yes this is exactly what the SB paper was writing.

I am currently looking at the following code:

fn main() {
    unsafe {
        let mut x = 0;
        let ptr1 = addr_of_mut!(x);
        let ptr2 = addr_of_mut!(x);
        // PoI 1
        let y = &mut *ptr2;
        // PoI 2
        println!("{}", *y);
        println!("{}", *ptr1);
        println!("{}", *y);
        //             ^^ miri error
    }
}

This code fails under miri/SB, because the read from ptr1 invalidates y. I find this a bit surprising, considering that the same code, but with &*ptr2 as the value for y runs perfectly fine. In TB the tree would look like this:

Borrow Tree at PoI 1
┌────────────┐
│ ptr1 (SRW) │
└────────────┘
  │
  │
  ▼
┌────────────┐
│   x (U)    │
└────────────┘
  ▲
  │
  │
┌────────────┐
│ ptr2 (SRW) │
└────────────┘
Borrow Tree at PoI 2
┌────────────┐
│ ptr1 (SRO) │  ◀─ Changed from SRW
└────────────┘
  │
  │
  ▼
┌────────────┐
│   x (U)    │ ◀┐
└────────────┘  │
┌────────────┐  │
│   y (U)    │  │
└────────────┘  │
  │             │
  │             │
  ▼             │
┌────────────┐  │
│ ptr2 (SRW) │ ─┘
└────────────┘

After PoI 2 the Borrow Tree stays the same. But to have miri behavior, the read of ptr1 would need to invalidate y, but not ptr2.

Is there an optimization that could benefit from this? I was not able to come up with one, but there probably is a reason for this.

I don't have a properly formal argument, but you no longer have that &mut is unique. Consider

let mut place = 0;
let p1 = &raw mut place;
let p2 = &raw mut place;
// point 1
let r = &mut *p2;
// point 2
*r = 1;
// point 3
*p1;
// point 4
*r = 2;
// point 5

With Stacked Borrows, the execution looks roughly like

// point 1
place: ShrRW(p1, p2)
// point 2
place: ShrRW(p1, p2), Uniq(r)
// point 3, write at r
place: ShrRW(p1, p2), Uniq(r)
// point 4, read at p1
place: ShrRW(p1, p2)
// point 5, write at r, UB tag not on stack

If I understand what you're describing of Treed Borrows, the execution looks roughly like

// point 1
place: ShrRW(p1)
     \ ShrRW(p2)
// point 2
place: ShrR(p1)
     \ ShrRW(p2), Uniq(r)

// There are three possibilities here
// I don't know which one you want

// point 3, write at r
place: ShrRW(p2), Uniq(r)
//point 4, read at p1, UB tag not in tree

// Or

// point 3, write at r
place: ShrR(p1)
     \ ShrRW(p2), Uniq(r)
// point 4, read at p1
place: ShrR(p1)
     \ ShrRW(p2), Uniq(r)
// point 5, write at r
place: ShrR(p1)
     \ ShrRW(p2), Uniq(r)
// writes to r cannot be reordered/coalesced

// point 3, write at r
place: ShrR(p1)
     \ ShrRW(p2), Uniq(r)
// point 4, read at p1
place: ShrR(p1)
     \ ShrRW(p2)
// point 5, write at r, UB tag not in tree

The third execution option preserves that &mut is unique by removing all active Uniq when performing a non-Uniq read, but making p1 suddenly read-only doesn't make sense to me. p1 and p2 should behave the same because they're part of the same shared raw "pool" of access.

The second execution option means that writes to &mut cannot be reordered over calls to unknown code because they may have a pointer from a different branch of the borrow tree that they can use to observe writes. I almost didn't include that as a possible option because of said defect.

What is the diffference with SharedReadOnly then? That is the variant which does allow aliasing unlike SharedImmutRead according to the first table.

I am going to modify the code you used, because I am currently only looking at reads (when writing, I already understood, why invalidating all pointers is necessary to allow the optimizations):

let mut place = 0;
let p1 = &raw mut place;
let p2 = &raw mut place;
// point 1
let r = &mut *p2; // variation: &*p2
// point 2
let _ = &*r;
// point 3
let _ = &*p1;
// point 4
let _ = &*r;
// point 5

The variation with &*p2 produces no errors, I thought that &T and &mut T would behave the same when reading (&mut T only would also ensure, that no other &T/&mut T exist).

When considering the modified code, TB would look like this:

// point 1
place: ShrRW(p1)
     \ ShrRW(p2)
// point 2
place: ShrRO(p1)
     \ ShrRW(p2), Uniq(r)

// point 3, read at r
place: ShrRO(p1)
     \ ShrRW(p2), Uniq(r)
// point 4, read at p1
place: ShrRO(p1)
     \ ShrRW(p2), Uniq(r)
// point 5, read at r
place: ShrR(p1)
     \ ShrRW(p2), Uniq(r)

When writing to r then p1 would of course be invalidated, but when only reading, I do not see any reason to invalidate r when reading from p1.

Yes I guess the table at the beginning does not really reflect the true nature of the tags. At the moment, SHIR is designed to be used for &T and SHRO is designed for &UnsafeCell<T>.

So when you hold a SHIR, then you can be sure that the data behind the pointer will not be written to and that it is not aliased by any write-capable pointer. SHRO has no such guarantees.

I see, thanks!

This I think just isn't a realistic model to remove write permission from p1. Imho, there shouldn't be a difference between any of the following w.r.t. provenance:

p1 = &raw mut place;
p2 = &raw mut place;
r = &*p1;
p1 = &raw mut place;
p2 = &raw mut place;
r = &*p2;
p1 = &raw mut place;
p2 = p1;
r = &*p1;
p1 = &raw mut place;
p2 = p1;
r = &*p2;

The big difference between your model and SB is that your model doesn't allow writing through p1 after a Uniq tag is derived from p2.

This seems completely untenable. It needs to be possible to derive a Uniq reference from a ShrRW pointer (if you know access is in fact unique) without invalidating any access permissions of the other pointers (because access is unique). Also, it's very important that writing to that reference also doesn't invalidate other pointers below it in the stack/tree, only ones derived from it.

Stacked Borrows very deliberately treats all of the raw pointers at a given level uniformly as one blob. Treating the pointers differently is highly unexpected, especially invalidating (some kinds of access for) some of them.

Right that is wrong behavior and can be fixed, by 1. not revoking write access and 2. invalidating all Unq when a SRW is used to write a place.

It should definitely invalidate Unq in other branches:

let mut place = 0;
let ptr = &raw mut place;
let r1 = &mut *ptr;
let r2 = &mut *ptr;
*r1 = 3;
*r2 = 4; // error not in borrow stack/tree

I find it confusing to talk about below/above in the tree, so I would prefer ancestor/parent, descendant/child and orthogonal/other branch to express it unambiguously (in the previous quote, i was not sure what you meant by below in the tree).

Applying the change from above would then allow trees like this to be valid:

root (U)
\-- a (U)
    \-- b (U)
        |-- c (SRW)
        |   \-- d (SRW)
        |   \-- e (SRW)
        \-- f (SRW)
            \-- g (U)
                \-- h (U)

(before, on creation of g, {c, d, e} would be changed to SRO)

Thanks for catching this one (if trees like above are indeed desired, which i think they are)!