Tree Borrows explained

We note that we would need UnsafeCellMut to be covariant in T.

I wrote a blog post comparing Tree Borrows with Stacked Borrows in more detail. I hope it will be useful. :slight_smile:

14 Likes

The model Detect UB at runtime. Does that mean the code should actually be executed?

Which code? The model is a mathematical definition. Miri comes with an executable implementation of the model which can detect UB. If you use standard rustc, nothing like this executes because running the model there would cost too much performance.

The code is under scrutiny. For example, in the OP's blog

Tree Borrows is characterized by the fact that it executes at runtime and is access-based, as opposed to compile-time and scope-based. The code is executed, and the model tracks updates to the state of borrows after each access. Code that is never executed cannot produce UB.

Tree Borrows is in this sense more fine-grained than the Borrow Checker, which rejects some examples where it is very obvious that no aliasing actually occurs thanks to runtime conditional guards.

Example: faulty code is unreachable at runtime

//+ TB: NOT UB (aliasing code never executes)
//- Does not compile. error[E0499]: cannot borrow `*u` as mutable more than once at a time
fn unreachable_faulty(u: &mut u8) {
    if false {
        let x = &mut *u;
        let y = &mut *u;
        *x += 1;
        *y += 1;
    }
}

The code is unreachable in the runtime, thus it is not UB. It looks to me that the Tree Borrow needs to execute the code.

My question is "Is it like Google address/leak sanitizer and if the execution does not touch some code due to various conditions not met for a specific execution instance?"

Note in other execution instances, if those conditions are met, the bug (use-after-free, etc) is detected by the Google address sanitizer. To detect most of the bugs, it is required to run the code with GAS with a lot of tests. Will this apply to the Tree Borrow?

Tree Borrows isn't a program; it's a proposed set of rules that valid programs must follow. If adopted, the compiler is allowed to make optimization decisions on the assumption that the rules are obeyed without first verifying that fact.

This is basically the way all modern programming languages work, and is what allows optimizing compilers to exist: There are some side effects, like instruction timing, that programs are not allowed to rely on. That gives the compiler some room to reorganize the program to be more efficient without affecting the "important" properties, such as the output that the program produces for a given input.


This has nothing to do with tree borrows per se, as it doesn't involve any unsafe code. The compile error you see here is issued by the borrow checker is the compiler pass that's responsible for rejecting safe code that might violate the semantics rules. Because it's impossible to verify the real rules precisely, the rules that the borrow checker are must be stricter than either tree borrows or stacked borrows.

Rust's unsafe operations exist to let you write code that falls into this gap between the two rulesets, that are correct but can't be automatically checked by the compiler. But the compiler will still optimize on the assumption that all of the tree/stacked borrow rules were followed— If a particular execution path ends up violating these rules, it can be miscompiled either now or in the future and behave erratically.

1 Like

I understand the confusion, so I added a note in the intro of the post to clarify.

It's in part my fault, in the explanations I overloaded the meaning of "Tree Borrows" to mean either

  1. the Tree Borrows model, which defines the aliasing requirements (input: a (posibly infinite) set of execution paths, output: the subset of those paths that contain UB);
  2. the current implementation in Miri of a program that detects violations of the Tree Borrows rules (input: one execution path, output: a boolean for whether it contains UB).

(1) doesn't execute anything, it tells you the properties that you can assume of anything that executes, while (2) does execute code, one path at a time. Miri uses (2) and thus executes code, while the compiler would use (1) and not execute anything.

1 Like

Context: I've been implementing a tree borrows model on top of creusot.

Everything prior to protected is of the form "you can do [thing] to reference if [condition depending on reference state] and it modifies other references like [this]". Protected adds new conditions of the form "and other references aren't in [these protected states]".

This isn't necessarily a problem, but seems a bit strange (and for me inconvenient). I'm curious if a variation of the rules where protecting a pointer directly modifies the state associated with other pointers, instead of forbidding operations at a distance, was considered? Or if there is a known equivalent phrasing of the rules that does that?

Everything prior to protected is of the form "you can do [thing] to reference if [condition depending on reference state] and it modifies other references like [this]". Protected adds new conditions of the form "and other references aren't in [these protected states]".

First off, note that [condition depending on reference state] is actually [condition depending on the state of the reference and all its ancestors]. This is important especially when interior mutability breaks the guarantee that permissions closer to the leaves are bigger (in the sense Reserved < Active < Frozen < Disabled)

Otherwise that's a great point, which until now was expressed in my thoughts as "protectors are the only source of foreign-access-triggered UB". Meanwhile insufficient permissions are always child-access-triggered, and these observations happen to be hardcoded in miri's error messages.

Side note: a similar phenomenon was already present in Stacked Borrows, with most UB occuring when you can't find the tag, with the exception of protectors that create UB when you pop the tag.

What I think could work is instead of having the protector on the tag, store it instead on every foreign tag. That is, add to each state two fields: each tag has on each location (perm: Permission, foreign_const: nat, foreign_unique: nat).

Disclaimer: I haven't spend very long thinking about this reformulation of the model yet, there may be glaring flaws that I haven't noticed. If from a quick look you feel that this approach has a chance of working out better for your purposes, I can spend more time writing down the exact transition system.

You can interpret foreign_const as "how many tags currently exist that are protected and would be invalidated by a write access", and foreign_unique as "how many tags currently exist that are protected and would be invalidated by a read access. And we're going to use them exactly as you can expect: "nonzero foreign_const" means "a write access would invalidate at least one protected tag" and will be UB, and these two counters will be incremented and decremented on function entry/exit.

  • On the first read access to some location of a protected tag (which occurs at the moment of reborrow = function entry in most cases, and may occur later if we have extern types or do accesses out of bounds), increment by 1 the foreign_const on that location for every non-parent tag.
  • On the first write access to some location of a protected tag, increment by 1 the foreign_unique on that location for every non-parent tag.
  • On function exit, for each protected argument
    • on every location that it was read from, decrement by 1 the foreign_const of every non-parent tag,
    • on every location that it was written to, decrement by 1 the foreign_unique of every non-parent tag.

Apart from that, that's all that changes for handling of foreign accesses. The Active -[foreign read]-> Frozen and other stuff stays the same. Beware of this subtlety:

  • transitions that cause UB based on the presence of a protector no longer do so, and there is no longer a distinction between [protected]Active -[foreign read]-> UB and Active -[foreign read]-> Frozen;
  • however transitions that were modified by the presence of a protector but in a way that does not cause immediate UB are not reverted: we still need the distinction [protected]Reserved -{foreign read}-> [protected]Frozen.

It is already the case that foreign access UB does not exist except coming from protectors, and now truly without exceptions foreign accesses never cause UB.

So in exchange there will be more UB on accesses through the tag itself:

  • it is UB to write directly through any tag that has nonzero foreign_const
  • it is UB to read directly through any tag that has nonzero foreign_unique

Huge emphasis on "directly": this does not apply if the access is a strict child access.

On a reborrow, you can (almost) inherit the direct parent's count: the number of non-parent protected tags differs by at most one (occurs when the tag itself is protected).

This reformulation of the rules has at least a few nasty side-effects:

  • the state machine is now infinite (though it's a rather tame infinity, it's only nat and it's quite uniform),
  • it's harder to track the error back to its source (I guess it could help to store foreign_const as a Set<Tag> rather than a nat),
  • more big tree traversals that cannot be trimmed at all

but if these are not a problem for creusot then I guess the uniformity is a plus.

TL;DR: if you can't have UB triggered by the permissions of other tags, you can make "whether there exist any other tags that would trigger UB" part of the permission of the tag itself.

That is fully intentional. Except for protectors, it is always okay to invalidate a reference that is never used again (and whose children are not used again, either). The entire point of protectors is to have references where it is UB to invalidate them even if they are not used again.

1 Like

Thanks for the detailed reply :slight_smile:

I haven't got to interior mutability yet (that's next on the list after this) and was taking the shortcut of knowing that parents have strictly more permissions, so I'll probably be running into the issue that leaves don't always have more permission shortly. Thanks for the headsup - probably won't be too hard to fix.

I think I can get the details of the transition rules you're suggesting right without creating more work for you. I'll report back here once I have everything implemented with documentation of precisely how I modeled things.

I should say that at this point it's really just my not-very-well-developed intuition suggesting that creusot will have more luck actually finding proofs if I avoid the foreign-access-triggered UB. Meanwhile I don't think "Infinite" state machines and big tree traversals should pose much of a problem at all.