Blog post: An alias-based formulation of the borrow checker

nikomatsakis · April 28, 2018, 12:00am

New post:

http://smallcultfollowing.com/babysteps/blog/2018/04/27/an-alias-based-formulation-of-the-borrow-checker/

Ever since the Rust All Hands, I’ve been experimenting with an alternative formulation of the Rust borrow checker. The goal is to find a formulation that overcomes some shortcomings of the current proposal while hopefully also being faster to compute. I have implemented a prototype for this analysis. It passes the full NLL test suite and also handles a few cases – such as #47680 – that the current NLL analysis cannot handle. However, the performance has a long way to go (it is currently slower than existing analysis). That said, I haven’t even begun to optimize yet, and I know I am doing some naive and inefficient things that can definitely be done better; so I am still optimistic we’ll be able to make big strides there.

I'm creating this thread for conversation.

Centril · April 28, 2018, 6:50am

Neat post.

We have traditionally called this an outlives relationship, but I am going to call it a subset relationship instead, as befits the new meaning of regions.

I think it is really helpful to do away with the outlives terminology. I've always thought it confusing to use terminology that is indicative of 'a <: 'a not applying.

josh · April 28, 2018, 7:23am

I love this new formulation. The idea that a borrow ends its scope precisely at the point where you do something the borrow would not allow you to do serves not only as a remarkably simple formulation, but also an easy one to reason about and explain.

arielb1 · April 28, 2018, 1:44pm

Nice post! I’m trying to work out through it.

It looks like the main idea is that by keeping track of relationships between regions, instead of the exact lifetime of regions, we don’t have to keep track of dead regions, which prevents different “initializations” .

I disagree with this proposal “avoiding SSA”. From my reading of it, it keeps track of a different region graph for every CFG point, which is basically a full CPS transformation, which SSA is supposed to be an optimized version of.

Forgotten lifetiems - region inference as quantifier elimination

I should really finish writing this, but I want to get this post out and I don’t have an infinite amount of time.

If we allow ourselves to take a full “CFG -> CPS transformation”, and do region inference on that, then our code becomes a bunch of lambdas wrapped about a fixed-point combinator:

    start/1
        x = use() ; -> start/2
    start/2
        y = use() ; -> start/3
    start/3
        list = &'b1 mut x ; -> start/4 'list
    start/4 'list
        v = &'b2 mut (*list).value ; -> start/5 'list 'v
    start/5 'list 'v
        list = &'b3 mut y ; -> start/6 'list 'v
    start/6 'list 'v
        use(v) ; -> start/7 'list 
    start/7 'list
        use(list);

In order to borrow-check the program, each ~~CFG point~~CPS lambda needs to know the relationship between the regions that enter it, and the loans that it must not modify, and region inference is basically propagating these constraints across the CFG.

Now, the regions that enter each CPS lambda have no direct relationship to the regions that exit it - the only relationship is through the loans and the stack trace. Therefore, if we don’t want, at each CFG point, to keep track of every region from every CFG point - and this is an infinite set of regions if the CFG contains a loop - we need to “forget” lifetimes that are going to be dead.

This “forgetting” of region variables is basically quantifier elimination - the caller knows that the conditions on the “external” regions hold for some set of “internal” regions, and it wants to eliminate the internal regions.

I’ll note that closures perform the same thing where they want to “forget” about a region. In that case, we don’t have a complete solution. This confused at least me - I stopped looking for a solution along these lines for non-closure inference, because I thought it would have to be incomplete.

However, the difference between these cases is:

In the closure case, we want, for our given φ representing the conditions on the closure, a ψ such that ψ('lts) ≡ ∀'x φ('x, 'lts), which is not trivial to do in a complete way.
In the non-closure case, we want, for our given φ representing the conditions on the “abandonment point”, a ψ such that ψ('lts) ≡ ∃'x φ('x, 'lts). This is equivalent to just “cutting out” 'x from the region graph, and definitely doable.

Now, to see what I mean, let’s see some code:

Loops, play-by-play

So, to give names to the lifetimes in the loop example:

fn main() {
    let mut base = &'base mut Thing;

    loop {
        // no relation between 'base & 't
        let t = &'t mut *base; // 'base: 't
        let c: Option<&'t mut Thing> = maybe_next({t});
        match c {
            Some(v) => {
                temp = v;
                // 't: 'base ('t `includes` 'base)
            }
            None => {
                // 't ends here
            }
        }
    }
}

Here, crucially, this code is sound because a 't region from this iteration of the loop can’t be alive at the None arm - and therefore, the borrow of base can’t go through that arm.

However, a 't region from a previous iteration of this loop can certainly be alive here (e.g. if we go through the Some arm and then the None arm), so some 't region can be alive here. The current NLL code treats all the regions with the same origin as through they are the same region, and therefore ends up with an error.

https://github.com/nikomatsakis/borrowck/issues/18, Play-by-play

I think this works:

struct List<+> {
  value: 0,
}

let x: List<()>;
let y: List<()>;
let list: &'list mut List<()>;
let v: &'v mut ();

block START {
    // ()
    x = use();
    y = use();
    //
    list = &'b1 mut x;
    // 'v: 'b1, 'b1: 'list
    v = &'b2 mut (*list).value;
    // 'v: 'b1: 'list: 'b2: 'v
    list = &'b3 mut y;
    // We forget the old 'list in the assignment
    // 'v: 'b1: 'b2: 'v; 'b3: 'list
    use(v);
    // nothing here keeps 'v alive.
    use(list);
}

CAD97 · April 28, 2018, 2:44pm

Random thought from someone who doesn’t quite get the theory part of it yet:

Traditionally with lexical lifetime scopes and the outlives relationship, it was never possible to unify lifetimes (or at least I remember seeing that said). Now, where 'a: 'b means 'a <: 'b, or that 'a's requirements are a (maybe improper) subset of 'b's, we trivially have that 'a <: 'a. Does this mean that it would be possible to assert two distinct lifetimes have the same set of requirements with 'a <: 'b, 'b <: 'a? Is that even a useful thing to assert?

Personally, when working with lifetime parameters, thinking about them as “death notes” for data gets me past borrowck most of the time. So, when I’m given some data with lifetime 'data, any further derived information tied to that lifetime dies when 'data becomes invalidated. I like that this model for parametric lifetimes apparently works locally as well under this model.

nikomatsakis · April 30, 2018, 8:49am

I believe we can get away with a simpler renaming. My plan was to say that r @ P and r @ Q are the same variable iff r is live on entry to Q. This naturally "coallesces" variables across the control-flow graph, but it does so more aggressively than SSA would, since e.g. here there is only one variable:

let p: &i32;
if foo {
    p = &x;
} else {
    p = &y;
}
read(p);

We can afford to be less precise than an SSA renaming because we are tracking the subset and requires relations per control-flow graph node. However, doing some amount of renaming does allow us to avoid the transitive closure and region liveness considerations, which should speed up the implementation considerably.

You might think that using SSA would allow us to not track the other relations per node, but that would lose precision. The usual example (hat tip to @zwarich) is vec-push-ref:

let vec: Vec<&'0 i32> = Vec::new();
let p: &'1 i32 = &x
if foo {
    vec.push(p);
} else {
    x += 1;
}
read(vec);

Here, neither vec nor x are re-assigned, so SSA would create only one set of region variables, and yet we wish to know that p is not live in the else branch (even though vec is!).

leonardo · April 30, 2018, 10:12am

However, the performance has a long way to go (it is currently slower than existing analysis). That said, I haven’t even begun to optimize yet, and I know I am doing some naive and inefficient things that can definitely be done better; so I am still optimistic we’ll be able to make big strides there.

If in the end there's no way to find a way to make the compiler fast enough then it's a good idea to abandon the idea of NLL, and keep the compiler less powerful but faster. Having a fast enough compiler is essential, and more important than having a slightly more flexible/smart borrow checker.

repax · April 30, 2018, 10:39am

I don’t agree. An elegant language is better than fast compilation. Computers will get faster, algorithms can be parallelised. An historical example is the single-pass compilation model that was baked into the semantics of the C programming language, requiring forward declarations and such just because multi-pass compilation was deemed too expensive in terms of time/memory at the time.

arielb1 · April 30, 2018, 11:15am

This doesn't feel quite clear enough. After all, this should compile:

let p: &i32;
let a = &x;
let b = &y;
if foo {
    p = a;
    y += 1;
} else {
    p = b;
    x += 1;
}
read(p);

While this should not:

let p: &i32;
let a = &x;
let b = &y;
if foo {
    p = a;
    x += 1; //~ ERROR
} else {
    p = b;
    y += 1; //~ ERROR
}
read(p);

comex · April 30, 2018, 2:55pm

@arielb1 Any chance you can clarify your “loops, play-by-play” example? As written, the variable temp isn’t used anywhere (or defined, for that matter), so there’s no way for assigning to it to cause unsoundness. The original example had temp as the self parameter to maybe_next, but you put something else there, and I’m not really sure what it’s supposed to look like. I’d like to grasp the point you’re trying to make, and I think I’m halfway there or so…

Also… if I understand correctly, if you take the CPS form, translate back to Rust, and simplify, you just get this recursive function:

// aka: fn recursive<'a>(temp: &'a mut Thing)
fn recursive(temp: &mut Thing) { 
    match temp.maybe_next() {
        Some(v) => recursive(v),
        None => recursive(temp),
    }
}

This doesn’t compile without NLL, but it does compile with existing-NLL. I think it might demonstrate what you’re saying about eliminating qualifiers, since the loop is turned into a function call where the function has a forall-lifetime qualifier in its type… but I’m only guessing, because I don’t actually know what ψ is or what 'lts refers to, and I don’t know where the ∃ qualifier is supposed to come into play. So, I’d appreciate if you could explain some more

nikomatsakis · April 30, 2018, 2:56pm

I believe that will work fine, for the same reason as the vec-push-ref example. Basically because we are still tracking the set of outlives relations on a point-by-point basis:

let p: &'p i32;
let a: &'a i32 = &x; // Loan Lx
let b: &'b i32 = &y; // Loan Ly
    // requires('a, Lx)
    // requires('b, Ly)
if foo {
    p = a;
        // 'a : 'p
    y += 1;
        // OK -- while `'p` is live,
        // it cannot reach loan Ly here
} else {
    p = b;
        // 'b : 'p
    x += 1;
        // OK -- while `'p` is live,
        // it cannot reach loan Lx here
}
// Here, both outlives relations are in scope, so
//     'a: 'p
//     'b: 'p
// and hence both `x` and `y` are borrowed here.
read(p);

I'm not sure yet what the most efficient way to track the set of subset edges is, but it seems like some kind of persistent graph (implemented e.g. with a persistent map, persistent vector, or even persistent adjacency matrix using a bitset) would be ideal, since that can be inherited between points using a very cheap "clone" operation (and there are a lot of clones).

I am still toying with whether differential dataflow makes sense in this case. Perhaps @frankmcsherry will want to weigh in.

nikomatsakis · April 30, 2018, 6:00pm

That said, it does feel like there should be some renaming strategy that can capture the flow dependencies here. I haven’t quite figured out what that is yet. =) Clearly it’s not ordinary SSA. I think the unification I was proposing isn’t really that helpful either – not as helpful as I initially thought, anyway.

There are a few scenarios we need to capture, I think. Some examples:

If you have a transitive link A: B: C and then B goes dead, we still want to remember that C may contain any loans that A contained. Similarly, if A: B and A contains L0, then A goes dead, B still contains L0.

I’m not sure what’s the best data structure for representing this, but it “feels” representable. My current implementation addresses this by finding the full transitive closure of the subset + requires relations at each point, but that’s horrifically expensive, at least implemented in a naive fashion.

I could imagine a few ideas. I was pursuing the renaming (as I mentioned earlier), but that’s not as useful as I hoped. We still want to ensure that if you have a transitive link A: B, and B goes dead, and A later gains an incoming loan L0, then B does not contain L0.

I could also imagine that when removing a node from the graph (e.g., a region that went dead), we propagate its incoming edges to all of its outgoing targets, and similarly incoming loans. I guess this would basically work. (We’d want to remove sets of nodes, actually, so as to avoid a lot of inefficiency.)

Anyway…I should do a bit of digging around I guess. It feels like there should be some examples of good data structures and approaches for “maintaining” a transitive closures in the face of edits?

The other thing that may be profitable is to be lazy: record the delta for a graph but avoid making the actual changes unless it matters (e.g., we are searching for a given loan and it may be affected by the change). One could also imagine a “magic set” sort of approach. Basically observe the set of loans that are potentially invalidated at any point in the graph, and try to narrow down the data we track to only those. I’m not sure how best to do either of those yet but there seems to be room there.

apt1002 · May 7, 2018, 1:50pm

Your blog announces its feed as http://smallcultfollowing.com/babysteps/feed.xml but it is actually http://smallcultfollowing.com/babysteps/atom.xml as far as I can see.

ahorn · June 15, 2018, 7:47pm

Thanks for the post. You mention that you found some inspiration in recent work on alias analysis for large C programs etc.

What alias analysis techniques are most pertinent to your post you think? Are you expecting computational complexity arguments to map to your approach?

Many thanks for your time!

nikomatsakis · January 18, 2019, 2:28pm

New post:

http://smallcultfollowing.com/babysteps/blog/2019/01/17/polonius-and-region-errors/

Now that NLL has been shipped, I've been doing some work revisiting the Polonius project. Polonius is the project that implements the "alias-based formulation" described in my older blogpost. Polonius has come a long way since that post; it's now quite fast and also experimentally integrated into rustc, where it passes the full test suite.

However, polonius as described is not complete. It describes the core "borrow check" analysis, but there are a number of other checks that the current implementation checks which polonius ignores:

Polonius does not account for moves and initialization.

Polonius does not check for relations between named lifetimes.

This blog post is focused on the second of those bullet points. It covers the simple cases; hopefully I will soon post a follow-up that targets some of the more complex cases that can arise (specifically, dealing with higher-ranked things).

mark-i-m · January 19, 2019, 7:56pm

Thanks @nikomatsakis!

You mention that polonius is “quite fast and integrated into the compiler”. I’m curious how fast is quite fast? Is it competitive with the lexical borrow checker? Also, do you expect polonius to land on stable rust any time this year?

nikomatsakis · January 22, 2019, 2:03pm

Heh, a good question. We've not directly compared. The answer overall is I think a bit complicated.

When I said "quite fast" I was basically meaning "relative to older versions of polonius". We've mostly measured against the "clap" benchmark, which comes from an older version of clap. It was a big pain point for NLL initially, and took ~20 seconds for the core region resolution to complete -- by removing the location sensitivity we got it faster. The full Polonius algorithm initially took minutes on this test case; it now takes ~4seconds, if I recall, and we still have a few tricks up our sleeves to do better. I'm not sure what NLL on master takes nor what the lexical borrow check takes (that's a bit harder to compare).

However, we also have another tool: we have an approximated, "location insensitive" variant of Polonius. If this analysis passes, then there are guaranteed to be no errors -- if it fails, we have to run the full analysis. I think it's sort of roughly equivalent to the NLL shipping in Rust 2018 today, though perhaps a bit less precise. In any case, it executes in 0.1 seconds or something like that.

But to really compare we have to look also at the "fact generation" time from the compiler and I've not really measured it. I also think we should try to migrate more of the NLL logic out of rust code and into polonius -- which would probably change the set of facts that we generate, to make them a bit more minimal (some of the "input facts" polonius gets can in fact be computed from a more core set of facts, e.g. liveness can be computed from uses).

Seems plausible, but I don't know for sure yet.

nikomatsakis · January 22, 2019, 2:04pm

New blog post posted:

Polonius and the case of the hereditary harrop predicate

In my previous post about Polonius and subregion obligations, I mentioned that there needs to be a follow-up to deal with higher-ranked subregions. This post digs a bit more into what the problem is in the first place and sketches out the general solution I have in mind, but doesn’t give any concrete algorithms for it.

LPTK · January 22, 2019, 5:13pm

@nikomatsakis, have you had a look at Stephen Dolan’s thesis Algebraic Subtyping?

In that thesis, there is a section on deciding subsumption of polymorphic function types, which contain forall quantifiers at the top level. This can be easily generalized to also support exists (just do the same thing but in the other direction). It seems that would solve the problems highlighted in the blog post.

More generally, the beauty of Dolan’s approach is that it gives you a complete algorithm for inferring and checking (first-rank) polymorphic types in the presence of subtyping, by using type unions and intersections, and a generalization of Hindler-Milner’s traditional unification called biunification. It seems to me that Rust could use the same approach to great effect. If Rust inferred regions made of unions and intersections of region variables, the inference could probably be simplified, and perhaps the errors would even be more understandable.

nikomatsakis · January 22, 2019, 5:15pm

I’ve read it, but to be honest I feel like I didn’t fully understand it. It’s been on my list to re-read and really deeply understand. I’ll allocate more time.

Topic		Replies	Views
Non-lexical lifetimes draft RFC + prototype language design	10	2584	March 25, 2019
Mechanical suggestions for some borrow checker errors compiler	4	1185	May 9, 2019
Borrow visualizer for the Rust Language Service tools and infrastructure	42	21226	July 2, 2019
Non-lexical lifetimes based on liveness compiler	45	10544	March 25, 2019
The final impl period newsletter	3	7021	March 25, 2019

Blog post: An alias-based formulation of the borrow checker

Forgotten lifetiems - region inference as quantifier elimination

Loops, play-by-play

https://github.com/nikomatsakis/borrowck/issues/18, Play-by-play

Related topics