Non-lexical lifetimes (NLL) working group tracking thread

Hello all,

So NLL has been available on Nightly for some time. Still, if we want to push it over the finish line and get it enabled by default, there’s some work left to do! This thread is going to be a standing thread to track the highest priority issues, ask for help, and announce progress.

Weekly triage meeting

Starting Tuesday, April 3rd, we are going to be having a weekly triage meeting at 15:30 ET (20:30 UTC) – it is scheduled for 1h, but I hope that most weeks we can shorten it to 30min. The meeting will take place in https://appear.in/rustaceans. If you are interested in helping out, please feel free to attend. If you are not able to attend, don’t worry, we’ll be posting logs here in this thread. (If you want an iCal invite to the meeting, send me a private message here on internals with your e-mail address.)

Notes from the meeting are kept in the “NLL Triage” DropBox paper document.

During the meeting we will:

  • Review the progress that we made over the last week
  • Identify the most important bugs to focus on
    • Links to those will be posted here
    • The goal is to pick a few more bugs than we have people, to leave places for newcomers to jump in

Gitter and other coordination

The central place for tracking the status of non-lexical lifetimes is the tracking issue on GitHub. That issue includes links to a number queries that can help you to find issues.

If you’d like to ask questions or chat, please join us in the gitter room.

9 Likes

My code compiles with NLL since lot of time. The compilation time with “–emit=metadata” is about 4X using NLL (it used to be about 7X in past), so it’s still a bit too much high to have it active by default (unless we add an option to disable it?).

EDIT: Recently NLL got a little faster again, now I am seeing just 3.8X slower with “–emit=metadata”.

Hmm, interesting. This is the same crate you discussed in Significantly slower --emit=metadata · Issue #47267 · rust-lang/rust · GitHub ? Is there somewhere I can download it to try it out locally?

I guess I can start with this comment, I just don’t know how representative it really is.

I don't know if that small example is enough to uncover all performance problems of NLL compilation. In this case you usually fix a problem, look at performance, and if you spot more performance problems, you report&attack those too.

First meeting is tomorrow. I’ve created the following DropBox paper document for tracking status. If you’ve been hacking on something, or there is something you would like to talk about, please feel free to add some notes in there!

1 Like

So we had the meeting today! You can check the paper doc for some notes, but I wanted to highlight that if you are interested in hacking on NLL, the new NLL-priority tag will highlight various issues that are important:

Right now there are a few things on there, more to come.

I’ve been working on a “reimagination” of how NLL works and I think it’s looking pretty promising! My goal with this is to solve two problems:

  • The current NLL implementation runs very slowly.
  • There are some cases where the analysis rejects programs I think we ought to accept (notably rust-lang/rust#47680, but also rust-lang-nursery/borrowck#18, which are both symptoms of the same underlying problem).

TL;DR

The high-level summary of what I’ve done is this:

Using this rustc branch, I have dumped out the data for one particular function in the clap-rs benchmark. Using our current analysis, this function takes my machine ~20s to run. The prototype is able to analyze it in ~1s – so a 20x improvement. Moreover, the analysis is smarter (it solves those bugs I mentioned above) and way more cleanly specified (horray, Datalog!). The actual code using differential_dataflow is about 100 lines.

That said, I want to put two big caveats on that 1s number:

  • First, I don’t yet know if there are bugs that are letting the analysis run too quickly
    • e.g., rustc could be dumping the wrong base facts, making the analysis run more quickly than it should
    • I did validate on other files and it seems to be right
  • Second, there is one part of the analysis we have yet to model that could take longer
    • but I’m not too worried about this.

Still, I’m feeling pretty optimistic. And it’s not like I’ve even begun optimizing the new rules.

The new analysis

Let me explain how the new analysis works.

The input to the analysis

We start from roughly the same input as the existing analysis. That is, we take the MIR for some function, and we replace every region that appears anywhere in there with a fresh, distinct region variable. From this we run a type-check and compute two things:

  • which regions are live at each point in the MIR
  • the outlives relationships that arise from subtyping

To see what I mean, let’s consider a variant of the example from rust-lang-nursery/borrowck#18. Here I’ve written all the lifetimes are play as their own variables, named things like 'a and 'b. I’ve added comments to the side showing (a) which regions are live on entry to each statement and (b) which outlives requirements arise from type-checking each statement.

                           // Live  | Outlives
                           // ----  | --------
let mut foo = 22;          //       |
let mut bar = 44;          //       |
let mut x: &'x = &'b1 foo; //       | 'b1: 'x
let mut y: &'y = x;        // 'x    | 'x: 'y
x = &'b2 bar;              //    'y | 'b2: 'x
drop(x);                   // 'x 'y |
bar += 1;                  //    'y |
drop(y);                   //    'y |

Liveness. Looking at the liveness column, one interesting thing you can see is that the region 'x – which appears in the type of the variable x – has two disjoint live ranges. This is because it is assigned twice; its first value is live initially, and then the second value.

Outlives. Looking at the outlives relations, we see that they effectively track the flow of data. When you have one reference assigned to another (e.g., y = x), then you get a 'x: 'y relation, indicating that data with region 'x is flowing into a place (y) whose type has region 'y. We will build on this intuition.

Borrow regions. One thing I want to talk about are the borrow expressions in the code above; you see that each is annotated with a region, like &'b1 foo. That is not legal syntax in normal Rust – but it’s how the compiler thinks about things internally. We say that each borrow has an associated region, called the borrow region. The resulting type from &'b1 foo would then be &'b1 i32, which then in turn gets assigned to x, yielding the outlives relation of 'b1: 'x. It doesn’t come up in this example, but there might be other outlives relations too, when reborrowing from an existing reference (e.g., if we had something like &'b1 *foo). This works justs as described in the NLL RFC.

Expressing the input in datalog

The new analysis, as I wrote, is specific in Datalog. This is a [subset of Prolog] designed for efficient execution. We can express the inputs above as datalog inputs, roughly like so:

  • borrowRegion(r:Region, b:Borrow, p:Point)
    • Declares that the given region r is the “borrow region” for the borrow b, which appears at the point p in the control-flow graph (Point is just a MIR Location).
      • I separated out the conceptual “borrow” from the point at which it appears, but when rustc dumps out its data, we just identify each borrow by the point in the MIR at which the borrow appears (i.e., a MIR Location).
  • nextStatement(Point, Point)
    • just encodes the CFG – these are edges within a basic block
  • goto(Point, Point)
    • just encodes the CFG – these are edges across basic blocks
  • regionLiveOnEntryToStatement(r:Region, p:Point)
    • indicates that region r is, well, live on entry to the statement at p. This is the liveness stuff I talked about above.
  • outlives(p:Point, a:Region, b:Region, q:Point)
    • this is basically the outlives relation 'a: 'b @ Q from the RFC, but you see it has an extra point P. I’ll discuss below.
  • killed(b:Borrow, p:Point)
    • this doesn’t come up in the current example, but it indicates when the path from the given borrow b was reassigned at p
      • e.g., if you have &mut *cursor and then later cursor = temp.next.

The outlives case is sort of interesting. In the RFC, we wrote 'a: 'b @ Q, meaning that "'a must outlive 'b starting at the point Q". But in this analysis I expressed it slightly differently, using two points:

Outlives(P, R1, R2, Q)

I refer to P as the “start point” – for the most part, it is always the predecessor of Q. Typically, Q will be the exit from a statement – the point where the outlives relation can first be observed. P then marks the point on entry to a statement, the input state just before the outlives was created (roughly). So in our example, we had the statement y = x:

                           // Live  | Outlives
                           // ----  | --------
...                        // ...   |
let mut y: &'y = x;        // 'x    | 'x: 'y
...                        // ...   |

If this statement is B0/3, then this would give rise to the following outlives relation:

B0/3, 'x: 'y, B0/4

Outlives as a “points to” analysis

The central idea of the new analysis is to think of regions not as “sets of points in the control-flow graph” but rather “sets of borrows”. In compiler alias analysis, this is often called a “points-to” analysis – where borrows play the role of “abstract heap objects”. We want the analysis to be flow-sensitive, so we actually track this point-to information per region, per point. This gives rise to a relation like this:

pointsTo( r:region, b:borrow, p:point )

You can read this as "a reference with region r may contain data derived from the borrow b at the point p". One important point: we merge pointsTo with liveness – so if a region is dead, then we don’t consider to “point to” anything. In terms of our example, we can show the pointsTo information for each region/point as follows; for each line, we show the pointsTo information on entry to the corresponding statement.

                           // 'x  | 'y | 'b1 | 'b2 |
                           // --  | -- | --- | --- |
let mut foo = 22;          //     |    |     |     |
let mut bar = 44;          //     |    |     |     |
let mut x: &'x = &'b1 foo; //     |    | b1  |     |
let mut y: &'y = x;        // b1  |    |     |     |
x = &'b2 bar;              //     | b1 |     | b2  |
drop(x);                   // b2  | b1 |     |     |
bar += 1;                  //     | b1 |     |     |
drop(y);                   //     | b1 |     |     |

I’ve identified the borrows as b1 and b2. You can see that the pointsTo information for 'x changes when it gets reassigned. You can also see that the “borrow regions” 'b1 and 'b2 are just considered live within the borrow expression itself (they are never directly referenced elsewhere).

To express the pointsTo analysis in datalog, there are three rules. The first rule is the starting point, concerning borrow regions:

pointsTo(R, B, P) :-
  borrowRegion(R, B, P).

The next rule propagated pointsTo through outlives relations:

pointsTo(R1, B, Q) :-
  pointsTo(R2, B, P),
  outlives(P, R2, R1, Q).

Here we are saying:

  • if R2 points to B at some point P,
  • and R2: R1 starting at P (ending at Q),
  • then R1 points to B at Q.

The intuition here is that R2: R1 arises when data with region R2 is assigned into a place with region R1. So basically things with region R1 now point to all the things that R2 refers to.

Finally, we have a “liveness propagation” rule, which says that if a region R pointed to something at some point P, then it still points at in the successor point Q, so long as the region is live at Q:

pointsTo(R1, B, Q) :-
  pointsTo(R1, B, P),
  cfgEdge(P, Q),
  regionLiveAt(R1, Q).

cfgEdge is just the union of nextStatement and goto:

cfgEdge(P, Q) :- nextStatement(P, Q).
cfgEdge(P, Q) :- goto(P, Q).

Refining the “points to” analysis to account for kills

The above analysis is good and accounts for our initial example, but it doesn’t account for “kills”. That is, we want to be able to accept code like the following:

fn iterate(mut cursor: &mut [i32]) {
  let mut vec: Vec<&i32> = vec![];
  loop {
    let (head, tail) = match cursor.split_first_mut() {
      Some(v) => v,
      None => break,
    };
    cursor = tail;
    vec.push(head);
  }
  drop(vec);
}

The key point here is that the borrow for split_first_mut – in some sense – persists until the end of the function; this is because data from that borrow is pushed into vec (the head value). However, the variable cursor is also reassigned, so in another sense the borrow is over: the data accessible through head is no longer accessible through any other path. In particular, restricting cursor is pointless.

Unfortunately, I’m running short of time this morning, so I’m going to run somewhat quickly through how we handle this, so that I can post something. =) The idea is to use a refinement of pointsTo called restricts:

.decl restricts( r:region, b:borrow, p:point )

The idea is that the restrictions from the borrow b must be enforced at the point p or else we might invalidate the (live) region r. So e.g. this would prevent you from mutating the data that was borrowed. The rules for restricts are very similar to pointsTo except that they take into account the killed relation.

There are three clauses, beginning with the borrow region:

restricts(R, B, P) :-
  borrowRegion(R, B, P).

Like pointsTo, restricts is propagated through outlives and liveness, but only if the borrow is not killed:

restricts(R1, B, Q) :-
  restricts(R2, B, P),
  !killed(B, P),
  outlives(P, R2, R1, Q),
  
restricts(R1, B, Q) :-
  restricts(R1, B, P),
  !killed(B, P),
  cfgEdge(P, Q),
  regionLiveAt(R1, Q).

Final analysis

Finally, we can make one last relation that indicates whether the restrictions from a given borrow are in force at a particular program point:

borrowLiveAt(B, P) :-
  regionLiveAt(R, P),
  restricts(R, B, P).

Conclusions and next steps

I’ve got to spend a few days fixing regressions and other things, but I hope on Friday to return to this and try to actually integrate the differential_dataflow crate into the compiler so we can try this out for real. (If anyone wants to try hacking on it over the next day or two, let me know.)

I’m pretty excited about this direction overall for a number of reasons:

  • clear specification through datalog rules
    • I also that we can tool the compiler to export base facts, allowing for other analyses to use them
  • differential_dataflow supports incremental computation
    • we’d have to extend our current approach, but we could save the results of borrowck and just update the base facts
  • I suspect we can move more and more things into “the datalog” rather than having custom code in the compiler to do the analysis
50 Likes

This sounds very interesting. Have you by any chance already thought what interface the compiler could expose to the consumers? While getting just the analysis results (borrowLiveAt) probably would be sufficient for most uses (Viper, highlighting borrows in the Rust IDE), I think that providing the compiler consumers with an ability to query the compiler directly via datalog (by defining their own rules) would enable many uses, which we have not anticipated yet. Please let me know if you think that I could help with designing and implementing such interface.

1 Like

I have not thought deeply about this -- still trying to prototype and get things up and going. But I feel like providing various kinds of "facts" -- or providing queries to derive them -- feels eminently doable.

I'm not 100% sure what this means but I am interested :slight_smile:

Sounds good. Such queries should be enough to get us going.

For example, our Viper front-end needs to perform some static analyses, and I am now wondering if we could replace them with the datalog rules.

Also, would not such a datalog based interface be the first step towards your idea of using datalog for analysing Rust code? Especially, if we find a way to save results on disk for incremental compilation, we should be able to quite easily import these saved results from many crates into a single database for further analysis, or am I missing something?

1 Like

For your information, Viper seems to refer to an ETH Zurich project. I was deeply confused before realizing this.

Sorry for the confusion, I should have added a reference. Your reference is correct: I was referring to the Viper verification infrastructure for which we are building a Rust front-end. It is the same one, which was mentioned several times in the Verification WG discussions.

So we just had our weekly sync, and I was wondering something:

Would it be better to do the meetings over gitter? I was thinking that for non-native English speakers, that might be more accessible. @pnkfelix may also prefer since having a late-night meeting that requires talking is perhaps more disruptive.

1 Like

I’m happy either way.

Having video (or even just audio) does give a feeling of connection to the other members of the group. I used to think, in my youth, that such intangible quantities didn’t matter compared to “real work” (where “real work” is of course measured by metrics like “lines of code” etc – so, yes, laughable). After 8 years of working on teams with remote members, I have come around to thinking that there is value in those interactions outside IRC.

But, that doesn’t mean that the video/audio meetings need to be every week!

(So, again: I’m happy either way. Just wanted to say that I personally do think there is some value in sometimes meeting outside of a text-based chat room.)

3 Likes

Yeah, this was one of the original reasons I opted for video.

Interesting thought

Oh, and of course: I know that I often speak too fast for people to understand me, including native English speakers. That would certainly be an argument for a text-based chat meeting, where people would just have to be able to keep up with the typing speed (usually easier for most, or at least one can tune out more readily)

So, not native english speaker here with very bad english understanding level :). I didn’t understand that much from the meeting to be honest. Got some keywords, I asked Niko if it’s possible to record next sessions or something, so I can re-play later. So in that regard chat based would be way better for me.

At the same time having video sessions are cool, we start to know each other a bit more, etc. So yeah, maybe doing video from time to time or something like that could be a good idea.

Also better than video calls, could be great to get together with most of the people in the working group at some point. So I guess if there’s somebody going to RustConf we could hang out and accomplish way better interactions than over video :), where also most of the people were with video off.

4 Likes

Update on this: I got this idea working in rustc, but I found that the rules as I wrote them had some fundamental (and, in retrospect, sort of obvious) problems. I think they can be fixed, but I decided that before doing that, I would try to integrate differential dataflow in isolation -- that is, reimplement the existing analysis. This would then isolate the "bulk" of the work somehow. Still working on that though.

9 Likes

Today is our weekly sync meeting. In advance, I wanted to leave some notes on my progress and suggest a possible course of action.

First off, as you may have seen, last week I wrote a new blog post explaining my analysis.

The new analysis is implemented on my nll-alias-analysis branch, however it is not ready to land. First off, it’s quite slow – the impl is naive. I am confident we can improve the perf dramatically, but it’s not entirely trivial to do so.

Even if the perf were great, however, there are also some obstacles around integrating the differential-dataflow crate that I am currently using. For one thing, it depends on libgetopt, which causes problems for…reasons. We added a feature flag to sidestep this but need a new release and @frankmcsherry hasn’t gotten around to it yet (which is fine). There are other problems too: it uses procedural macros for custom derive, and we can’t incorporate those into the stage1 builds yet (we might be able to sidestep this too, I’ve not yet tried).

In any case, I had an alternative thought. Maybe I should just land a -Znll-facts switch that dumps the base facts out into a set of files (in fact, I already have this switch). Then I can extract the solving code into a rust-lang-nursery repo that loads from those files. This would effectively be the “canonical” borrow checker implementation. Then we can tinker with an optimized variant there, independent from the compiler, which should allow for faster iteration. Eventually we might be able to extract a subcrate usable by the compiler. This is roughly the chalk model.

Thoughts?