Non-lexical lifetimes based on liveness

I should give credit where credit is due, by the way: Lionel Parreaux did some nice work investigating a system where regions are paths instead of lifetimes; the "C++ lifetimes" work also seems to take a similar approach.

1 Like

Have you thought of creating an RFC for NLL?

I know I have bumped into confusing error messages from borrowck before that would either be legal or have better error messages with NLL.

New post offering a refinement of the model:

http://smallcultfollowing.com/babysteps/blog/2017/02/21/non-lexical-lifetimes-using-liveness-and-location/

7 Likes

Will NLL make this work?

use std::io;
use std::io::prelude::*;
fn main() {
    for line in io::stdin().lock().lines() {
        let line = line.unwrap();
        let l = line.trim_left();
        if l.is_empty() || l.starts_with("#") { continue; }
        println!("{}\n", line.trim_right());
    }
}

Right now (1.17.0-nightly) you get

error: borrowed value does not live long enough
 --> test.rs:9:5
  |
4 |     for line in io::stdin().lock().lines() {
  |                 ----------- temporary value created here
...
9 |     }
  |     ^ temporary value dropped here while still borrowed
  |
  = note: values in a scope are dropped in the opposite order they are created
  = note: consider using a `let` binding to increase its lifetime

To make it work you have to assign io::stdin to a named variable above the loop. I find this both inconvenient and confusing, and the error message doesn’t help (why is the early drop of the io::stdin() object being ascribed to the end of the loop?)

No, it has no effect on this. This is a result of our temporary lifetime rules. We made a decision long ago not to infer the lifetime of temporaries but rather to use syntactic rules to decide them – in other words, you shouldn’t have to do anything too fancy to know when your destructors run (although the syntactic rules aren’t that trivial).

However, RFC 66, which we accepted a long time ago but have never implemented, probably would fix this problem. Part of the reason we never implemented is that it is not entirely trivial to do so, but I should try to write up some mentoring instructions.

5 Likes

To clarify motivation, will NLL allow all of these added foo writes?

let mut p = &foo;
// `p` is live here: its value may be used on the next line.
if condition {
    // `p` is live here: its value will be used on the next line.
    print(*p);
    // `p` is DEAD here: its value will not be used.
    foo = 1;
    p = &bar;
    // `p` is live here: its value will be used later.
    foo = 2;
}
// `p` is live here: its value may be used on the next line.
print(*p);
// `p` is DEAD here: its value will not be used.
foo = 3;

It would.

In the “Destructors” section you talk about how the rules for dropck remain unchanged. Am I correct in thinking that this is sufficient to ensure that there aren’t any breaking changes the behavior of RAII types such as Mutex or RwLock?

2 Likes

Are there any known examples of safe, correct code that this analysis would not accept? All the standard NLL examples I can think of seem like they’d pass this.

There would be no breaking changes in behavior around RAII, correct.

The intention is certainly to accept strictly more code than before, so anything that works today would continue to work. I’m not aware of any exceptions to that.

1 Like

That said, I should spend some time writing about dropck. In particular, I had hoped that drop would no longer be a special case at all, but that's not really true, since the implicit drop is the one case where we (intentionally) allow references that are "out of scope" to be accessible (but only if there is no custom destructor to witness them). This is what lets you build up cycles in some cases.

To elaborate a bit more:

The idea of NLL is that it has no affect whatsoever on what happens at runtime. This is purely about what kinds of code the compiler will accept. But the “dynamic semantics” (i.e., what happens when you execute the program) remain unchanged.

2 Likes

I like this plan a lot. It is indeed much simpler than all the single-exit and "continuous" stuff. I do generally prefer to design the language, and then think of inference algorithm, while this describes the inference algorithm but leaves the explicit-lifetimes-MIR language implicit, but whatever. This is closer to how the compiler will work in practice.

You can go farther, and give variables a distinct type at each point in the program, as in Ericson2314’s stateful MIR for Rust. But even then you must contend with invariance or you have the same sort of problems.

I do appreciate the shout-out :). Did you mean contravariance by any chance--which I did mention wanting, or is this a limitation I missed?

Oh, and you hint that during the sprint you talked about "type per statement" which sounds like my plan. I agree that, for the current goals, that and my plan are vast overkill, but I am curious now what the sprint plan looked like.

Would this use MIR borrowck?

I am very exited about this discussion. Lifetimes based on lexical scopes were a major frustration for me and ultimately the main reason why I didn’t adopt Rust for my current project even though I really like the language. Specifically, I am looking to solve a following puzzle. I have a non-trivial data structure (in following DS) that uses custom indices to manipulate it. The index is not a machine pointer, but can be a fairly intricate combination of references to the internal state of the DS (e.g. offsets into internal tables + state flags etc.). Furthermore, indices potentially become invalid once the DS is mutated. Therefore, in order to have a safe implementation of the DS, we’d ideally want to throw compile-time errors if any of the indices are alive when the host structure is mutated.

I attempted to achieve this in Rust using lifetime parameters. When an index is created, it is bound to the lifetime of the DS, resulting in implicit immutable borrow (no actual reference is created, which is again desirable). While this ensures that I can’t mutate the DS before all the indices go out of scope, it also makes the indices fairly useless as there is no apparent way to consume them. Specifically, I’d like to do something like this:

collection.consume(index0)

or

collection.join(index0, index1)

where the signature of the methods is something like:

fn consume<'a, 'b: 'a>(&'b mut self, index0: Index<'a>)
fn join<'a, 'b, 'c: 'a+'b>(&'c mut self, index0: Index<'a>, index1: Index<'b>) // how do you even state that 
                                                                               // 'c outlives both 'a and 'b?

The intuition here is that while indices do borrow the collection, they can’t outlive the method call (as the index is moved). Furthermore within the mutation method, they get destructured into the implementation-dependent internal state before the collection is actually mutated. Therefore, I would consider this pattern “safe” — there is no way that the mutable borrow of self will interfere with the implicit immutable borrow within the index. In another words, since the collection technically owns the index (the later cannot exist outside the particular collection state), it is safe for the collection to consume it. However, I don’t see any way of implementing this in current Rust — the compiler of course complains about the mutable borrow of self while there is still an outstanding implicit immutable borrow by the index.

Now, I have to admit that I haven’t fully understood the approach to NLL suggested by Nicholas in his post. I’ll certainly spend some time looking at it, but just to satisfy my curiosity: will this approach allow one to implement patterns like the above?

I think that you can do this using interior mutability, or by using the RefCell type in the standard library.

It sounds like your indexes are somewhat analogous to slices. Looking at the source, Rust uses some unsafe code in its implementation and it has direct compiler support (as it has dedicated syntax). But in essence a slice can be built from just an array reference, a start offset and a length. For a mutable slice the array reference would be mutable, for a shared slice the array reference would be shared. Could an analogous construction work for you?

1 Like

If I see it correctly, the most significant difference between Rust slices and my indices is that the slice can be used independently to access the borrowed data, while my indices are only bundles of state that can only be interpreted by their owner object. Actually, it the indices themselves can be implemented easily — its the invariant part (no index can outlive owner object mutation) that is tricky. I could of course try changing the API so that the object is manipulated directly through indices (even though I really dislike the idea), but then I still will have to deal with the issue of multiple mutable borrows… and of course, the entire thing should be super-lightweight for performance reasons.

If you’re willing to change the api you could have a structure that contains a mutable reference and your existing index structure, that should be as lightweight as passing a reference and the index around separately. The main drawback would be that now your index also is either mutable or immutable, and you won’t be able to have multiple mutable indexes at the same time.

If you don’t want to change the api you could still resort to runtime checks or to unsafe code to do what you want, but I’m not sure if your usage would be safe. (If an index is akin to a shared reference, then when there is one reference out there there can be N as well. If you consume one using unsafe code the compiler won’t prevent the other ones from staying alive. To get around that the index would have to be unique, which can be done by incorporating a mutable reference in it. But then you couldn’t have two indexes to pass to your join method.)