More robust lifetime inference


#1

It appears to me that the purpose of the RFC on lifetime elision is to use heuristics to decide on lifetime specifications. But can’t we do better than that? Can’t we do proper lifetime inference?

For example:

fn foo(x: &int, y: &int) -> &int {
    if *x > 50 {
        x
    }
    else {
        y
    }
}

Unless I’m misunderstanding something, according to the RFC (as well as in current Rust builds) the above code requires explicit lifetime annotations to compile, like so:

fn foo<'a>(x: &'a int, y: &'a int) -> &'a int {
    ...
}

But it seems like the compiler ought to be able to infer this. In fact, in a sense it seems like it already does. If I try to compile this:

fn foo<'a, 'b>(x: &'a int, y: &'b int) -> &'a int {
    ...
}

It refuses to compile, giving an error about the lifetimes. So the compiler knows that this particular lifetime specification makes no sense with respect to the code inside the function. So why not take its analysis a step further and have it work backwards from the return value to figure out which inputs it could possibly come from, and infer a minimal correct lifetime specification from that?

Of course, this is a trivial example. But aside from code that uses unsafe blocks to do weird things, it appears to me that lifetime specifications (at least for functions) are pretty mechanical and deterministic. It feels like busy work with only one correct answer that the compiler ought to be able to infer.

I could absolutely be missing something here–I’m relatively new to Rust. So my apologies if I’m just being ignorant. But I’m concerned about locking Rust into heuristics as outlined in the RFC if more robust lifetime inference is a feasible alternative.


#2

I’m guessing that this could be possible (I don’t really know), and it would be quite useful for newcomers. However, there is a big problem with it: the (inferred/invisible) signature of the function could change slighlty based on the code within, making detecting breaking changes very difficult. For example, pretend I had the following function in my library:

/// Returns one of `x` or `y`.
// Inferred signature: fn<'a>(x: &'a int, y: &int) -> &'a int
pub fn foo(x: &int, y: &int) -> &int {
    x
}

Now let’s say that some day I decide to change this function to do something else, but I kept its signature the same:

/// Returns one of `x` or `y`.
// Inferred signature: fn<'a>(x: &int, y: &'a int) -> &'a int
pub fn foo(x: &int, y: &int) -> &int {
    y
}

Users of my library will find their code suddenly failing to compile because I changed not the visible signature of the function, but the internal implementation, which accidentally influenced the inferred signaure. They were relying on the invisible inferred lifetimes of my function, and when I changed those accidentally, their code broke. This is one reason we have simple reliable heuristics for lifetime elision that only rely on the signature of the function, rather than depending on the internal implementation of the function: users of the library can look at the signature of the function, with all lifetimes elided, and still be able to fill in the blanks and work out the invisible lifetimes from the visible signature alone.


#3

Currently a function’s type is known purely from the declaration (even our current lifetime inference satisfies this). Changing to infer based on the contents of the function would be a major departure from this property, and small internal adjustments could cause the inferred function signature to change, as @P1start points out.

Rust aims to use traits and lifetimes to guarantee that a program can be type checked just looking at the type signatures, with no need to look at internals, which reduces the scope for the huge errors one can sometimes get out of C++ compilers, and the confusing errors one can sometimes get out of Haskell compilers (where a small local change causes type inference to change, resulting in an error in a seemingly unrelated place).


#4

Ah! I had not even thought of this. I wasn’t thinking of lifetimes as being part of the interface spec, but in retrospect of course it is! Color me ignorant. I wonder if there’s an appropriate middle-ground, though…

API stability is most important for the public-facing components of libraries. So one possibility that comes to mind is that lifetimes could be inferred as I described, but there would be a lint that errors on anything without explicit lifetime annotations that is public-facing outside of the crate. That way you would get robust lifetime inference for most code, but public-facing API’s are required to be explicit.

On the other hand, that still breaks the spirit of signatures being self-contained. So maybe that’s not such a good idea.


#5

Hmm. The more I think about it, the more I see why lifetime inference isn’t a good idea.

When I started thinking about traits, it started to become clear that this breaks down pretty quickly. Traits are such a huge part of the language, and you can’t infer lifetimes from their interfaces since they have no specific implementation. Thus full inference would only end up applying to a somewhat limited amount of code anyway.

So I withdraw my suggestion. The heuristics in the RFC seem very minimal and reasonable. Thanks so much @P1start and @huon for taking the time to respond to me and explain things!