# Rust lifetimes semantics at mathematical level

I’m not quite sure whether this belongs in this forum, but I think it does, as it goes beyond what most users want from Rust.

3 Likes

I have absolutely no idea what I am talking about, but does that description feel similar to @RalfJung’s work (as documented by the blog https://www.ralfj.de/blog ) for anyone else?

What I want to do is in some way similar to what was done in the RustBelt project (which you are refering to), but I want to ignore unsafe code, as I’m only doing a (somewhat small) thesis and not an entire research project (to keep it all managable). So I want to take a look at the implementation myself, both because unsafe is completely intertwined with the RustBelt project, and for so called ‘educational value’

1 Like

Lifetimes form a lattice. Viewing them as lattice not only explains the order between lifetimes, but also how type inference will merge two lifetimes to produce a smaller or larger lifetime (the greates lower bound and least upper bound, though I can never keep straight which is which).

By the way, there is a bunch of research related to this besides RustBelt. There’s Patina, Metal, and a lot of research that is not directly related to, or even predating, Rust that also uses lifetimes (often called regions instead — see region-based memory management and particularly Cyclone). You should probably survey them to get a feeling for how they work with these objects formally, and also to check whether someone has already done the exact thing you want to do

Thank you, I will take a look at them

Lifetimes don’t form a lattice, but something with both less and more structure: unlike lattices they lack a join/sup (a disjunction) and a Top (a constant for True). They form an meet-https://en.wikipedia.org/wiki/Semilattice with bottom (a False). The meet is + and the bottom is 'static. But they have something extra, which logicians call a universal quantifier (`for<..>`) and that needs some rather more involved category-theoretic mumbo-jumbo to formulate algebraically.

1 Like

and

There is an opposite of `'static`, it’s just not exposed to users. And there are both join and meet operations: When unifying types containing lifetimes `'a` and `'b` respectively, the compiler computes the join or the meet (though it prefers the terms GLB and LUB) depending on whether they occur in a covariant or contravariant context.

Quantifiers are not part of lifetimes, they are part of type bounds. There is no lifetime `for<'a> ...` , there are just bounds with quantifiers. See also: Chalk

1 Like

OK, `for< >` is a universal quantifier than can only be applied in outermost position, so it’s not really a logical operator, as has been the case in say ML for a long long time.

As for join and a “True” under the hood, I didn’t know about them, but am I allowed to say that the public structure of lifetimes is a meet-semilattice with bottom?

Thanks for the clarifications.

It can be applied in nested positions too. Currently, I don't think you can do anything interesting with that, but with Chalk it will hopefully be more useful.

I don't think it's accurate to reduce it to just a meet-semilattice. Both GLB and LUB are important to lifetime inference, the results of which are definitely user-visible. And once a user starts thinking in those terms, it's not a stretch to say "there is an empty lifetime and using it is an error", even if that's not what the compiler does. Actually, there has been a proposal to effectively expose the empty lifetime to users (under a different name, with a more useful intuition). So I think the lattice property is not just an implementation detail or a mathematical trick.

So `for< >` looks a lot like a duck, quacks a lot like a duck, but is not a duck. Do people have an idea of what it is, then?

There should be a smallish formal system in which all the primitives we have mentioned fit, but not much else.

Looks like there are some theses to be written indeed.

I really don't know what you mean. It is the universal quantifier. Syntax is a bit different than classical logic notation (though not just wrt quantifiers) but that's just surface level stuff. The analogy of bounds to logical formulas goes much deeper than just lifetime bounds — again, see Chalk. So there is a nice formal view of `for<>` and much more.

For that reason I neither see the need for, nor do I know of, an analysis that restricts itself to lifetime algebra plus higher rank lifetime bounds. There's of course the option of restricting the richer formal system to just those elements, but AFAICT the result is not very interesting mathematically [1]. That's because lifetime bounds themselves are not very interesting in the absence of trait bounds and higher-ranked quantifiers over types. You just trivially get a list of atomic bounds that all have to hold, of the forms:

• `SpecificType : 'a`
• `\$TypeVar : 'a`
• `'a : 'b`

[1]: It also isn't interesting for analysing Rust code, because as soon as you start looking at bounds you're gonna want/have to look at all of them.

The quantifier `for<...> ...` does not itself "produce" a lifetime, so it's not really an operation on lifetimes.

Interestingly, lifetimes in RustBelt are a semilattice with intersection and `'static`; we don't have a union operation ("join", I guess, but I also always mix them up). Intersection turned out to be key to model, e.g., `Rc` and `RefCell`; union was not necessary. The Rust compiler works on a subset of lifetimes (those corresponding to sets of points on the CFG of a function); it is very possible that this subset has intersection while the "full" set of lifetimes (which can, e.g., span across multiple threads) does not (need to).

Actually I'd probably say that the compiler works on "symbolic lifetimes" (as in abstract interpretation). The abstraction function cuts away everything outside the current function, only remembering whether the lifetime lasts after the function returned or not. Clearly, this equates lifetimes that are not actually equal, but the analysis is of course sound for the abstraction, so that's not a problem. Symbolic lifetimes form a full lattice, concrete lifetimes may not. (Or maybe I'm splitting hair here, because really the Rust compiler doesn't care about "concrete lifetimes" at all. But we've needed them in RustBelt, to explain how unsafe code can rely on the borrow checker.)

Are you saying you want your model to closely follow the actual implementation in the compiler? That would be extremely interesting; with RustBelt we've decided to build a more abstract model to avoid things getting too messy.

2 Likes

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.