Rust lifetimes semantics at mathematical level

RustThesis · January 8, 2018, 4:03pm

I’m not quite sure whether this belongs in this forum, but I think it does, as it goes beyond what most users want from Rust.

I am a computing science student and I am writing a thesis about the memory safety aspects of Rust. That means I’m interested in the lifetime system, the idea of ownership, etc. My goal is to define formal semantics for a very small subset of Rust, which includes lifetimes. I do understand the notion of lifetimes and I read several books, tutorials and the like (like the Rust book), but none specify how lifetimes work precisely. I hope to figure out what kind of mathematical structure lifetimes most closely resemble. For example, you need to compare lifetimes to tell one is longer than the other, which says something about the mathematical structure lifetimes have to have. I tried looking at the github documentation, but I have a lot of trouble navigating through it to find the (for me) relevent parts. So basically my question is; is there anyone here that can point me in the right direction? For example the right piece of documentation in the github, where I can read more about how exactly lifetimes are implemented? My hope is that I can derive from that what kind of structure I can use for the rest of my thesis.

HadrienG · January 8, 2018, 4:08pm

I have absolutely no idea what I am talking about, but does that description feel similar to @RalfJung’s work (as documented by the blog https://www.ralfj.de/blog ) for anyone else?

RustThesis · January 8, 2018, 4:16pm

What I want to do is in some way similar to what was done in the RustBelt project (which you are refering to), but I want to ignore unsafe code, as I’m only doing a (somewhat small) thesis and not an entire research project (to keep it all managable). So I want to take a look at the implementation myself, both because unsafe is completely intertwined with the RustBelt project, and for so called ‘educational value’

hanna-kruppe · January 8, 2018, 4:35pm

Lifetimes form a lattice. Viewing them as lattice not only explains the order between lifetimes, but also how type inference will merge two lifetimes to produce a smaller or larger lifetime (the greates lower bound and least upper bound, though I can never keep straight which is which).

By the way, there is a bunch of research related to this besides RustBelt. There’s Patina, Metal, and a lot of research that is not directly related to, or even predating, Rust that also uses lifetimes (often called regions instead — see region-based memory management and particularly Cyclone). You should probably survey them to get a feeling for how they work with these objects formally, and also to check whether someone has already done the exact thing you want to do

RustThesis · January 8, 2018, 4:40pm

Thank you, I will take a look at them

quatrezoneilles · January 8, 2018, 7:58pm

Lifetimes don’t form a lattice, but something with both less and more structure: unlike lattices they lack a join/sup (a disjunction) and a Top (a constant for True). They form an meet-https://en.wikipedia.org/wiki/Semilattice with bottom (a False). The meet is + and the bottom is 'static. But they have something extra, which logicians call a universal quantifier (for<..>) and that needs some rather more involved category-theoretic mumbo-jumbo to formulate algebraically.

bjorn3 · January 8, 2018, 8:04pm

github.com

rust-lang/rust/blob/master/src/librustc/infer/README.md

# Type inference engine

The type inference is based on standard HM-type inference, but
extended in various way to accommodate subtyping, region inference,
and higher-ranked types.

## A note on terminology

We use the notation `?T` to refer to inference variables, also called
existential variables.

We use the term "region" and "lifetime" interchangeably. Both refer to
the `'a` in `&'a T`.

The term "bound region" refers to regions bound in a function
signature, such as the `'a` in `for<'a> fn(&'a u32)`. A region is
"free" if it is not bound.

## Creating an inference context

This file has been truncated. show original

and

github.com

rust-lang/rust/blob/master/src/librustc/infer/region_constraints/README.md

# Region constraint collection

## Terminology

Note that we use the terms region and lifetime interchangeably.

## Introduction

As described in the [inference README](../README.md), and unlike
normal type inference, which is similar in spirit to H-M and thus
works progressively, the region type inference works by accumulating
constraints over the course of a function.  Finally, at the end of
processing a function, we process and solve the constraints all at
once.

The constraints are always of one of three possible forms:

- `ConstrainVarSubVar(Ri, Rj)` states that region variable Ri must be
  a subregion of Rj
- `ConstrainRegSubVar(R, Ri)` states that the concrete region R (which

This file has been truncated. show original

may be interresting to read.

hanna-kruppe · January 8, 2018, 8:57pm

There is an opposite of 'static, it’s just not exposed to users. And there are both join and meet operations: When unifying types containing lifetimes 'a and 'b respectively, the compiler computes the join or the meet (though it prefers the terms GLB and LUB) depending on whether they occur in a covariant or contravariant context.

Quantifiers are not part of lifetimes, they are part of type bounds. There is no lifetime for<'a> ... , there are just bounds with quantifiers. See also: Chalk

quatrezoneilles · January 8, 2018, 9:44pm

OK, for< > is a universal quantifier than can only be applied in outermost position, so it’s not really a logical operator, as has been the case in say ML for a long long time.

As for join and a “True” under the hood, I didn’t know about them, but am I allowed to say that the public structure of lifetimes is a meet-semilattice with bottom?

Thanks for the clarifications.

hanna-kruppe · January 8, 2018, 10:11pm

It can be applied in nested positions too. Currently, I don't think you can do anything interesting with that, but with Chalk it will hopefully be more useful.

I don't think it's accurate to reduce it to just a meet-semilattice. Both GLB and LUB are important to lifetime inference, the results of which are definitely user-visible. And once a user starts thinking in those terms, it's not a stretch to say "there is an empty lifetime and using it is an error", even if that's not what the compiler does. Actually, there has been a proposal to effectively expose the empty lifetime to users (under a different name, with a more useful intuition). So I think the lattice property is not just an implementation detail or a mathematical trick.

quatrezoneilles · January 9, 2018, 10:27am

So for< > looks a lot like a duck, quacks a lot like a duck, but is not a duck. Do people have an idea of what it is, then?

There should be a smallish formal system in which all the primitives we have mentioned fit, but not much else.

Looks like there are some theses to be written indeed.

hanna-kruppe · January 9, 2018, 11:58am

I really don't know what you mean. It is the universal quantifier. Syntax is a bit different than classical logic notation (though not just wrt quantifiers) but that's just surface level stuff. The analogy of bounds to logical formulas goes much deeper than just lifetime bounds — again, see Chalk. So there is a nice formal view of for<> and much more.

For that reason I neither see the need for, nor do I know of, an analysis that restricts itself to lifetime algebra plus higher rank lifetime bounds. There's of course the option of restricting the richer formal system to just those elements, but AFAICT the result is not very interesting mathematically [1]. That's because lifetime bounds themselves are not very interesting in the absence of trait bounds and higher-ranked quantifiers over types. You just trivially get a list of atomic bounds that all have to hold, of the forms:

SpecificType : 'a
$TypeVar : 'a
'a : 'b

[1]: It also isn't interesting for analysing Rust code, because as soon as you start looking at bounds you're gonna want/have to look at all of them.

RalfJung · January 21, 2018, 1:30pm

The quantifier for<...> ... does not itself "produce" a lifetime, so it's not really an operation on lifetimes.

Interestingly, lifetimes in RustBelt are a semilattice with intersection and 'static; we don't have a union operation ("join", I guess, but I also always mix them up). Intersection turned out to be key to model, e.g., Rc and RefCell; union was not necessary. The Rust compiler works on a subset of lifetimes (those corresponding to sets of points on the CFG of a function); it is very possible that this subset has intersection while the "full" set of lifetimes (which can, e.g., span across multiple threads) does not (need to).

Actually I'd probably say that the compiler works on "symbolic lifetimes" (as in abstract interpretation). The abstraction function cuts away everything outside the current function, only remembering whether the lifetime lasts after the function returned or not. Clearly, this equates lifetimes that are not actually equal, but the analysis is of course sound for the abstraction, so that's not a problem. Symbolic lifetimes form a full lattice, concrete lifetimes may not. (Or maybe I'm splitting hair here, because really the Rust compiler doesn't care about "concrete lifetimes" at all. But we've needed them in RustBelt, to explain how unsafe code can rely on the borrow checker.)

Are you saying you want your model to closely follow the actual implementation in the compiler? That would be extremely interesting; with RustBelt we've decided to build a more abstract model to avoid things getting too messy.

system · March 25, 2019, 8:29am

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Thesis about Visualization of Lifetime Errors	1	1871	July 2, 2019
Current status of formal verification	10	7662	September 29, 2019
RustBelt: Securing the Foundations of the Rust Programming Language	4	3046	March 25, 2019
Non-lexical lifetimes based on liveness compiler	45	10553	March 25, 2019
[Brainstorming] Is there a body of research on ownership equivalent to Stacked Borrows Unsafe Code Guidelines	7	1518	December 22, 2024

Rust lifetimes semantics at mathematical level

Related topics