Do lifetimes *have* to be erased at runtime?

This topic is mostly me thinking out loud, and hoping someone smarter than me can help me think it through some more.

@dtolnay came up with another proposal for non-'static lifetime TypeIds. What caught me was the following:

And all of a sudden it hit me: do lifetimes have to be erased in all cases? Can we come up with a new type (LifetimeId?) that can store the information about a given lifetime which can be consulted at runtime? Since TypeIds are opaque, we could add an extra field that references the appropriate LifetimeId, which would solve the issue above (I think? This is where smart people need to jump in).

My thought is that we'd have a marker trait that types can optionally implement. If it is implemented on a type, then that is a signal to the compiler to create a LifetimeId for the type, which can be consulted by the TypeId machinery for comparison purposes, allowing non-'static types to be compared.

As a first-pass minimum viable product, LifetimeIds can only be constructed by the compiler[1], are opaque, and generally the absolute most minimal type we can make them. Most importantly, their internal representation is not stable, and can be changed at any time by the compiler team, so you can't serialize/deserialize them. Eventually, as we get experience with them and have some ideas on how they can be stabilized, someone could figure out how to make it possible to serialize them out, but I expect that it will be a long time, if ever, before their internal representation could be stabilized.

Issues with this idea that I can think of already:

  • I'm not sure that this approach is even possible, let alone practical.
  • I suspect that there needs to be a canonical algorithm for constructing LifetimeId instances so that something like interning can happen. This is for the case where you construct two objects that have the same lifetime but were constructed separately; if they have the same TypeId, and the same lifetimes, then they should be equal[2].
  • How to make the TypeId machinery work with this, given that TypeId is stable.

  1. You can construct them via unsafe code, probably by transmuteing from a manually constructed byte array, but for fairly obvious reasons I'm not going to pay much attention to this. ↩︎

  2. Note that I'm required strict equality here, my gut feeling is that anything else would be unsound as I think it would be possible to invisibly transmute a lifetime of non-'static items otherwise. I don't have concrete proof of this though, it's just a gut feeling at this time. ↩︎

Here's a couple of observations:

  • Let's say this works (and if not why?):
let lifetimes = (0..n).map(|i| lifetime_of(&i)).collect();

Are those lifetimes equal? Each of them starts and ends before the next one, so I wouldn't consider them equal. Can you even represent n different lifetimes, with n arbitrary?

  • Consider this snippet code:
let a = String::new();
let b = String::new();
let la = lifetime_of(&a);
let lb = lifetime_of(&b);
// b is dropped here
// a is dropped here, after b

Is the lifetime of a the same as b, or is it different because it lives a little bit longer?

  • What about types with multiple lifetimes? Assuming you can even get a LifetimeId for each of them, how would that be usable elsewhere?
5 Likes

I agree with you, they are not equal.

As for representing n different lifetimes... you've got me. I suspect that someone will be able to construct a program that generates the power set of the number of types in the program, but I can't prove that this is, or isn't, possible. Maybe we punt and have the compiler have a built-in upper limit that the user can override via a -Z switch? Not perfect, but until we had data on how common this issue is in the real world, I'm not going to lose much sleep over it.

Different, because it lives longer. Note that someday we may be able to relax this constraint, but for the moment I want to be very conservative so that we can iron things out without ending up with soundness holes.

And THIS is why I posted this topic, so people smarter than me could help me think it through! :smiley:

What about sets of LifetimeIds? If the sets aren't equal, then neither are the types. Again, this is overly conservative, but until all of the issues are fully worked out it's better to be conservative than risky.

Happened-before and logical clocks

I just realized that what we're really after is a logical clock, because what we're trying to describe is a happened-before event. All of the mechanisms I know of would be relatively heavyweight, but there may be a way of making them lighter specifically for Rust's use. I don't know if this would solve the problem, but we can at least look at the approach that others have used to see if it could be used here.

Even if this is possible, I think it's likely to have the same issues as when it was brought up for specialization. A reborrow will shorten lifetimes (such as when calling a method), and I'm not sure how this proposal handles reborrows.

The rust compiler can deduplicate code with identical behavior but with lifetimes compared to without, but with such a system it has to keep track of lifetimes at runtime.

There's other things like moving a drop call up slightly have non-local impacts on the code, but also the fact that I'm not sure this is useful. From what I've seen, what is desired is some equivalent of C++ remove_ref, not the ability to actually see if two lifetimes are the same.

1 Like

Was the idea of a lifetime type brought up in specialization before??? Do you have a link to the discussion, or at least some keywords I can likely google for?

Not all lifetimes, only those that aren't 'static.

I can't answer how useful it would be in practice; I think we'd have to try to do it and see what happened.

A few times, though not ever in a centralized place. It's the same concept though, that specializing over lifetimes is unsound because lifetimes are erased. (See Niko's "Always Applicable Impls" post about specialization for the basic overview about when specialization is sound.)

So what the problem is isn't just that you can generate an infinite number of distinct lifetimes. It's that this is a runtime problem:

fn get_lifetime_id<'a, T>(_: &'a T) -> LifetimeId {
    __magic_lifetime_id!('a)
}

let mut lifetimes = vec![];
for _ in 0..HOWEVER_MANY_YOU_WANT {
    let new_place: i32 = rand();
    lifetimes.push(&new_place);
}

This is an infinite number of distinct lifetimes coming from a single source point. As such, the compiler would have to implement lifetime id assignment and lookup at runtime. Basically, for lifetimes to exist at runtime, they need to have non-zero-sized state wherever they're carried. Not having this is why they're erased at runtime.

What could occur (and is essentially part of what's being asked for) is monomorphizing a copy of lifetime-generic functions for known-'static. But at the same time, this can't always happen (thus specializing on it is not always applicable / is unsound if it's behavior changing), because of for<'a> fn(&'a _) — lifetime-generic function pointers, which are type-level observers that lifetimes don't get monomorphized, since they're a single pointer and not a (for<'a> fn(&'a _), fn(&'static _)) pair.

5 Likes

Yeah, that's the same thing I was worried about, except that I was hoping that by only allowing the compiler to generate the IDs that it could limit shenanigans. However, I expect that you're correct, and that nothing I do will fix it. Oh well, it was too good to be true.

Pretty much, yeah, I think.

Because you can turn a &'long &'long T into a &'short &'short T:

pub fn demo<'long: 'short, 'short, T>(x: &'long &'long T) -> &'short &'short T { x }

But the code generated for that you can't edit anything in the inner reference, because it'd be a data race with others trying to change it, and it can't copy it either, because then the outer reference would be to a different address.

3 Likes

You can also turn &'static for<'a> fn(&'a ()) into &'static fn(&'static ()) despite the fact that the difference between for<'a> fn(&'a ()) and fn(&'static ()) is not erased. Rust Playground

Non-erased doesn’t need to mean that there’s something left at run-time. (In fact the whole point of lifetimes is that there’s no overhead at run-time; so even with some non-erased form of lifetimes, if something like that was possible to add into Rust (which I don’t really believe), it should not mean that every usage of lifetimes includes some run-time information.)

2 Likes

I agree with you on this! That was why I was thinking it would opt-in via marker traits. If we were going to keep the information around for every instance that's created, we might as well go for a full garbage collector, etc. etc. etc. But then it wouldn't be Rust... I was hoping to find something that would be used only when it was necessary, not all the time.

That said, I'm convinced now that it wouldn't work; too many ways for it to go wrong. Oh well, maybe someone else can figure out a better idea.

1 Like

Comex brought up some pretty good reasons answering my similar question here.

4 Likes

Code generation being independent of lifetimes has been useful so far: mrustc and gcc-rust rely on being able compile Rust without having a borrow checker.