[Pre-RFC] Unsafe lifetime


#1

Motivation

In the course of writing unsafe code, the need can sometimes arise to store a value with an unknowable or inexpressible lifetime. In such cases, the value is insulated from the outside world and exposed through a safe interface. This is all well and good, but we must still hold the value internally somehow, and if it demands lifetime parameters, we must provide them.

In many such cases, it may suffice to simply use 'static as a surrogate lifetime and coerce the value as appropriate. This approach is workable, but has two major problems.

  • First, the intent is unclear and must be expressed as a comment. The claim that the value is valid for 'static is a lie and one must keep in mind what the actual true lifetime is. Later audits of the code may overlook this detail and accidentally misuse the value.

  • Second, 'static is not flexible enough. While in many cases it will work, 'static has meaningful semantic implications of its own and cannot act as a stand-in for any possible lifetime. Case in point:

Ref<'static, i32> // This is acceptable to the compiler
Ref<'static, MyType<'a>> // This is not

In the second case, we’re forced to provide a stand-in lifetime that meets the requirements imposed by the signature of Ref, but doing so may not be possible without exposing an additional, meaningless input parameter.

A Solution

In cases such as this, what we could really use is a lifetime that’s more general than 'static, and more clearly indicates our intent. For this purpose I propose 'unsafe. This lifetime would satisfy any constraint, with the caveat that it can only be instantiated within an unsafe context. “Instantiated” is key, in that after creation, it can be freely consumed as any other lifetime, without unsafe qualifications. This is important for parametricity.

The semantics of 'unsafe would be akin to the semantics described for “unbounded lifetimes” in the nomicon; this merely allows such a lifetime to be named and used in other contexts.

Alternatives

  • Do nothing. In some cases, 'static still suffices as a stand-in, and in others, the problem can be worked around by introducing a new lifetime parameter. This additional parameter, however, leaks out to externally facing API and is difficult to explain to users.

Unanswered Questions

  • How many use cases are there for this. rental is one, are there more?
  • What are the implications of implementing this? Is the notion reconcilable with borrowck?
  • Does this cause unintended “spooky action at a distance” if an 'unsafe value is allowed to mingle with external, otherwise safe code?

#2

How does it compare to pointers? *const/*mut? I think that’s what Rust currently uses for unsafe, unknowable lifetimes.


#3

Pointers only work as a substitute for references, and even then only if the lifetime you need to “erase” is the outermost lifetime attached to the reference itself. In opaque, user define types, there’s no escape hatch for the borrow checker and you are required to supply a valid lifetime, even if it’s false. This proposal is just geared toward making it more clear that a lifetime is determined by unsafe logic rather than scope, and addressing the problems with the current workarounds.


#4

Maybe this is just because I never write this kind of code, but it’s not at all obvious to me why this is useful. Could you show an example of some code that can’t be written with raw pointers/transmute/etc but would be writable with this feature? (I tried skimming the Rental source code but it’s way beyond me) And is there any way we could make Rust smarter/more flexible (e.g., adding immovable types) so that this sort of code doesn’t have to be unsafe at all?


#5

Sure, I’ll try to unravel all the abstraction and macro mess going on inside rental to make it more clear where the problem comes in.

At the most basic level, rental generates 2 element structs that store an owner and a borrower side by side. One example might be this:

struct RentalStruct {
    owner: RefCell<i32>,
    borrower: Ref<'???, i32>, // What to put here?
}

In its current state, there is no lifetime that we can place here that will truly express the fact that this field is tied to the lifetime of the other field. For this reason, we’re forced to choose a stand-in, because we still need to store the field somehow.

Could we perhaps instead just use a raw memory chunk and transmute it? Theoretically perhaps, but that would require being able to statically determine the size of Ref and declaring a field with a u8 array exactly that size. As far as I know, you can’t currently do this. Even if you could however, we still have a problem, because to get the size of a type you still need to be able to name it, so we still need to be able to put a valid lifetime there.

So, proceeding with the stand-in lifetime approach, what do we put there? 'static seems perfectly reasonable, and in this case that will work fine. We’ll just need to be careful to remember that it’s not REALLY 'static.

Now let’s consider another example:

struct RentalStruct<'a> {
    owner: RefCell<MyType<'a>>,
    borrower: Ref<'static, MyType<'a>>, // Problem, 'a is not 'static, this type can't exist
}

Here, even 'static fails us, since the compiler will not accept it, even as an internal field that will never be publicly exposed. Luckily, we still have a backup lifetime in the form of 'a, and that will save us once again in this case.

Now, what if we want to be generic?

struct RentalStruct<T> {
    owner: RefCell<T>,
    borrower: Ref<'a, T>, // Problem, 'a is now unknown to us, so we're stuck again
}

Here, T is not bounded to be 'static and could have any lifetime. Unfortunately, we have no idea what that lifetime actually is, so there’s nothing we can put there that will satisfy the compiler, so we must resort to this:

struct RentalStruct<'redundant, T: 'redundant> {
    owner: RefCell<T>,
    borrower: Ref<'redundant, T>, // Compiler is happy again
}

Now the compiler is satisfied, but we’ve paid a price. Or rather, we passed that cost along to the consumer of our API and THEY must pay the price of supplying this redundant lifetime that has no actual meaning other than to satisfy the compiler. In some cases inference will eliminate it for us, but that won’t always work. If we want to put RentalStruct in a struct of our own, then that struct will also be infected with the redundant lifetime, and so on.

'unsafe just tells the compiler to accept that there is no meaningful lifetime we can actually give it and that we promise to handle the implications properly ourselves. The user of the API is then unaware that this has taken place and sees only a struct that takes a single type parameter, as we wanted in the first place. It also more clearly indicates to any reader of the code that the lifetime of this value is deliberately unsafe and must be handled with extreme care, instead of a lie that we have to remember.


#6

That thoroughly explains why no existing lifetime syntax in Rust can handle this situation, but that’s not what I was confused about: Why can’t borrower be a raw pointer like *mut T?

I think you tried to answer that question in your second post with “Pointers only work as a substitute for references, and even then only if the lifetime you need to “erase” is the outermost lifetime attached to the reference itself.”, but that didn’t really make any sense to me for all sorts of reasons (what are you “substituting” if not references? Why do you want to erase only an “outer” lifetime when you’re the one creating the original self-borrow/inner lifetime?) which is why I tried asking for a concrete example that doesn’t work with raw pointers.


#7

Because borrower isn’t necessarily a ref, that was just an example. Borrower can be any opaque type, of any size, that takes lifetime parameters, such as a MutextGuard or libloading::Symbol or what have you.


#8

As a concrete example, my easy_strings library uses this pattern to wrap the String iterators and return owned iterators. Ultimately, the wrapped iterators boil down to stuff along the lines of

struct(Arc<String>, str::Lines<'unsafe>)

As Lines is opaque, I couldn’t just use a pointer here, even if I wanted to.


#9

Fun fact: even the stdlib does this. From core/cell.rs:871:

#[stable(feature = "rust1", since = "1.0.0")]
pub struct Ref<'b, T: ?Sized + 'b> {
    value: &'b T,
    borrow: BorrowRef<'b>,
}

Here, &'b T is technically false, since borrow being dropped will invalidate it. Just from glancing at this code, that’s not clear at all. &'unsafe T would be an immediate warning that there is subtlety at play.


#10

Took the feedback here into consideration and drafted the actual RFC.