Types with existential lifetimes

I've noticed in multiple occasions that sometimes we may be dealing with multiple lifetimes, potentially invariant, but actually need only one for some API:

  • better_any::Tid<'a> is only generic over a single lifetime 'a
  • bevy's SystemInput has a similar limitation
  • it's not that uncommon for people to want to create some kind of stacked environment struct, sort of like a singly linked list where links are mutable references (see e.g. this recent discussion on URLO)

All these cases could be solved by erasing multiple lifetimes into a single, "common denominator" lifetime, as long as we don't care about what was the specific lifetime but only that it was "big enough".

Practically speaking I think this could be done if we had a kind of type that closes over some existential lifetime. I'll refer to this kind of type with the syntax exists<'a: 'b> T<'a> (potentially to bikeshed). In short this would be a type whose values can be values of type T<'a> with any lifetime 'a that is greater than the given 'b. In this kind of types exists would be a quantifier, just like for<'a>, bounding 'a, while 'b would be free.

Of course these types would be useless if they weren't somehow usable like an instance of T<'a> for some lifetime 'a, but of course using 'b as lifetime would be unsound, as that would be basically equivalent to make every lifetime covariant. Instead I believe each instance of these types should introduce a new anonymous lifetime in the current context, known to be bigger than 'b but without other informations.

Note that currently we already a kind of type with this capability: trait objects. A dyn Trait + 'b can hold values of any type implementing Trait and having a lifetime greater than 'b. We can exploit this to build a POC showing what an hypothetical exists kind of type can do, however this is not really usable in practice for a couple of reasons:

  • it suffers from the limitation of trait objects, like dynamic dispatch and requiring boxing in certain situations (even though we want to use it only for types that will all have the same shape and implementations at runtime!)
  • the API requiring closures and the extra parameter for the implied bounds is pretty inconvenient.

What do you think?

6 Likes

What is the relationship between &'a exists<'c: 'b> Thing<'c>, and exists<'c: 'b> &'a Thing<'c>? Can you convert between the two?

This doesn't work. Suppose you have a Vec<exists<'b: 'a> Thing<'a>>. You then create an anonymous lifetime that represents the true concrete lifetime inside. This allows you to accidentally treat the multiple Things in the vec as having the same concrete lifetime.

I think that operations on this exists thing should be desugared into the callback API you gave in the PoC.

I believe in both cases you should be able to introduce an anonymous lifetime for 'c while reborrowing, and the re-abstract at the appropriate point to get the other type.

However I can see how this being possible could introduce some issues:

  • you always need an explicit annotation for where to introduce the existential binder, it cannot be inferred

  • it this was a subtyping relation we would have another instance of two types being a subtype of each other

I believe you can avoid this issue if you introduce such lifetime only once you have an instance of exists<'b: 'a> Thing<'a>, and having a Vec or other wrappers of them would not count. A specific rule might be requiring a place of type exists<'b: 'a>, although this feels like it might be too restricting.

Did you now a about higher_kinded_types - Rust? I'm using it for generic lifetimes frequently for my apis

Yes, I wanted to use it for the POC (instead of that half assed TCtor) but it isn't supported in the playground.

Still, it does something different than what I'm proposing, it just let's you write code that's generic over a family of types parameterized by a lifetime, but it does not help with the issues I described in the first post. If anything you are likely to encounter those issues when using higher_kinded_types, since most often it will be used to be parametric over a single lifetime but your types might have multiple ones.

1 Like

For sure, this is a sound design. Showcasing it with dyn and no unsafe proves its soundness from a pure API PoV.

FWIW, you do not need the implicit bounds, if you don't mind exposing a weaker form of existential in these closures.

You can then make an unsafe implementation of it all. Here is a demo supporting any type with up to two[1] generic lifetime parameters:

pub use ::core::marker::PhantomData as PD;

pub
trait ForLt2 {
    type Of<'a, 'b>;
}

pub
trait WithLifetime2<'a, 'b> {
    type Of;
}

impl<T : ?Sized + for<'a, 'b> WithLifetime2<'a, 'b>>
    ForLt2
for
    PD<T>
{
    type Of<'a, 'b> = <T as WithLifetime2<'a, 'b>>::Of;
}

#[macro_export]
macro_rules! ForLt2 {(
    <$a:lifetime, $b:lifetime $(,)?> = $T:ty $(,)?
) => (
    $crate::PD<
        dyn for<$a, $b> $crate::WithLifetime2<$a, $b, Of = $T>
    >
)}

/// `dyn 'u + ConceptualErasure` is properly conservatively:
///   - `'u`-bound/infected,
///   - but also, subtly, unimplementing every auto-trait,
///   - even more subtly, `drop_use<'u>`.
trait ConceptualErasure {}

pub
struct Exists<'usability, T : ForLt2>(
    // or even better:
 // unsafe<'a, 'b> T::Of<'a, 'b>
    T::Of<'static, 'static>,
    PD<dyn 'usability + ConceptualErasure>,
);

unsafe
impl<T : ForLt2> Send for Exists<'_, T>
where
    for<'a, 'b> T::Of<'a, 'b> : Send,
{}

unsafe
impl<T : ForLt2> Sync for Exists<'_, T>
where
    for<'a, 'b> T::Of<'a, 'b> : Sync,
{}

// Regarding the soundness of drop glue and dropck:
// what if `T::Of<'u, '…>` is a `drop_use<'u>` type,
// but upon `Exists`-erasure, it would cease to be so
// (since we're only holding a `T::Of<'static…>`!).
// Well, `dyn 'u + …` is deemed to be a `drop_use<'u>` type.
// Wrapping it in `PhantomData` preserves it.
// So `Exists<'u, …>` is itself a `drop_use<'u>`, and
// all is fine. See the compile_fail test to illustrate.

impl<'usability, T : ForLt2> Exists<'usability, T> {
    pub
    fn new<'a, 'b>(value: T::Of<'a, 'b>)
      -> Self
    where
        'a : 'usability,
        'b : 'usability,
     // T : 'usability, // not needed, since `T` infects `Self`; in practice, `T : 'static`,
    {
        Self(
            unsafe {
                // Safety:
                //   - when `T : 'usability`, the
                //     resulting instance shall not be usable beyond `'usability`,
                //     and we know the concrete instance is safe to use within it.
                //     (for arbitrary `T`, the actual usability becomes `'T ^ 'usability`,
                //     and the rest holds).
                //   - no concrete `'a`/`'b` are ever reconstructed from this point.
                //   - auto-traits are lost but for `Send/Sync` iff those impls
                //     were lifetime-agnostic;
                //   - dropck seen above.
                ::core::mem::transmute(value)
            },
            PD,
        )
    }
    
    pub
    fn with_ref<R>(
        &self,
        // note: it is very dangerous to repeat the lifetime of `&self`
        // here. Most APIs doing this in this situation have turned out to be
        // unsound (since eg. `Box` lets us have `&'unbounded self`; most notably,
        // `&'static self` when applicable, which would corner `for<'a, 'b>` to be
        // `'static` due to implied bounds, i.e., no longer merely existential).
        // Here, however, `&'static self` would require `'usability : 'static`,
        // so now I believe it would be fine. But better safe than sorry for now.
        //                 v
        scope: impl FnOnce(&T::Of<'_, '_>) -> R,
    ) -> R
    {
        let yield_ = scope;
        yield_(&self.0)
    }
    
    pub
    fn with_mut<R>(
        &mut self,
        //            same remarks
        //                 v
        scope: impl FnOnce(&mut T::Of<'_, '_>) -> R,
    ) -> R
    {
        let yield_ = scope;
        yield_(&mut self.0)
    }
    
    pub
    fn with<R>(
        self,
        scope: impl FnOnce(T::Of<'_, '_>) -> R,
    ) -> R
    {
        let yield_ = scope;
        yield_(self.0)
    }
}

fn _proof<'a, 'b>(
    cx: &'a mut ::core::task::Context<'b>,
) -> impl use<'a /* look ma no 'b */> + Sized
{
    // type alias for convenience
    type Context = ForLt2!(<'a, 'b> = &'a mut ::core::task::Context<'b>);
    Exists::<'a, Context>::new(cx)
}

#[test]
fn example_usage() {
    struct Foo;
    
    impl ::core::fmt::Display for Foo {
        fn fmt(&self, f: &mut ::core::fmt::Formatter<'_>)
          -> ::core::fmt::Result
        {
            // This instance has gotten the `<'_>` lifetime erased.
            let mut erased
                : Exists<
                    ForLt2!(<'a, 'b> = &'a mut ::core::fmt::Formatter<'b>),
                >
                = Exists::new(f)
            ;
            erased.with_mut(|f| {
                f.write_str("Foo")
            })
        }
    }

    assert_eq!(Foo.to_string(), "Foo");
}

/** Remove the compile_fail to see the error.
```rust ,compile_fail
use ::playground::*;

/// Error, cannot move out of `mutex` because it is borrowed.
fn dropck() {
    let mutex = ::std::sync::Mutex::new(());
    let guard = mutex.lock().unwrap();
    type Guard = ForLt2!(<'mutex, '_b> = ::std::sync::MutexGuard<'mutex, ()>);
    let _guard = Exists::<Guard>::new(guard);
    drop(mutex);
}
``` */
const _: () = ();

  1. and even a third lifetime parameter when this third param can be the intersection of the two params (or a sublifetime thereof). ↩︎

3 Likes

One important thing to note is that an &'a mut T<'b> can be converted to an exists<'c> &'a mut T<'c>, but it can't soundly be converted to an &'a mut exists<'c> T<'c>.

The former type allows changing the value but requires the lifetime to be the same, whereas the latter type allows changing both the value and the lifetime. The latter type is still sound as long as it was obtained by actually taking a reference to an exists<'c> T<'c> - overwriting the lifetime is fine at that point, cause nobody knows what the original lifetime was anymore (similar argument to owned values being covariant).

Traits actually do support conversions from &mut T<'b> to &mut dyn Trait + 'a, which corresponds to the latter type - it's only sound because dyn Trait is unsized and thus can't be overwritten.

2 Likes

The idea would be to disallow writing a exists<'c> T<'c>, but I can see that being somewhat problematic with e.g. generics. An exists type would have to never be used in generics, and instead always instantiated with a fresh lifetime before use, which should prevent the assignment you mentioned.

I don't think forbidding overwriting is practical - any type that contains an existential lifetime would also be unoverwritable, which splits the world into first-class types without existential lifetimes and second-class types with them.

Thankfully, I don't think it's necessary, either! It should absolutely be sound to change the concrete lifetime of an exists<'c> T<'c> as long as an exists<'c> T<'c> can only be created by value (or under an immutable reference). With that condition, nothing anywhere can rely on the concrete lifetime, so it's fine to change it arbitrarily. E.g.:

// Cell to make `'a` invariant
struct T<'a>(Cell<&'a ()>);

fn foo<'a: 'b, 'b>(bar: &mut exists<'c: 'b> T<'c>, baz: T<'a>) {
    let baz: exists<'c: 'b> T<'c> = baz;
    // This assignment is sound!
    // No caller is capable of knowing what lifetime bar had before,
    // just that it outlived 'b.
    // (They can only use the lifetime existentially as well)
    *bar = baz;
}

The simple way to express the rule is: exists<'b> T<'b> is always a supertype of T<'a>, regardless of 'a's variance in T. That would cover all the necessary cases:

  • By covariance, &T<'b> <: &exists<'c> T<'c>
  • By invariance, &mut T<'b> </: &mut exists<'c> T<'c>
  • Directly, &T<'b> <: exists<'c> &T<'c> and &mut T<'b> <: exists<'c> &mut T<'c>
1 Like

@21aslade you've hit the nail on the head. It's taken me a bit to be fully convinced, though, so here is my own slower/more detailed description of what you've observed.

The

&'r mut (impl ?Sized + 'u + …) -> &'r mut (dyn 'u + …)

(re-)[1]unsized coërcion of dyn is only sound insofar we cannot mem::swap() nor mem::replace() or whatnot unsized values, especially dyn-erased ones (there are so many reasons this would be unsound).

  • To illustrate, otherwise we could trivially do:

    // Objective (to cause unsoundness): be able to overwrite `*r` with `s`, violating
    // the mandatory-for-soundness invariance of `&'r mut &'_ str` over `'_`.
    fn exploit<'r>(r: &'r mut &'static str, s: &'r str) {
        let r: &'r mut (dyn 'r + Display) = r; // unsizing coërcion.
        if BY_VALUE { // <- kind of absurd to talk of this with `dyn`, but bear with me.
            *r = s as (dyn 'r + Display);
        } else { // <- this is the more likely `dyn`-compatible exploit.
            let mut s = Box::new(s) as Box<dyn 'r + Display>;
            mem::swap(r, &mut *s); // <- and yet it fails because `swap()` APIs & the like
                                   //    do require `Sized`-ness of the referee for soundness.
        }
    }
    

And, as you mention, the exists<'…> case (e.g., considering my Exists<'u, ForLt2!(<'a, 'b> = …)> actually spelled out design) can be quite dangerous insofar the BY_VALUE case could be written:

// pseudo-code:
*r = s as (exists<'a : 'r> &'a str);
// in real-code:
*r = Exists::<'r, ForLt2!(<'a, '_b> = &'a str)>::new(s); // OK!

So in order for this exploit() to fail, the other operation performed therein ought to fail:

let r: &'r mut (exists<'a : 'r> &'a str) = r /* : &'r mut &'static str */;

So that kind of operation must never be possible.


Other than that, these existential operations, when performed by value (or in the dyn case, exceptionally behind &mut thanks to the lack-of-unsized-mem::swap() rule), they are perfectly capable of "bypassing invariance", at least in appearance:

  • this is the initially surprising case of the re-unsizing dyn coërcion behind &mut:

    /// This compiles fine.
    fn f<'r>(
        r: &'r mut (dyn 'static + Trait),
    ) -> &'r mut (dyn 'r + Trait)
    {
        r
    }
    
  • and back to these existentials, it would be indeed stuff such as:

    fn f<'r>(
        r: &'r mut &'static str,
    ) -> exists<'a : 'r> &'r mut &'a str
    // i.e., in real code:
    ) -> Exists<'r, ForLt2!(<'a, '_b> = &'r mut &'a str)>
    {
        return r as _;
        // i.e.
        return Exists::new(r);
    }
    

So the rules you have spelled out are exactly what would be needed indeed.

Amending your quote in-place, by introducing 'r wherein 'b : 'r, and using as the symbol for <: (since otherwise we get too many angle brackets for my taste):


  1. this "unsizing" coërcion can actually be triggered even in the impl ?Sized case such as dyn 'bigger_than_u + …. At that point, there isn't really actual unsizing, since the original type was already unsized to begin with. But API/soundness-wise, it's just a shortcut for some existential downcast to the right type, followed by a new unsizing coërcion. I call this mechanism the "re-unsizing coërcion". ↩︎

2 Likes