A hack to enforce typestate for futures

I'm not sure if someone discussed the idea before, but I'm not aware of such discussions in the AsyncDrop thread.

A problem with AsyncDrop is that a signature like follows:

trait AsyncDrop {
    fn poll_drop(self: Pin<&mut Self>, cx: &Context<'_>) -> Poll<()>;
}

requires the object to track whether poll_drop was called or whether the object was dropped internally and dynamically, and the implementor is obliged to "fuse" poll_drop manually.

However, there is a way to enforce typestate statically with a trick on lifetimes of mutable references! This means that code that attempts to call poll after poll_drop simply doesn't compile.

use std::marker::PhantomData;
struct DropOnce<'a> {
    _data: PhantomData<&'a mut ()>,
}
impl<'a> DropOnce<'a> {
    fn poll(&'a mut self) -> &'a mut Self {
        println!("poll");
        self
    }   
    fn start_drop(&'a mut self) -> Dropping<'a> {
        Dropping(self)
    }   
}
struct Dropping<'a>(&'a mut DropOnce<'a>);
impl<'a> Dropping<'a> {
    fn poll(&'a mut self) -> &'a mut Self {
        println!("dropping");
        self
    }   
}
fn main() {
    let mut x = &mut DropOnce { _data: PhantomData };
    for i in 0..3 { x = x.poll(); }
    let mut y = &mut x.start_drop();
    for i in 0..3 { y = y.poll(); }
    //x.poll(); // fails borrow checker
}

It works by fixing the lifetime 'a on the struct's signature. In this way, at most one instance of &'a mut Self can exist in the program at any time (&mut refs also have ownership), and if we consume it, it's gone. If we wrap it in Dropping, there is no one else who can retrieve it.

Based on this idea, I propose a new StronglyTypedFuture trait (intentionally ugly name) that encodes future states as typestates instead of runtime states, so that:

  • The state of the future (whether it is running, completed or being dropped) is a part of the future's type
  • Futures are fused statically, and attempting to poll a future that has completed is a compiler error
  • Once a future is turned into a dropping future, attempting to poll the original future is a compiler error
  • Attempting to poll a dropping future that has completed is also a compiler error
enum StronglyTypedPoll<P, T> {
    Pending(P),
    Ready(T),
}
trait StronglyTypedFuture<'a> {
    type Output;
    fn poll<'b>(self: Pin<&'a mut Self>, cx: &Context<'b>) -> StronglyTypedPoll<Pin<&'a mut Self>, Self::Output>;
}
// Every other method need to ensure that the reference they get from &'b mut self
// matches 'a: 'b in order to ensure that such method cannot be called after a call to
// async_drop. This is currently enforced by the compiler if a field with lifetime 'a
// exists in the struct, but will not be the case if my proposal about expired references
// was implemented.
// (see https://internals.rust-lang.org/t/proposal-about-expired-references/11675/11)
trait StronglyTypedAsyncDrop<'a> {
    type DropFuture: StronglyTypedFuture<'a, Output = ()>;
    fn async_drop(self: Pin<&'a mut Self>) -> StronglyTypedFuture<'a, Output = ()>;
}

No, you don't necessarily have to keep track of if poll_drop was called. You can just assume that it isn't called outside of poll_drop. This is fine because poll_drop must be the last thing called. The is similar to how you can assume Drop::drop is not called outside of drop

Both comments in this thread seems to me to misunderstand why poll_drop_ready needs to have fused semantics.

The goal is to enable generated drop glue for struct types without having to store additional state in the struct. This drop glue would call poll_drop_ready on every field until all of them return Ready. Because of this, poll_drop_ready would be called after Ready is returned on the fields that return Ready first.

So Yato's comment that poll_drop_ready does not need to be fused is wrong (unless we give up generating this glue for composite types, which seems bad), and this post is not a solution to the problem: there's no way to track which futures have returned ready to not call them again statically. You cannot write the drop glue for the struct so that the types of its fields change while its being dropped.

For the record, it is already standard behavior that futures track that they have completed. Most of them, though, panic when they are completed. This is because they return a meaningful value on Ready and they either can't (because it is not Copy) return it again or simply don't want to (because this increases the state size and correct implementations won't call poll after Ready is returned anyway).

But poll_ready_drop returns () as its output type - a ZST which can always be constructed when needed. Tracking the state completed is not a big deal - just instead of panicking when called after finishing these functions would return Poll::Ready(()).

4 Likes

..or in other words this wouldn't work because when a struct needs to asynchronously drop two fields one doesn't know statically which one will finish async dropping first; so it becomes necessary to track this dynamically

2 Likes

Thanks for the explanation. I realized that while reading the original thread again, and I'm sorry for not carefully reading that.

However, there are some logic here that I think is not quite clear.

First, in many cases it's not just drop flags, but also some state logic specifically used in dropping. In order to understand this, we need to realize that there is a difference in AsyncDrop for regular structs, and AsyncDrop for futures. For low-level structs that implement AsyncDrop directly, they contain only the data, and the state needed to actually drop it is far more than the drop flags, and we don't want to put them in the struct body. Trying to put a method fn poll(self: Pin<&mut Self>, cx: Context<'_>) -> Poll<()> is treating it like a Future, and it's just strange for such a method to make any sense.

struct LowLevel {
    url: String,
}
impl AsyncDrop for LowLevel {
    fn poll(self: Pin<&mut Self>, cx: Context<'_>) -> Poll<()> {
        // We want to create an HTTP connection and send DELETE to the URL.
        // But how are we going to implement this in a poll function on a struct
        // that is not designed to store state?
    }
}

Thus, I would like to propose the following signature instead:

// in poll_drop case
trait AsyncDrop {
    async fn into_drop(self);
}
// in poll_drop_ready case
trait AsyncDrop {
    async fn drop_ready(self: Pin<&mut self>);
}

EDIT: I read the post about the problems with async fn in traits and maybe that's why such traits are not currently acceptable. I'll discuss a solution about that later.

For mid-level structs that simply contain AsyncDrop structs, it's obvious how drop flags are implemented: they are implemented by join!, which I believe contains the drop flags originally discussed:

#[derive(AsyncDrop)]
struct MidLevel {
    data1: AsyncDroppable,
    data2: AsyncDroppable,
}
// Derived implementation in the poll_drop case
struct MidLevel {
    data1: ManuallyDrop<AsyncDroppable>,
    data2: ManuallyDrop<AsyncDroppable>,
}
impl Drop for MidLevel {
    fn drop(&mut self) {
        unsafe {
            ManuallyDrop::drop(&mut self.data1);
            ManuallyDrop::drop(&mut self.data2);
        }
    }
}
impl AsyncDrop for MidLevel {
    async fn into_drop(self) {
        let (f1, f2) = unsafe { (ManuallyDrop::take(&mut self.data1), ManuallyDrop::take(&mut self.data2) };
        join!(f1, f2)
    }
}
// Derived implementation in the poll_drop_ready case for Unpin structs
struct MidLevel {
    data1: AsyncDroppable,
    data2: AsyncDroppable,
}
impl AsyncDrop for MidLevel {
    async fn drop_ready(self: Pin<&mut self>) {
        join!(self.data1.drop_ready(), self.data2.drop_ready())
    }
}

For high-level futures, think about how to drop the future produced by dropping the future. That creates an infinite loop! In my opinion, it should be the same future produced by dropping the original future. So in fact, dropping a future is just turning it into the "dropping" state - there is no way to actually drop it until the future destructs itself. Think it as an alternate code path in await that effectively changes the return type to () and returns immediately.

struct MyFuture(PhantomPinned);
struct MyFutureDroppingWrapper<'a>(Pin<&'a mut MyFuture>); // note this is Unpin
// for the poll_drop_ready case
impl AsyncDrop for MyFuture {
    fn drop_ready<'a>(self: Pin<&'a mut Self>) -> impl Future<Output = ()> + 'a {
        // set the future to "dropping state" and return myself!
        MyFutureDroppingWrapper(self)
    }
}
impl<'a> Future for MyFutureDroppingWrapper<'a> {
    type Output = ();
    fn poll(self: Pin<&mut Self>, cx: Context<'_>) -> Poll<()> {
        self.0.poll_drop_ready_private(cx)
    }
}

// For the poll_drop case
impl AsyncDrop for MyFuture {
    fn into_drop(self) {
        // How did you move out of a Pin?
        // The hack is needed in this case.
    }
}

There is another aspect of the problem that I would like to point out. It makes sense to allow fields to drop in an async context and this helps especially if we have a global executor (so we can spawn futures in drop), but why do we need to enable fields to drop in parallel?

Of course, the goal is to improve performance. But I'm afraid it's much easier to use it less efficiently in cases where parallelism is expected to have a positive impact on performance.

The first problem is with large arrays of AsyncDrop. A future executor usually has a good queueing implementation and deals with lots of futures efficiently, but a naive join! implementation doesn't. Think about an array with 10000 elements; even if there are only 2~3 outstanding futures, it has to check the flag of all these elements in every poll.

As pointed out somewhere in the thread, when performing select!(a, b), we don't want dropping b to block the execution of a. Even if select! doesn't drop it immediately, we don't want dropping either of the future to block the current async function from returning. A solution is to allow await to return before the future gets dropped, but that prevents reclaiming the stack space (or pinned space) used by the future.

This means that if an item needs to actually block before it can be dropped, there is already a big problem with that item. Instead, it would be more efficient to allocate dedicated heap space to manage the "dropping lifecycle" of the item and spawn it as a future if blocking is needed. And if blocking is not necessary, I would suggest to drop fields one by one instead.

I just want to be clear:

Even if the default way of drop glue generation does await join!($(self.$member.drop_ready())),*), nothing prevents you from implementing AsyncDrop for your type to do something special (e.g. spawn destructors directly on the executor), the same way Drop exists to specialize the drop glue for a synchronous drop.

The default definitely should be the allocation-free path, however.

I do think that the idea of returning a new future box could be a good one, though it does necessarily require GAT. Sketching all of the APIs more concretely, along with drop glue:

pub trait AsyncDrop {
    fn async_drop(self: Pin<&'self mut Self>) -> impl Future<Output=()> + 'self;
}

struct Complex {
    contains: AFuture,
    and: AnotherFuture,
}

// drop glue is the equivalent of:

impl Drop for Complex {
    fn drop(&mut self) {
        self.contains.drop();
        self.and.drop();
    }
}

// This version takes more stack space for the clear split of sync prep and async work
impl AsyncDrop for Complex {
    fn async_drop(self: Pin<&'self mut Self>) -> impl Future<Output=()> + 'self {
        let contains = pin_project!(&mut self.contains).async_drop();
        let and = pin_project!(&mut this.and).async_drop();
        join!(contains, and)
    }
}

// This version is almost (but not quite) additional state free
impl AsyncDrop for Complex {
    fn async_drop(self: Pin<&'self mut Self>) -> impl Future<Output=()> + 'self {
        async {
            let contains = pin_project!(&mut self.contains).async_drop();
            let and = pin_project!(&mut this.and).async_drop();
            join!(contains, and).await
        }
    }
}

There is a non-obvious cost to this, though: the space required to asynchronously drop a type is necessarily increased by at least one usize to store the reference again. We trade off the ability to request more space to do the drop work with the fact that the drop work has to hold its own reference to our type we're dropping. We could make another future-like type whose poll method gets two references to avoid this cost, but at that point what's the actual gain anymore?

The "zero-cost" version that allows for extra drop space:

trait HasAsyncDrop {
    fn async_drop(self: Pin<&'self mut Self>) -> impl DoAsyncDrop<Self> + 'a;
}

trait DoAsyncDrop<This> {
    fn poll(self: Pin<&mut Self>, this: Pin<&mut This>, cx: Context<'_>) -> Poll<()>;
}

The tricky part is then requiring/proving that this reaches a fixed-point where the DoAsyncDrop type has zero size. And, tbh, though I say this is a zero-cost version, I'm not even convinced that you can have a fixedpoint at all.

Regarding the problem with async fn in traits. While other problems (especially the problem with GAT) can be solved at the compiler level, the problem with dyn Trait not being Sized is inevitable.

struct AsyncGuard {
    url: String,
}
impl AsyncDrop for AsyncGuard {
    async fn drop_ready(&mut self) {
        // invoke HTTP DELETE
    }
}

If AsyncGuard was stored in a Box<dyn AsyncDrop>, where to store the state for drop_ready? If we don't want to preallocate the space for every struct implementing drop_ready, it cannot be stored in the AsyncGuard itself. It cannot be stored in the pinned future invoking drop_ready either, since the size is not known at compilation time. It has to be stored in another Box instead.

impl<T> BoxAsyncDrop for T where T: AsyncDrop {
    fn box_drop_ready<'a>(&'a mut self) -> Box<dyn Future<Output = ()> + 'a>{
        Box::new(self.drop_ready())
    }
}

This ties AsyncDrop to a specific Box implementation but it gets things working. To avoid that with dyn: (I'm not sure if it works in current Rust at all)

impl<T> PtrAsyncDrop for T where T: AsyncGuard {
    const fn ptr_drop_ready_size() -> usize;
    unsafe fn ptr_drop_ready(&'a mut self, p: *mut u8) -> &'a mut (dyn Future<Output = ()> + 'a) {
        ptr::write(p, self.drop_ready());
        &mut *p
    }
}

This avoids forcing a Box implementation, and the pinned future is then able to compute box_drop_ready_size() for every dyn Future, sum them up, perform one single allocation and call box_drop_ready for every dyn Future, as if it generated a struct at runtime and computed offset for every field like the compiler. However, splitting the function into two stages, one computing the size and another writing the object, is inevitable. This is as if implementing GAT for dyn Trait, because the program needs to compute a struct definition which the compiler would have done for concrete types.

The problem with trait objects is one reason poll_drop ready is advantageous. Another has to do with auto traits.

If we have an async fn drop, we don't know what auto traits that type implements. For example, though the type being dropped might be Send, the future created for dropping it might not be. Therefore, futures containing that type now are not Send unless the destructor future of that type is also Send. If that type is generic, we simply have no idea of it has a destructor or if that destructor is Send. That is an async fn foo<T: Send> now would not provable to be Send in any context where T is not known.

In general, it is very advantageous for the type being dropped to contain all of the state needed to drop it. And it is actually quite logical for this to be the case for the main motivating examples (things like bufwriters and scope guards). Obviously it is less convenient for just whipping up some async code in a destructor, but the problems allowing that would present (setting aside the implementation blockers) are nontrivial.

1 Like

Regarding the problem for auto traits, currently the only auto trait that is usually related to Future is Send (Sync is useless because there are no methods to deal with &Future). So there are only two kinds of Future with respect to auto traits: Send futures and non-Send futures. The problem with auto traits seems to concern how to express in the function signature of an async fn whether the Future produced by it is Send or not.

Sooner or later we would be able to have the compiler smarter and prove that a program is sound. The real problem is the reverse: if a program is not sound, how do we modify the program to make it sound? For the compiler, proving that a program is sound involves checking a set of boolean constraints, which can be done in linear time. However, figuring out how the set of boolean constraints could be satisfied is called the SAT problem, which is NP-hard in general and takes exponential time.

In this particular case, only positive reasoning is involved and the problem is not NP-hard yet for the compiler. However, the complexity of "proving a program is sound" versus "fixing an unsound program" is just like the complexity of "checking a set of boolean constraints" versus SAT. In order to make a non-Send future Send, the programmer needs to look for all non-Send arguments, maybe remove any usage of RefCell in the function body and replace them with RwLock even though the programmer is sure they won't be used across threads, or wait until an upstream library makes its structs Send or a new compiler feature is implemented.

My conclusion is that instead of dealing with a complex problem later, you have to avoid the problem in the first place. In other words, instead of accidentally pouring non-Send arguments to produce a non-Send future, force the programmer to be explicit on whether they expect a Send future or a non-Send one. This means implementing an async fn with two versions: one for Send arguments and another for non-Send arguments. Avoiding duplicate code for these two versions is a completely different problem which can be solved by sharing function pointers between them. This means dividing the Rust world into two worlds: a Send world and a non-Send world, and this is inevitable because if the boundary wasn't clear, the non-Send world would erode the Send world, which create more problems in the future. However, we do want to make sure Send code automatically get its non-Send counterpart.

Regarding drop state, there is a distinction between the problems AsyncDrop solves and the BufWriter problem. When you are dropping a future, you probably don't want to flush the bufwriter since the future is... dropped anyway. Instead, it should be enforced via linting. For example, a marker trait MustBeDone could be used to complain if an object is implicitly dropped. The programmer has to drop the object by explicitly calling drop or passing the object to a function taking self, such as async fn close(self) or async fn sync(self).