Asynchronous Destructors

Sure, that's similar to what withoutboats' original blog post said. But what problems does this cause? The tradeoff to me looks like it's:

  • Add poll_drop to the Drop trait
  • Potentially, give Box async destructor support

versus the naive version

  • Add a separate AsyncDrop trait (using an associated type, so it can be implemented today)
  • Accept that Box will not have async destructor support (because there is no nice way to implement the async destructor of Box)

To me, this latter option seems worth considering. People can still implement various types of Box-with-async-destructor, for example using a poll_drop supertrait if they want - this decision would just be made by applications rather than being baked into Drop.

The challenge is in making the various forms of async destructor useful, not just easy to write.

First, whatever we do here has to be implemented in the compiler; in turn, that means that we expect it to work in a no_alloc environment, as not all Rust targets have an allocator. This makes AsyncDrop difficult - the drop glue has to allocate somehow, but we've just said it can't expect an allocator. The way round this is to add more magic allocations to every type that might have an async destructor, but that then means that you write pub struct Wrapped<T> { hidden: T }, and Wrapped is no longer the same size as a T at runtime, but some unpredictable amount larger to allow for drop glue.

poll_drop avoids this by saying that if you want an async fn drop_async(self: Pin<&mut Self>); in your type, you have to explicitly allow space for it - e.g. your struct might have to contain a async_drop: future::Fuse<…> for the destructor, and then poll_drop can be implemented as:

fn poll_drop(self: Pin<&mut Self>, cx: &mut Context) {
    self.async_drop.poll(cx)
}

So, that means that poll_drop is strictly more flexible than AsyncDrop - it can do everything that AsyncDrop can, just requires you to be explicit about storage, plus it allows you to write stateless poll_drop code (e.g. that just forwards to a subfield, for Box, Vec etc).

Second, poll_drop_ready is more useful to the implementor than poll_drop, because the guarantees are simpler to express. Consider the drop glue for both (in pseudo-Rust):

// Using `poll_drop`
// This function called repeatedly until it returns Ready, as per a normal async function
// cx is Some if in async context, None if not
magic fn drop_glue<T>(drop_me: Pin<&mut T>, cx: Option(&mut Context)) -> Poll<()> {
    let async_drop_res = match cx {
        None => { drop_me.drop(); Poll::Ready(()) }
        Some(cx) => drop_me.poll_drop(cx),
    }
    if let Some(Ready) = async_drop_res {
        recurse_drop_glue_members(drop_me, cx) // Defined as Poll::Ready(()) iff `drop_me` has no members, else runs this function on all members
    } else {
        async_drop_res
    }
}
// Using `poll_drop_ready`
// This function called repeatedly until it returns Ready, as per a normal async function
// cx is Some if in async context, None if not
magic fn drop_glue<T>(drop_me: Pin<&mut T>, cx: Option(&mut Context)) -> Poll<()> {
    let async_drop_res = match cx {
        None => Poll::Ready(()),
        Some(cx) => drop_me.poll_drop_ready(cx),
    };
    if let Some(Ready) = async_drop_res {
        drop_me.drop();
        recurse_drop_glue_members(drop_me, cx) // Defined as Poll::Ready(()) iff `drop_me` has no members, else runs this function on all members
    }
    async_drop_res
}

Yes, the glue code is marginally more complex in the latter case, as it runs two user-provided functions in an async context, not just one - however, the user of poll_drop_ready gets a guarantee that drop will also be called, not just poll_drop_ready. This then means that, as the user, you only need implement the memory-safety relevant code once - in drop - and not twice - in poll_drop as async code, and in drop as sync code - which reduces errors. poll_drop_ready is a pure optimization, as it stops you from having to block waiting for a destructor to finish.

This, in turn, means two things to the user of async destructors:

  1. When using poll_drop, the drop method that's called varies according to what context you're in - for code that implements only one of the two drop methods (sync or async), we have to somehow ensure that the other one is called, and we need rules for when to write a poll_drop/AsyncDrop implementation, and when to just write drop. In contrast, the rules are simple for poll_drop_ready - drop is the code you write to ensure that nothing is leaked, poll_drop_ready makes sure that drop never blocks
  2. Non-trivial types need duplicate code between drop and AsyncDrop/poll_drop, because both destructors needs to cleanly release resources you own. poll_drop_ready is just an optimization, so all resources can be cleaned up in drop, and poll_drop_ready just handles pushing async resources to a "done" state.

For example, take a process handling phone calls as part of a cluster of IMS servers - when it shuts down, we want to hand off all the active calls to another server, so that we don't drop calls on a normal restart. This implies that drop already has to hand all calls over, synchronously, to another server, and then drop all in-memory structures that represent those calls. In the poll_drop/AsyncDrop case (since the distinction between the two is just in whether the compiler allocates space for the drop future, or the user does), both chunks of work have to be repeated as part of the poll_drop work; in the poll_drop_ready case, poll_drop_ready has to asynchronously hand over all calls, but does not have to handle dropping the internal state (which cannot be handled asynchronously - there's no blocking involved here) once the calls are handed over, because drop will be called anyway.

TL;DR: poll_drop and AsyncDrop are equivalent in power, modulo who allocates storage for the Future state machine. poll_drop wins on that, because it makes the storage for the state machine explicit, rather than a compiler-generated allocation (which can't be done in a no_alloc world). poll_drop_ready is simpler to explain and involves less duplication since the glue code still calls sync drop, and it's clear what belongs to drop (everything), and what belongs to poll_drop_ready (any work that has to be done so that drop is non-blocking in async terms).

7 Likes

The drop glue is implemented by async syntax, which would reserve space in the Future struct that it is building. It's like awaiting a certain async function at the end of your async block. The relevant "destructor state problem" as stated in the original post is "dropping trait objects" - but you don't drop a trait object directly. You might drop a Box<dyn T>, but not in a no_alloc environment! I am suggesting that in the naive AsyncDrop approach it is natural to bite the bullet and say that dropping a Box<dyn T> in an async block would not call the async destructor.

I would also point that adding an extra virtual function call when you drop a Box<dyn T> in an async block seems significantly non-zero-cost. The original post doesn't discuss whether Box would call async destructors, so I might be attacking a strawman here - but then what is the concern with dropping trait objects?

I don't see how futures could be harder to use - it should just be like writing "my_object.async_drop().await" at the end of your async block (roughly speaking). If recursive async drop is wanted in the future, I don't see why it would be any more complicated than with polling. I imagine something like this:

async fn async_drop_all(x: MyStruct) {
    let MyStruct { field1, field2 } = x;
    async_drop(x);
    async_drop_all(field1);
    async_drop_all(field2);
}

Again, the compiler-generated allocation would just be part of the Future being constructed by async block - not a separate heap allocation. It might be able to save space in general, for example:

async fn f() {
    let a: SmallObjectWithComplicatedAsyncDrop = ...;
    let b: LargeObjectWithNoAsyncDrop = ...;
    a.use(b);
}

Here, the Future only needs space for the biggest of:

  • a + a's async drop future, or
  • a + b

With poll_drop, the space for a's async destructor has to be stored in a (or in a box, but that's adding a heap allocation and requires an allocator). This means the future needs space.for a+a's async drop future+b. This might be fine in practice, I don't know.


I want to clarify that I think poll_drop(_ready) makes sense as a first step. I just think the "destructor state problem" needs more nuance - a realistic "AsyncDrop" approach to compare to is:

  • A new generic AsyncDrop trait using an associated type (potentially an async trait fn when that is implemented).
  • std::boxed::Box<dyn T> doesn't call async destructors.
  • If you want an async-destructor-aware box for dyn T, for example to box futures, then you can choose a polling approach, using a new type impl<T: PollDropReady> PollDropBox<dyn T>. Here PollDropReady is an object-safe trait with fn poll_drop_ready.
  • No automatic recursive drop glue, at least at first.

The ergonomic advantage would be using async blocks in destructors without any change to the struct, though with current Rust this would generally require boxing as in async-trait.

These changes are more or less compatible with the changes suggested in the original post, except that poll_drop(_ready) goes on a new trait rather than Drop.

That then means that if I'm not using async syntax to implement my future, I have to ensure that I manually implement drop glue, which is an extra tax on developers. We can already expect the compiler to generate the right sync drop glue for a type and everything it contains, even if I manually implement drop or don't implement it at all - why should async destruction be any different?

Put differently, in the poll_drop* cases, the drop glue that's generated for any async context correctly handles putting Futures in containers. For an example that is currently allocation free, consider:

enum PossiblyStatic<F, T> where F: Future<Output = T> {
    Fut(F),
    Static(T),
}

Both F and T are stored by value inside the enum, and thus, if I want F's async destructor to run when I drop a value of this type, I need something using this type to correctly handle drop glue. However, because it's not itself a Future (at least, not as described here - you could implement Future on this), it's more ergonomic if the compiler is able to deduce that, if I trigger drop glue for one of these in an async context and F has an async destructor and the enum's current variant is Fut(F), then it needs to run F's async destructor.

Modulo the pointer-ness, this is true also of Box, Vec, Arc and other container and pointer types. It seems wrong to me that it should be OK for does_drops below to run async destructors on its arguments, while does_sync_drop doesn't. This is the sort of refactor that i would expect to not have a significant impact on blocking, and yet itt does - by moving the Futures into a Vec, I've ensured that any blocking in thing2's destructor suddenly blocks the entire execution thread it's using, instead of just the task.

async fn does_drops(thing1: F,
                    thing2: F) -> u32 
where F: Future<Output=u32>
{
    thing1.await
}
async fn does_sync_drop(thing1: F,
                        thing2: F) -> u32 
where F: Future<Output=u32>
{
    let v = vec![thing1, thing2];
    handle_vec(v)
}
fn choose_item<T>(V: Vec<T>) -> T {
    // This is a placeholder, and will be more complex in future
    v[0]
}
async fn handle_vec(v: Vec<impl Future<Output=u32>>) -> u32 {
    choose_item(v).await
}

While I expect does_sync_drop to be marginally slower - it's handling a heap-allocated container, after all, this isn't an unreasonable refactor to do if I'm going from small N to variable N, and having unexpected I/O latencies caused by thing2's destructor is a pain. The alternative is to have async versions of all the common containers and smart pointers, and to accept that I can't use the normal versions in async context, because they introduce surprise long latencies.

This argument still applies even if we make it about futures instead of polls. If I have correctness constraints that require drop, I have to write drop regardless of whether I'm doing something async or not, and it needs to ensure that correctness is held in the case where drop is run from a sync context. The difference then becomes how much duplication needs to exist between async fn drop_ready(self: Pin<&mut Self>) and fn drop(&mut self). In the async fn drop(self: Pin<&mut Self>) case, I need a copy of all parts of drop that apply in the async world in async drop. In a async drop_ready world, I only need consider the parts that could block.

So, using the completion-based I/O idea, you get:

fn drop(&mut self) {
    self.cancel_io();
    self.sync_wait_io_completion();
    // Now do any cleanup that's needed once the I/O is stopped
    // For example, drop reference counts on shared buffers
}

async fn drop_ready(self: Pin<&mut Self>} {
    self.cancel_io();
    self.await_io_completion().await
    // Leave other clean-up to `drop`
}

async fn drop(self: Pin<&mut Self>) {
    self.cancel_io();
    self.await_io_completion().await
    // Now do any cleanup that's needed once the I/O is stopped
    // For example, drop reference counts on shared buffers
}

If I refactor the object, and anything in drop changes, I have to remember to check to see if these changes need reflecting in async drop, but in async drop_ready, I only have to remember to check if I've affected a blocking call (and hopefully, the fact that I've touched a blocking call will remind me about drop_ready.

My personal guess is that drop_ready implementations will be rare and tiny, especially if there's good support for correctly threading drop glue around things like Box, Vec and other container types - you just won't need it very often. But I'm willing to be proven wrong by time.

2 Likes

I didn't mean to express a preference for async drop versus async drop_ready - I was just using the name AsyncDrop to discuss the "destructor state problem" in the original post.

I agree that making Box<dyn T> (and Rc, Arc) call async destructors automatically is convenient. But for an incremental approach convenience is not necessarily a priority. And there is the tradeoff of convenience vs zero-cost, where usually Rust would go zero-cost. Synchronous Drop is a bit different because it is the status quo; it's just the API that Box etc provide. If it is desirable for Box<dyn T> to call async drop futures, it could be supported, it's just a bit more complicated.

For code that doesn't use Box<dyn T> etc, for example no_alloc code, there's not much to do - the async drop code can just build up a big Future struct. As a size optimization, it could use polling wherever possible recursively. In a typical no_alloc encironment, as I understand it, the statically allocated LocalFutureObj would have to either always be run to completion (i.e. not be cancelled somehow by the executor), or have space reserved statically for its async drop future type.

On the other hand, the compiler-generated async destructor of Box<dyn T> (etc) could use the following vtable entry to clean up the contents of the box. The Pending value still allows polling without an allocation. The dyn_async_drop_hook is like poll_drop(_ready) but can return a boxed future instead.

#[cfg(feature = "alloc")]
enum AsyncDropResult {
    Ready,
    Pending,
    BoxedFuture(Box<dyn Future<Output=()>>),
}

#[cfg(feature = "alloc")]
// pseudocode for a compiler-generated vtable hook, similar to sync drop glue
trait DropHook {
    fn dyn_async_drop_hook(self: Pin<&mut Self>, cx: &mut Context<'_>) -> AsyncDropResult;
}

The point of this is to demonstrate it is possible to support futures and have the convenience of them being run by Box<dyn T>, without only a small cost for code that doesn't use futures: some extra branching, and extra Future space required when dropping a Box<dyn T>.

(Just to emphasize again: I agree with using polling at least as a first approach.)


For ergonomics it might be worthwhile to put forward ideas for ensuring async destructors get run as you'd expect - a basis for "docs and lints". Consider:

  1. It is easy to call functions that drop objects without calling the async destructor. For example calling vec.clear() or iterating over vec.into_iter().filter(...) where vec: Vec<T>.
  2. Some types won't call async destructors on their contents. Even with automatic compiler drop glue, there are existing types with custom layout such as in smallvec-1.0.0.

One solution is to add a new widening bound ?TrivialAsyncDestructor, to be used in generic fn definitions. This is similar to what has been suggested for undroppable/unforgettable types (maybe there is a better Pin-like approach?). In the presence of a widening bound T: ?TrivialAsyncDestructor, any x: T parameter cannot be passed to a function that does not specify ?TrivialAsyncDestructor. The standard function std::mem::drop would not have the widened bound, but std::mem::forget would. Any implicit drop would be treated like a call to std::mem:drop. Violating this rule produces an error, which can be disabled by code like an executor (which is ultimately responsible for "consuming" the ?TrivialAsyncDestructor bound).

The rule ought to ensure that an object with a non-trivial async destructor is never dropped without first being async dropped, except perhaps during unwinding from a panic. I think the semantics are something like this: an object can "own" another object, or can "async-own" another object. The latter is a stronger property; a type with custom layout might own but not async-own objects that have been passed to it. An object with a non-trivial async destructor should always be part of a tree of async-ownership, with the root nodes being the heap or stack.

This rule prevents problem 1 directly, and prevents problem 2 because you wouldn't be allowed to pass the object into SmallVec::push for example. There might be a lot of extra ?TrivialAsyncDestructor churn if people want to use lots of standard functions with object with async destructors. But it could be viable to just use it on a few Future implementation like Join - code using async destructors would be limited but reliable.

With a rule like this, it could be quite easy to disallow putting a ?TrivialAsyncDestructor object into a Box<dyn T>, which would make async destructors zero-cost while still being reliable.

Whether an object was dropped or not should have been a kind of typestate that can be tracked statically rather than dynamically. It's just hard to encode that for pinned Futures.

I would like to propose the following signature that is most ideal, works safely without drop flags on structs, even though it's far too fancy to implement, also it has some problems with dyn even if implemented:

// OwnedPin is a trait appiled to types that has ownership to a pinned object.
// The only type that implements it is Pin<Box<T>>, where Item = T.
trait OwnedPin {
	type Item;
	fn deref(&self) -> Pin<&Self::Item>;
	fn deref_mut(&mut self) -> Pin<&mut Self::Item>;
	fn into_inner(self) -> Self::Item where Self::Item: Unpin;
	unsafe fn transmute<T>(self) -> impl OwnedPin<Item = T>;
}
impl<T> OwnedPin for Pin<Box<T>> {
	type Item = T;
	fn deref(&self) -> Pin<&Self::Item> {
		self.as_ref()
	}
	fn deref_mut(&mut self) -> Pin<&mut Self::Item> {
		self.as_mut()
	}
	fn into_inner(self) -> Self::Item where Self::Item: Unpin {
		*Pin::into_inner(self)
	}
	unsafe fn transmute<T>(self) -> impl OwnedPin<Item = T> {
		Box::pin(Box::from_raw(Box::into_raw(Pin::into_unchecked(self)) as *mut T))
	}
}
// Also provide a wrapper for Unpin types.
struct OwnedPinWrapper<T: Unpin>(T);
impl<T> OwnedPin for OwnedPinWrapper<T> where T: Unpin {
	type Item = T;
	fn deref(&self) -> Pin<&Self::Item> {
		Pin::new(self.0)
	}
	fn deref_mut(&mut self) -> Pin<&mut Self::Item> {
		Pin::new(self.0)
	}
	fn into_inner(self) -> Self::Item {
		self.0
	}
	unsafe fn transmute<T>(self) -> impl OwnedPin<Item = T> {
		mem::transmute::<Self, OwnedPinWrapper<T>>(self)
	}
}
// This converts the object to a drop future.
// The function takes ownership of the future to ensure that
// after the conversion, there is no way back.
trait AsyncDrop {
	fn into_drop_async(s: impl OwnedPin<Item = Self>) -> impl Future<Output = ()> {
		DropFuture(Some(s))
	}
}
struct DropFuture<T>(Option<T>);
impl<T> Unpin for DropFuture<T> {}
impl<T> Future for DropFuture<T> {
	type Output = ();
	fn poll(self: Pin<&mut Self>, _cx: &mut Context<'_>) -> Poll<()> {
		Pin::into_inner(self).take()
	}
}
impl<T> AsyncDrop for DropFuture<T> {
	fn into_drop_async(s: impl OwnedPin<Item = Self>) -> impl Future<Output = ()> {
		s.into_inner()
	}
}

This attempts to encode the fact that the current future is being dropped, as a typestate, by creating a new trait OwnedPinso that while dropping, it is possible to shadow ownership of the original object and prevent any operations other than dropping to happen on it.

Nevertheless, I hope that this may serve as a ground for how its signature would look like in the future, and for now we may implement AsyncDrop for Pin<Box<T>> only. I know you hate making Box<T> so special, but this avoids tracking whether poll_drop() was called dynamically, which has a unnecessarily huge impact, such as Atomic<Option<Box<T>>> no longer fitting into a word. And it is much more elegant than adding a new flag to every struct that we have right now. And attempting to avoid Box being special, as shown above, is too ideal to really implement in my opinion.

trait AsyncDrop {
	type DropFuture;
	fn into_drop_async(self: Pin<Box<Self>>) -> Self::DropFuture;
}
// We cannot provide a default implementation for AsyncDrop because:
// 1. Default associated types have not been implemented yet.
// 2. Even then, it is not possible to couple a default implementation to a
// specific default DropFuture, which means that in our default implementation
// of `into_drop_async`, it is impossible to assume that
// Self::DropFuture == DropFuture<Self>.
struct DropFuture<T>(Option<Pin<Box<T>>>);
impl<T> Unpin for DropFuture<T> {}
impl<T> Future for DropFuture<T> {
	type Output = ();
	fn poll(self: Pin<&mut Self>, _cx: &mut Context<'_>) -> Poll<()> {
		Pin::into_inner(self).0.take();
		Poll::Ready(())
	}
}
impl<T> AsyncDrop for DropFuture<T> {
	type DropFuture = Self;
	fn into_drop_async(self: Pin<Box<Self>>) -> Self::DropFuture {
		*Pin::into_inner(self)
	}
}

EDIT: After thinking for a while, I found it is possible to implement it for Pin<&mut Self> right now! The idea is as follows:

use std::marker::PhantomData;

struct DropOnce<'a> {
    _data: PhantomData<&'a mut CallOnce<'a>>,
}

impl<'a> DropOnce<'a> {
    fn poll(&'a mut self) -> &'a mut Self {
        println!("poll");
        self
    }   

    fn do_drop(&'a mut self) {
        println!("last_poll");
    }   
}

fn main() {
    let mut c = DropOnce { _data: PhantomData };
    let r = &mut c;
    let r = r.poll();
    let r = r.poll();
    r.do_drop();
    c.poll(); // borrow checker error
}

This requires changing the signature of Future to include a lifetime parameter in the trait definition and return Pin<&'a mut Self> from the poll method. In order to preserve backward compatibility, we can have a trait DroppableFuture and define such a poll method there.

I have 2 opinions about this matter:

(A) We should not have an async destructor

For me, drop is part of stack unwinding (either by early return, or panic!). In asynchronous code, we don't have calling stack, we have state object, we called it a task, after all, async/await is just tool to create a single anonymous struct for that task object, therefore we should not have async fn drop (or other similar functionality) either. When the code panic inside async fn, in the top normal function (non-async one) point of view we drop the task, and that drop is synchronous.

Here is another argument.

Let's see it in the world of synchronous code. Destructor is mean to be simple, destructor job is to do some cleanup, and release the resource. We all agree that the cleanup code must not panic, in that same spirit, I also think that destructor should not block either.

Let's say I have wrapper type MySock for TCP socket, and I want to send a big prime number as trailing data when it is dropped, I would rather spawn a new thread, and take over the inner TCP socket ownership and send it to the new thread, so let the new thread to compute the big prime number and drop the socket. At this point, we don't care if the sending data is completed or not, that is the same if we don't offload the work. So in this case, it is okay drop return early even before the resource is physically released (in another scenario this case maybe not true). Virtually we said that MySock is already gone, even though it is deferred.

And I think that also true for asynchronous code, we should not do .await in destructor either. If we need to do something else, we can spawn another task.

But, if we agree that we can do blocking in synchronous drop, we should agree to do block_on current thread on the synchronous drop function. I don't think that possible, because we need a runtime for that.

(B) We should have an async destructor

I don't think we should combine it into normal Drop trait, I feel it should combine it with Future trait, or probably create new trait FutureDrop: Future with fn poll_drop, why? because async/await language constructs deal with any type that implements Future trait, so it also has to deal with any type that implements FutureDrop.

So from a synchronous function point of view, poll and poll_drop are just a normal function, we don't know how to prevent calling poll_drop more than once, but that also true for poll method.

I don't know what the implication of this idea. , like what should happen if async function panic, should it call poll_drop or just treat it like any other type (just drop the task)

Dropping any type that implements Future in sync function is safe without calling poll right?, so I think Dropping any type that implements FutureDrop is also safe without executing fn poll_drop

Personally, I like option (A), so instead of trying to solve the problem, maybe we should ignore the problem. There is a universal API to spawn thread by std::thread::spawn, but for asynchronous one, we don't have it yet, maybe stdlib should export some interface to register what runtime to be used to spawn task

What do you guys think? maybe I'm wrong because I don't understand well how internal rust works

2 Likes

Hi everybody,

Correct me, if I'm wrong, but those are cases when we might need an async drop:

  1. regular function return (with an error or not)
  2. future canceled
  3. panic

And currently, there are no ways to do some asynchronous non-blocking operations on future cancellation.

In my Python experience, I used finally and async with (those do the same job as async drop could do in Rust) multiple times.

It was, e.g.:

  1. Killing a subprocess via:
    1. SIGTERM
    2. wait with N secs. timeout
    3. SIGKILL
  2. Doing some network related logic. Like making HTTP requests to remove or stop entities/operations on a remote server.

Primarily, it is essential to have the ability to do such operations on task cancellation, since task cancellation might be pretty standard in some use cases.

The panic case makes the async drop more complicated. So the solution might be to call the regular drop during a panic unwind. panic is only intended for the invalid code situations, so it should be understandable why we aren't going to do any long operations and just call regular drop.

Idea adding such async drop only into trait FutureDrop or trait Future, sounds interesting, though, I'm not sure if it is implementable.

If we add poll_drop_ready into trait Future, then I don't see how it should propagate into the structures that have at least one field with non-default poll_drop_ready. Automatically implementing trait Future doesn't sound like a good idea.

If we add a new trait FutureDrop, then many libraries will likely have to change there code, because they now have to deal with the new trait.

About just spawning a new thread from the regular drop. Well, it might work for extreme situations. However, it is far from being an optimal solution, since spawning a thread is a somewhat expensive operation. Plus, dealing with the lifetimes (e.g., of TcpStream) becomes much harder.

So whats about:

  1. Adding poll_drop_ready into the Drop trait.
  2. Making it just return Ready by default.
  3. Adding automatic poll_drop_wrapper function (a user cannot change, interact with it directly), that is going to call poll_drop_ready first, and the regular drop next. With such function, we won't have to do two virtual method calls in the async context.
  4. Adding poll_drop_wrapper into vtable.
  5. In the non-async context calling regular drop.
  6. During panic unwind calling regular drop.
  7. In the async context calling poll_drop_wrapper.
  8. Add a compile-time warning, for the situations when we have a clear drop of a struct with non-default poll_drop_wrapper in non-async context. Or an escaping of such structures from async to non-async.

With such design, the guaranties of calling of poll_drop_ready are much weaker than of calling the regular drop. However, it shouldn't be a significant issue because:

  1. We partially mitigate it with a warning.
  2. You still have a drop to do some operation in case we drop in the non-async context.
  3. It is somewhat similar to Python's async with, where if you forget to do async with for an object, you won't get __aexit__ called. And still async with is a powerful tool in Python.