The writeup in that thread refers to a hypothetical Future which borrows a buffer, then hands that buffer to an asynchronous I/O API provided by the operating system. What happens if the caller cancels the operation? It claims:
There exists an additional fix for it: We can perform a blocking wait after CancelIoEx in order to wait inside drop until the OS has finished processing the operation and until it does not require the buffer anymore. However since Cancel is only a hint to the OS - and since the cancellation might take a long amount of time or might even never happen - this is not practical in an async context where the OS thread should never be blocked.
Not practical, but perhaps usable as a fallback if some other API is established as the recommended way to cancel things.
Except, as someone just pointed out to me on Reddit, that doesn't actually work. Because of Pin's guarantees, Future implementations are allowed to assume drop will be called before the memory belonging to the future object itself can be reused. But no such guarantee holds for data borrowed by the future! A caller can forget the future, then immediately access the previously borrowed data without giving the future any opportunity to clean up whatsoever.
This can be verified with a quick playground. Perhaps it's obvious in retrospect, but it seems like almost nobody in the original thread realized it.* The OP certainly didn't, nor did I.
That leaves me with some questions:
Will AsyncDrop somehow get around this limitation? I don't see how it could.
Could it be useful to create some sort of "Pin for buffers" that guarantees it will run, not drop, but some other routine?
Should Pin itself have been designed differently, somehow?
It's interesting that stack pinning as provided by pin-utils effectively does guarantee that the future will be dropped before the borrow expires, but heap pinning does not. (Well, unless you do a stack pin from within an async fn which is itself leaked.)
The only fix I see is the same as the scoped thread spawing problem, i.e. providing an async function that takes a closure giving a “I/O scope” that lets you create completion-based futures.
But how exactly would that work? It seems like a problem is that in an async fn, the lifetime system can't distinguish between memory on the async fn's own 'stack' (i.e. part of the future object, which thus cannot be reused until the async fn is dropped) and data which the async fn has borrowed from outside (which can be reused if the async fn is forgotten).
* @bill_myers might have realized it, judging by the edit history of their post, something I noticed because usually read via email. :p
The data that has been borrowed from outside has a lifetime that outlives the async "scope" (because the API is designed to require that), and when the function creating the async scope returns all the futures are cleaned up, so the memory can never be freed while I/O is in progress.
It works exactly like scoped threads, with thread spawning replaced by creating futures, threads accessing memory replaced with I/O accessing memory, and joining threads replaced with calling CancelIoEx and awaiting the results.
What I realized doesn't work is the other solution of using CancelIoEx with normal future creation when it's guaranteed to be synchronous, because you can just mem::forget the future.
fn borrows_some_data(a: &i32) {
crossbeam::scope(|scope| {
scope.spawn(|_| {
// Access borrowed data from outside the scope
println!("{}", *a);
});
});
}
In an async variant of the same API, that would be unsound.
I guess you could work around the issue by requiring the first closure to be 'static. But that's only feasible in this API because you can call crossbeam::scope once and spawn many threads from it – which implies dynamic allocation is required.
I'm imagining a simpler API which would be something like
(AsyncFnOnce doesn't actually exist, but assume it's short for FnOnce(Args) -> impl Future<Output=Ret>.)
It would allocate SIZE bytes on the 'stack' and pass the buffer to the provided lambda. NiceBuffer is a type that would guarantee that, when it's passed as an argument to a function, it won't be reused before the function completes. Or something.
In reality, that's fundamentally broken. If you required f to be 'static, you could at least ensure that the buffer isn't deallocated before f completes. But that would make it impossible to have multiple nested with_buffer calls. It's also impossible to associate that guarantee with the type NiceBuffer itself; if some function takes a NiceBuffer<'a> as an argument, there's no way for it to guarantee that 'a wasn't borrowed from outside an async boundary.
It's almost like we need an equivalent of Send for async.
I am not entirely sure, but I think it can be made sound. In case of a single-thread executor, you can make the executor "cancel" children futures in Drop, which makes sure that they are not run after the parent is dead.
In multithreaded executor it get's triciker, because, while the parent is in Drop, the child might actually be running on the other core. However it seems that in this case it's OK for the parent to synchronously wait until the child hits the next suspend point.
That is, we can make this work by employing synchronous cancellation in a guard object created by outer ::scope function
let fut = with_cancellation(|token| async {
let buf = [0; 1024];
do_something_with_async_cancellation(token, &mut buf).await
});
fut.poll();
mem::forget(fut);
The fut would be forgotten while still running.
This is still problematic even if the executor is aware about problematic asynchronous cancellation cases, because this exact situation can be replicated at any async fn depth by a super short timeout alongside do_something_with_async_cancellation.
Basically, it seems that for APIs that require asynchronous cancellation, the runtime/reactor (one of the two, I still don't quite know what the separation is) has to own the resource such that the future itself can be synchronously dropped/cancelled and the runtime/reactor can do the actual cleanup upon being notified of the asynchronous cancellation finishing.
It's true that the naiive version of this API which passes &mut [u8] to io_uring or the like can't be made sound, since there's no guarantee that you'll be able to run code before the [u8] goes away. You can get around this, though, with something like a Pin<&mut IoUringBuf> where the underlying buffer is !Unpin and will deregister itself from the async read before deallocating memory. This is unfortunate in that you can't just use "normal" &mut [u8] like you'd ideally want, but it's definitely workable. I've spent a good amount of time thinking about this, and I don't think we could've done this any other way based on Rust's existing set of guarantees/non-guarantees.
I haven't thought that about the issues that not running drop() via std::mem::forget can bring.
Unfortunately that means all efficient Future implementations in futures-intrusive are not memory safe if one uses mem::forget on them. They rely on either the Futures running to completion or drop() being called to avoid memory issues. This is especially problematic with stack-pinned Future s since the memory is then definitely invalidated, and less problematic with heap-pinned Futures as long as the memory is still valid.
As far as I understand the issue would not happen if mem::forget would be marked as unsafe. Which is obviously be problematic since it's a breaking change. One question is whether it can be made unsafe to call on Future types via some compiler hacks, and which would be less breaking.
Even a change to run Futures always to completion (which would workaround the cancellation issues, but would also be very very breaking) could not resolve the mem::forget issues.
Interestingly, mem::forget may actually be more powerful than reference cycles, as it does lead to potential invalidation of the memory (via forgetting on the stack) whereas reference cycles can only keep memory live (on the heap).
This is what I forgot about and may actually make this a non-issue.
Once you've pinned a location, you're not allowed to touch the original location. That means that Pin<&mut _> prevents you from mem::forgetting the location. You can mem::forget e.g. a Pin<Box<_>>, but that keeps the memory live. Even if you have "async stack", if that's borrowed across an .await point, it will ultimately be pinned to the stack (where it cannot be legally mem::forgotten) or the heap (where it can be forgotten but not reclaimed without Drop).
The problem is then only if the borrowed memory is on the stack outside a future which is then used by a future that is then forgotten on the heap, as in the linked playground. (I didn't click through, :shame:.)
I don't know how that can or should be prevented, but it's subtly different than what I initially thought the problem was.
That is true. After I revisit the call chain I think there might actually not be an issue. The flow might be something along:
let mut intrusive_fut = create_intrusive_fut();
pin_mut!(intrusive_fut);
poll_once(intrusive_fut);
std::mem::forget(intrusive_fut);
The concern here is that intrusive_fut continues to run after having been forgotten and that the background task might invalidate it. However that doesn't seem to be the case, since there is an invisible block involved in the pin_mut macro. std::mem::forget(intrusive_fut) will only drop the Pin and not the Future itself, so the destructor of the Future would still be called and clean up all state.
The only way to forget a future without the destructor being called is forgetting it directly (before it ever gets pinned or polled). Since Futures in unpolled state should have caused no side-effects so far this should always be safe.
So there doesn't seem to exist a general issue with any parts of the stack (borrowing/intrusive futures, pin_mut, mem::forget).
Reproducing the playground example so I can be sure everyone's on the same page to where the problem is:
pub fn go(cx: &mut Context) {
let x: &mut [u8] = &mut [1, 2, 3];
{
let mut p = Box::pin(MyFuture(x));
let _ = p.as_mut().poll(cx);
forget(p);
}
x[0] = 4;
}
This means that MyFuturecannot give &mut x to any OS function that will make progress not while the future is in the act of polling. The Pin guarantee is not enough.
Can this be reproduced with async.await? It does mean that any buffer the OS is going to fill, synchronous or asynchronous, needs to be owned by the reactor calling out the the OS, not the calling code.
Here's a setup for attacking any reactor that may have this issue. No unstable, no unsafe. If the poll makes progress not during an actual poll (i.e. the async primitive takes given pointer rather than a buffer or just pending until the op wouldn't block), then this is unsound. In debug mode it should read whatever input, print it, and halt, but of course, no guarantees.
fn oh_no(cx: &mut Context) {
let mut buf = [0; 1];
{
let mut fut = Box::pin(stdin());
let _ = fut.poll_read(cx, &mut buf);
forget(fut)
}
while unsafe { read_volatile(&buf[0]) } == 0 {}
dbg!(String::from_utf8_lossy(&buf));
}
struct OhNo;
impl Future for OhNo {
type Output = ();
fn poll(self, cx: &mut Context) -> Poll<Self::Output> {
oh_no(cx);
Poll::Ready(())
}
}
fn main() {
tokio::spawn(OhNo);
}
I can't get tokio to build right now (I think my nightly version has pin elision rules slightly different than what tokio is using?), so I can't check whether tokio fails the test. But I don't think it does: if I'm reading it right, Stdin at least lies about being async and just blocks.
It's effectively equivalent to scoped threads: You need 'static unless the closure guard forces a synchronous join at the end of the closure.
And is it possible to soundly implement this such that the future yields back to the futures executor if the futures it is waiting on have not finished yet?
If the poll makes progress not during an actual poll (i.e. the async primitive takes given pointer rather than a buffer or just pending until the op wouldn't block), then this is unsound.
This only can happen if the Future or its source is designed to do work in the background while not being actively polled. Those implementations obviously need to watch out for this memory safety concern and avoid it - e.g. by only using buffers and other resources that are owned by their IO source (doesn't need to be the executor). I can only imagine this currently being an issue with IO completion based operations, for which the issue had been previously described.
Most of the current ecosystem (e.g. Tokio) is more readiness based and don't access resources apart from some reference-counted datastructures outside of poll(). Therefore I don't expect any issue there.
I wonder whether it's possible to write a Future-provider using purely safe Rust which can invalidate the safety if the Future gets forgotten after having been polled for the first time. My intuition says it might not be possible, since the Future would get a lifetime and whatever background task is there would need to erase that in order to able to access the Futures state at all while not being polled. But maybe there are some creative ways.