several pieces of documentation refer to the fact that futures don't do anything unless you await them. while this is true for most futures, it is not true for all of them, such as those created by tokio::task.
the main offender i found was in libcore:
/// Futures alone are *inert*; they must be *actively* `poll`ed to make
/// progress, meaning that each time the current task is woken up, it should
/// actively re-`poll` pending futures that it still has an interest in.
i think this should also have a qualifier like "some futures may be able to make progress without being polled due to background threads, however, generic code cannot rely on this if it wishes to be compatible with all futures"
I do not see any future making progress without polling here.
The future inside a task needs to be polled, but the caller has delegated polling to the task, which polls the future without any further interaction from the caller.
The returned future (the JoinHandle) wraps the task that is proceeding without the JoinHandle future being polled, but you need to poll the JoinHandle to see the results of the task.
Yeah, ultimately the thing here is that tokio::task::spawndoes something and returns a Future, the same way some other method might print to stdout and return a value. Because the result of the JoinHandle is based on executing the Future you pass in, it feels like the rule is violated, but it isn’t. The Future you pass in is polled to completion (unless you kill the runtime, I guess), and independently of that the JoinHandle will do a synchronization operation when polled, which is mostly just waiting.
As an analogy, consider spawning a subprocess. If your process stops—heck, it could go to sleep!—the subprocess keeps making progress even without an ExitCodeFuture being polled. But that’s because the ExitCodeFuture isn’t responsible for running the subprocess, only reporting its result when it finally exits.
I know you understand all this already from how you brought up the topic! Maybe there’s an alternate change to the documentation that would be less broad? The docs for JoinHandle do call it out, at least:
Note that the background task associated with this JoinHandle started running immediately when you called spawn, even if you have not yet awaited the JoinHandle.
it may be accurate to say "the future does nothing unless polled", but saying "the future doesn't make progress without being polled is still not quite accurate". also, while most background-working futures do not actually contain the datastructures required for this background progress[1], there is no reason they can't.
for example, on an embedded target, a future that represents a read from an I/O port may actually contain a buffer that is filled by an interrupt. it is possible to argue that this future is still not itself making progress, and instead that progress is done by the interrupt handler, but i think that is counterproductive for anyone who wants to use the Future abstraction as part of their mental model instead of thinking about the underlying mechanisms.
"futures are only guaranteed to make progress when polled" would be more clear for the documentation, i think. then the compiler warning could be changed to "futures generated by async functions do nothing unless polled". for other futures that have a similar #[must_use] message, that could be changed to "this future does nothing unless polled".
usually they just contain a handle like an fd or task id ↩︎
it doesn't matter if the current documentation is technically correct, it matters if it is clear, expecially since async programming is famously tricky, and deadlocks and race conditions are not protected against by rust[1]
it does protect against data races, but only to the extent required to ensure memory saftey ↩︎
I think it's more important to stress that async specifically is always "lazy" independent of what functionality it calls does. I've also seen people get bit by some runtime function which eagerly passes the work off to a background task (thus looks like eager promise evaluation) but which no longer gets run concurrently when stuck behind what looks to be an identity function.
Because of this, even though it can be size-inefficient, I think manually implemented futures should also be fully lazy and not dispatch any work until they're first awaited. Generally the preferable way to accomplish this is to provide IntoFuture instead of Future.
wouldn't that simply make tokio::task completly impossible?? or did you mean to say "until their first polled"? the latter would make things extremely clunky, but not impossible.
i think specifying that async fnalways returns a lazy future, even if it contains eagar futures, would be the best way to go. maybe even add a section to the async book explaining the difference between the two. the current state of just pretending this distinction doesn't exist seems far from ideal.
I did mean to say await, but used it kind of informally to include both the initial poll but also the IntoFuture::into_future call. So in this world, task::spawn(fut) would return essentially
|| {
let mut fut = IntoFuture::into_future(fut);
let handle = get_runtime().spawn(move |cx| unsafe {
Pin::new_unchecked(&mut fut).poll(cx)
});
poll_fn(move |cx| handle.poll(cx))
}
IMHO IntoFuture is underutilized.
It all depends on where you define "progress." All that tokio's fs futures do is wait on a background task. The actual fs operations do "progress" independently of the foreground, but the only "progress" that the future you hold actually makes is submitting work to that background task and noticing that the task has completed. Neither of those happen without polling.
Calling a function that returns a Future is already two levels of indirection through function calls, it just doesn't come up that often. I think the main use would be if you have some data that can be used in some other way (other than awaiting it), and it requires additional setup to become a future, and the overhead of lazily performing that setup the first time it would be polled is significat.
If you define "progress" as "stuff the future does when polled" then yes, the future doesn't do anything. If you want to get even more technical, the future is just data, and never does anything, all work is done by functions such as poll. I don't think either of those are a helpful definition.
The same way JoinHandle is just a handle, with the actual work being done by background threads, tokio::Runtime is also a handle. Treating handles to a resource as synonymous with the resource itself is so commonplace that many programmers don't even realize they're doing it. File descriptors are referred to as files, thread handles are referred to as threads, pointers to things are often referred to as the thing they point to, etc.[1] Documentation that can only be understood properly if you don't use this abstraction is needlessly confusing, and at the bare minimum should have a disclaimer that it is treating the names of things as separate from the things they identify.[2]
this is even a thing outside of programming, in broader linguistics. if i asked what "Linus Torvalds" is, more people would say "the creator of linux" instead of the technically correct answer "the name of the creator of linux" ↩︎
rust itself often doesn't distinguish between references to things and the things themselves. for example, "slice" is frequently used either to refer to &[T] or [T]↩︎
I do agree that the current way it's presented isn't ideal, and that there isn't a silver bullet presentation either. But at the same time, spawning work onto a background thread isn't exotic behavior nor a behavior of futures; it's a behavior of the code which does cross-thread communication.
In fact, I think it's important that any async adapters which use background work (completion model and not readiness model) to document that they do so. This is because background work can get lost in the event of process termination without the same visibility into partial progress as with synchronous work.
It doesn't need to be "sudden" process termination either; it could just be exiting main without flushing handles which push work into background threads. I think tokio does wait for its managed IO threads to complete during runtime shutdown, but a system which uses more implicit worker threads (e.g. smol/async-std) more acutely carries this risk factor.
The difference is, there's nothing on (eg.) Iterator that says it doesn't spawn a background thread, so if someone created an iterator that does, it wouldn't contradict the documentation in any way. Absolute terms such as "must" and "are" should be used sparingly.
yes, and i think such a requirement would be a good thing to add to the Future docs, instead of denying that any such distinction exists. the current documentation is not just confusing for users of futures, but also for anyone implementing futures.
"I made a future that does background work, and it does that background work even if I don't await it" is a real point of confusion I've seen on URLO.
I think the async book has gotten a bit better at this recently, removing some blanket statements that are not always true.