Future and its assurance of completion

I'm curious about people's opinions on Future providing an assurance of completion, perhaps via an additional Complete auto trait (for example).

As background, I've been thinking recently about async/await and its applicability to actor models. In particular, I've been thinking about whether an actor's receiver should be an async function. An actor's receiver provides the ability for it to receive messages from its mailbox (aka a channel). An example without async:

struct Greet {
    whom: String,
}

struct HelloWorld {}

impl Actor<Greet> for HelloWorld {
    fn receive(&mut self, context: &mut ActorContext<Greet>, message: &Greet) {
        println!("Hello {}!", message.whom);
    }
}

With an async receiver, the above could be adapted to invoke some async function:

#[derive(Clone)]
struct Greet {
    whom: String,
}

struct HelloWorld {}

impl HelloWorld {
    async fn hello_world(message: &Greet) {
        println!("Hello {}!", message.whom)
    }
}

impl Actor<Greet> for HelloWorld {
    fn receive(&mut self, _context: &mut ActorContext<Greet>, message: &Greet) -> Pin<Box<dyn Future<Output=()> + Complete + '_>> {
        let m = message.to_owned();
        Box::pin(async move {
            HelloWorld::hello_world(&m).await
        })
    }
}

(Rust Playground code)

A concern when considering async/await receivers is that an awaited future may block indefinitely. The consequence of this is that an actor is not able to process any subsequent message; one such message being to request termination. For actor receivers then, I've been wondering about a new trait to be used in combination with Future to assure the compiler that the async code should always complete. Perhaps signatures then become something like Future<Output = ???> + Complete (in the spirit of how + Send is used). Perhaps the compiler can then ensure that async functions must also convey that they can complete.

Perhaps completion-assured futures might benefit other use-cases. Or perhaps there's room in the existing Future API to simply state that it should always complete by way of a convention.

Thoughts?

1 Like

To be clear, this is about a future that will, if polled as requested, eventually return Poll::Ready, right?

I'm pretty sure this immediately reduces to the halting problem (i.e. unsolvable), unfortunately. While you could weaken the guarantee to "makes forward progress," and make it at least somewhat tractable to approximate, because any code could just loop {} (which is perfectly defined and allowed in Rust, unlike C or C++, which have forward progress guarantees).

The answer is pretty much that you have exactly the same expectation that foo().await eventually resolves that foo() eventually returns: that, unless documented otherwise, it'd be a pretty serious bug in the providing library if it didn't.

I don't know if async-std's runtime actually does so -- there was talk a while back about making it so so, but I don't recall the exact outcome -- but a sufficiently smart async runtime can detect that a task is unexpectedly blocking a thread, and shift that thread to the blocking pool, replacing the task thread with a new one. This alleviates the issue somewhat, so long as said task is straightline rather than select!ing over multiple subfutures (as siblings would still be starved).

Unfortunately, cooperative multitasking just doesn't work if you can't trust your tasks to cooperate. If you want to add more safeguards to your async code against accidentally being uncooperative, you can a) spawn_blocking anything potentially blocking, and b) periodically .await whatever your runtime's equivalent of "let the reactor spin for a turn" future is (e.g. tokio yield_now).


Most people when they talk about guaranteeing completion of futures, they mean to guarantee that the future will not be dropped (or forgotten, or otherwise stopped being polled) before it yields Poll::Ready. This is a distinct guarantee, and one that's difficult-to-impossible to provide within Rust's model. The worst case is as above, an infinite loop in another task in a select! but also plain mem::forget and ref counting cycles also work against providing this guarantee.

10 Likes

To be clear, this is about a future that will, if polled as requested, eventually return Poll::Ready , right?

Thanks. Yes.

I'm trying to be very careful here to avoid the word, "guarantee". As you allude to, there are no guarantees in life. :slight_smile: I also note that there's nothing guaranteed about Send as I can cast to an Any and add Send to a channel, and so my thoughts on Complete are offered in the same spirit.

I believe there's something about calling async functions where the likelihood of not completing is higher than calling sync ones, particularly when dealing with network IO. My expectations on completion are thus slightly different.

Having a runtime moving a task to a blocking thread is interesting. However, being the provider of a library, I'm not in control of what its users will do within the actor's receiver and so I'm wondering if we can help the developer determine what async calls are able to be used.

Perhaps a more concrete example of completion assurance is Tokio's timeout function. That function provides me with some assurance that it will return after a period of time and would therefore be a good candidate for returning a Complete assurance.

1 Like

@huntc Unfortunately, cooperative multitasking just doesn't work if you can't trust your tasks to cooperate. It's not 100% grantee you should keep on this project. :slightly_smiling_face:

The blog post about this has been updated to say that they didn't end up going ahead with it.

There's also a section in a Tokio blog post which explains why they don't auto-detect blocking tasks.

1 Like

The solution to your problem is of course to join the termination signal future with the future you want to await.

There is obviously no way of guaranteeing a future terminates without using a much more sophisticated language with dependent types that can encode proofs and/or a totality checker.

I don't actually see a difference between the sync and the async instances here. Is it just that the async one requests the next message itself?

Synchronous functions also can block indefinitely so this wouldn't be a new issue for the async handles.

You could, if you wanted to, drop the receiver's future after some deadline of time (as long as the future was indeed yielding back to you).

But as others have said in cooperative multithreading you have to be able to trust your futures.

Wait, no, this is not true, and that's important! Respecting Send/Sync are necessary to maintain soundness and freedom from data races. dyn Any does not impl either Send nor Sync; if you want a trait object to be Send/Sync you have to say dyn Any + Send + Sync.

Playground example

If a channel allows you to send dyn Any between threads (not just between tasks on a single-threaded executor, but send it between threads), that channel is unsound and has a serious bug.

6 Likes

To be clear, I only cast to Any if the original type has Send and the Any is also used in combination with Send. Any is required as I've introduced the notion of a Dispatcher that interfaces with async runtimes. These dispatchers are able to work with any message (that can be sent). Here's my project for reference: GitHub - titanclass/stage: A minimal actor model library using nostd Rust and designed to run with any executor.

1 Like

Thanks for the responses here. My original question posed two ideas: (i) a possible auto trait to signal that completion has been considered; and (ii) an enhancement to the Future documentation providing assurance around completion.

Considering the second option for the moment, and given what the Future doc states right now:

A future represents an asynchronous computation.

...would developers of async functions benefit from having the sentence extended per:

A future represents an asynchronous computation that must complete.

i.e would this statement help developers of Rust async functions then think more about completion?

Yes, so long as you use dyn Any + Send everything is fine, it's only dyn Any that it would be bad to share cross-thread. But if you only send dyn Any + Send cross-thread, then why did you claim that

since that's not true? (The lack of) Send is, in fact, a strong guarantee made by the standard library, and breaking that guarantee easily leads to UB in safe code.

"typically expected to complete" is the strongest wording that I could see being adopted. "Must" is too strong, when I can already trivially write loop { rt::sleep(1).await } the same way I can write loop { os::sleep(1); } in sync code. In fact, it's not even terribly uncommon for an async application's main function to be shaped as rt::block_on(async loop { do_work().await; }) -- the "main" task is an infinite loop awaiting spawned tasks, the same as the main thread blocking on effectively-infinitely-running threads.

That said, it is generally expected that if someone gives you impl Future to .await, it will yield a value at some point. future.await is basically identical to callback() in completion expectation: generally, modulo bugs, functions should complete some unit of work and report that via return value (or modification of captured state).

If anything, it would help developers think less about completion. If you're guaranteed (for some level of guarnateed) that anything you .await will complete, there's no need to worry about it not completing. Futures are already supposed to return Polling "reasonably quickly", and not do any CPU intensive or blocking work in poll.

Just as one final example, consider a future whose task is to read some e.g. TCP packet group. If it reads a header saying that three more packets are coming, and then the other side of the connection never sends the last packet (for one reason or another), the future is never going to yield, as it's never done waiting (for what isn't going to come). Whatever network primitive shouldn't be forced to include a timeout to conform to the API expectations of the "can be .awaited" trait. Instead, if a library wants to provide a safer interface, then the raw task should be wrapped in a layer that selects over the primitive or a timeout. (Or, well, in reality you want a combination of smart between-packet timeouts as well as a global timeout or otherwise accepting a steady but kinda slow connection while refusing maliciously slow ones, but that's a lot more complicated.)

So I feel the real solution to your OP stated concern,

is to just not wait to receive the next message until the first one is done polling. The whole point of async is being able to parallelize the waiting; not starting to poll the next message because you haven't finished processing the prior one defeats the purpose.

Keep in mind, you don't have to always immediately .await when you call an async fn. Again, that defeats the point of async. The point is to poll as many tasks as you can reasonably get concurrent, concurrently. And if managing polling all of them is difficult, that's the whole reason your runtime exists and has rt::spawn: to collect multiple tasks to be run concurrently. (The ecosystem sorely wants some standard runtime-agnostic way to spawn(impl Future) -> impl Future, as it's the way to go from "lazy" to "eager" futures.)


And again, any discussion of completion runs dangerously close to accidentally reinforcing people's (expectation? hope? desire?) notion that futures are polled through completion (unless explicitly cancelled, which would be a panic, I guess? Or otherwise an unwind like... Drop is for the async stack).

I stated:

there's nothing guaranteed about Send as I can cast to an Any and add Send to a channel

...meaning dyn Any + Send. Sorry for not being clearer.