Precise generator capturing and how it interacts with a future possibility

chrefr · August 20, 2024, 8:20am

More precise tracking for generators (including async functions) is planned (e.g. Tracking issue for more precise coroutine captures · Issue #69663 · rust-lang/rust · GitHub and more).

However, thanks to a conversation on Stack Overflow, I realized that this feature is at odds with another potential feature, so we need to decide which one we prefer.

An issue that has been brought up several times is that the following code doesn't compile, even though it is completely sound:

use std::rc::Rc;

async fn foo() {}

async fn bar() {
    let rc = Rc::new("hello");
    foo().await;
    &rc;
}

fn require_send<T: Send>(_: T) {}

fn main() {
    require_send(bar());
}

The problem is that Rc, a non Send type, is held across an .await point. However, Rc specifically is not problematic; it is true that it cannot be moved between threads, but only if there is no clones left in the original threads. Async blocks will never leave a clone in the original thread, so they can safely hold a Rc across .await points.

The suggested fix is to have some another auto trait, let's call it SendNoEscape, that is implemented for Rc, and not implemented for, say, MutexGuard.

All of this is already known, and probably already discussed (whether we really want an additional auto trait, is it worth it etc.). However, an important enlightening that I had due to the abovementioned Stack Overflow discussion, is that the following statement is true:

While values created in async block can safely use SendNoEscape, values captured by async block (including async fn parameters) cannot, as they can leave copies in the caller.

This raises the question: How does the compiler differentiate between the two?

Syntactically, it's very easy. But semantically, a value created in the async block can originate from a value captured by the async block, and the compiler can't tell. So, this leads us to the following understanding:

If any value captured by the async block is !Send, the async block must be !SendNoEscape, even if this value is dropped before the first .await point.

Today, this is always true. The only way to make a value considered dropped early for generator computation is by using blocks, and captured values cannot be enclosed in blocks.

However, with the precise capturing effort, this is no longer true. So if precise capturing is implemented (and stabilized), supporting SendNoEscape will become a breaking change, which means it cannot happen.

This means we must decide, and decide now, before we stabilize precise capturing: is there any chance we will ever want to support SendNoEscape? If yes, precise capturing cannot be implemented.

There is also a middle ground: implement precise capturing, except for captured values. We can even not store them in the generator, but make it "as if" they are held wrt. auto traits.

SkiFire13 · August 20, 2024, 9:01am

This is true only if you ignore thread locals. Otherwise the async block could both store a clone of the Rc in a thread local of the original thread and keep another instance internally to use later on.

The problematic situation seems to be when:

the async block captures a !Send + SendNoEscape value;
it uses such value to create another !Send + SendNoEscape value;
it drops/consumed the original captured value;
all of this happens before the first await;
the new value is then held across a await.

Is this correct? If it is then I don't see where's the breaking change. Even with precise capturing this Future will be !Send because the new value is still !Send and is held across an await point.

Your rule also seems too restricting, as it should probably apply only when a !Send + SendNoEscape value is held across a .await. With this it's clear that it should never happen to situations where a Future is currently Send. Edit: nevermind this is wrong

In general I would expect any SendNoEscape proposal to allow strictly more code than the equivalent with only Send bounds. As such there should be no examples that compile with Send bounds (even after precise capturing) but don't with SendNoEscape.

chrefr · August 20, 2024, 9:30am

Ouch. This does render the entire point moot. But I'm sure I saw people mentioning that, for example at https://matklad.github.io/, even though I can't find it now.

But the entire point is to use SendNoEscape, not Send, to determine if the future is safe to cross threads.

The logic works as follows: if any captured value is !Send, the future is !SendNoEscape. If any local value is !SendNoEscape, the future is !SendNoEscape. But if a local value is SendNoEscape + !Send, the future is SendNoEscape (and !Send, but this doesn't matter).

SkiFire13 · August 20, 2024, 11:16am

I don't see how this is a problem though. If you use SendNoEscape to determine if a future is Send or not then you should be allowing strictly more futures to be Send. If there's a situation where using SendNoEscape makes the future !Send but not using it makes the future Send then this is IMO an error in the design of SendNoEscape, since clearly that future should be able to be Send.

By the way, a !Send captured value should make a future !Send even with precise capturing, even ignoring SendNoEscape. This is because the captured variable is alive when the Future is created, and is also alive when it is first polled, which means it is stored in the Future and makes it !Send.

chrefr · August 20, 2024, 1:25pm

There is such example, and this is what I'm talking about:

async fn bar(capture: Rc<i32>) {
    drop(capture);
    foo().await;
}

This example cannot soundly be SendNoEscape, because that will mean the following will also be:

async fn bar(capture: Rc<i32>) {
    let clone_local: Rc<i32> = Rc::clone(&capture);
    drop(capture);
    foo().await;
    Rc::clone(&clone_local);
}

But it can definitely be Send, because Send alone has no such risk - the second snippet will be !Send.

Ooh! This is a good point. Then I imagine this discussion is useless twice

SkiFire13 · August 20, 2024, 5:02pm

No it can't, because otherwise you could:

create the Future;
move it to another thread;
poll it the first time, which will execute the Rc::clone from that thread with the risk of a data race.

idanarye · August 21, 2024, 12:28am

What are SendNoEscape's semantics even going to be?

Cloning a value does not leave a compile-time "trace" to the original value (e.g. for something that does leave such trace - taking a reference)
Clone is not a trait that gets treated specially by the compiler. At least not directly - one can argue that its does so indirectly because Copy has it a as a dependency. But either way - the compiler does not "know" that a value is cloned because it does not treat Clone any different than any other method.
Cloning does not "color" the function it gets called inside - so one could easily create a function that clones the Rc inside it, and since that method will not need to be specially marked in order to do so - how will SendNoEscape know about it?

withoutboats · August 21, 2024, 9:42am

(NOT A CONTRIBUTION)

In my overall opinion, I'm not very enthusiastic about SendNoEscape. On the downside is how highly disruptive it would be, adding a new auto trait, etc, that's well known. But I also think the upside is weaker than other people seem to think.

You really shouldn't need Rcs that don't escape a task. Between tasks, reference counting is necessary both because of the scoped task issue and because if your application doesn't use "structured concurrency" you genuinely don't know which task will outlive the others. So I see the use for ref counting between tasks, but then if you're using a work-stealing executor they need to be Arc, and if you're not then your futures don't need to be Send.

But within a task, you can just use references because that's one of the big advantages of "intra-task" concurrency. So why do you need ref counting? If you need some sort of graph-like data structure, use petgraph or something.

There's more of an issue with wanting to use RefCell and holding Ref across any await points. But that seems really risky to me (since you could attempt to access the ref cell in a concurrent sub-task) so I don't really think this pattern should be encouraged. If you know it won't be accessed in a conflicting way, you should be able to transform your code to eliminate interior mutability.

That said, I'm also confused about the relevance of this to precise capture. There is effectively an implicit "await point" right as an async block is created; if you hold a !Send value but drop it before the first explicit await point, the future still must be !Send, because the first time you poll the future (which could be on a different thread from construction), it will execute that code.

EDIT: I see this last part is the same as @SkiFire13's last comment.

Topic		Replies	Views
Pre-RFC: Allowing async/await in no_std language design	8	2546	March 25, 2019
Implementing Clone for generators/async blocks compiler	2	883	May 31, 2022
Help test async/await/generators/coroutines! announcements	34	22202	March 25, 2019
Async/Await series language design	30	8020	March 25, 2019
A `capture` trait for cheaply cloning into closures	13	3373	August 8, 2022

Precise generator capturing and how it interacts with a future possibility

Related topics