Motivated by the recent proposal for autoclones, I want to suggest a different approach. I strongly dislike the autoclone proposals, but that's a different topic which I may post in the future. For now, I'll just focus on the specific pain point raised in the discussions: captures of Rc/Arc based shared pointers in closures and async blocks. Specifically, the following pattern is imho familiar to anyone who wrote async code:
fn do_stuff(serv: Server, foo: Arc<Foo>, bar: Arc<Bar>) {
serv.set_handler_fn_mut({
let foo = foo.clone();
let bar = bar.clone();
|req| {
let foo = foo.clone();
let bar = bar.clone();
async move {
stuff_1(&foo, &bar);
handle_1(req, foo, bar);
}
}
})
.set_handler_fn_mut({
let foo = foo.clone();
let bar = bar.clone();
|req| {
let foo = foo.clone();
let bar = bar.clone();
async move {
stuff_2(&req);
handle_2(req, foo, bar);
}
}
});
}
...yeah, that's annoying, and in real life there can be many more shared pointers plus some extra local state. The worst part is that it feels like something which should be handled automatically, like with autocaptures, and it looks so insignificant, and you need to repeat the same boilerplate so many times, just to use the resulting cloned Arc once in the handler function.
Automatically inserting those clones would be one approach, but it has knock-on effects. Rust generally tries to avoid hiding execution of arbitrary code (yes, there are exceptions, like Deref and Drop, but for the most part it's true and the exceptions are themselves very predictable).
What if we had a certain operator (let's call it .move), which would allow to move out-of-scope values with inline call syntax? The above example I'd want to write in this way:
fn do_stuff(serv: Server, foo: Arc<Foo>, bar: Arc<Bar>) {
serv.set_handler_fn_mut(|req| async {
let foo = foo.clone().move;
let bar = bar.clone().move;
stuff_1(&foo, &bar);
handle_1(req, foo, bar);
})
.set_handler_fn_mut(|req| async {
stuff_2(&req);
handle_2(req, foo.clone().move, bar.clone().move);
});
}
Much better! Now we don't have any unnecessary bindings. The bindings are only introduced when we would actually use them in normal straight-line code, when we want to use a value multiple times. The cloning also happens as close as possible to the actual use site. The second handler looks in the lightest way possible without obscuring any effectful operations (clones). Note that we no longer need async move: foo and bar are unconditionally cloned and force-moved into the innermost scope, while req is handled by the usual autocapture rules.
With regards to the semantics, the expressions with .move operators are supposed to desugar basically to the verbose variants written above. Actually, I haven't yet made up reasonable semantics which handle the case of nested capturing scopes, but at least a single closure or async block can be handled in this way. I.e.
Consider a capturing expression (closure definition, async block, generator etc) which contains an expression of the form
expr.move, which is not nested in another capturing expression. Let us write that expression asCapture(expr.move), whereCaptureis the term of the entire capturing expression, andexpr.moveis a single use of themoveoperator (syntactically identical subexpressions are treated as different in the above, i.easync { (expr.move, expr.move) }performs two separate moves rather than trying to give some non-linear semantics to a single move). We have the following desugaring:Capture(expr.move) ==> { let tmp = expr; Capture(tmp) }
with an extra requirement that
tmpis captured by move unconditionally, without applying the usual autocapture analysis.
So e.g. the following expressions are definitionally equivalent, bar autocapture:
// this
it.map(|elt| transform(elt, state.frobnify().move)
// desugars to this
it.map({
let tmp = state.frobnify();
move |elt| transform(elt, tmp)
})
// this
tokio::spawn(async { foo((bar + baz).move).await })
// desugars to this
tokio::spawn({
let tmp = bar + baz;
async move { foo(tmp).await }
})
What if the .moved expression contains references to inner scope? In that case the expression doesn't typecheck, with the borrow checker error pointing out that the a local variable is used outside its scope.
What if the same !Copy value is attempted to move twice? The borrow checker complains that the value was moved, as usual. The desugaring preserves the evaluation order of moved subterms, so that should cause no issue.
What if capturing scopes are nested? I don't have a good answer at the moment. The obvious answer is that the transformation is applied only to the innermost scope, but that's not ergonomic enough for the intended async-valued closure use case. Then again, perhaps we can mostly ignore async-returning closures, since those should be solved by proper async closures, which have only one capturing scope.
What if the move operator is used outside of a capturing scope? I think it should be a syntax error. Rust is already move by default, what would an explicit move even mean? We wouldn't want to make users think that sometimes a move isn't a move without the magic keyword.
Should the operator be overloadable? Or contain a conversion trait, like IntoFuture for .await? I think no. I don't see a reason to do it.
Note that type checking is not affected in any way. In particular, if expr: &T, then expr.move moves a reference. It doesn't try to convert it into an owned value.
The advantage over explicit capture lists in the closure/async definition:
- more lightweight syntax: only the bare minimum of effectful operations, plus
.moveto denote the scope move trick (that's just a simple memcopy, so not interesting from code analysis perspective). - Complicated moved expression don't require temporary bindings. Bindings are created as usual, when you want to reuse values.
- The captured expression is right at its use site, so code is more readable.
- Doesn't affect autocapture in any way, apart from the force-moved expression. Old code isn't affected by the addition of
.move. I also expect many cases of explicitasync moveblocks andmoveclosures to be unnecessary with this fine-grained moving. In fact, why would anyone wantasync move {}when you can just.movethe specific offending variable? - Minimal changes to the parser.
.movesyntax doesn't seem to conflict with anything, and it's just another keyword postfix operator. By contrast, changing the syntax of closures andasync {}would be much more complex, and would likely break many macros. - More flexible. Arbitrarily complex expressions are naturally supported. For example, I'd want to support clones parametrized by an explicit allocator, like
vec.clone_in(alloc).moveorBox::new_in(value, arena).move.
The advantages over autoclone:
- Again, more flexible. Autoclone (or recently, Claim) suggestions work only for cloning, mostly of refcounted pointers. Any more complex use cases are ignored.
- No hidden operations. The expression to be captured is as explicit as in current Rust, but all boilerplate (compiler-placating trivial bindings and scopes) is removed.
- Importantly, doesn't have any ripple effects on the rest of the Rust code or the ecosystem. Solves precisely the annoying problem of captures, and nothing more (though perhaps it could be extended to deal with overeager scopes in other places, like scrutinees of
ifandmatch).
Bikeshed: syntax
I'm not particularly tied to the specific .move syntax. I think it's good: the precedent of postfix keyword operators is well established, the syntax looks evocative of the semantics, the move keyword is barely used in current Rust, and I can't see any parser conflicts. But I think any postfix or prefix operator syntax could work. We could even make it a new block, but that looks like overkill, and would likely be less readable.
Edit: Downsides
-
The inverted control flow may be confusing.
// prints "cab" async { print!("a"); print!("b"); print!("c").move; }.awaitSpecific cases can be linted against, but in general there is no way to prevent it. Application of
.moveto expressions with side effects is the intended usage,.clone()is the most important use case. -
This can lead to some nasty bugs if arbitrary side effects are allowed:
wopr.when_the_russians_fire_their_nukes(|| { // Whoops, should have been: // let abort_signal = our_nukes.move.fire(); let abort_signal = our_nukes.fire().move; // ... }) -
The application of
.moveoperator may happen in branches, but the semantics execute all.moveinvocations unconditionally in sequence before the capturing block executes.fn unconditional(cond: bool, foo: impl FnOnce()) -> impl FnOnce() { // foo is executed unconditionally before the closure || if cond { foo().move } else {} } // Equivalent to fn unconditional(cond: bool, foo: impl FnOnce()) -> impl FnOnce() { let tmp: () = foo(); || if cond { tmp } else {} } -
Weird scoping interaction with existing postfix operators:
fn early_return(foo: Option<()>) -> Option<impl FnOnce()> { // The ? returns from the containing function, not the closure. || foo?.move } // Equivalent to fn early_return(foo: Option<()>) -> Option<impl FnOnce()> { let tmp: () = foo?; move || tmp } async fn nested_async(foo: impl Future<Output=()>) -> impl Future<Output=()> { // Actually, this future just returns the output of `foo`. // It doesn't await inside its body. async { foo.await.move } } // Equivalent to async fn nested_async(foo: impl Future<Output=()>) -> impl Future<Output=()> { let tmp: () = foo.await; async move { foo } } -
Iterated application looks confusing:
cx.spawn(|| async { foo().move.bar().move }); // Arguably, `.move` should be evaluate from outside to inside, i.e. let t1 = foo(); cx.spawn(move || { let t2 = t1.bar(); async move { t2 } }); // But perhaps they should evaluate left to right, resulting in error. cx.spawn(|| { let t1 = foo(); let t2 = tmp.bar(); async move { t2 } }); // Or maybe it should be a compile error? -
Doesn't help with nested capturing scopes, e.g. a closure which returns
async {}.
Alternatives
-
Explicit lists of captures before a capturing block. However, these can get quite verbose for real-world code with long variable names and captured fields. Consider this example:
cx.spawn(closure!( move saved_context, move lsp_adapter_delegate, clone path, clone languages = self.languages, clone telemetry = self.telemetry, clone slash_commands = self.slash_commands, ref workspace, |this, mut cx| async_block!( ref mut cx, move this, move saved_context, move lsp_adapter_delegate, clone path, clone languages = self.languages, clone telemetry = self.telemetry, clone slash_commands = self.slash_commands, ref workspace, async move { /* body */ } ) ));Arguably, that's worse than the existing pattern with nested scopes and variable bindings. Some of the above annotations could be omitted if explicit captures could be combined with autocapture rules, and if capture modes of the same type could be grouped, but most of the captures would still need repeating.
-
Use a prefix operator (
move foo). Arguably that would somewhat discourage nesting of such expressions, but wouldn't solve the issue with conditionals and early return,?,.await. -
Use block syntax (
move { foo }). However, that may encourage putting more statements inside, including side-effectful ones (e.g. logging and asserts. The benefit is that it may be more obvious that there is weird interaction with scoping and control flow, in particular that the variables and early returns inside refer to enclosing scope. -
Use parenthesis or brackets (
move(foo),move[foo]). -
Make it a macro?
move!(foo)This may make it easier to ban potentially confusing operations inside (early returns,?,.await, nested macros) and make it more intuitive that it's not just a simple expression. This also allows easier introduction of custom syntax, e.g. we no longer need to introduce a new keyword. Instead it can be an arbitrary macro name, e.g.capture!oreager!.