A defer discussion

vojtechkral · March 5, 2024, 12:32am

Right, but recall that one of the motivations for the defer feature is that you might want to mutably refer to the things that are also refered to mutably in the "body", cf. the some_queue example in #1. Ie. in this case you'd end up with a mut ref to the same thing in both the first and second arguments to the function.

NB, one more idea on syntax: defer could be a consuming method on async block. It would be provided by a lang-item trait impld on async block futures. The trait would have to be a lang-item trait due to the integration with borrow checking and scope cleanup.
e.g.:

let container1 = docker.run("foo/foo").start().await()?;
async { container1.stop_and_remove().await; }.defer();

Downside is borrow-checking rules are sensitive to whether there is a .defer() consuming that block or not.
Upside is that no new keyword and special syntax would be needed, this could even be done in a backcompat manner (though I guess in current edition a new trait could not be added to the std prelude).

Ok, that's it, sorry, I'll stop spamming wild ideas now...

rpjohnst · March 5, 2024, 12:49am

That's precisely the nice thing about with- the block body becomes a callee/nested scope relative to the final block in its caller, so all the mutable (re)borrowing works out naturally. Here's the queue example:

let mut some_queue = ...;

while Some(item) = some_queue.pop() {
    with reborrowed_queue <- push_item_later(&mut some_queue, item);
    if item.is_special() {
        reborrowed_queue.push(SpecialItem);
    }
}

fn push_item_later(queue: &mut Queue, item: I, body: impl FnOnce(&mut Queue)) {
    do { body(queue) } final { queue.push(item); }
}

Of course you can do this with a scope guard object too, but in either case it only really matters if you're trying to abstract the cleanup out, like in your example of deeply nested do/finals, or if you're trying to tie setup and cleanup together.

toc · March 5, 2024, 1:43am

Not to throw water on the overall discussion, but the exact syntax here seems like it can be punted until late in the RFC process. There are multiple possible syntaxes.

It seems like the issues around return (in particular, it seems like a bubble()? operation when we're already returning an Err will typically hide the error we care about), async cancellation, and operation around panicking are more salient. Although it also seems like all of those issues are already present around various multi-ref elaborate Guard types and even exist if you just manually copy/paste cleanup code everywhere necessary.

vojtechkral · March 5, 2024, 10:06am

My $.02 is that the important bits are borrow checking rules, async support, and yes - async cancelation issues. Stuff like working with result and bubbling from the defers/finals feels like advanced features on top.

matthieum · March 5, 2024, 5:52pm

While I agree that syntax should be at the service of functionality, and thus determined after functionality has been pinned down, I created this specific discussion precisely to explore the (potential) differences of functionality that defer could offer compared to do .. final.

For example, I suggested that defer could offer granular built-in dismissal in a way that do .. final didn't. It's a common feature in scope-guard libraries. It appears that in Rust it would be hard to tackle, so in that sense the discussion was fruitful (in tempering my optimism).

We're not just exploring syntax, here, we're exploring control-flow abstractions. We tack a syntax to it because it makes it easier to discuss something that has a name, and to play with various code samples to get a feel for it. But the syntax is not the crux -- I could use kickoolol, instead -- it's the underlying control-flow abstraction, and its interactions with borrowing rules, etc... which matters.

rpjohnst:

Using with sugar:

{
    let docker = Docker::connect_with_socket_defaults()?;
    with container1 <- start_stop_remove(docker.run("foo/foo")).await?;
    with container2 <- start_stop_remove(docker.run("bar/bar")).await?;
    with container3 <- start_stop_remove(docker.run("baz/baz")).await?;

    tests.run().await;
}

async fn start_stop_remove(container: Container, body: impl async FnOnce(&Container)) -> Result<(), DockerError> {
    let container = container.start().await?;
    do {
        Ok(body(&container).await)
    } final {
        container.stop_and_remove().await;
    }
}

It's really not clear how this thing is supposed to work here -- what it desugars to. This doesn't quite seem to work like Python's with, for example.

I think in the end I pieced it together by looking at the examples it was attempting to rewrite and deducing what it should be doing... but a bit of explanation wouldn't hurt.

So, here's what I think with is supposed to desugar to (container1 only):

{
    let docker = Docker::connect_with_socket_defaults()?;

    start_stop_remove(docker.run("foo/foo"), async |container1| {
        with container2 <- start_stop_remove(docker.run("bar/bar")).await?;
        with container3 <- start_stop_remove(docker.run("baz/baz")).await?;

        tests.run().await;
    }).await?;
}

In which case, body is improperly typed -- errors from the await? on container2 and container3 should be able to bubble up:

  async fn start_stop_remove<F>(container: Container, body: F) -> Result<(), DockerError>
  where
      F: async FnOnce(&Container) -> Result<(), DockerError>,
  {
      let container = container.start().await?;
      do {
          body(&container).await
      } final {
          container.stop_and_remove().await;
      }
  }

And we've got a missing Ok(()) at the end of the original function:

{
    let docker = Docker::connect_with_socket_defaults()?;
    with container1 <- start_stop_remove(docker.run("foo/foo")).await?;
    with container2 <- start_stop_remove(docker.run("bar/bar")).await?;
    with container3 <- start_stop_remove(docker.run("baz/baz")).await?;

    tests.run().await;

    Ok(())
}

Please let me know if I've misinterpreted something.

With that said, it works. But it seems we're piling a somewhat complex feature (with) on top of a non-too simple feature (arbitrary finalization) in attempt to reduce its shortcomings.

I mean, compare:

#[test]
fn using_defer() -> Result<(), Box<dyn Error>> {
    let docker = Docker::connect_with_socket_defaults()?;

    let container1 = docker.run("foo/foo").start().await?;
    defer || container1.stop_and_remove().await;

    let container2 = docker.run("bar/bar").start().await?;
    defer || container2.stop_and_remove().await;

    let container3 = docker.run("baz/baz").start().await?;
    defer || container3.stop_and_remove().await;

     tests.run().await;

     Ok(())
}

#[test]
fn using_with_on_top_of_do_final() -> Result<(), Box<dyn Error>> {
    async fn start_stop_remove<F>(container: Container, body: F) -> Result<(), DockerError>
    where
        F: async FnOnce(&Container) -> Result<(), DockerError>,
    {
        let container = container.start().await?;
        do {
            body(&container).await
        } final {
            container.stop_and_remove().await;
        }
    }

    let docker = Docker::connect_with_socket_defaults()?;
    with container1 <- start_stop_remove(docker.run("foo/foo")).await?;
    with container2 <- start_stop_remove(docker.run("bar/bar")).await?;
    with container3 <- start_stop_remove(docker.run("baz/baz")).await?;

    tests.run().await;

    Ok(())
}

The first is a linear (one pass down, one pass up) story, the second is a tad more convoluted to follow as the execution flow keeps bouncing back between the main function and the helper function(s).

And this is the ideal case for with, here:

The function defined only needs a single argument.
A single function needs to be defined.

If more arguments are needed, or different clean-ups are needed, then the size of the code would balloon-up, and it'd become even more difficult to follow the execution flow.

In any case, it seems with is completely orthogonal to do .. final vs defer: it does not offer guaranteed deferred execution on its own, as far as I can tell. In this case, I would favor NOT discussing with further, and focusing on the various possible ways to offer guaranteed deferred execution and whether some ways offer functionality that others don't, or make such functionality more easily accessible.

rpjohnst · March 5, 2024, 7:47pm

Yes, I was a bit hand-wavy with the Results.

I disagree that "bouncing between functions" is harder to follow. with offers the same control flow as Drop- a binding that introduces some cleanup, in reverse order, at the end of the block. Rust programmers already deeply understand how functions, callbacks, and Drop work, it's how we write this kind of code today.

The two reasons I brought up with in the first place were a) the "rightward drift" argument and b) the "tie setup together with cleanup" argument. The with sugar provides both these aspects of functionality in conjunction with do/final, without the downsides of defer, as an orthogonal construct.

idanarye · March 6, 2024, 9:11pm

Sorry for the bike-shed, but I think using a closure syntax when we don't use closure semantics (because we don't want to move/borrow the values the defer is supposed to clean up) can be confusing. If you want the ability to conditionally pass the result to the defer, there are two syntax styles for that that are already used in other Rust constructs (and are therefore more teachable):

Always pass the result, and just ignore it with _ when you don't actually need it:

// Without result:
defer _ { some_queue.push(item); }
// With result:
defer result { result.and_then(|| take_action()); }

dever let

// Without result:
defer { some_queue.push(item); }
// With result:
defer let result { result.and_then(|| take_action()); }

Also, with both styles (though it may be more natural for defer let) we can support refutable patterns:

// First style
defer Ok(result) { take_action(); }
// Second style
defer let Ok(result) { take_action(); }

CAD97 · March 6, 2024, 10:04pm

Counterpoint: so far, keyword (...) {...} always is/includes an expression fed to the block construct. Thusly, defer x {...} reads more like "defer using the thing named x" than "defer, capturing with the name x". defer let x doesn't have that issue, but

defer is structurally similar to a closure, with the difference being that binding/lifetime capture happens at the position(s) it's called (at scope exit) instead of the position it's defined. This is a large difference and could be reason to avoid closure syntax, but it's not too odd. In a way, it's somewhat similar to two-phase borrows in that it allows you to "capture" something but still use that thing between the capture and the capture's usage. (But two-phase borrows don't actually do reordering like defer would, but ordering is a common informal understanding as to why it works.)

The current syntax analogous form would probably be defer { result => take_action() } instead.

vojtechkral · March 7, 2024, 2:15pm

Hmm I'm not sure whether this is indeed the case or whether the defer syntax just makes it look that way. I guess I still don't understand the proposal enough; for example, what are the semantics in the following situation?

let mut resource = Resrouce::acquire();
let guard = defer || { resource.cleanup(); }

// code...

if something() {
    let foo = Foo::new();
    resource.do_something(foo);
} else {
    drop(resource);
}

// code...

It's not clear to me what should be the point where the capturing happens. It can't quite be at scope exit, since at that point resource may already have been dropped.

Edit: Or is it the case that the capturing always happens at scope exit and the above would simply be an error?

matthieum · March 7, 2024, 5:18pm

I would agree, except we do use closure semantics, in all ways but one:

The variables "captured" are captured by name at the moment the defer is defined, not later.
The effect of continue, break, or return is scoped to the closure.
I'm probably forgetting some things.

The one difference is that borrow-checking analysis would be special, though not quite completely off:

Similar to an early capture (typical closure), a bound variable cannot be moved out.
- Not even if it's later re-assigned, because if it's, even temporarily, de-initialized when a panic occurs, that's problematic.
Not similarly to an early capture (not quite like a closure), a bound variable is not considered borrowed outside of the actual execution of the defer closure.
- It can be modified, swapped, replaced, taken, etc...

So there's only one difference with a real closure, and it's the deferred "lock" borrow-wise. Otherwise, it's a closure through and through.

The above would be an error, indeed.

toc · March 7, 2024, 7:01pm

vojtechkral:

let mut resource = Resrouce::acquire();
let guard = defer || { resource.cleanup(); }

// code...

if something() {
    let foo = Foo::new();
    resource.do_something(foo);
} else {
    // guard fires
    drop(resource);
}

// code...
// guard conditionally fires

guard fires exactly when a Guard<&mut Resource>(resource) would drop, in this case a "drop flag" would be generated for whether or not resource got dropped (this is the current behavior for guard-type things though, it's not new, you can do this now with scopeguard and some printlns). In this case though:

let mut resource1 = Resource::acquire();
let mut resource2 = Resource::acquire();
defer {
    resource1.cleanup();
    resource2.cleanup();
}

// code...

if something() {
    let foo = Foo::new();
    resource1.do_something(foo);
} else {
    drop(resource1);
}

// code that still uses resource2

We would get a borrow error.

dlight · March 8, 2024, 1:29am

Defer is like a closure that is called inline, like (|| ..)() (an idiom common in other languages such as Javascript, but is also seen in Rust) but it is conceptually defined and called at the end of scope rather than immediately.

As such, defer || ..; is maybe of misleading, because it isn't merely defining something similar to a closure, but calling it at the end of scope.

But writing defer (|| ..)(); is kind of noisy and unpleasant.

So I think that ultimately the || that alludes to closures kind of don't fit here.

Maybe defer ..; or defer { .. }; would be better.

vojtechkral · March 8, 2024, 9:43am

Given that defer really is essentially a closure (thanks everyone for the clarifications), I think the easiest way forward is to use good ol' closure and pass it to a new built-in macro:

defer!(|| { some_resource.cleanup() });

borrowck could hopefully be updated to recognize the builtin and apply appropriate rules. Nicer syntax could be devised later, and I don't think this closes doors for a do-final syntax either if need be, I believe it could desugar to it.

matthieum · March 8, 2024, 5:16pm

I don't see defer as defining a closure so much as defer being passed a closure as argument.

Do note that even if the closure is called at the end of the scope, the variables it captures are defined at the moment it's defined.

What's the point of using a built-in macro instead of a (possibly contextual) keyword?

HeroicKatora · March 8, 2024, 7:15pm

Differentiating the two usual exit paths out of a block has more implications than the availability of the value. I think many interesting use cases occur if we treat defer as not binding like a closure. A closure is a value which can be moved and passed and as such it must be representable in the type system such that type/borrow/drop analysis can be local. The defer block, on the other hand, has the unique feature that it is not a value but specific to the (lexical) block it is declared in. Hence, I think we should also advantage of this fact as it allows uses that only work with local analysis; in particular an interesting feature would be to allow defer to interact with the drop-state of variables without 'capturing' them, such as not requiring them to be initialized before and after.

Regarding the value of a block, the pattern syntax as in this comment is an intringuing way to introduce, explicitly, a way to not only specify the bindings but also clarify the paths on which the defer should be ran? There are two different paths in value-return vs. unwind. Are these not also treated separately for the sake of liveness analysis?

To show one exiting possibility of moving out of and into a variable declared earlier, consider this:

fn defer_for_non_panic_intervention(file: File)
    -> Result<(), std::io::Error>
{
    let close_error: Result<(), std::io::Error>;

    // 'Value-defer', called with a result value when not unwinding.
    defer r /* : Result<(), std::io::Error> */ => {
        // Prefer existing errors over retained error of closing
        r?;
        // Use of this value requires it to be live on all value-exit paths.
        // See below why that is the case.
        close_error
    }

    { 
        // This block isn't technically needed, just here for demonstration.
        // Liveness analysis is bounded to a function, and defer-domination
        // should happen in the same way.
        let file = file;
        
        defer /* Always, even unwinding */ {
            close_error = file.flush();
        }
        
        file.write_all(b"Hello, world!")?;
    }

    // Here is dominated by the `defer` in the inner block.
    // `close_error` is initialized in it on all paths, it can be used.

    Ok(())
}

Here we always try to flush a file when unwinding, and ensure that value-delivering exits of the function properly take into account the potential error value of such a flush.

toc · March 8, 2024, 9:54pm

In the case of your "Value-defer", is the defer block receiving the value being returned? What if the scope of the defer only conditionally returns a value, something like

{
    defer |returnee| { finalize(returnee) }
    if a_bool {
        return 5;
    }
}

// ... other code ...

return 6;

Would returnee then be an Option<T>?

I expect that a lot of usages of defer will return, mostly implicitly via bubble. I would like this code to work:

fn fun() -> i32 {
    defer {
        return 7;
    }
    4
}

... and unconditionally return 7, preferably with a warning about discarding the 4. But it would be nice if adding a fallible defer block to a function didn't require fully restructuring that function to save potential return values. But perhaps this is why defer in some languages includes so many potential branches. From python:

do {
    fallible_operation()?;
} except e @ Err {
    // if there was an error
} else {
    // only if there was no error
} finally {
    // executed unconditonally
}

(which is not a proposal, but highlights some of the complexity)

HeroicKatora · March 9, 2024, 3:58am

Yes, I would intuitively regard return as a value for the function block as well. It should invoke that defer-transform to get to the actual value. However, blocks which did not have values in the actual code path are unwound without triggering the value-defers. I'm not entirely a fan of this implied part of the idea, since I can't answer how to teach this well.

fn bar(a_bool: bool) {
    if a_bool {
        // Never executed as `return` unwinds the `if` block without value.
        // It can be linted but oof. Note the type here is fixed by `if`.
        defer v => { () }
        defer { /* this would be executed */ }
        return 1;
    }

    2
}

Consequently, a value-defer inside a let-else clause never makes sense. The deferred block has semantically the type ! -> ! and must be unreachable. (A non-value-defer is of course still valid).

A labeled break should trigger the value-defer of the blocks that the label refers to in a similar fasion. This prompts the interesting question of loops, which I'd want to avoid bikeshedding. Not yet allowing defer in the loop bodies seems fine. Rather more intriguing, I think we might allow break inside the value-defer block to short-circuit a value for it. (It'de be sufficient to allow giving the block a label to be broken to. I'm sure of the readability benefit of requiring that pair of curly braces around the deferred block anyways). The semantics of return from within the deferred block should not be much more tricky outside the teachability aspect mentioned.

I don't think there's any significant complexity from the expected/else/finally emulation specifically. All of these are directly transferrable to an equivalent defer. However, as it interacts with blocks it definitely interacts with the try proposal. That is straightforward though, I hope. It decomposes semantically to standard blocks with labels anyways (sorry if I'm mistaken whether they Ok-wrap right now or not):

'block: {
    defer { /*finally, even in panic */ }

    defer v => { /* finally, except in panic */ v }

    defer v => {
        if let Err(e) = &v { /* if there was an error */ }
        else { /* only if there was no error */ }
        v
    }:

    let __v = match fallible_operation() {
        Err(e) => break 'block Err(e.into()),
        Ok(__v) =>  __v,
    }

    Ok(__v)
}

CAD97 · March 9, 2024, 8:52pm

It does seem like this is the most composable option. To specify behavior (a bit) more precisely:

defer $block runs $block on every exit edge of the containing scope (during drop glue cleanup). Place binding names are resolved at the span the block is written, but any borrow/ownership transfer happens individually on each exit edge.
- This "early bind, late borrow" behavior is actually possible today by defining macro_rules! macros -- if you shadow a name, a later invocation of the macro will still use the prior binding.
defer $pat => $block runs $block on every value-producing exit edge of the containing scope (during drop glue cleanup), binding the produced value to $pat and making the containing scope produce the value produced by $block instead.
- Lint when a value is produced to a containing block bypassing a value-impacting defer. Maybe cause an error?
- Definitely at least lint when value-defer is attached to a loop scope, since "run on last iter" behavior isn't super obvious. Likely cause an error; the behavior can be (almost^[1]) reproduced by { defer _ => {}; loop {} } instead.
- In the else of a let-else, would trigger the unreachable_code lint.

Interesting details / further questions:

Type inference for $pat. Without intervention would function like closure parameter type inference, so calling methods would be difficult ("type must be known before this point").
Should we allow fallible patterns for $pat (passing through an unmatched value unchanged)? That the block is conditionally run would suggest a "yes" answer.
Do we permit return in value-producing defer? It could mean to break from the defer or from the function.
- Along the same line, does ? target the defer or the containing function? They're equivalent for function-scoped defer, but not for inner block-scoped defer.
  - try retargets ? without retargetting return.
- break targeting an outer scope has no intrinsic reason to be prohibited.
  - Unlabeled break targeting the defer itself could be desirable and weakens need/desire for return to do so.

Can defer change the produced value's type? e.g.

if rand() { return String::new(); }
defer v => { v.to_string() }
return "";

How exactly does this work in async contexts?
- For nullary defer to run on cancellation (drop), it needs to be sync.
- For nullary defer to run on unwind (panic), either it needs to be sync or unwinds need to be async.
- Value-defer happens on normal edges, so should transparently be able to .await.
- It's potentially worth to have defer try {} / defer async {} as shortcuts for common usage.
  - defer async $block => defer v => { let () = (async $block).await; v }
  - defer try $block => defer v => { try { let v = v?; let () = $block; v } }

If somebody wants to implement this experimentally in the compiler to get some impl experience, spelling it as do defer is available syntactic space (cf. do yeet) that could then be surfaced as a defer! macro.

try blocks do already do Ok-wrapping (or rather, whatever the output variant of the inferred impl Try is, with Try::from_output), and it's FCP-finalized that they will. (Making try { x? } a semantic no-op, modulo type inference.)

If it wants to use bindings from inside the loop, placing the defer outside the loop doesn't have access to them, but inside does. ↩︎

NoamB · March 9, 2024, 11:06pm

I think that value defer can be implemented as a proc macro with the syntax defer!({block}, pat => {deferred}), by pasting deferred at every exit edge, inside a block with the original return value bound with pat.

CAD97 · March 9, 2024, 11:53pm

You should probably specify proc macro attribute if that's what you mean (reread says actually no), because if you just say "proc macro defer!" people will assume you mean for defer! to be a functionlike proc macro, which as stated earlier in the thread, cannot itself add this functionality. If you're already doing arbitrary rewriting with a proc macro, it's probably better not to put the "continuation" inside a functionlike macro, such that rustfmt continues to work reasonably.

Otherwise, yes, because of how macro_rules! hygiene capture works, it should be almost possible to define defer with macro syntax, with the caveats being

Doesn't compose with inner usage of macros (no eager macro expansion)
Defer and drop glue will almost certainly end up interleaved incorrectly compared to the "correct" order.
Only the value-transforming defer, not the unwinding defer (without the "fun" pseudo-inconsistent worldview caused by suspending the unwind... probably^[1])

You could almost do it without proc macros (with a macro introduced scope that defers target) except for that ouroboros self-referential macro-defined macros tend to break name resolution and/or cause an error along the lines of “note: ambiguous because of a conflict between a macro-expanded name and a less macro-expanded name from outer scope during import or macro resolution.”

Maybe you could get the correct unwinding state with let r = catch_unwind(|| $continuation); let _shim = OnDrop(|| $deferred); r.unwrap_or(resume_unwind)? ↩︎

Topic		Replies	Views
Pre-RFC: defer statement language design	16	3375	August 18, 2022
Pre-pre-pre-RFC: implicit code control and defer statements language design	8	1208	April 18, 2024
Blog post: Async Cancellation language design	10	5956	February 18, 2022
Blog post: A formulation for scoped tasks language design	20	2731	July 31, 2023
Can we reduce the burden of cancel-correctness for async Futures? language design	21	6843	September 7, 2019

A defer discussion

Related topics