A defer discussion

rpjohnst · February 28, 2024, 2:45am

matthieum:

There are work-arounds, of course, but when you get to:

let (variable_a, variable_b, variable_c) = do {
     let variable_a = ...;
     let variable_b = ...;
     let variable_c = ...;

     (variable_a, variable_b, variable_c)
} final {
    // undo
};

//  Use the variables.

Instead of:

defer || /* undo */;

let variable_a = ...;
let variable_b = ...;
let variable_c = ...;

//  Use the variables.

You do feel the sharp pain of this unconditional mixing.

This example doesn't support your argument, because you've changed the behavior of the program. With do/final, the cleanup runs before Use the variables, and with defer it runs after. If you had written the same program in both styles, then they would also have identical scoping properties - Use the variables would have to move either inside the do block, or outside of the defer's containing block.

This is because defer is also tied to a block! It's just tied to its containing scope rather introducing a new block itself. You can always convert a defer program to a do/finally program without changing the scoping structure: wrap the region between the defer and the end of its block in do and move the cleanup into final. There will, by definition, never be any code in the same scope that needs to run after the cleanup.

I do think there is an argument to be made for tying initialization and cleanup together. That's why we have Drop, after all. But this can be accomplished without the out-of-order nature of defer. The nesting pattern used by APIs like thread::scope puts the initialization and cleanup together in a higher-order function, with similar benefits to the way Drop ties cleanup to the type. (Notably defer does not quite manage this.)

There are also ways to recover the ability to jump out of an "abstracted" do/final block like this. A crude one is to use a macro instead of a higher-order function. Alternatively, the closure could capture its control flow environment, with an implementation along the lines of Kotlin's inline fun. We could also deem this particular trifecta (initialization+cleanup together, not using Drop, and early-exit control flow) to be the realm of undroppable types, or not worth supporting at all.

(And, for those who still want to reduce rightward drift, there is syntactic sugar like Koka's with that captures the tail of a block following a binding, wraps it up in a closure with the binding as a parameter, and passes it to a higher-order function.)

withoutboats · February 28, 2024, 2:29pm

(NOT A CONTRIBUTION)

I think we should draw a distinction between tying these together with types and tying them together by putting them next to each other in code. I understand the reason for the former, but not really the reasons for the latter.

For example, when @matklad writes:

I am puzzled by this. Why would you want to put the precondition and postcondition next to one another and not where they actually apply? For example, with Hoare triplets we don't write {P}{Q}C we write {P}C{Q}.

I just don't see why this is even an advantage.

I didn't include this in my blog post but in conversations with Eric Holk we called this the "with pattern": you write a higher order function that wraps the closure in a do .. final block as a kind of poor man's destructor, for when the clean up code can't be written with a destructor because it has effects. This is not always ideal but it sort of works.

matklad · February 28, 2024, 2:47pm

Preconditions and postconditions are part of the function signature. Both belong to the function prelude for the same reason we write

fn foo() -> Bar {
    …
}

rather than

fn foo() {
    …
}: Bar

matthieum · February 28, 2024, 5:22pm

Except... the semantics differ.

let file_a = open(…).await?;
defer { file_a.flush().await?; }
let file_b = open(…).await?;
defer { file_b.flush().await?; }

Will flush file_a even if opening file_b fails, whereas the join approach -- whether with defer or do .. final -- will only flush the files if both successfully opened.

In this case, it's probably fine -- there should be nothing to flush on open -- but in the general case of always needing to execute the defer action this won't fly.

I would actually argue the other way around.

Just like try! and ? were provided for sugar on top of match, do! can be provided as sugar on top of defer, even by 3rd-party crates!

Therefore, if only one functionality should make it in rustc and have special-support, it makes more sense that it be the lower-level, more flexible one.

I didn't mean that you couldn't express the functionality provided by defer with a do .. final construct, but quite literally that you couldn't create a defer! macro on top of do .. final.

Except when it comes to Drop, and defer is largely an extension of Drop, so in this sense it's nothing new under the Sun (in Rust land).

parasyte · February 28, 2024, 6:45pm

I don't understand the argument that "X can't be done with macros". Procedural macros annotating a function or similar block scope can more-or-less invent their own keywords and other syntax. Including "If you rebracket the whole following scope, yes of course you can do the rewrite in either direction." They can even remove keywords. It's also a distracting side topic that isn't going anywhere.

re: Drop. There are two issues:

Drop ordering can be confusing, especially with guards. But because it isn't explicitly scheduled near the constructor, it is uncommon to get it wrong. And even when there are problems with drop order, the solution is usually "add another block scope or explicit drop()". I don't see defer solving this issue, but it does create some new ones because:
Drop is opt-out, defer is opt-in.

My criticism is that "optional destructors" is the prelude to many avoidable bugs.

yigal100 · February 28, 2024, 10:47pm

matklad:

withoutboats:

I am puzzled by this. Why would you want to put the precondition and postcondition next to one another and not where they actually apply?

Preconditions and postconditions are part of the function signature. Both belong to the function prelude for the same reason we write
fn foo() -> Bar {
    …
}
rather than
fn foo() {
    …
}: Bar

Except that the analogy doesn't actually work.

The function's pre/post conditions do belong with the signature rather than its body. They ought to be part of the contract for the caller. Consider for example a trait method. Therefore, defer statements would be the wrong tool to express this.

defer statements on the other hand are part of the function body and are useful to wrap over inner implementation resources.

e.g.

fn foo () {
    let file = File::open(...);
    defer file.close();

   // use file here
}

Clearly, defer relates inwards rather than outwards in the common case.

matthieum · February 29, 2024, 5:57pm

My apologies, I was focused on the try! analogy, and did not consider that by macros you included procedural macros as a possibility, rather than the "usual" (for me) macros-by-example like try!.

You are technically correct that a procedural macro can introduce arbitrary syntax. In fact, one could probably write a procedural macro to enable defer today, or one to enable do .. final.

There's a big difference, though, between:

Defining do! as a quick macro-by-example building on top of a built-in defer.
Defining a procedural macro with which to annotate a function, which transforms inner uses of defer into do .. final, and may require additional transformations due to the additional block that do .. final introduces.

The latter is much more complicated, and slower to compile.

I do find it important to note that one built-in allows providing the other easily, and the other doesn't.

It makes the argument clear, for me, that defer is the better MVP, should it come to implement one.

I understand the argument, but you're not presenting any solution for interruptible (async) or fallible (.flush) destructors.

Furthermore, you can, today, using guard types to approximate defer. Creating an instance solely for the side-effect that running its destructor will have is a rather round-about way to go about it. Throw-in the borrow-checking issues, and clearly defer would be much welcome.

Hence, I think there's a place for deferred/final execution, separate from destructors:

To run async work.
To run fallible work.
To run ad-hoc work, rather than creating an instance just for that.

parasyte · February 29, 2024, 8:07pm

Because it's clearly a very difficult problem to solve, and I do not have any proposals for it. (Or I would have proposed them.) As I said before, my inclination here is to lean on the type system as in RAII. But this path usually leads to the unmovable/linear types discussion, and I have little else to add there. What's clear to me is that defer is substantially more error prone than the type system route.

matthieum · March 1, 2024, 5:21pm

I partially agree with you.

First of all, I see defer as a more general feature. The presence of crates like scopeguard clearly shows there's a demand for the ability to "do something" upon exiting a scope, no matter how the exit happens. This kind of clean-up is both:

Typically ad-hoc, so that writing a type for it is overkill.
Doesn't play nice with borrowing, so that built-in support is necessary.

This is for me the primary motivation for a defer language feature; regardless of fallible/interruptible drops.

Secondly, I would note that defer is fully compatible with linear types, or even pseudo-linear¹ types. That is, in the presence of defer, it's possible to express fallible & interruptible drops as functions that consume self and do their things, then make the type linear (or pseudo-linear) and rely on the compiler (or runtime) to point out any error.

In fact, defer may be a necessary prerequisite to !Drop² types since otherwise those types cannot be used in the presence of potential panics.

¹ I call pseudo-linear types those types where the destructor panics, either conditionally or not, to remind users they really should use another method for destructing the type. Not perfect, but in the absence of linear types...

² !Drop types are a subset of linear types where the user must choose between various ways of consuming each value, and for which a Drop implementation is thus not suitable, as it represents a default choice which may not be suitable. For example, an intermediate state in a state-machine should likely be !Drop if executing the state machine to completion matters.

max-sixty · March 2, 2024, 9:00am

What does this mean?

quinedot · March 2, 2024, 9:22am

It's a licensing/legal thing.

vojtechkral · March 2, 2024, 11:48pm

I find the traditional classification of try-finally blocks as control flow dubious, imo without catch it's not really control flow anymore (the finally block simply always gets executed and that's it). This is why the "rightward drift" feels unwarranted, and it's particularly annoying when the block is the whole function - that way you basically pay 2 indents for a function body for what seems like no good reason. For me this is by far the main reason I dislike the do { ...} final { ... } construct, particularly since I expect the whole-function use-case to be the most frequent one. It feels like the syntax should optimize for that.

Rust has shown that you don't need an extra indent over some code just to handle errors well. It would be nice if we could do the same thing when it comes to scope cleanup.

To that end, maybe a modification could save the do { ...} final { ... } syntax: Drop the do { ... } part. Just have finally { ... } and make that only legal at the end of a block. We already have syntactic rules for the last thing in the block anyway (the last expression), so this would just be an extension of those.

Edit: Rewriting @CAD97 's example from #12 using this suggestion, it would look like this:

let items = {
    let mut ptr = /* raw alloc *mut [T] */;
    let mut count = 0;

    while count < ptr.len() && let Some(item) = iter.next() {
        ptr.get_unchecked(count).write(item);
        count += 1;
    }
    if count != ptr.len() || iter.next().is_some() {
        panic!("bad ExactSizeIterator")
    }
    Box::from_raw(ptr)

    finally {
        if thread::panicking() {
            dealloc(ptr.cast(), Layout::for_val_raw(ptr));
        }
    }
};

withoutboats · March 3, 2024, 11:25am

(NOT A CONTRIBUTION)

I don't find that as objectionable as defer. There are two downsides that should be noted:

You don't know a block has a final block until you look to the bottom of it; the do signals the beginning of a block with a final block.
Logically it should run before destructors of the enclosing block, whereas final runs after the destructors of the do block. If you want it to run after destructors, you need another block anyway.

Not sure how these should be weighed in deciding a syntax, just writing them down for future reference.

vojtechkral · March 3, 2024, 12:12pm

Yes, good point. Is there an advantage to have it run after dtors? I can't think of any at the moment, my intuitive preference would be to have it run before dtors.

This also highlights the difference that with the do-final syntax you can only access things outside of the do block, but panics are handled only for the inside. Basically, it's an (acces XOR handle-panics) situation. The trailing finally block on the other hand is (access = handle-panics). Just wanted to point that out also.

vojtechkral · March 4, 2024, 2:28pm

Now that I'm thinking about it a bit more, I think neither the do-final nor the trailing-finally solutions really work. Coincidentally, I'm currently writing test code that goes something like this:

{
    let docker = Docker::connect_with_socket_defaults()?;
    let container1 = docker.run("foo/foo").start().await; // FIXME: error handling
    let container2 = docker.run("bar/bar").start().await;
    let container3 = docker.run("baz/baz").start().await;

    tests.run().await;

    // FIXME: tear down and remove containers (requires async)
}

This is a fairly tricky case: These resources require async cleanup, their construction is fallible and there's several of them.

With the do-final notation, we get something like this:

{
    // ...
    let container1 = docker.run("foo/foo").start().await?;
    do {
        let container2 = docker.run("bar/bar").start().await?;
        do {
            let container3 = docker.run("baz/baz").start().await?;
            do {
                tests.run().await;
            } final {
                container3.stop_and_remove().await;
            }    
        } final {
            container2.stop_and_remove().await;
        }    
    } final {
        container1.stop_and_remove().await;
    }
}

... if I understand the do-final proposal right, that is. It works, but the rightward drift gets pretty bad here.

With my trailing-finally suggestion we run into the uninit vars issue:

    let container1 = docker.run("foo/foo").start().await?;
    let container2 = docker.run("bar/bar").start().await?;
    let container3 = docker.run("baz/baz").start().await?;

    tests.run().await;

    finally {
        // container3.stop_and_remove().await; // may be uninitialized, what do???
        // container2.stop_and_remove().await; // dtto
        container1.stop_and_remove().await;
    }

I guess the compiler could generate an Option wrapper for these? Though that seems a bit magicky, idk if I like that at all.

With defer, we get something like this:

    let container1 = docker.run("foo/foo").start().await?;
    defer || container1.stop_and_remove().await;
    let container2 = docker.run("bar/bar").start().await?;
    defer || container2.stop_and_remove().await;
    let container3 = docker.run("baz/baz").start().await?;
    defer || container3.stop_and_remove().await;

    tests.run().await;

This seems relatively nice and clean, as long as people keep the defer statement right next to the binding it pertains to. @withoutboats is right to point out that there's nothing stopping people from sprinkling these defers all over the place, which would make them hard to trace and figure out.

Just for kicks, let's see what this would look like if defer were a .keyword on an expression, though I'm neither sure how I feel about this nor how offensive this is to others:

    let container1 = docker.run("foo/foo").start().await?.defer {
        container1.stop_and_remove().await
    };
    let container2 = docker.run("bar/bar").start().await?.defer {
        container2.stop_and_remove().await
    };
    let container3 = docker.run("baz/baz").start().await?.defer {
        container3.stop_and_remove().await
    };

also what the precise semantics should be wrt. variables access.

So yeah, this is pretty tricky, I can't quite say that I like one solution over others, all of them feel like tradeoffs.

matthieum · March 4, 2024, 5:16pm

I would note that a more optimal solution -- syntactically -- for the do .. final case would be:

let mut containers = Vec::new();

do {
     containers.push(docker.run("foo/foo").start().await()?);
     containers.push(docker.run("bar/bar").start().await()?);
     containers.push(docker.run("baz/baz").start().await()?);

     let (container1, container2, container3) = (&containers[0], &containers[1], &containers[2]);

     tests.run().await;
} final {
    //  FIXME: used unordered join if concurrency is desired.
    for container in containers {
        container.stop_and_remove().await;
    }
}

Still not as lightweight as defer, but much flatter and less boilerplatey than the original example. And further abstraction would help.

CAD97 · March 4, 2024, 9:22pm

Refactoring again to essentially match drop flag codegen:

let mut (container1, container2, container3) = (None, None, None);
do {
    container1 = Some(docker.run("foo/foo").start().await()?);
    let container1 = container1.as_ref().unwrap();
    container2 = Some(docker.run("bar/bar").start().await()?);
    let container2 = container2.as_ref().unwrap();
    container3 = Some(docker.run("baz/baz").start().await()?);
    let container3 = container3.as_ref().unwrap();

    tests.run().await;
} final {
    if let Some(container) = container3 {
        container.stop_and_remove().await;
    }
    if let Some(container) = container2 {
        container.stop_and_remove().await;
    }
    if let Some(container) = container1 {
        container.stop_and_remove().await;
    }
}

I would love to write let outer @ Some(ref inner) to bind both names at once, but you can't mix assignment to both existing and fresh binding names for good reason.

Actual drop glue codegen might bypass the need for drop flags, since it can emit different drop glue per exit, but IIUC rustc currently doesn't do this, instead always building IR using drop flags then potentially optimizing the drop flags back out. (This allows unified single-exit cleanup since the IR always does out place assignment.)

rpjohnst · March 4, 2024, 11:30pm

Using with sugar:

{
    let docker = Docker::connect_with_socket_defaults()?;
    with container1 <- start_stop_remove(docker.run("foo/foo")).await?;
    with container2 <- start_stop_remove(docker.run("bar/bar")).await?;
    with container3 <- start_stop_remove(docker.run("baz/baz")).await?;

    tests.run().await;
}

async fn start_stop_remove(container: Container, body: impl async FnOnce(&Container)) -> Result<(), DockerError> {
    let container = container.start().await?;
    do {
        Ok(body(&container).await)
    } final {
        container.stop_and_remove().await;
    }
}

vojtechkral · March 5, 2024, 12:05am

Ok, but that's not quite the same as you have to handle explicit state now. Whether that's a good thing I'm not sure. In this case I'll almost certainly end up with a holder for these resources, so maybe it's fine. Not sure whether generally.

Just for fun, if compiler generated those Options for you (essentially exposing the drop flag), using the trailing finally syntax:

{
    let container1 = docker.run("foo/foo").start().await()?
    let container2 = docker.run("bar/bar").start().await()?
    let container3 = docker.run("baz/baz").start().await()?

    tests.run().await;
    finally {
        if let Some(container) = container3 {
            container.stop_and_remove().await;
        }
        if let Some(container) = container2 {
            container.stop_and_remove().await;
        }
        if let Some(container) = container1 {
            container.stop_and_remove().await;
        }
    }
}

(Though ironically this way there's more rightward drift of the finally block.)

Hm, I'm having a hard time figuring out what the capturing rules would be here, as the finally is propagated into the caller function...

rpjohnst · March 5, 2024, 12:15am

I'm referring to the with sugar I linked in this post, from Koka. There are no new capturing rules; instead the remainder of the block containing the with is passed to start_stop_remove as a closure, and if you inline those calls you get your original do/final version.

Topic		Replies	Views
Pre-RFC: defer statement language design	16	3514	August 18, 2022
Pre-pre-pre-RFC: implicit code control and defer statements language design	8	1247	April 18, 2024
Blog post: Async Cancellation language design	10	6438	February 18, 2022
Blog post: A formulation for scoped tasks language design	20	2785	July 31, 2023
Can we reduce the burden of cancel-correctness for async Futures? language design	21	6892	September 7, 2019

A defer discussion

Related topics