Idea: `with` block to ensure definite resource release

Rust currently use Drop to manage resource auto-release. However, it has been proven that this mechanism is unable to protect memory leak or forgetting guards.

Many languages with garbage collections are using with blocks to ensure resources will be release after exiting the block.

And it occurs to me that, if we introduce a with block to ensure resources or guards can be only allocate in the stack and forbid users to move it, maybe it can ensure that resources or guards will not be forgotten and will be definitely released after exiting the block.

Example: Currently, some apis are already using closures to ensure resources or guards will not be forgotten, such as std::thread::scope. It looks like

thread::scope(|s| {
   // Doing things
});

with with blocks, it can be written like:

with let s = "thread::scope_guard" { // scope_guard are a hypothesis name. And the binding syntax is also hypothetical.
  // Doing things
}

// The scope_guard only survives within the block. 
// In the block users can not move it, so it cannot be forgotten.
// After exiting the block, the scope_guard will be released.
// 
// We can also add a new resource management trait/marker, to ensure a guard-like object will only be called with `with` block.

With with blocks, we can use less closure in these scenario, and maybe provide more information to compiler making it easier to optimize.

I didn't found a similar proposal in internals forum and github issues, sorry if i propose this again. This is a premature idea, however, i think it can be useful.

What this is doing is basically just stack pinning, a la pin_mut!.

It's worth noting that if you have

fn example(f: impl FnOnce()) {
    setup();
    f();
    teardown();
}

example(|| do_stuff());

this can be written to the same effect as

macro example() {
    setup();
    let _defer = OnDrop(|| teardown());
}

example!();
do_stuff();

thanks to hygiene.

In general, stack locals are guaranteed to have their destructors run before their memory is reused, so long as they aren't moved from. Introducing a with isn't meaningful since Drop is already consistently called.

The value of with in a GCd language is that with a GC the drop equivalent (usually called collect) isn't called at the end of scope, it's called sometime later when the GC collects the object. with effectively introduces an RAII scope to call a cleanup function deterministically at exit.

Plus, with would only be useful for soundness if it were required to be used. I don't have a proof, but I suspect that the scoped thread API would be sound if it used Pin to guarantee that the scoped thread arena is dropped rather than a closure interface.

6 Likes

Pin does have its Drop guarantee so it can be used in this way sometimes. Unfortunately the guarantee is not strong enough for APIs such as thread::scope since it only guarantees that data owned by the pinned pointer won't be invalidated before the pointer is dropped. It doesn't guarantee that borrowed data will remain valid until the pointer is dropped.

In the following code for example it would not be sound to allow other threads access to the borrow reference by assuming we could join in the Drop impl:

use core::marker::PhantomPinned;
use core::pin::Pin;

struct Guard<'a> {
    value: u32,
    borrow: &'a u32,
    pinned: PhantomPinned,
}
impl<'a> Guard<'a> {
    pub fn new(value: u32, borrow: &'a u32) -> Self {
        Self {
            value,
            borrow,
            pinned: PhantomPinned,
        }
    }
    pub fn start_background(self: Pin<&mut Self>) {
        // Send pinned values elsewhere (such as another thread)
        // We could send a pointer to `value` but not to `borrow`.
        println!("Start background")
    }
}
impl<'a> Drop for Guard<'a> {
    fn drop(&mut self) {
        // Drop guarantee ensures this is called before `Self` is deallocated.
        // It doesn't guarantee that this is called before `borrow` becomes invalidated since it is not owned by `Self`.
        println!("might never run if `Self` is forgotten");
    }
}

fn main() {
    let value = 2;
    let mut guard = Box::pin(Guard::new(4, &value));
    guard.as_mut().start_background();
    // Drop is never called (but we don't free the guard, so we don't break the "drop guarantee")
    core::mem::forget(guard);
}

fn alt_main() {
    let value = 2;
    let scope = async {
        // This could be in a separate async function:
        let guard = Guard::new(4, &value);
        tokio::pin!(guard);
        guard.start_background();
        // Yield:
        futures::future::pending::<()>().await;
    };
    let mut scope = Box::pin(scope);
    // Run to the yield point of the scope:
    futures::FutureExt::now_or_never(scope.as_mut());
    // Ensure drop impls after the yield point isn't run:
    core::mem::forget(scope);
}

Playground

1 Like

I did propose a macro/stack-pinning approach in the scoped thread RFC (link to the comment) but as @Lej77 said it was found to be unsound due to stack pinning not guaranteeing locals to be dropped in async blocks and functions.

1 Like

i think introducing with block can maybe provide a stronger guarantee, so that the acquired resources/guards cannot transfer ownership, and thus the resources/guard cannot be called by forget or other function.

Ah, and the same would apply to a macro_rules! based approach, as the extra behavior which using a closure gives you is that the enclosed code is no longer async and thus cannot .await, which it could if using drop glue emitted by a macro (effectively, pinned by hygiene)

What is the difference between

with let s = thread::scope_guard() {
    // ...
}

and

let s = &mut thread::scope_guard();
// ...
1 Like

with let x ... in semantic can limit operations to x. for example x cannot be moved, so you cannot transfer it to heap or called by std::mem::forgot because it requires argument by move.

Note the &mut in my second example. Because the value is immediately put behind a reference, it cannot be used by-value either.

If you want to prevent forget(mem::take(s)), well... now every API which takes &mut T cannot be used with a with let value. (And then I ask, let s = &expr;.)

4 Likes

with block is different from the example. because with is a block, so you can not intersect the variable like the second example.

Sorry if this hypothetical syntax may be confusing.

In python, with block is like

with GetResource() as res:
    # do stuff with res

so I used let in hypothetical syntax to binding variable, but it is not a let statement variant. It is actually like:

with Variable_Initization {
  // do stuff
}

How do you uphold those guarantees though? I don't see how that would be possible while allowing .await inside with blocks. The problem is that while you can not mem::forget the resource created in the with block you can mem::forget the whole Future that contains it, thus forgetting the resource in the process.

1 Like

Maybe forbid await inside with? Otherwise, I think std::thread::scope api may suffer from this problem too.

I think that's a typo; I don't know what it means to "intersect" a binding.

Then at that point, what's the advantage over introducing a closure? A closure naturally restricts access to .await, as well as any other future control flow which could potentially cause issues.

2 Likes

Could you not wrap any object guarded by with with ManuallyDrop, defeating the Drop guarantee introduced by with, of course you could detect such usage and emit an error, but then you would also need to traverse newtypes.

I do not really see a use-case for this. When you really need to defer some teardown, creating a zero size struct/closure wrapper with the teardown in the Drop impl is a good way to handle this. It is already used in many places and provides everything that with would also provide.

Also when do you come across the situation where you want something to be dropped, but use mem::forget anyway? I rarely use mem::forget (especially not when i am handling resources that need dropping), so i am curious for your use case.

There are currently a lot resources and guards that are not using closure. so, they may be forgotten directly or indirectly in previous examples. Introducing with block, can help user to mark those they do not want to forget.

Like previous examples, forgetting and memory leaks may be indirectly. with can provide closure-like guards to all resources.

However, the closure wrappers can be difficult to look when managing multiple resources, and hard to analyze for compiler. for example, if you have three guards, it may look like:

guard_a(|x| {
  guard_b(|y| {
    guard_c(|z| {
       // do real stuff here
    }
  }
}

but with with block, guards can be initialized in one line.

with let x=guard_a, let y=guard_b, let z=guard_c {
  // do stuff here
}

That's why i think with can be better than closures.

So what is meaningfully different with using

let x = &mut guard_a();
// which is just a shorter version of
let mut x = guard_a();
let x = &mut x;

This also ensures that the value is only used by reference.

I could not find such an example, could you maybe show some running rust code that "indirectly" mem::forgets a value? I have never come across such a situation, so i find it really hard to image how it happens.

Not if you use a macro that simplifies it:

macro_rules! with {
    (let $var:ident = $guard:ident => { $($inner:tt)* }) => {
        $guard(|$var| {
            $($inner)*
        })
    };
    (let $var:ident = $guard:ident , $($tail:tt)*) => {
        $guard(|$var| {
            with!($($tail)*);
        })
    }
}
#[derive(Debug)]
struct GuardA;
#[derive(Debug)]
struct GuardB;
#[derive(Debug)]
struct GuardC;

fn guard_a(_: impl FnOnce(&mut GuardA)) {
    todo!()
}
fn guard_b(_: impl FnOnce(&mut GuardB)) {
    todo!()
}
fn guard_c(_: impl FnOnce(&mut GuardC)) {
    todo!()
}

fn main() {
    with! { let x = guard_a, let y = guard_b, let z = guard_c => {
            // do stuff here
            println!("{x:?}, {y:?}, {z:?}");
        }
    }
}
2 Likes

In Python, the with statement means exactly the same as creating a local and dropping it at the end of the block scope in Rust, it's just (superficially) more complicated, and the reason it's necessary is that the language doesn't guarantee deterministic destruction like Rust does universally.

I fail to see how adding different syntax for the exact same purpose would help — there have been many such "this syntax would be cute/I like it better/it is familiar to Language X users" proposals in the past, but this argument doesn't nearly reach the threshold for changing the core language. If you like the Python syntax better, feel free to use Python, just do not expect Rust to be changed solely based on what other languages do.

3 Likes

I feel this is a bit harsh; @xjkdev has multiple times expressed that they believe that with scoping would be semantically different from a plain let binding. Whether the semantic difference is real or meaningful is a separate question, but @xjkdev is not proposing different syntax for the same purpose, they are proposing something they believe is an extension of what can be semantically expressed.

(As my posts show, I disagree with this understanding, and believe that taking an immediate &mut reference achieves the exact benefit they're after. But the point stands that they aren't just proposing sugar.)

3 Likes