Idea: `with` block to ensure definite resource release

your example can not prevent things like previous example @Lej77 mentioned. If you put this in a async block and await after it, the async block itself can be forgotten, thus the guard is also forgotten.

i believe that if designed well, with block can prevent those situation and help people actually ensure a resource will be released. Otherwise, if you indirectly forgotten a lock guard, it can be a dead lock very hard to debug.

thank you:)

I want to make an intermediate conclusion here to explain the with block more clearly:

Rational:

Normal let statement and Drop actually cannot ensure guards or resources to be released, for example in UB in std:thread:JoinGuard. And this is also the reason why std::mem::forget is a safe function.

What I think is that if there is a special context (i.e. with block) that forbid certain operations to guards and resources, the guards and resources can be ensured to be released after exiting the context. Because of the different context and limitations that with block provides, the with block is different from RAII.

Current Methods:

Closure

Some newer apis now use closures to ensure the guards will be released, such as std::thread::scope.

The similarity of with block is: the inner context of the with block is functioning like the context inside the closure of std::thread::scope.

The advantages of with block are:

  1. it looks clearer when managing multiple resources other than recursive closures.
  2. many old apis are not using closures to ensure releasing, and they may also cause UBs like JoinGuard. By using with block, those guards can archive same context of closure to ensure releasing.

Drop(RAII)

Drop actually cannot ensure guards or resources to be released, for example in UB in std:thread:JoinGuard. So, it is actually relying on programmer to not forget them. However, in some situations forgetting is indirectly like examples of async blocks. Using with blocks can prevent these situations.

Pin

Even though Pin structure prevents movement of variable, it can not prevent situations in async block like previous examples. And the pointer to pinned guards can be forgotten.

1 Like

It's worth noting though that unless with is required to be used with an API, Drop still cannot be relied upon for soundness, as the caller could just not use with.

It's for this reason that I don't think with adds anything meaningful to the language, as forcing the use of with seems impractical and I'm unsure whether it even provides the needed guarantee in the face of e.g. mem::swap.

6 Likes

Would a with block also disallow the uses of std::process::exit, because this would not release the resources (if yes, how do you track the execution of this function inside of another function, or worse, a closure you received)?

How do you handle panic = "abort"? What about double panics? What about ptr::write?

As @CAD97 already posted, the user could always not use the with block. I think that this behavior (if possible to implement with some defined behavior for the situations mentioned above) should rather be implemented by a trait (e.g. SurelyDropped). That trait would require a Drop impl and the compiler would then insert drop glue at some additional control points (disallowing moves could be put into some super trait, as i think that could be more general (maybe we should just leverage Pin for that)).

To be clear, I think that this might be a good feature, my issue is that there are too many unclear situations where this could go badly (and a new control flow structure also seems to be the wrong tool for the job).

Two points I'm wondering about: Why would it need to introduce a block? How good is the comparison to Python given that it's with primarily serves the roles of destructors (that get invoked in exceptions) and that it's ExitStack actively allows retro-actively deciding against dropping as well as dropping before scope end?

The insistence of differentiating between block vs. scope is due to Rust and its interesting and surprisingly, refreshing take where blocks primarily return values; they are not inherently the start and end of lifetime scopes since non-lexical lifetimes were enabled. Tying semantics to a block only really makes sense, imho, if there is something inherently semantically important about the return value. Which doesn't seem to be the case and, in my experience, the restriction to one return type is the one annoying thing about the scope construction—the other being that it's non-orthogonal to control flow.

From a pure design persective, what kind of guarantee / invariant are we promising in the first place? It seems to narrow down to: 'this guard object's Drop is called before some lifetime 'a ends'. That can actually be encoded in a type:

// Note: correct lt-variance is non-trivial
struct IsDropped<'a, T> { inner: T, _scope: PhantomData<fn(&'a ())> }

impl IsDropped<'a, T> {
    /// Safety: Caller promises to drop the returned value before 
    /// the lifetime ends, by any means they wish.
    pub unsafe fn new(val: T) -> Self {
        IsDropped { inner, _scope: PhantomData }
    }

    /// Safe, but potentially useless constructor:
    pub fn with_static(val: T) -> IsDropped<'static, T> {
        IsDropped { inner, _scope: PhantomData }
    }
}

// Edit: there was DerefMut which definitely is _not_ sound.
// TBD: unsafe methods to acccess such as &mut IsDropped -> Pin<&mut _>

Such a type would have the advantage that the invariance can be observed by other code. If, at any point, we receive a reference to such an instance then we can rely on the sequencing of the scope to the given Drop. In particular, a guard might allow scheduling a callback into the guard (aka. ExitStack::callback) and the caller is able to rely on it being called before end-of-scope. Or, conversely, if IsDroppped is a receiver type then the type T can provide methods that do so.

The following API should be enabled by such a type:

impl std::thread::Scope {
    // Via `with_static`, this safely subsumes `std::thread::spawn`.
    pub fn spawn(self: &IsDropped<'env, Scope>, …)
}

Now, how can we actually use this? The first comes to mind would be a simple macro like pin!; but I'm almost certain there's some other way that I'm overlooking on this first take.

Try it out here

Usage, tl;dr this does not compile:

let value = Scope;
is_dropped!(value);
    
{
    let test = ();
    Scope::spawn(value, &test);
}

Here's the macro magic to make it work:

is_dropped!(value);
// … expand to something like:

struct InnerScope<T>(PhantomData<fn(T)>);
impl<T> InnerScope<T> {
    fn bind(&self, _: T) {}
}
impl<T> Drop for InnerScope<T> { fn drop(&mut self) {} }

let scope = ();
// This having Drop forces it to be dropped lexically, i.e. at end of scope.
// Which requires `scope` to be alive at that point.
// Which implies scope outlives `$value`.
let scope_guard = InnerScope(core::marker::PhantomData);
scope_guard.bind(&scope);

// Hide the name as a `&mut _`.
// which makes it impossible to move and thus forget.
let ref mut $value = unsafe { IsDropped::new($value) };
fn unify_lt<'env, T>(
    _: &mut IsDropped<'env, T>,
    _: &InnerScope<&'env()>,
) {}
unify_lt($value, &scope_guard);
3 Likes

Your IsDropped is contravariant over 'a; shouldn't it be invariant?

Also, the macro is not sound, you can still do:

let value = Scope;
is_dropped!(value);
// SAFETY: I drop the created value (but I don't drop the old value).
std::mem::forget(std::mem::replace(value, unsafe { IsDropped::new(scope) }));
1 Like

Yes, fixed the variance.

Hm, good question. The specific requirement as written isn't perfectly sound but this feels like cheating by language-lawyering, not a true subversion. I don't think you could violate the guarantees with a 'safe' primitive? How about requiring the scope lifetime should be generative because the scope-variable can only be named within the macro?

I hope so (I tried and failed, but didn't put much thought into that).

But then the safety proof is non-local - not saying this is a blocker, but it makes it harder to reason about.

I think OP is confused with guarantres between variables and values. It is not guaranteed for values that it s Drop impl will be called when it becomes unreachable. However, for variables it is guaranteed that the drop impl of the value will be called when it is unreachable and holds a value.

But it still is possible for code within the variable's scope to move out the value from it, which makes the variable logically empty and prevents calling destructor at the end of its scope. I think it's desirable to be able to construct a variable like the let but disallow to move out from it.

1 Like

That is also the original idea of this proposal.

Generally, I think the compiler should not inject codes before panic and non-returning functions, even for the reason of preventing forgetting. That is kind of breaking principle of rust, I think. But I think the releasing of guards are maybe not very important in panicking because process may just be terminated, although this may break some guarantees. This behavior needs further discussions.

However, for function non-returning like exit i think there should be a compiler warning or error.

For ptr::write, i think ptr::write does not break the guarantees that with block provides. Plus, the ptr::write is already depending on programmer to determine whether the behavior is safe and sound.

And I think for compatibility reasons old guards only using let statement should still be compiled. However with can provide a more specific context and provide more checks.

This behavior is also used in python and java. In python, variables are managed by both rc and gc. Normally, if you did not close the file before function returning, the file is actually closed when function returns, just like RAII. However, if you do not use try-with in java, the resource will just be forgotten and not assuring to be closed, and this behavior is not like Rust though.

If you do not want/need the additional guarantee, then the approach from @HeroicKatora (with some small fixes) should give you exactly what you need, I created a gist that tries to fix the problems with the approach (I might have missed some other problems though...).

How do you want to track this? I could call a function not labeled with fn() -> ! that also never returns.

To prevent the use of mem::replace the with block would already need to only give access to Pin<&mut T> (requiring one to get rid of the Pin to use ptr::write), so I think ptr::write is not an issue here.

This is true, but a macro can do the same.

The API in java/python needs to trust their users to call close. In rust the API needs to trust the user not to call mem::forget/Box::leak. This is slightly different, in that it generally prevents the casual "oh i forgot to clean up all of my resources and now my application performs poorly". Of course some users might still use mem::forget, but the chance is high that they read the docs and realize that it might not be a good idea. Malicious users will still be able to use unsafe, even with a with block to do all kinds of things, so the only added benefit of adding such a construct (which I believe to be equivalent to the gist API) is:

  • better communicating in the API that the value needs to be dropped.
  • adding additional hurdles for users that accidentally use mem::forget.

Otherwise I do not see any benefits (and because the library solution is as good as a native lang construct solution, I do not think that creating this as a lang construct is a good idea, especially since the syntax is so similar [although adding it to stdlib would be a good idea, if we really want this]).

3 Likes

You make some good points. but i still think the macro version looks a little bit magic, i would rather use a closure than the macro.

However, i still have problem maybe historical.

If we can trust users, why did we remove JoinGuard in the first place? In addition, why we only removed JoinGuard and replace it with closure, but did nothing to other guards.

Because there's a difference between them.

Forgetting to close a resource, leaking memory, etc. is a safe logic error. The program's behavior is still defined.

Failing to scope a scoped thread, however, is Undefined Behavior. If you fail to join a scoped thread, then the thread can have dangling references and do use-after-free.

If you prevent that, it's perfectly fine to detach a thread; this is exactly what thread::spawn does, and the 'static requirement ensures that any resources that it accesses will continue to exist.

A scoped thread, however, because it allows the use of non-'static resources, must be joined before the end of the lifetime to prevent Undefined Behavior.

The behavior that Rust considers undefined is what Rust really cares about preventing, and refuses to just "trust the programmer" to uphold in the C manner, without an explicit unsafe acknowledgement of the proof burden. For behavior not considered unsafe, though, Rust does trust the programmer to not write logic errors. We definitely want the default pit of success to not fall into subtle traps which are defined but not unsound, and do a lot of thoughtful API design to assist in such, but at the end of the day preventing Undefined Behavior is the important part, and preventing logic errors can be compromised in the name of usability without being an academic proof assistant.

9 Likes

Your is_dropped macro is doing stack pinning which guarantees dropping only when used in sync functions, not in async functions. This means it is unsound when used in async functions. It's the same problem as already discussed in Idea: `with` block to ensure definite resource release - #3 by Lej77 and Scoped threads in the standard library, take 2 by bstrie · Pull Request #3151 · rust-lang/rfcs · GitHub

2 Likes

Cross-posting the zulip discussion.

It could be fixed by adding some mechanism to avoid holding a value over an .await, but that does not seem to be ideal, as it does not account for CompletionFuture (which would allow holding such a value over .awaits, as the caller will either cancel it using the cancel function or poll it to completion, in both cases the value would be dropped at the right moment).

I do not know if a with-block construct would be a better approach, because a type-based approach would integrate much better into the current ecosystem.

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.