Pre-RFC: Leave auto-trait for reliable destruction

Indeed. Added panic/unwind semantics to "feedback requested" section.

FWIW, crossbeam's scoped threads block until all of the scoped threads are terminated - I assume for safety reasons. Such behavior would be possible with the ImplicitDrop variant of this proposal -- perhaps that's a better direction. I'd like to understand the panic recovery use cases better; pointers appreciated.

I like to think about this visually so it hopefully gives more of an intuition. Looking at the control-flow graph of a function, like the following.. (block bodies are MIR and not important)

we can see every function call (like json::encode above) has two possible outgoing edges, one if the function returns and an unwind edge if it panics. Notably, this is also true of destructors.

In the unwind path, implicit destructors are run, and unwinding continues up the stack.

We would therefore need something like defer which "submits" explicit destructor code to run in both the graceful return and unwind paths, to ensure that the value is always explicitly destroyed.

5 Likes

One implication being that in this proposal, passing a !Leave value to Vec::push wouldn't work: the allocator could panic, causing push to drop your value on the floor, since it doesn't know how to destructure it.

A Vec::try_push API would work, because it returns a Result that hands you back your value in the Err case. This makes it your responsibility to handle the Err case and destruct the value yourself.

Of course, the compiler must know that no functions called by try_push can panic, or we could create the same situation as before! So there must be an assertion somewhere, either an unsafe one in try_push or on the signature of the functions it calls, that they won't panic.

2 Likes

This feels like the main foundational sticking point in the approach. It really scares me if using this would mean needing to make a new "can panic" effect that everything would need to be generic over, and it would even mean things like not being able to even use addition when one of these types in scope (as that can panic in debug mode).

8 Likes

Thank you for this proposal!

About the vec example:

for e in v { // OK: All values are returned from the iterator
    e.destroy(); // OK
    if roll_dice() { break } // Error: breaking leaves elements in the iterator
}

How does the compiler know that all values are returned from the iterator? After all, Iterator::next can call arbitrary code. It is my impression that the iterator would have to be destroyed explicitly, whether or not it contains elements – unless std::slice::Iter implements Leave for must-destroy types and performs a run-time check that the iterator is empty when dropped.

The other problem is panics. It seems to me that enforcing that a section of code can't panic is very difficult.

One solution is to always use defer statements, which are run even in the case of a panic.

Another solution is to have a #[no_panic] attribute for functions that mustn't panic and are very limited in functionality (no arithmetic except saturating_*/checked_*/wrapping_* functions, no allocations, ...).

It seems to me that enforcing that a section of code can't panic is very difficult.

Yes, that seems more complicated than necessary. I'm not suggesting that - apologies for the noise. Let me clarify: If a panic unwind occurs, and there is a live must-destroy type on the stack, the panic handler can:

  1. Abort the process (my initial proposal)
  2. Call drop (the ImplicitDrop and Leak variations)
  3. New: Add another default destructor for must-destroy types, such as destroy(&mut self)
  4. Some form of deferred expression, although I'm not suggesting this - my defer example was intended for normal (non-unwind) control flow.

How does the compiler know that all values are returned from the iterator? After all, Iterator::next can call arbitrary code.

Great point. Future will have the same problem when destructed after Poll::Ready(..). It must be safely destroyed using a runtime invariant check, through a destructor. Possibly, such a destructor could be the same as for panics, i.e. (2) or (3) from the paragraph above.

1 Like

Thanks for taking the time to write this up!

As already discussed on Zulip, I agree that especially the async cancellation situation is tricky and might lead to unexpected errors, and I'm looking forward to improve on this. However I'm not 100% convinced with whether it's worthwhile to generalize this more.

I think for the synchronous function world, this RFC might have not that strong of a motivation. The main one from the text I can find is that the extra scope for crossbeam would not be necessary - but I think this is a minor cosmetic issue, and it might not justify adding more complexity into the language. If there are other stronger motivators, I would recommend adding them. Future-proofing for something isn't the best motivator, because more often than not we find out that things which were added for future-proofing aren't the things we need later on.

Questions

  • How does the compiler know which function destroys the object? Apparently all of destroy, fancy_destroy and async_destroy do it. Would there need to be an attribute (like [#destructor]) on the method? If yes, how much is this different from directly calling drop/async_drop, and just requiring this must be used?
  • How do you get a !Leave future out of an async fn which doesn't take self? Would it be based on any parameter?
  • What is the panic behavior? This had more or less been pointed out in other questions, but it came up for me too.

Usage concerns

I'm a bit worried this adds another marker trait to the language. The experience with my team (> 20 developers, non Rust-experts) is that those are hard to follow for people who are not that much invested into the language (or in advanced programming languages in general). Send/Sync make sense to most after a few explanations. But e.g. Unpin and UnwindSafe are something that most users really don't want to be concerned with - not even in error messages. It's currently unclear to me how much forwarding of ?Leave or !Leave bounds this change requires.

In addition to this I think negative trait impls are currently also not stable (but my knowledge might be outdated). So I think this needs to also work with an inversion like Unpin/PhantomPinned - which makes things even trickier.

I also share the concern that this might be hard to use within loops, iterators and other control flow. I fought plenty of battles with the compiler, trying to tell it that I really re-populated variables after I used them in the last loop iteration or doing other things that the borrow-checker didn't really understand.

Overall

Overall I'm feeling this is an attempt to generalize over a couple of challenges and doing more clever static analysis via traits. However I'm feeling like it might lean a bit too much into the "generalization" area, and might therefore be not very accessible to a large set of users. And as others pointed out, this generalization might also require additional generalization work (e.g. the panic "effect" that @scottmcm pointed out), and therefore require a substantial deep-dive to understand how everything fits together. Before doing this, it might be worthwhile to figure out which user-problem this solves which can't be solved another way.

For the "unexpected async cancellation" challenge, there is the proposal of introducing async fns which run to completion, which level-sets them with synchronous functions. For synchronous as well as asynchronous code which must run - even if panics occur - defer blocks might be an interesting solution. While they won't provide the strong guarantees that static analysis via trait checking can provide, they seem to be well understood by most users.

For other challenges - we should identify them :slight_smile:

3 Likes

One example where this could be useful is the guard pattern. An API designer could implement !Leave for an RAII guard, to force the user to destruct the guard manually. This can prevent bugs where a guard is accidentally dropped too late.

It doesn't need to know. A function is a destructor if it takes self by value and doesn't return it. However, there's no compiler magic involved, a destructor just takes ownership of the value and destructures it. This prevents the Drop impl from being called.

I guess that you'd have to hold a value that implements !Leave in the async function across an .await point.

I don't think this is a problem. The standard library can simply provide a ZST that implements !Leave, which can be added to other structs to opt out of Leave.

1 Like

Is this a new mechanism you are proposing? Because it is not currently possible to use destructuring to destroy a Drop type without its Drop impl being invoked.

I wasn't aware of that, my bad. That means that a !Leave type can't implement Drop.

1 Like

Thanks for the detailed feedback @Matthias247.

Details

How does the compiler know which function destroys the object?

Through destructuring the type into its constituent parts (i.e. let Self { c1, c2 } = foo). Most commonly, through a method that takes self. No annotations are needed.

How do you get a !Leave future out of an async fn which doesn't take self ? Would it be based on any parameter?

Either drive it to completion locally or return the future back to the caller. Please give an example if I misunderstood your question.

What is the panic behavior? This had more or less been pointed out in other questions, but it came up for me too.

Explained here. The discussions have shown we need to tweak Leave or break it up.

General

Overall I'm feeling this is an attempt to generalize over a couple of challenges and doing more clever static analysis via traits. However I'm feeling like it might lean a bit too much into the "generalization" area, and might therefore be not very accessible to a large set of users.

There's a big difference between just "a new auto-trait" and "how invasive is the new trait". Unpin turned out quite invasive, and this is problematic whereas Sized is less invasive. I am only happy if we achieve minimal mental overhead, have sane defaults and actionable compiler errors.

The main one from the text I can find is that the extra scope for crossbeam would not be necessary - but I think this is a minor cosmetic issue [...]

The extra closure is needed because the desired logic cannot be safely expressed in Rust's type system (due to leaks being allowed). This is not jost a cosmetic limitiation, but a type system limitiation, which is why scoped tasks (in async) can't be implemented today. Given the trajectory of async, not having static borrowing across tasks is a very real issue, with problematic workarounds: "Arc everywhere" are not zero-cost, causes boilerplate noise and has correctness pitfalls.

I'm a bit worried this adds another marker trait to the language.

Yes - don't worry, the "is it really worth it"-discussion is unavoidable - independent of which proposal gets widespread scrutiny. My goal here is making this proposal as good as possible before that. Note that nobody, not even me, wants to merge this into Rust at this point. That's why I started this thread.

I also share the concern that this might be hard to use within loops, iterators and other control flow.

Yes, we need to see how it "feels" in real world code. Fortunately, we do have some precedent - uninitialized variables is the spiritual inverse of this feature.

For the "unexpected async cancellation" challenge, there is the proposal of introducing async fn s which run to completion.

Yes, and here's the link. This proposal deserves its own review. Very briefly, my perspective is:

  • Some futures are state machines.
  • Some state machines should not be dropped prior to reaching a terminal state.
  • In non-panic cases, a state machine that is not driven to its terminal state correctly should be able to cause compiler errors (much like must_use does - but stronger), requiring the attention of the programmer.

This proposal assumes that this is a general problem which deserves a general solution. It is not clear to me that a new future type with an unsafe interface that needs a parallel eco-system of combinators is a less invasive change, but I'm not ruling it out either. I think more prototyping is needed, especially this proposal.

1 Like

Updated proposal

Due to your feedback, the original proposal needs to be changed. Originally, I stated that Leave is a super-trait of Drop. This meant that no destructor was available for a !Leave type, even in the panic case, which in turn makes unwinding hard to do.

Additionally, I realize I was trying to solve two related, but separate, problems:

  1. Some state machine types (like certain Futures) should not be arbitrarily dropped under normal non-panic control flow. Instead, they should be driven to their terminal state.
  2. Types that cannot leak, i.e. live past the lifetime that rustc expects them to have.

Both of these could be auto-traits, and considered independently. These traits could be merged to a single trait which upholds the guarantees of both, but for the sake of discussion and separation of concerns I present them separately.

ImplicitDrop auto-trait

An !ImplicitDrop instance cannot be dropped regularly by going out of scope or through std::mem::drop(..). This forces the user to destruct the instance gracefully on all code paths, or optionally force-destroying the value.

Purpose: Prevent accidentally breaking invariants of state machine types.

How to destruct an !ImplicitDrop type:

  • Graceful: Destructure the type (typically through a fn foo(self, ..) function).
  • Forceful: Add an "explicit/force drop", e.g. std::mem::destroy<T: ?ImplicitDrop>(t: T) function:
    • destroy invokes drop as usual but suppresses the error.
    • destroy is safe, but should be considered potentially invariant breaking. It can be invoked by a caller that has driven the instance to its terminal state (e.g.when an iterator is consumed or a future is ready)
    • destroy is invoked on the panic-unwind path, just like a regular drop.

Use cases: Futures, iterators and custom state machines that should be driven to completion before dropped can force the user to consider all exit points. Generic containers offer ?ImplicitDrop impls if they can uphold this contract transitively.

Alternatives: There's some wiggle room for considering a lint instead of an error here. The problem with lints is transitivity in nested types and so on.

UPDATE: I was not aware that destructuring a type currently invokes Drop. (Thanks @mbrubeck for pointing it out). We need to find a better definition of "accidentally permaturely dropping".

Leak auto-trait

Very simple: it is unsafe to leak a !Leak type. As such, not implemented by Arc and Rc, and other generic containers that can leak.

Leaking is defined as outliving the input lifetime.

Immediate use case: scoped threads and async tasks.

1 Like

For a similar reason, I think traits object types must have an implicit Leave bound by default, unless explicitly overriden with dyn Trait + ?Leave.

Otherwise, a type like Box<dyn Read> could own a !Leave type. All existing trait object types would become !Leave, breaking existing code.

1 Like

@mbrubeck Yes, that sounds completely right (although I just updated the proposal and no longer using Leave). I thought this was inferred by the compiler for auto-traits, but maybe I'm wrong?

Could you please give or point to an example of code that does not compile because a certain type involved has got the new !Leak implemented for it?

So far it seems as if embedding PhatomData<&'a u8> into a type would be sufficient to not allow it to bubble up above the stack frame associated with lifetime 'a.. This brief definition is probably obvious to people familiar with scoped threads but on its own it looks quite opaque..

Update: .. what is input lifetime?

Sure, for instance:

struct Foo(u64);
impl !Leak for Foo {}

let foo = Foo(42);
let _ = Rc::new(foo); // Error: Rc<T> not implemented for T: ?Leak
mem::forget(foo); // Error: mem::forget(t) not implemented for T: ?Leak

As you can see, there's nothing magical - the implementation of Rc, Arc, mem::forget etc can simply not uphold the contract that is required by !Leak, so they won't let you. Note that they rely on unsafe code in order to cause leaks in the first place - it can't happen by accident in safe Rust.

Not so for Rc and Arc. Even if you set an input lifetime to 'a (say until end of the stack frame), you can still insert it into an Rc or Arc. Combined with a cycle, you can end up with a reference that points to non-valid data, but it's no longer reachable, so it is currently considered safe. (See safe_forget example)

When the whole scoped threads debate happened in 2015 one proposal was to restrict Arc and Rc to have T: 'static which would prevent any non-static references. It was rejected because it would constitute a breaking change. I personally think this was an unfortunate decision - it's why we're in this situation today. However, this was in 2015 and it also doesn't matter what I think, so here we are and I'm hoping we can fix it :slight_smile:

Update: .. what is input lifetime?

I may be using this term wrong (defined here) but what I mean is this:

struct Foo<'a>(&'a u64);
struct Bar;

{
  let n = 12; // 'a begins
  let foo = Foo(&n);
  Rc::new(foo); // input lifetime is 'a
  // create some cycle here..
  let bar = Bar;
  Rc::new(bar); // input lifetime is 'static for types that don't have references
} // 'a ends
Rc::new(Foo)

Note that you could create a safe version of Rc<T: ?Leak> by adding the bound T: 'static.

Also, Leak is slightly misleading, since we're not interested in preventing memory leaks per se, but rather preventing references that outlive a certain lifetime (even when those references are no longer reachable in single-threaded code).

What worries me is that Leak and ImplicitDrop would have to be automatically added to all trait bounds (like Sized) to not break existing code.

So a lot of APIs would have to be modified to support !Leak or !ImplicitDrop types. For example, struct Box<T: ?Sized> would have to be changed to struct Box<T: ?Sized + ?Leak + ?ImplicitDrop>.

3 Likes

What worries me is that Leak and ImplicitDrop would have to be automatically added to all trait bounds (like Sized ) to not break existing code.

That's the case for auto-traits in general though, like UnwindSafe, Unpin, etc. And these traits can be considered separately – Leak delivers more value in the short term, and its contract is trivial.

Btw, the async-scoped crate illustrates perfectly the situation with scoped tasks in async:

  • The scope_and_block API uses the "crossbeam trick", of blocking the calling thread (i.e. should not be used in an async context), which provides safety but consumes an entire thread for each scope.
  • The scope_and_collect API does not, but it is exposed as unsafe (see Reddit discussion).

From the docs:

However, the user should ensure that the returned future is not forgotten [i.e. leaked] before being driven to completion.

A Leak trait would instead allow the compiler to enforce this guarantee.

2 Likes

I just created a "safe-leaks crate" which demonstrates different ways that data can be leaked in safe Rust today. It's also a challenge! If you know any other techniques, feel free to send a PR :slight_smile:

No easy way to test, but if you allow 'static values as well there's a few more techniques that can be used - including passing ownership of the value to a different thread and letting that thread spin/be parked until the end of the program (at which point the thread will be killed without dropping anything).

1 Like