[Pre-RFC] Destructuring values that impl Drop

whitequark · June 22, 2019, 4:50pm

Feature Name: destruct_avoid_drop
Start Date: 2019-06-22
RFC PR: rust-lang/rfcs#0000
Rust Issue: rust-lang/rust#0000

Summary

Allow destructuring structs and enums that impl Drop through the use of an attribute.

Motivation

Drop is typically implemented on a container for some resource such that the resource is released at the end of the container lifetime. Often it is useful to extend the lifetime of the resource beyond the lifetime. For example, the methods:

Container<T>::into_inner(self) -> T, or
Container<T>::map<U>(self, f: Fn(T) -> U) -> Container<U>,

which are implemented on many (if not most) containers do so. In this case, the resource must be moved out of the container, and Container::drop() must become a no-op.

There are 4 general approaches to implement this:

If T: Clone, the resource can be copied, and the old copy destroyed. Although the semantics of this is similar to moving the resource, it is strictly worse, because it is less efficient, and T::clone() can panic, typically because creating the copy exhausts some finite reserve.
If T: Default, the resource in the container can be swapped with a new empty instance of the same kind of resource using core::mem::replace. Unless T::default() is expensive, which is unlikely, this is a good solution. However, many resources do not have a null value.
If T implements neither Clone nor Default, the resource can be wrapped in an Option<T>, which reduces this to case (2). This requires unwrapping the Option<T> each time it is used, and is particularly bad if the container implements Deref, because then each auto-deref incurs a branch, even in release mode. Even worse, this changes the representation of the type, and breaks #[repr(transparent)].
If the project specifies #![allow(unsafe)], then the container may be transmuted (using core::mem::transmute or a union) into a struct or enum with the exact same structure but which does not implement Drop. Then, the resource can be simply moved out. However, it is not possible to guard against introducing UB in the future: there is no way to statically assert that two given types have compatible representations.

To summarize, approaches (1) and (2) are not fully generic, approach (3) is not zero cost, and approach (4) uses fragile unsafe code to implement an obviously safe operation.

An unfortunate additional aspect is that the following (safe) code:

pub trait Cleanup {
    fn cleanup(&mut self);
}

pub struct Ref<'a, T: Cleanup + 'a> {
    inner:    &'a mut T,
    consumed: bool,
}

impl<'a, T: Cleanup + 'a> Ref<'a, T> {
    pub fn new(inner: &'a mut T) -> Self {
        Ref { inner, consumed: false }
    }

    pub fn into_inner(mut ref_: Self) -> &'a mut T {
        ref_.consumed = true;
        ref_.inner
    }
}

// impl<'a, T> Deref for Ref<'a, T> { ... }

impl<'a, T: Cleanup + 'a> Drop for Ref<'a, T> {
    fn drop(&mut self) {
        if !self.consumed {
            Cleanup::cleanup(self.inner);
        }
    }
}

has been incorrectly accepted by the pre-NLL rustc, and is going to be accepted in indefinite future by the NLL rustc, although it emits E0713.

Now, it would be quite nice if Ref above could be rewritten as follows:

pub struct Ref<'a, T: Cleanup + 'a> {
    inner: &'a mut T,
}

impl<'a, T: Cleanup + 'a> Ref<'a, T> {
    pub fn into_inner(ref_: Self) -> &'a mut T {
        let Self { inner } = ref_;
        inner
    }
}

impl<'a, T: Cleanup + 'a> Drop for Ref<'a, T> {
    fn drop(&mut self) {
        Cleanup::cleanup(self.inner);
    }
}

This is currently rejected with E0509: “cannot move out of type Ref<'_, T>, which implements the Drop trait”. The reason, of course, is that moving parts of ref_ out would make dropping it undefined behavior.

Since it is always safe to not invoke drop(), I propose that when a value that implements Drop is destructured, drop() is not invoked. I.e. the behavior is identical to the approach (4) above, except it is robust and requires no unsafe code.

Since it is highly undesirable to silently omit drop() calls, I propose that this only happens when an attribute #[avoid_drop] is specified on the value that is being destructured; in case of into_inner above, it will become:

    pub fn into_inner(ref_: Self) -> &'a mut T {
        let Self { inner } = #[avoid_drop] ref_;
        inner
    }

This issue comes up regularly (SO 1, 2, urlo 1, irlo 1, 2, 3, /r/rust 1, 2, blogs 1, 2).

Guide-level explanation

If a value implements Drop, its parts are not normally allowed to be moved out. However, this rule is a problem for containers that implement Drop, because it makes extending the lifetime of the contained object beyond the lifetime of the container inefficient. To address this, the container can be destructured using a let or match expression on a value marked with the #[avoid_drop] attribute:

// For a struct container:
fn into_inner(self) -> T {
  let Self { inner } = #[avoid_drop] self;
  inner
}

// For an enum container:
fn into_inner(self) -> T {
  match #[avoid_drop] self {
    Self::Full(inner) => inner,
    Self::Empty => panic!("empty container"),
  }
}

To allow the container implementation to evolve in a backwards compatible way without causing memory leaks, destructuring a value with #[avoid_drop] is only allowed in the same scope that can implement Drop for that value, i.e. in the same crate.

Reference-level explanation

An attribute #[avoid_drop] is added that is only valid directly on in one of the following positions:

On the let right hand side: let pat = #[avoid_drop] expr;,
On the match value: match #[avoid_drop] expr { pat => .. }.

When this attribute is specified, expr must have type T: Drop, where T is a struct or enum defined in the same crate where the attribute is placed, and pat (each arm in case of the match) must denote a struct or enum pattern respectively. Its effect is that, when/if the pattern matches, every field of the struct/variant is moved out, and the matched expression is forgotten. The fields that are not bound to any variable are dropped. (In other words, the compiler acts as-if T: !Drop for the duration of this let or match.)

Alternatively/optionally, since expression attributes are unstable and lack a roadmap to stabilization:

An attribute #[avoid_drop] is added that is only valid directly on in one of the following positions:

On the let statement: #[avoid_drop] let pat = expr;,
On the match statement: #[avoid_drop] match expr { pat => .. };.

(The rest is identical.)

Drawbacks

The use of #[avoid_drop] can introduce leaks, particularly if it is used together with a pattern that ignores remaining parts with .. and more fields are added later, together with the Drop code to discard them.
An attribute is added that changes semantics in major ways. (There is precedent for this: #[may_dangle]).
The language gains another special case.

Not really a drawback of this proposal, since it is orthogonal, but I’d like to emphasize the following point:

The #[avoid_drop] attribute allows moving out fields instead of calling drop(), but not in the drop call (which would be the so-called by-value Drop).

Rationale

Ignoring Drop is always safe according to the Drop contract.
Requiring unsafe or non-zero-cost code to perform an operation that is always safe is highly undesirable.
Moving out parts of values implementing Drop is common in container implementations, but nowhere else, and so it does not warrant any complex changes to Drop logic.
An attribute is a perfect fit for an ad-hoc special case in pattern matching.
The implementation is straightforward: ensure that the syntactic form is correct, suppress E0509, and mark the value as consumed instead of dropped.
The use of the attribute is restricted to the same crate to avoid leak hazards; see the section below.

Avoiding misuse

This SO question deals with the case of a hyper::client::Response object. (It’s not clear which version of hyper, so I’m using 0.9.5). In it, the Response struct is defined as follows:

pub struct Response {
    pub status: StatusCode,
    pub headers: Headers,
    pub version: HttpVersion,
    pub url: Url,
    status_raw: RawStatus,
    message: Box<HttpMessage>,
}

impl Drop for Response {
    fn drop(&mut self) {
        if /* complex condition elided */ {
            self.message.close_connection()
        }
    }
}

The SO question asks for a way to move response.headers out of Response. In principle, #[avoid_drop] could allow this, and if it did, it would be tempting to use it indeed. However, in that case, the Drop logic would be skipped, causing a leak.

Let’s consider a similar hypothetical situation with a following struct:

pub struct Response {
    pub connection: Connection,
    status: Status
}

impl Drop for Response {
    fn drop(&mut self) {
        self.connection.close()
    }
}

In this case, it looks like user code that invokes #[avoid_drop] would not cause any undesirable behavior–after all, such code probably specifically avoids closing the connection. But if a new private field is added, which must be cleaned up in drop() after all, a leak will appear.

Both of the hazards above as well as any other similar ones are avoided with a simple rule: only the code that can opt into Drop can opt out from Drop. This means restricting #[avoid_drop] to the same crate.

Alternatives

Not doing this, and suggesting people keep using Option<T> or unsafe code in all containers that need it.
Hiding the unsafe code in a library using custom derive (the drop-take crate does something similar, but does not allow implementing into_inner() above; it is probably possible to extend it to do so). This approach is inherently invasive to the user code, since it can only present a safe facade if it hijacks impl Drop and/or changes representation.
Any number of approaches (1, 2, 3, 4, 5) that radically change how Drop works. It is not clear that any of them are viable at all, but all of them are certainly very invasive to both the language and the compiler code.
Instead of an attribute, allowing destructuring of Drop types with a warn-by-default lint if they are also constructible (i.e. all fields are visible), and forbidding it if they are not.
Instead of an attribute, use a keyword, such as move.
Instead of an attribute, using something like a magic wrapper struct, with similar syntatic requirements.
Instead of an attribute or a magic wrapper struct, adjust ManuallyDrop to work like Box, that is make the compiler track the status of fields being moved out (see this example).

Prior art

See above in the Alternatives section.

Unresolved questions

Should #[avoid_drop] be an expression or statement attribute? An expression attribute reads better, but a statement attribute can be stabilized easier.
Should a lint be used instead of an attribute? Arguably, it solves every drawback, while being just as easy to implement.

Gankra · June 22, 2019, 5:14pm

Another alternative that accomplishes a similar result but in a more familiar way: just unconditionally allow destructuring Drop types, but add a warn-by-default lint for it.

Then any method which wishes to do it can #[allow(destructuring_drop)].

CAD97 · June 22, 2019, 5:25pm

As an alternate syntax, use move instead of #[avoid_drop].

I think allowing destructuring all Drop types, even with a strongly worded deny-by-default lint, isn’t desirable. With the exception of types where all members are visible. (This is a slight reformulation of “same crate”.)

I’m not sure what cross-crate cases would both have all pub items and a nontrivial Drop implementation, but they should be safe to skip the Drop of as the user necessarily knows everything about the type.

Whatever choice is made, if a .. is present it should definitely drop the individual parts, and just skip the drop for the external wrapper.

As two more options for manually implementing into_inner, someone can use ManuallyDrop or MaybeUninit as an "untagged Option" and mem::forget(self) to skip drop glue.

horrible, horrible (but actually kind of ok (not really)) idea I'm not gracing with a new post

Make ManuallyDrop work like Box in that you can do partial moves out of it but also allow doing so even when it wraps a drop type as it removes the drop glue from the type anyway

This gets you destructuring of any Drop type with the simple syntax of

let Thing {
    field,
    ..
} = *ManuallyDrop::new(thing);

whitequark · June 22, 2019, 5:26pm

I agree that this is an alternative (which I’ve added to the list), but I think it has a drawback: with your suggestion, adding Drop is not breaking enough. That is, if downstream code destructures some struct, and I implement Drop on it, I want all downstream code to loudly break instead of silently leaking. And a lint (even a deny-by-default lint!) would be capped, allowing downstream code to keep building.

Adding impl Drop for T probably should not break destructuring T downstream so that this serde use case stays viable.

whitequark · June 22, 2019, 5:43pm

Added as alternatives.

This is something I've considered, but it doesn't work as well on enums, since it's not possible to make a private variant.

Yes, this is explicitly the semantics I propose.

You can't do that in safe code because ManuallyDrop::into_inner requires moving ManuallyDrop itself, so you get the exact same problem.

You can do it in unsafe code, but there's a near-infinite amount of ways to do this in unsafe code (ptr::read and so on), and I don't want to enumerate them because this operation should not require unsafe in the first place.

RustyYato · June 22, 2019, 5:48pm

I don't think enums ever implement Drop in practice.

Gankra · June 22, 2019, 6:13pm

summary of offline discussion on downstream matchers needing to break:

“It’s pretty extraordinary for a type to suddenly gain a destructor, and for it to have public fields, and for people to be matching on it. Pretty scary for users to be able to poke at state when your type’s state is so important that it needs Drop”

“might happen for hyper Responses, but the important state would be added as private fields. Partial matches would still work in this case.”

“it would be very reasonable to deny destructuring a drop type if you don’t have permission to access all fields (aka permission to initialize it).”

so adjusted proposal:

hard error to destructure a Drop type you cannot initialize, and only a warn-by-default lint if you do have permission to initialize.

dtolnay · June 22, 2019, 6:37pm

I have a major use case for destructuring Drop types that is not met by this match #[avoid_drop] proposal. In serde_json we have:

pub enum Value {
    Null,
    Bool(bool),
    Number(Number),
    String(String),
    Array(Vec<Value>),
    Object(Map<String, Value>),
}

If someone creates a deeply nested Value, its compiler-generated Drop will overflow the stack—serde-rs/json#440.

The correct fix would be to implement Drop for Value in a way that uses a loop and heap-allocated stack to drop the whole thing without recursion. I have code for this. The problem is:

We cannot implement Drop for Value because it breaks the ability to destructure Value, which is an extremely common and important use case that needs to be supported;
We cannot implement Drop for Vec<Value> because it overlaps Vec<T>'s impl;
We cannot implement Drop for serde_json::Map<String, Value> because implementations of Drop cannot apply to a specialized choice of type parameters;
We cannot implement Drop for serde_json::Map<K, V> because that requires a recursive call to V’s drop so wouldn’t solve the problem.

I believe the stack overflows are not possible to fix today nor with this proposal. Instead of (or in addition to) the match #[avoid_drop] in this proposal I would need some attribute on Value or on its Drop impl that says the type is fine to destructure despite having a Drop impl.

whitequark · June 22, 2019, 6:43pm

After considering your proposal carefully, I realized that it is the only alternative so far that eliminates 2 out of 3 drawbacks entirely (since it would essentially be removing—partially—a special case, instead of adding one), and could eliminate the last remaining drawback by adding a second lint that warns on destructuring Drop types with a pattern that uses ...

whitequark · June 22, 2019, 6:50pm

Unless I'm mistaken, @Gankra's suggestion would make the code you want possible. Since Value can be constructed by any downstream code, it would also be destructible by any downstream code without any changes that have to be applied to every call site.

Under @Gankra's suggestion there will be a warning. Early on, this warning would be easy to turn off globally using #![allow(destructuring_drop)] (since you're tied to nightly anyway), and I imagine that before stabilizing destructuring_drop, we would add an attribute that disables the warning for one specific type. Kind of the inverse of #[must_use].

Earlier, I have said that I would like adding an impl Drop for T to specifically break downstream code destructuring T. Now I see that this is not so desirable. I think your use case is very interesting and definitely worth pursuing.

Centril · June 23, 2019, 12:25am

There's always #[non_exhaustive].

This is just #[non_exhaustive].

pnkfelix · July 1, 2019, 12:17pm

Here is another piece of potentially related work that I did not see on the (unfortunately quite tersely encoded) list in the Alternatives section:

RFC PR 1180: Propose Interior<T> data-type, to allow moves out of the dropped value during the drop hook.

(at least, to me the Interior<T> type seems like it would have the same structure as the set of struct fields in the interior of the Self { ... } pattern on the receiving end of #[avoid_drop] here)

mzabaluev · July 1, 2019, 12:56pm

I wonder if this could help take away some of the pain with implementing finalizer methods for containers of must_consume values that I have proposed: [Idea] Attributes to warn if value is dropped without being acted upon

whitequark · July 3, 2019, 10:44am

Thanks for the pointer, I wasn’t aware of that RFC.

My understanding is that the complexity of the linked RFC wasn’t justifying its benefits in light of ManuallyDrop, which is why it was closed. Does that sound right?

pnkfelix · July 5, 2019, 9:26am

Yes, that sounds 100% correct.

If I recall correctly, RFC 1180 focused mainly on the code in an impl Drop for .... But the case you are discussing here seems relevant to a much larger audience, and therefore we might be able to get more buy-in to spend time investigating a solution.

system · October 3, 2019, 9:32am

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Pre-RFC: allow by-value drop ideas (deprecated)	12	3213	March 25, 2019
Destructuring Droppable structs language design	61	2584	October 13, 2024
Edition idea: Should any type that has drop glue be considered to impl Drop?	20	2368	June 9, 2021
Re-use struct fields on drop; was: Drop(&mut self) vs. drop(self) language design	33	4941	March 25, 2019
Drop bounds are useless, so let's change that language design	5	655	September 16, 2024