pre-RFC: the Destruct trait

This the second RFC regarding to the Drop trait. The first one is here.


Summary

This rfc introduce a new trait Destruct to move some of the Drop duty into itself, to allow custom pattern match to move fields out.

The trait is defined as the following:

trait Destruct {
    #[match]
    fn destruct(self);
}

Which require the #[match] attribute described in another RFC.

Motivation

Right now, moving a type that is Drop from a pattern match results in E0509. Also, as the Drop::drop method receives a borrow, it do not have full control on the object so we have to do some unsafe trick to enable partial move etc.

In theory, dropping an object can be divided into two steps:

  1. Do everything necessary to announce this object is dropped. This should only be done once, and after this the object is no longer a valid object. This requires only a mutable borrow.
  2. Depart the object into its pieces, and then drop all child objects. This requires full object ownership.

Today, Rust object that do not implement Drop can optionally go through step 2 via pattern match, and those implement Drop expected to go through step 1 via Drop::drop. However, as Drop::drop do not have full object ownership, it have to use unsafe tricks to do the step 2 jobs when necessary. This is why this proposal is needed.

Things allowed by this RFC

  1. The rule that disallow moving out from a Drop type is relaxed: moving is allowed as long as the type implements Destruct.

  2. Programmers can now write safe code to write complicated dropping logic (child object dropping order can be precisely specified).

Why new trait

This RFC intended to be 100% backward compatible. This means the existing code does not need to know about the meaning of this RFC; they simply works the same way it did.

For this reason, if we simply add new methods to the Drop type we have to specify how and when the new methods to be called, and a lot of compatibility issues have to be addressed.

Introducing a new trait, on the other hand, is easier as existing code need not know about it.

Demostration of benifit

The following code was stolen from std::mem::ManualDrop documentation (with minimal modification that require a call before the drop logic):

use std::mem::ManuallyDrop;
struct Peach;
struct Banana;
struct Melon;
struct FruitBox {
    // Immediately clear there’s something non-trivial going on with these fields.
    peach: ManuallyDrop<Peach>,
    melon: Melon, // Field that’s independent of the other two.
    banana: ManuallyDrop<Banana>,
}

impl Drop for FruitBox {
    fn drop(&mut self) {
        FruitBox::notify_owner("consumed!"); // Always called before destruct
        unsafe {
            // Explicit ordering in which field destructors are run specified in the intuitive
            // location – the destructor of the structure containing the fields.
            // Moreover, one can now reorder fields within the struct however much they want.
            ManuallyDrop::drop(&mut self.peach);
            ManuallyDrop::drop(&mut self.banana);
        }
        // After destructor for `FruitBox` runs (this function), the destructor for Melon gets
        // invoked in the usual manner, as it is not wrapped in `ManuallyDrop`.
    }
}

This can be rewitten without ManuallyDrop and unsafe code as the following:

struct Peach;
struct Banana;
struct Melon;
struct FruitBox {
    box_id: usize,
    peach: Peach,
    melon: Melon, // Field that’s independent of the other two.
    banana: Banana,
}

impl Drop for FruitBox {
    fn drop(&mut self) {
        notify_owner!("{} consumed!", self.box_id); // Always called before destruct
    }
}

impl Destruct for FruitBox {
    #[match]
    fn destruct(self) {
        match self {
              //Note: notify can be called here as well; but it is not recommended
              // because destruct can be bypassed.
              FruitBox{box_id:_, peach, melon, banana} => {
                  self.peach.destruct();
                  self.banana.destruct();
                  // calling melon.destruct() is optional
               }
        }
    }
}

Guide-level explaination

Definition

trait Destruct {
    #[match]
    fn destruct(self);
}

The attribute #[match] attribute indicates this method must in its very first step, pattern match on self that moves, and so self will be invalid in any other occurrences, as self cannot be rebinded in Rust. See this RFC for detailed restrictions.

Derive

Destruct can be derived to all user types with the usual #[derive] attribute:

#[derive(Destruct,Clone)]
struct MyStruct(...)

The default code is just a simple ignoring match, and after optimization, nothing (of cause, as usual, the drop glue on child objects need to be called). This is not very useful for types that is not Drop, but if it is Drop, this changes the expectation of the drop method: partial move is now not allowed.

Restriction

If a type implements Destruct, its Drop cannot do partial move. Attempts to do so results in undefined behavior.

Reference-level explaination

Invocation

Destruct::destruct will be called automatically whenever a value implements Destruct out of scope:

{
    let destructable = Destructable::new(...);
    // Generated or manually
   destructable.destruct();
}

However, an explicit pattern match that moves the value will bypass Destruct::destruct.

{
    let destructable = Destructable::new(...);
    match destructable {
       ...
    }
   //No code generated
}

Interaction with Drop

If a type implements both Drop and Destruct, Drop::drop will be called before any pattern match that moves the value.

{
    let destruct_drop= DestructDrop::new(...);
    // generate only: (the users  are not able to write this, as they are today)
    // destruct_drop.drop()
    match destruct_drop { // Note, E0509 is not emitted
       ...
    }
}
{
    let destruct_drop= DestructDrop::new(...);
    // generate only when destruct is not used manually;
    // if calling destruct manually, drop will be called inside destruct,
    // before the pattern match
    // destruct_drop.drop()
    // generated or manually
    destruct_drop.destruct();
}

Compatibility

If a type implements both Drop and Destruct, the drop implementation will have to ensure when it returns, the object is still valid except that the drop glue is not to be called again. This means it should not do partial move, even in unsafe code. Attempts to do so will be unconditional UB, because when it eventually being pattern matched, there will be uninitialized values being accessed.

The Drop implementation writers should also keep in mind that the drop method usually don't have to take care its child objects. Let the child objects do their things is the best. In Destruct the programmer can control how the child objects should be handled.

This will not affect existing code: if a type does not implement Destruct, its behavior do not change.

Drawbacks

Increased complexity by introducing a new trait, also increased learning difficulty as users have to learn the restrictions behind the mechanism.

Rationale and alternatives

Alt 1: Do nothing

We then have same problems we have today: not able to move out from Drop types without unsafe code. The Drop method is unintuitive and the theory behind it is not convincing enough.

Alt 2: Define destruct method without #[match], or limit the #[match] to Destruct only

We expect the method reader knows that they were constrained to do things in a specific way. An attribute will help them identify this is the case and they can understand by reading the documentation. As #[match] seems to have no use out of Destruct, we may initially do this for Destruct only. Then we need to answer questions about "why I cannot use this attribute in other context".

Alt 3: Alter the Drop to require destructure of the Drop type

Many other proposals of this exists. The main problem is backward compatibility, and interaction with pattern matching destructure.

Here is @CAD97's proposal:

trait Drop {
    fn drop(&mut self);
    fn destroy(#[no_drop_glue] self) {}
}

This can be done in a way to have self typed ManuallyDrop or similar. (To be continued)

Alt 4: Introduce a magic container for Drop types having drop glue removed

This is a proposal from @stepancheg. However in rust, we tent to have more magical traits and less magical containers. For example, Rc and Arc used to be magical containers, and now their magic were removed.

It also requires to alter the existent Drop, and so need to justify compatibility issues.

However, this is also a very attractive alternative and have its beauty. Maybe we can combine the idea to introduce a Destruct trait that accepts a magical container instead of marking the method.

Prior Art

There are a lot of discussions before...

and so on...

(TODO)

Unresolved questions

The name of the trait and the exact definition. The exact wording of the behavior and restriction.


Updated to address @CAD97 's comment.

Bikeshed color: Destroy.

The RFC should have some mention of why a new trait is (needed? desired?) for this purpose. Another possibility could put this method onto Drop itself without introducing a new trait:

trait Drop {
    fn drop(&mut self);
    fn destroy(#[no_drop_glue] self) {}
}

Some other notes:

  • What’s the point in implementing Drop when you have Destruct? (Which is why my soft proposal was to rewrite Drop instead of introducing a new trait.)
  • Motivation should probably include a reason that you’d want to move out of self during Drop.
  • What benefit does the derive offer?
  • Destruct seems to suppress the “cannot move out of Drop type” restriction; if this is intended it should be highlighted more, but I’d argue that it, especially where still calling Drop::drop would be more surprising than useful. (For this, if #[match] is accepted in its current state, a #[match] method that converts to a non-Drop type would seem more obvious in behavior.)

This proposal falls in my Alt 3. I would be happy to add your code to it.

This was highlighted as we want the Drop to be two stepped.

Sure. I will need to study previous proposals and see if I can find some convincing examples.

The derive is telling: this type uses the new Drop behavior, aka No partial moves in drop.

The point is Destruct is not guaranteed to be in use - for example, people can manually destruct by custom pattern match. However, this proposal guarantee that drop is still called in this situation. So for RAII purpose, people write drop rather than destruct unless they don't want the user do custom pattern matches (for example, keeping all fields private etc)

I beg to differ – however that's just my own opinion. A more objective question: what's a supporting argument of it being counter-intuitive, and/or what's wrong with "the theory" behind it?

5 Likes

The unintuitive part is that many people will think a drop method should receive self by value. With this proposal, although this is not changed, but people will immediately see Destruct and think about the reasons behind them.

Another thing that Drop have problem with is that it fulfill all requirements for object safety, ans so it is object safe. However, &dyn Drop does not make much sense. I think it should be disallowed, although this is out of the scope of this RFC.

And many other people don't, for example those coming from the other prominent RAII-supporting language, C++, and who consequently are intricately familiar with the interaction of value semantics with RAII. But anyway, implementing drop in any case should not be a mere question of intuition, since if you are implementing drop yourself, you are probably doing something nontrivial, to say the least. Therefore I think it's justified to require that at least the basic idea behind why drop takes &mut self be understood and learnt clearly in order to be able to implement it.

I disagree – yes, it might not "make sense" in the sense that it's not very useful. But why actively disallow it, when it's something that just falls out of the type system? By this reasoning, we could disallow many other "useless" constructs as well (e.g. empty enums since we have a built-in never type, !); however, that would not add any value – the existence of these constructs isn't harmful, and just disallowing them one-by-one would only add corner cases and exceptions to the language, complicating it with no good reason.

3 Likes

I feel the final RFC, and preferably this pre-RFC too, should explain how Destruct would interact with the Pin API and the invariants it guarantees. Pin and Drop already interact badly, but we are stuck with a Drop::drop(&mut self) instead of Drop::drop(self: Pin<&mut Self>).

3 Likes

I would like to point out that the direct comparison against C++ shouldn't be made for many reasons:

  • C++ doesn't have the concept of default-moveability. If you take something by value, you are copying it.
  • C++ doesn't have (at to date) destructive move, in the sense that if you move from A to B, A still exists and has a defined (but unspecified) state. The typical example is a string, which can be used after a move once clear is called
  • C++ has implicit implicit self/this, and every method takes a const or mutable reference to the object instance (because, as said before, no destructive move exists). Considering that C++ is extremely conservative and it doesn't want to break old codebases, the situation won't change even if the language starts supporting destructive move.

Said that, C++ destructors are quite black magic, in the sense that there are some unintuitive behaviours. For instance, a class without the declaration of a destructor is trivially destructible, but if you define an empty destructor, it is not anymore. Another example is that every method and function in C++ can throw an exception by default, except for destructors.

I am saying this only because there are good things in C++ that could be inspirational for improvements, but IMHO destructors are not.

4 Likes

I'd like to echo the sentiment that I have no strong intuition about whether drop "should" consume self or just take &mut self.

It's also still far from clear to me that there's any meaningful benefit to adding all this machinery instead of using ManuallyDrop, even though I've said that on at least one of the previous threads. Despite the length of the pre-RFC, it's difficult to even find what the alleged benefit is supposed to be, which is unfortunate since "why does anyone want this?" seems like by far the most important part of this proposal to me. There's also still no attempt to explain why anyone would ever want to move out of self during drop in the first place, which is still a situation I've never encountered myself.

I think the alleged benefit is that this proposal makes it possible to move out of self during drop() without unsafe code, but I'm not sure I even believe that because this pre-RFC also says:

which sounds like it's merely replacing unsafe code with the possibility of UB in safe code.

4 Likes

Requiring a match inside the function is odd. AFAIK Rust doesn’t have such a strong correlation between function definition and function body anywhere else. This brings unnecessary complications, e.g. all code before match (if any is allowed) is special. Match arms that capture the whole object have to be detected and forbidden separately.

1 Like

I know. I know all of that, I’ve been using C++ for 7 years before switching to Rust.

Of course, C++ destructors are not the same as Rust’s drop. It’s kind of redundant to reiterate this – this is obvious (to me anyway), they are distinct languages, after all.

In any case I’m most definitely not saying drop should work like C++ dtors either. In fact, I’m quite happy with the way it works now, and I wish C++ dtors worked like it instead.

All I’m trying to say is there are important similarities between the two languages in this regard, which means that for most former C++ users, several concepts should make sense in Rust which newcomers might find puzzling. That is, at a higher, conceptual level – not necessarily in finer details.

For instance, for some problems around desruction, it doesn’t matter whether pass by value means destructive move, nondestructive move, or implicit copy: naively passing the object by value to drop or the destructor would equally result in infinite recursion. Similarly, since in both languages, the destructing function gets a mutable reference to the object, it also doesn’t matter from this point of view whether the parameter happens to be implicit or explicit – it’s just the way it is, and it is, again, similar.

1 Like

With my lang team hat on (tho not speaking officially for the lang team), what I am looking for here is:

  • If UB is prevented, how much is prevented? In other words, how prevalent is the problem, how easy is it to get wrong and cause UB?

    From my impression, it doesn’t seem terribly common.

  • Can this be done in a procedural macro in a way that covers most use cases and prevents potential UB effectively?

    The procedural macro proposed in another thread seemed to cover most cases and was U.B. proof?

  • Can the attribute be generalized / be useful generally? or will it only revolve around Drop?

I’m quite wary of introducing attributes that are little ad-hoc knobs in the type system that A) cannot be generalized, B) are for niche use cases. This is not to say that the proposals floated thus far are non-starters, but please bear these points in mind.

Finally, a general note about attributes: While attributes are a light weight approach syntactically, they can affect semantics in non-trivial ways and be quite complex in every other way but syntactics. Attributes are thus a good way to hide complexity away; we should be as weary of using them when it affects the type system as when we are using normal keywords.

The problem with only allowing in-fn-def destructure is that it cannot handle enums. (That’s why this idea of requiring a by-move match was brought up.)

1 Like

One clarification I'd like to bring up:

Partial moves are already disallowed in Drop::drop. Because you have &mut self, you can't move from self, only modify. (That's the whole reason for the proposal).

Specifically, because child drop glue is called after Drop::drop, self needs to be in (at least) a valid state. That is, it needs to be safe to drop all members of self after the Drop::drop implementation is ran.

The ManuallyDrop pattern works because the dropped state is an unsafe but valid state for the type, and the drop glue for it is empty. This means that it is safe to (within Drop::drop, at least) drop or take its contents, leaving it in a junk state until reclaimed by the memory allocator.


Please clarify why this behavior of Drop + Destruct types allowing a structural by-move-out match outside of the Destruct implementation is desirable. I (still) think this is more surprising than useful.

Also, a clarification of what Drop::drop should be used for in a world with Destruct. I still don't see the motivation for the new trait other than to make a new magic trait rather than rejigging the magic on the most magical trait, Drop.

Avoiding magic isn't a benefit when you're just putting the same magic on a different trait. Avoiding compatibility issues isn't meaningful when it can be maintained via a simple default implementation.

1 Like

Well, it seems obviously nice to no longer need unsafe for this. Whether that's "meaningful", of course, is a more complicated discussion.

A good point. I believe Drop::drop(self: Pin<&mut Self>) is the right constraint, as it implies exactly what Drop + Destruct implies for drop. However as this is suppose to be a 100% backward compatible RFC, let us fix it afterwards.

The other benefit I want to have is to allow by-move pattern match on Drop types.

Oh this is not what I mean. I see someone already spotted out that drop is already not allowing partial move, so maybe I was wrong and I need to adjust it.

It would mean we don't need to adjust fn drop(&mut self) to fn drop(self: Pin<'_, Self>)? Because, Pin only means not moving out until drop, and this is true in both signatures.

For this part, I am correcting myself.

This is exactly what I proposed in another RFC. In sort, I want the attribute to be generally available, if people like to use. And because it is a restriction on what code can be written, it can hardly be abused. However, I didn't see practical use of this attribute except in Drop context.

Thanks for the clarification. I will update when I confidently understand this.

I would update the text later. But right now let me explain it a bit. In "things allowed" part I listed 1. relax E0509: by-move pattern match on Drop types; and 2. full control on dropping logic without unsafe code.

But this two things are somehow contradicting: custom by-move match means different logics other than the one specified by Destruct can be specified, and so in this case, there will be different ways to destruct the object! The programmer then have to let the object user to decide which way they want to destruct.

What all these mean? If you have an object that the destruction order is irrelevant, but you need to specify RAII behavior, you will write Drop. Then if you want by-move match, you can derive Destruct.

If on the other hand, you want to specify the destruction order, you would better keep all fields private and so people cannot do by-move matches. And then your Destruct will specify the exact order of destruction. Even in this case, the flexibility of by-move match still allows you to define a method to destruct the object in a different way.

Time to do the right thing this time and write a more detailed post, I guess.

The Pin API guarantees that once you have some Pin<P> where P: Deref then you cannot, without opting into unsafe-rust, get access to &mut P::Target if P::Target: !Unpin. That is to prevent moving in safe rust because Self might be in a state where it is self-referential and moving would cause it to have dangling raw pointers, which is bad. If P::Target: Unpin then you can safely obtain a &mut P::Target via Pin::get_mut.

Drop::drop(&mut self) was a thing before the Pin API was thought of, interacts badly with the safe-rust should not be able to move pinned values guarantee. But changing the Drop::drop signature is bad with regards to backwards compatibility. Not introducing the Pin API is bad because we really need it for Futures. Having both is bad because you can accidentally move Self via core::mem::swap, but at least Drop::drop doesn't take Self by value, which would make it even easier to accidentally move Self.

Which brings us to a proposed Destruct trait which takes Self by value. How does this proposal take the Pin API, which has gone through the RFC process but has not stabilised yet, into account? The following definition would be fine, I think:

trait Destruct: Unpin {
    #[match]
    fn destruct(self);
}

Or is that to restrictive for your liking?

I honestly don’t see the problem here (for today’s Drop).

Drop::drop is the last thing in a structure’s lifetime. And it knows about the structure (of course), so it’s going to not put the object into an invalid state.

I can see where a moving Drop is potentially problematic, however. Currently, Drop::drop is (guaranteed to be? I’m not sure) in place. But a moving drop would, as the name suggests, require moving.

If you’ve got a proof of concept where you can trigger UB safely by breaking Pin guarantees (safely) in Drop::drop, though, it should definitely be shared. (It only counts if you don’t use unsafe to set it up though; that unsafe contact leaks to everything that uses the type’s internal details.)


Side mention: I’m pretty sure any !Unpin types with all public members are doing something wrong, so even the move-out-outside-destruct of this proposal wouldn’t cause a problem there.

Is it a problem if Drop can move out of !Unpin types after they’ve served their propose? It necessarily has knowledge of unsafe internals if it’s doing so. I’m not certain.

The main issue with drop + pin is that it blocks having safe projections from a pinned reference to a pinned reference to one of your fields. If drop didn’t exist then it would be fine to have auto generated functions for this, because it does you have to promise (with unsafe) not to move any of your fields during drop if you want to project a pin onto them.

1 Like

To clarify, this would be equally true of any Drop/Destruct/etc trait with a method that gets automagically called on scope exit, right? (unless it was passed a non-mut &self I guess)

I’m sensing an implication in some of these posts that Destruct would interact better with Pin, but I don’t see how.