[Pre-RFC]: Specify and stabilize drop order

Summary

I propose we specify and stabilize drop order in Rust, instead of treating it as an implementation detail. The stable drop order should be based on the current implementation. This results in avoiding breakage and still allows alternative, opt-in, drop orders to be introduced in the future.

Motivation

After lots of discussion on issue 744, there seems to be consensus about the need for a stable drop order. See, for instance, this and this comment.

The current drop order seems counter-intuitive (fields are dropped in FIFO order instead of LIFO), but changing it would inevitably result in breakage. There have been cases in the recent past when code broke because of people relying on unspecified behavior (see for instance the post about struct field reorderings). It is highly probable that similar breakage would result from changes to the drop order. See for instance, the comment from @sfackler, which reflects the problems that would arise:

Real code in the wild does rely on the current drop order, including rust-openssl, and there is no upgrade path if we reverse it. Old versions of the libraries will be subtly broken when compiled with new rustc, and new versions of the libraries will be broken when compiled with old rustc.

Introducing a new drop order without breaking things would require figuring out how to:

  • Forbid an old compiler (with the old drop order) from compiling recent Rust code (which could rely on the new drop order).
  • Let the new compiler (with the new drop order) recognize old Rust code (which could rely on the old drop order). This way it could choose to either: (a) fail to compile; or (b) compile using the old drop order.

Both requirements seem quite difficult to meet and would result in additional complexity. In my opinion, it would be better to accept the current non-intuitive drop order than to introduce more complexity.

Finally, in case people really dislike the current drop order, it is still possible to introduce alternative, opt-in, drop orders in a backwards compatible way. However, that is not covered in this RFC.

Detailed design

The design is the same as currently implemented in rustc: struct fields and tuple elements are dropped in a FIFO order. This is also valid for enum variants of both kinds.

How We Teach This

When mentioning destructors in the Rust book, Reference and other documentation, we should also mention that they are run in a particular order.

Drawbacks

  • The counter-intuitive drop order is here to stay.

Alternatives

  • Figure out how to let rustc know the language version targeted by a given program. This way we could introduce a new drop order without breaking code.
  • Introduce a new drop order anyway and try to minimize breakage by running crater and giving the community enough time to update.

Unresolved questions

To which extent should we specify drop order? For instance, should drop order of closure captures be specified? Or should we limit ourselves to structs, tuples and enums?

5 Likes

I basically agree with your thesis. I think changing the current drop order will definitely break code (I know of people that are relying on it) and I don’t see any real gain it. Moreover, while it’s not a central point imo, it is interesting that there are efficiency advantages to dropping struct fields in order – or at least to dropping large arrays in order. (I seem to recall some measurements taking place on the thread.)

That said, I think that for this RFC to be accepted “for real” it ought to definitely describe the behavior of the compiler, ideally with tests (that can then be added as regression tests).

1 Like

One more alternative is to run destructors in field order after field reordering.

This way we get “fast by default drop”, memory access pattern during destruction is very HW friendly and field reordering (which is supposed to be optimization!) doesn’t pessimize it. This needs measurements though, there’s high chance non-trivial field destructors are still independent and can theoretically be reordered by compiler under “as if” rule. I someone relies on drop order (should be rare), then #[repr(do_not_reorder_fields)] (#[repr(C)] at the moment) will have to be used.

Hmm. I think the only time perf matters is basically big arrays; it is unlikely that one has a struct with enough fields to make stride relevant.

That said, if we were to attempt a change here, we’d want to be careful in how we approach it to see if we can flush out bugs first rather than diagnosing after the fact.

How detailed would you like the description of the current behavior to be? Would it be enough to provide a short description and a code example in the “Detailed Design” section for each element of the language that is related to drop (eg tuples, structs, enum variants, slices, vec)?

That seems sufficient. I think what would be best is if you drew up some tests that actually demonstrated the order, so that we can add them to the regression suite. People can then look for corner cases you’ve forgotten to cover.

But basically yeah I think just specifying the order for each of these:

  • tuples
  • arrays
  • slices
  • structs + tuple structs (and enum variants of those kinds)

(am I forgetting anything?)

What do you think of Vec and closure captures? I think it would be great to stabilize them as well, even though Vec is not a built in type.

Vec seems fine. I am not sure about closure captures; I honestly don’t know what we do today. I would guess that it is related to the definition order of the variables or else the occurrence order of the variables in the source text. =) I … am reluctant to define the source order for closures, since it seems like such a corner case, and I feel like anyone who wants to rely on it ought to use a struct – but then I guess that might be naive on my part. =)

I agree we should do this.

Does anyone have a reference for why rust-openssl relies on drop order, rather than (for instance) arranging for wrapped Rust objects to have appropriate lifetimes to constrain drop order? It seems like only unsafe code that didn’t have appropriate lifetime bounds could run into issues with this.

2 Likes

Agreed. But that is a valid use-case. I know @carllerche has some examples where he had to rely on drop order as well (or chose to, I don't know the details). I think it comes up when you want to have a struct that has "internal references" to data owned by other fields, basically. This cannot necessarily be typed using lifetimes and borrows, since that struct may be something you want to move from place to place (and borrows tie you down to one point).

@nikomatsakis I just wrote the detailed design part and would appreciate your feedback before submitting a PR.

Detailed design

The design is the same as currently implemented in rustc and is described below. This behavior will be enforced by run-pass tests.

Structs and enum variants

Struct fields are dropped in the same order as they are declared. Consider, for instance, the struct below:

struct Foo {
    bar: String,
    baz: String,
}

In this case, bar will be the first field to be destroyed, followed by baz.

Tuple structs show the same behavior, as well as enum variants of both kinds (struct and tuple variants).

Note that a panic during construction of one of previous data structures causes destruction in reverse order. Since the object has not yet been constructed, its fields are treated as local variables (which are destroyed in LIFO order). See the example below:

let x = MyStruct {
    field1: String::new(),
    field2: String::new(),
    field3: panic!()
};

In this case, field2 is destructed first and field1 second, which may seem counterintuitive at first but makes sense when you consider that the initialized fields are actually temporary variables.

Slices and Vec

Slices and vectors show the same behavior as structs and enums. This behavior can be illustrated by the code below, where the first elements are dropped first.

for x in xs { drop(x) }

Again, if there is a panic during construction of the slice or the Vec, the drop order is reversed (that is, when using [] literals or the vec![] macro).

Allowed unspecified behavior

Besides the previous constructs, there are other ones that do not need a stable drop order (at least, there is not yet evidence that it would be useful). It is the case of vec![expr; n] and closure captures.

Vectors initialized with vec![expr; n] syntax clone the value of expr in order to fill the vector. In case clone panics, the values produced so far are dropped in unspecified order. The order is closely tied to an implementation detail and the benefits of stabilizing it seem small. It is difficult to come up with a real-world scenario where the drop order of cloned objects is relevant to ensure some kind of invariant. Furthermore, we may want to modify the implementation in the future.

Closure captures are also dropped in unspecified order. At this moment, it seems like the drop order is similar to the order in which the captures are consumed within the closure (see this blog post for more details). Again, this order is closely tied to an implementation that we may want to change in the future, and the benefits of stabilizing it seem small. Enforcing drop order through closure captures seems like a terrible footgun at best (the same effect can be achieved with much less obscure methods, like passing a struct as an argument).

Note: we ignore slices initialized with [expr; n] syntax, since they may only contain Copy types, which in turn cannot implement Drop.

Reverse order that they are being constructed in, not reverse order of the struct declaration, right?

struct MyStruct {
  field1: String,
  field2: String,
  field3: String,
}

let x = MyStruct {
    field3: String::new(),
    field1: String::new(),
    field2: panic!()
};

would first destruct the temporary for field1 then the temporary for field3.

1 Like

It is probably worth mentioning what happens when a slice or other type is interrupted during construction (e.g., let x = [X, Y, panic!()], I believe will drop Y and then X). Also, what @Nemo157 said. =)

Otherwise, hmm, seems quite good to me, at least at first read!

Also, you mentioned tuple structs, but not tuples, right? Maybe worth saying “Tuples and tuple structs” or something, and changing the heading to “Tuples, structs, and enums”.

Perhaps worth mentioning that union types never drop their contents.

I have to go cross-reference the list of “kinds of types”, I guess =)

Ah, also, it’s fairly obvious, but for completeness sake I think it’d be worth talking about the overall picture for a type that implements Drop. In particular, if a struct/enum implements Drop, then when it is dropped we will first execute the user’s code and then drop all the fields (in the given order). Thus unsafe code in Drop must leave the fields in an initialized state such that they can be dropped. If you wish to exert control over drop order, then, you can make the fields into Option and have a custom drop that calls take(), or else wrap your type in a union (with a single struct/enum member) and implement Drop such that it invokes ptr::read() or what have you manually. Something like that.

In effect, that last bit of material seems better suited to docs than an RFC, but this RFC feels like it is primarily documentation anyhow (and if we accept it, we should probably move it more-or-less verbatim into the reference manual).

This has come up in the coroutines RFC, and elsewhere. I think we really ought to fix this, in which case will there still be a legitimate use-case for this? I'm not necessarily against doing something eventually---I do think C's unspecified argument eval order is bad, for example---but I am loath to stabilize any more of Drop today as I think the status quo around that is way too weird and full of dark corners.

In particular, if we had by-&move Drop, a drop instance could easily and simply choose to drop fields in any order it likes with plain old drop(self.foo);. I think this should be de-rigueur for any code that depends on drop order, whether or not we stabilize something, for the sake of future maintainers of the code.

1 Like

@Ericson2314 I agree; that seems like something that code relying on drop order could use. For that matter, you can use drop_in_place and similar today, if you don’t mind using unsafe.

If it turns out that a substantial amount of code relies on drop order and we want to make it even easier, what if we provided the equivalent of a #[repr(...)] for drop order? Then, code not specifying that would get the current unspecified drop order.

My primary concern here: I don’t want such a specification to lock in drop order requirements in the language in a way that prevents future optimizations, language enhancements, or other possibilities. Code that has precise drop-order requirements seems uncommon, but locking in drop order would affect all future code.