Pre-RFC (?): unions, drop types and `ManuallyDrop`


#1

This is part 1 of my thoughts on various union-related topics. It’s not really unsafe code guidelines related and there is not even very much to explain, so I figured I’d just post it here as some form of pre-RFC.

Summary

Define ManuallyDrop as

#[repr(transparent)]
struct ManuallyDrop<T : ?Sized>(T);

but also make it a lang-item to get the magic behavior of not calling Drop on its content. IOW, ManuallyDrop<T> never has drop glue. In every other aspect ManuallyDrop behaves like a normal struct, including the way DST work and layout optimizations.

With unions and ManuallyDrop disentangled, there is no reason to allow fields with drop glue in unions at all. So weaken the current check on stable (fields must be Copy) to test if the type does not need drop glue, and we remove the unstable feature to bypass this check. That may not always be possible when generic types are involved; we can then fall back to ensure that types are Copy because such types never have drop glue. (I think there will not be many false positives in this test: We can traverse through any types that are statically given, stop at things like &, &mut, ManuallyDrop, raw pointers and other type constructors that never have drop glue even if their type parameter does, and then only check for Copy when hitting a generic or associated type [constructor].)
Notably, ManuallyDrop never has drop glue and hence can be used to put any type into a union. For example, MaybeUninit remains definable for all sized T as

union MaybeUninit<T> {
    uninit: (),
    value: ManuallyDrop<T>, // we statically know this has no drop glue, even if T does
}

Then entirely remove support for union fields that do have drop glue. unions are no longer “magic” for dropping, the fact that they do nothing when dropped just follows from the fact that none of their fields would do anything. Writing to a union field will also never implicitly call drop and hence can always be a safe operation.
However, one can of course still manually impl Drop for MyUnion. This requires tracking whether or not a union is initialized (to decide whether we insert drop when it goes out of scope). How exactly that is done is left as an open question.

This fixes #47034, resolves #49056 (which only got closed as “postponed”) and makes most of the complexity that has been proposed in RFC #1897 unnecessary.

Some more details

There’s no that much more to add (unless I missed something, which I probably did). The core observation is that once we have ManuallyDrop, we can side-step all the complications around drop in unions by just ruling out types that have drop glue. Some very complex rules have been proposed for when to drop a union field and when not to, and working with such unions is a huge footgun because it is hard to see from the code when something is dropped.
Instead, I propose we force people to write ManuallyDrop, so drop is never called automatically and the programmer can manually take care of calling drop when necessary. The fact that the type of the field is ManuallyDrop should be a strong reminder that drop has to be manually handled, so this is much better than just silently not dropping fields in unions. (This actually happens currently to some extend: Fields are dropped on assignment but not when the union itself is dropped. So this proposal also removes some inconsistent union behavior.)

This should be fully backwards compatible as we can re-implement the API surface of ManuallyDrop for the struct, and union fields with drop glue have not been stabilized.

It is worth mentioning that @kennytm has a draft proposal for better Drop control that would actually make ManuallyDrop as described above 100% a library type:

Allowing user to redefine needs_drop also let us to implement ManuallyDrop ([RFC #1860]) with an unsized struct

Doing that later is backwards compatible with this proposal (we’d just remove the lang-item).

Open questions

I suppose a union implementing Drop should have restrictions similar to struct concerning moving out of fields and partial initialization?


What do you think? I am particularly interested in whether not allowing Drop fields for union (but allowing ManuallyDrop<T> for any T because that never drops) would actually work for everyone. Are there cases where the extra wrapper of a ManuallyDrop causes problems?

This does deliberately not answer any questions around DSTs in unions, union layout optimizations or other kind of question related to “which values are valid in a union”. I think those are a separate concern, and we can now have ManuallyDrop for DST without answering these questions.

Cc @kennytm @nikomatsakis @petrochenkov @cramertj @mikeyhew


#2

Hmm, this would make all union field assignment safe too, right? (A leaky footgun, perhaps, but safe.)

Another thought: would this let ManuallyDrop be always Copy? The double-free problems go away since dropping the contents is unsafe… Oh, no, because it would allow duplicating &mut in safe code. Never mind; terrible idea :sweat_smile:


#3

Yes.


#4

I must admit, this is tempting.

Given the availability of a transparent ManuallyDrop wrapper, I don’t see any particular need for Drop fields in union types. If you want one, you can always wrap it in ManuallyDrop. That would then completely remove the complexity around “if you assign to the whole union” versus “if you assign to a union field”.

:+1: from me.


#5

I’m using unsafe unions extensively (and to great effect) in a project right now. However I’m a little worried by this last part, but I probably misunderstood.

Basically I have:

#[repr(C)]
#[derive(Eq, PartialOrd)]
pub struct Block {
    pub(crate) hdr: Hdr,
    pub(crate) things: *mut u8,
}

#[repr(C)]
pub union Thing {
    pub(crate) x: u64,
    pub(crate) block: Block,
}

I have a custom allocator for Blocks, and i implement drop manually for Thing and Block (its safe and it works fine, although I’m pretty sure i have a memory leak, but that is orthogonal).

So I already have a proliferation of newtypes in this project, which are already unpleasant enough, and I’m hoping you’re not saying that you want me to add another newtype wrapper on pub block: ManuallyDrop<Block> ?

Will this be optional? If this change/proposal gets merged, will my currently working code with a droppable field no longer compile? I’m just confused/uncertain as to the scope of the effects of the changes being proposed here.


#6

My understanding of the proposal is that no additional newtypes are required anywhere, you just need to write ManuallyDrop in a few more places. So in your example: Block is unchanged, and Thing simply becomes:

#[repr(C)]
pub union Thing {
    pub(crate) x: u64,
    pub(crate) block: ManuallyDrop<Block>,
}

@RalfJung that’s what you meant, right?


#7

This means I have to access the real block via block.0, and so on and so forth.

Also, my real worry is it seemed like fields in a union with drop were getting kicked out? Since note from my post, I said Block (and Thing) implement Drop. I.e., is my code which is fine right now going to stop compiling if something like this lands?


#8

E.g., this makes me super nervous


#9

Well… yes, we’re talking about an unstable feature, and this is proposing a breaking change to it, so your code will stop compiling, but all you have to do is add ManuallyDrop<...> in the right places. Why does that make you more nervous than any other breaking change to an unstable feature?


#10

The right places? If I understand correctly, I now have to locate every scope in every case where my union got completely safely auto-dropped before, and now insert manual frees? How is this an improvement? This is a completely non-trivial refactor which will absolutely add more unsafety (the more I have to touch free the more pain I’m going to have), if that’s what your asking here.

Hopefully I’m missing something.


#11

Weird, I’ve never heard of unions having any sort of auto-drop behavior (for their contents, I mean; I thought “dropping” a union was basically a no-op). Even the most trivial test of this feature in the playground gives me a warning that union contains a field with possibly non-trivial drop code, drop code of union fields is ignored when dropping the union. And afaik that can’t possibly be sound anyway since the whole point of a union is there’s no way to know which variant is the active one (that the compiler understands/is allowed to rely on). Where did you get the impression today’s nightly unions do the same auto-dropping of their contents that structs and enums do?

EDIT: Yeah, from re-skimming the various issues and RFCs, the only thing I can find that comes close is the debate over whether assigning to a union field of Drop type should drop() the previous value. In fact, part of this thread’s motivation is making that question moot.


#12

So the idea here is that ManuallyDrop can’t be matched on because its field is private, so therefore the edge cases can’t be reached? I hope that no-one wants to match through a MaybeUninitialized.


#13

Sorry I think we’ve been miscommunicating, and its probably my fault. So my union implements drop, which gets “auto-dropped” similar to other rust owned values. I manually tag my unions values using essentially the same method that OCaml does (it reserves a bit in the number to mark whether its a regular number or a block).

I was worried from this proposal that union’s themselves wouldn’t be able to implement drop (which I believe they will still be able to?), but I see this only applies to union’s fields which implement drop, iiuc.

If i’m understanding correctly now, I will still be able to implement Drop for my union, but for my Block, I’ll be forced to add ManualDrop ?


#14

Ah, you’re right, that was a total miscommunication. IIUC, none of these posts are saying anything about the ability to manually impl Drop for MyUnionType, so you just have to add ManuallyDrop<...> in the type and a .0 in the Drop impl.


#15

yup, got it now, for my unions drop impl, i’d just do ::std::mem::drop(self.block.0), and update all self.block -> self.block.0 in the union whenever it accesses.

I can also avoid all of this for now if i want, I believe, by removing my drop impl of the block, which is also fine for me.

yea i’m ok with this (as long as ManuallyDrop has repr transparent, because i need to use this union through FFI boundaries)


#16

@m4b Yes, the intention was that union fields cannot have automatic drop (to side-step all the questions that immediately opens), but unions themselves can still manually have drop implemented for them like all other nominal types. I will clarify that in the original post.

I suppose a union implementing Drop should have restrictions similar to struct concerning moving out of fields and partial initialization?

Which edge cases are you referring to? TBH I did not think about match at all.


#17

I just realized that implementing Drop for unions is still unstable. Good :slight_smile: . That allows is to figure out those rules, I guess…


#18

I like the proposal overall. I would sort of like to have a better trait than Copy to represent “no-op on drop”, though. I suppose we could add something, let’s cal it NoopDrop, and then have trait Copy: Clone + NoopDrop…?

It raises some interesting questions though about whether it is “opt-in” etc. For backwards compatibility I guess it would have to not be.

Anyway, it just feels a bit weird to have this test, which crops up in various places, but is not truly reflected in the type system in a natural way.


#19

I can’t quote the exact terminology we’ve been using, but GC would benefit from a "trivial Drop" bound as well. The two ways we have to reason about trivial Drop right now is 'static lifetime or Copy data. The GC experiments so far tend to use 'static.


#20

I’m not sure – what is the connection between 'static and “trivial drop”? (I took trivial to mean “no-op”)