"Pure" destructors

Today, the following transformation is invalid:

fn foo(x: Vec<i32>) {
  if cond { bar(x) }
  unrelated();
}
// into
fn foo(x: Vec<i32>) {
  if cond { bar(x) } else { drop(x) }
  unrelated();
}

In particular, Rust has the same destructor rules as C++: temporaries are destroyed at the end of a full expression; bindings are destroyed at the end of a scope (modulo drop flags in the Rust case). This means that unsafe is permitted to, in the first example, assume that x.as_ptr() is alive during the call to unrelated() if cond did not hold.

However, it's a bit silly that any time recursively containing a Vec, Box, or Arc should require drop flags and not let MIR analysis pick the order in which things are dropped, which can help in certain situations where the drop order disagrees with the borrow checker, resulting in awful errors and explicit drop calls. (Here's an example from my own project.)

Obviously, we can't change this after that fact (though I think we probably shouldn't have made this the default...), but this seems like a nice behavior in a lot of cases. Here's how I could imagine enabling it:

  • Add a pub auto trait PureDrop {} that tells the compiler "hey, drop this as soon as possible in whatever order you want". Unsafe code cannot rely on the variable living past when its owning variable is no longer alive (exercise: can we make this rigorous)? This like smart pointers could implement this, but something like MutexGuard should not. It should be thought of as being a type which can be dropped without safe code being able to observe it.
  • Add a pub struct DropLater<T>(T) that never implements PureDrop.

Now, the ugly caveat: it is a breaking change to implement PureDrop for Box, because unsafe code already relies on Box living for longer than it needs to. I'm not sure if this change is worth it. However, I don't think PureDrop needs to be unsafe, since implementing it on random types shouldn't cause memory issues without other unsafe going and poking at pointers (I think...). Folks who have boxes in their types could manually implement PureDrop if they don't plan on exposing the raw pointer.

Thoughts? I imagine there may be other fun optimizations this could enable.

1 Like

See: MCP: Allowing the compiler to eagerly drop values · Issue #86 · rust-lang/lang-team · GitHub

(And make sure to take discussion to Zulip, not on the issue.)

Hmm, sounds like that proposal is an inversion of mine, though it does mention something like what I've described as an alternative... I should probably drop into Zulip about this at some point.

My rough TL;DR of the thread is that it would be really hard to specify what "drop as soon as possible" means, basically amounting to full borrowck, but drop calls themselves also count as uses for borrowck.

If you just want to read without logging into Zulip, here's the archive copy: https://zulip-archive.rust-lang.org/243200tlangmajorchanges/41351MCPAllowingthecompilertoeagerlydropvallangteam86.html

And Vec and Rc and basically any other container type has this problem. For starters, I would be really upset if Vec started dropping my vector items early, and this induced e.g. threading bugs related to dropping RAII join handles or channels at the wrong time.

2 Likes

Right. I'm not saying we should do this, I mean that if you want the behavior you should opt into it by implementing the auto trait.

1 Like

I assume Vec<T> would only implement EagerDrop when T: EagerDrop, so existing code and existing join handles would be unaffected.

2 Likes

I don't think that's correct. That would make this code break:

fn foo() {
  let stuff = vec![...];
  let ptr = stuff.as_ptr();
  // `stuff` is never used again, so we can drop it... Right??
  unsafe {
    // Kaboom!
    println!("{:?}", &*ptr); 
  }
}

This works today, and is guaranteed to work. Vec<T>: EagerDrop for any T breaks this, even when T: EagerDrop, since i32: EagerDrop trivially.

3 Likes

It would definitely require rewriting such code to explicitly include the drop

fn foo() {
  let stuff = vec![...];
  let ptr = stuff.as_ptr();
  unsafe {
    println!("{:?}", &*ptr); 
  }
  drop(stuff); // no kaboom
}

But I wonder about detecting such issues automatically.

I'd love some kind of use-after-free lint for raw pointers. It'd help with CString::new("…").as_ptr() footgun too.

But it's probably not enough to prevent breaking unsafe code. Theoretically you could do vec.as_ptr() as usize and cast that later in the scope.

So it will need some other opt-in, unfortunately.

How about that:

Your trait PureDrop will not automatically perform eager drops. Only when a code block (including functions etc was annotated with something like #[eager_drop] will perform this optimization when possible. Of cause, this should also be use for higher scopes like #![eager_drop] so every functions/code blocks in this scope can do the optimization.

Or,

we might not actually need a trait here, only an attribute flag will be enough to tell the compiler that it is good to perform eager drop. So if the code block contains unsafe code, the writer should aware that if the code rely on late drop he/she should not use the attribute. (and the violent of this rule should be able to be catched by MIRI, since the offending code will trigger access to invalid pointers)

No, this is wrong; you want a trait so that you can opt out MutexGuard, which is a "true" RAII object. I think that trying to solve this with attributes is the wrong direction. I think that like @kornel says we should implement a lint, and if the lint is good enough break code that does dangerous things with Box at an edition boundary. Worst case we can spend a week cratering the ecosystem running under ASan or Miri or something silly like that. The extreme view to take is "if you're doing exciting things with pointers outside of FFI maybe you should be miri'ing your tests".

There's also something like making rustc --test do the TCMalloc (I think TCMalloc does this?) of scribbling 0xabababab over uninit heap memory and 0xcdcdcdcd over just-freed memory and hope that helps.

Yet another idea is that we can say "eager drop does not apply in functions that contain the token unsafe". If you never utter "unsafe", pure destructors like Box::drop() are unobservable.

Of course, this might just not be worth doing, in which case we can just ask people do to do

struct K { things: Vec<Thing>, }
impl PureDrop for K {}

in perpetuity. Big deal.

What's funny is this "lint" is just lifetimes. I would be all for attaching lifetimes to pointers if it could be cheaply cast away.