Idea: Leak auto trait and Drop call guarantees

Currently std::mem::forget docs says that it's not unsafe because we can't guarantee that Drop is called anyway because we can safely leak objects with cyclic Rc. This forces the usage of closures instead of a drop-guard for unsafe things. An example is thread::scoped which was available but now has been removed.

I suggest to make a new marker trait:

unsafe auto trait SafeLeak { }

It should be automatically implemented for each type just like Send and Sync. All things that could potentially leak stuff in safe context like Rc, Box::Leak, mem::forget should add a constraint that T must be SafeLeak. This way you never can leak an unsafe guard in a safe code. Guard-like structs which require Drop being always called should not implement it (I'm not sure about the the syntax, may be something like "unsafe impl !SafeLeak for x"?). Any struct which contain it would then be !SafeLeak also.

An alternative would be to make a marker trait that is implemented only on unsafe guards:

unsafe auto trait UnsafeGuard { }

but we need a syntax then to disallow UnsafeGuard in generic constraints.

What do you think?

This doesn't seem useful, because there are a number of ways to leak things. For example,

fn bad_forget<T: Send + 'static>(value: T) {
    std::thread::spawn(move || { let value = value; loop { std::thread::park(); } });
}

Total leak protection is fundamentally impossible to solve, so we can't have any abstractions that require that they can't be leaked. (Rc, Arc, mem::forget, and the various leak functions/methods are just other ways to show this, but they aren't the only ways to leak memory)

edit: @steffahn thanks, forgot a bound.

2 Likes
error[E0310]: the parameter type `T` may not live long enough
 --> src/lib.rs:2:5
  |
1 | fn bad_forget<T: Send>(value: T) {
  |               -- help: consider adding an explicit lifetime bound...: `T: 'static +`
2 |     std::thread::spawn(move || { let value = value; loop { std::thread::park(); } });
  |     ^^^^^^^^^^^^^^^^^^ ...so that the type `[closure@src/lib.rs:2:24: 2:84]` will meet its required lifetime bounds
1 Like

This is a breaking change unless SafeLeak became at least an implied bound on generics that would (like ?Sized) need to be opted out of. See the ?Move conversations for more about why there's extreme resistance to adding more of those. And also see the Freeze RFC for a discussion about how auto traits have ecosystem impacts that may mean that we'll in practise never stabilize another one.

So I would suggest starting not from a solution proposal, but in making a persuasive case that leaking being safe is such a severe problem that it's worth taking drastic measures to address it.

5 Likes

I think this bound invalidates your point though. Or at least it makes the code example way less convincing. The problem with drop not being called on Guard-like objects that OP mentions has to do with the fact that you cannot be sure that your Guard<'a> type was properly destructed after the 'a lifetime has ended. As long as it’s only possible to "leak" 'static types, the Guards-like objects aren’t unsafe yet.

By the way, you also can always

use parking_lot::{Mutex, const_mutex}; // or std::Mutex with lazy_static
fn slightly_better_bad_forget<T: Send + 'static>(x: T) {
    static GRAVEYARD: Mutex<Vec<Box<dyn Send>>> = const_mutex(Vec::new());
    GRAVEYARD.lock().push(Box::new(x))
}

Sure, but my point wasn't that this piece of code was problematic, it was that arbitrary code can't be checked, so you could come up with new ways to leak things, with safe code only (potentially inadvertantly), thus making it very hard to provide actual leak protection. This also get's into what's considered leaking?

fn main() {
    let x = vec![0; 1000];
    loop {}
}

Does this code leak memory?

fn main() {
    let x = vec![0; 1000];
    if condition() { loop {} }
}

How about this code?

If the only way to enforce leak protection is through heavy restrictions on what you can do, it's not useful. If you don't have heavy restrictions on what you can do, you'll accidentally leak things. So this isn't a useful property to have for Rust.

Finally, I could have used scoped threads from rayon or crossbeam, that way we don't need the 'static, but that's harder to work into a minimal example.

5 Likes

I don't quite have an opinion on the OP. But would like to throw in my two cents about the above.

I would argue that either leak memory as both programs continue to execute and still have valid accesses to all heap allocated memory.

Maybe that isn't the strongest definition of "memory leak" but I think it is at least consistent with some people's mental model.

(EDIT: This was somewhat incorrect; see explanation in next post)

I would like to point out that, as far as I can tell, there's a fundamental difference between mem::forget() and the other methods of leaking. There are many ways to make objects that live in memory forever, but as far as I can tell, mem::forget() (and its cousin, ManuallyDrop) is the only way in safe Rust to break the condition that an object can't have its lifetime end without its destructor being run. All three examples on the Rustonomicon page about leaking are only a problem when you break that condition, and wouldn't a problem if the only permissible leaking was through Rc cycles, Box::leak, etc.

Thus, I think it was a historical mistake to make mem::forget() a safe function (while Rc cycles and Box::leak are okay). But I don't know if there's any reasonable way to fix it now.

This is a very easy trap to fall into, but isn't true. This example adapted from rio's issue tracker and is originally by @dtolnay. (rio is unsound because it relies on the property that memory is either dropped or never reclaimed. But please don't dogpile on the project, it's a known issue and probably well enough disclaimed.)

To be completely fair here, if you're saying lifetime as in the uniquely owned memory behind an indirection (i.e. Box), I don't have a counterexample, and I think you're correct in that extremely minimal understanding[1]. But if you're saying lifetime as in the captured 'a lifetime, and borrowed pointers, well, counterexample.

use std::{
    cell::Cell,
    marker::{PhantomData, Send},
    ptr,
    sync::Arc,
    thread,
};

#[derive(Debug)]
enum Buffer {
    Inline([usize; 3]),
    Outline(Vec<usize>),
}

impl Buffer {
    fn as_mut(&mut self) -> &mut [usize] {
        match self {
            Buffer::Inline(buf) => buf,
            Buffer::Outline(buf) => buf,
        }
    }
}

struct BufWriter<'a>(*mut [usize], PhantomData<&'a mut [usize]>);

impl Drop for BufWriter<'_> {
    fn drop(&mut self) {
        eprintln!("BufWriter may never be dropped");
        std::process::abort();
    }
}

struct TrustMe<T>(T);
unsafe impl<T> Send for TrustMe<T> {}

impl<'a> BufWriter<'a> {
    fn new(buf: &'a mut [usize]) -> Self {
        assert_eq!(buf.len(), 3);
        let this = BufWriter(buf, PhantomData);
        let ptr = TrustMe(this.0 as *mut [usize; 3]);
        thread::spawn(move || loop {
            // [INCORRECT] SAFETY: as this BufWriter can
            // never be dropped, it must be valid to continue
            // to write into the buffer it holds.
            unsafe { ptr::write(ptr.0, [1, 1, 1]) };
        });
        this
    }
}

struct Oops<T>(T, Cell<Option<Arc<Self>>>);

fn main() {
    let mut buf = Buffer::Inline([0, 0, 0]);
    let w = BufWriter::new(buf.as_mut());
    let oops = Arc::new(Oops(w, Default::default()));
    oops.1.set(Some(oops.clone()));
    drop(oops);
    buf = Buffer::Outline(vec![]);
    thread::sleep_ms(100);
    dbg!(buf);
}

It's obviously too late to change the fact that mem::forget/ManuallyDrop are safe. But even if I could go back in time and bring my current knowledge about Rust, I'd still argue on the side of making them safe. The benefit of ManuallyDrop for drop ordering and writing unwind-correct code is huge, and the benefit of "stack forgetting" being impossible is miniscule if present at all.

[1]: In fact, I think you could potentially argue that this weakened form of "only behind an owned pointer" "leaked == forever alive" is true even with stack forget, because the whole idea of forget is that the destructor isn't run, so the out-of-line data isn't cleaned up. The only possible benefit remaining is the "stack part" remaining alive, which isn't even potentially useful, because you can move it around freely, so there's no way to know where it is.


This is a very long-winded way to say that no, while mem::forget/ManuallyDrop are in fact the only way to not run destructors for a value on the stack (beyond aborting the process), removing the ability to do so does, in fact, give you no extra guaranteed properties about any object. Borrowck considers leaked objects to no longer have lock on any captured 'lifetimes.

3 Likes

Aha, thank you, that's a very interesting example! In the past, I've asked multiple very smart people for examples of how to implement something like mem::forget() in safe code, and none of them were able to produce an example, so I figured it wasn't possible.

As you say, this still doesn't violate the condition that an object can't be deallocated without its destructor being run, but out of the three problems on the Rustonomicon page, that only helps with the Rc problem and not the Drain or JoinGuard problems.

Assuming that I understood the question correctly, I would like to provide a counter-example:

#[derive(Clone, Debug)]
struct Noisy(i32);
impl Drop for Noisy {
    fn drop(&mut self) {
        println!("dropping!!!");
    }
}

fn main() {
    let mut v = vec![Noisy(0); 1000];
    let p = &v[42] as *const Noisy; // pointing to a Noisy(0)
    let d = v.drain(..);
    Box::leak(Box::new(d)); // could use Arc cycle as well
    drop(v); // no "dropping!!!" noise to be heard
    let mut v = vec![1337; 1000];
    
    // so let’s see if the memory behind "p" still exists:
    unsafe {
        println!("{:?}", &*p); // -> prints "Noisy(1337)"
    }
}

(playground)

Here, the Vec::drain implementation uses the pre-pooping your pants approach to leak its contents when the Drain object is leaked. Afterwards it has no problem with de-allocating the memory; it’s then re-used by the next vector (or equivalently, perhaps even more reproducibly, by re-filling the same vector with new contents).

The above also @elidupree.

By the way, @CAD97, you can make your footnotes more fancy by using <sup>[1]</sup>, like this[1] :wink:

3 Likes

Yeah, it's for reasons like this that trying to say what sticks around in the face of leaks is difficult. I think the concession I'm dancing around is in actuality the exact operation described by Box::leak: if you don't drop a box then the data in the box will remain valid.

So here I'd be referring to any memory owned by the Drain, which is none. (If you want to get language lawyer-y, you could say that the function (stack) owns the memory that the Drain's "stack portion" is resident in.)

And yeah I know the <sup> trick, and in fact did exactly that elsewhen today, but sometimes writing posts on a smartphone makes me lazy :stuck_out_tongue:

1 Like

This comes up every now and then, so I wonder what a good place would be to put this example? Here's just the forget part extracted from the previous example:

use std::cell::Cell;
use std::rc::Rc;

pub fn forget<T>(x: T) {
    struct Oops<T>(T, Cell<Option<Rc<Self>>>);
  
    let oops = Rc::new(Oops(x, Cell::new(None)));
    oops.1.set(Some(oops.clone()));
}

(I remember already posting some equivalent code somewhere this year, but forgot where.^^)

The forget docs already explicitly say

forget is not marked as unsafe , because Rust's safety guarantees do not include a guarantee that destructors will always run. For example, a program can create a reference cycle using Rc , or call process::exit to exit without running destructors. Thus, allowing mem::forget from safe code does not fundamentally change Rust's safety guarantees.

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.