Dropping with arguments ("higher RAII")

A few days ago I was reading about the Vale programming language and what the author calls "higher RAII". Naming aside, I think it is actually an interesting feature that I feel would fit Rust, but I don't think can be emulated at the moment.

In short, the idea would be (in Rust terms) to generalize the idea of dropping to arbitrary functions with parameters, and have the compiler enforce that you call such function. For example, suppose you have some struct representing a connection for some protocol. The connection should always finish with some integer exit code that sends a final message with the code and terminates the connection. You could have a fn close(self, exit_code: u8) method that "swallows" self, so you cannot use it anymore after. But there is no enforcement from the compiler that forces you to call that method - if you don't, the struct will just be dropped as normal.

It would be nice to be able to enforce something like this. For example, somehow making drop directly accessible only from within the module where the struct is defined, so you would be forced to explicitly transfer the ownership of the struct somewhere else.

Is it possible to emulate something like this in Rust at the moment? Would it be worth considering as a language feature?

1 Like

From your description, that sounds like linear typing, one of the good posts on linear typing in rust I remember is this:

7 Likes

Linear types would be excellent. Unfortunately they aren't a drop in slam dunk and there are a lot of sharp edges to work out. There's a lot written on the subject, here's one more recent item on the subject which links like 10 other sources. It honestly feels like Rust is adding surface area (e.g. async) linear types would have to address faster than any progress on it is being made, but perhaps that's okay.

2 Likes

Thank you @Nemo157 and @toc for the replies and links, great reads. I was sure that smarter people from the Rust community would have given this the thought it deserves, but I didn't know the formal concept for it.

Can be used this feature when I want to destroy some class using specific allocator, but I dont want to save reference to this allocator inside each class object?

There would be nothing ensuring that the right allocator is used to deallocate again, right?

2 Likes

I wonder if the compiler should really enforce this? Perhaps something more lightweight akin to the #[must_use] attribute. A pair of #[must_move] and #[ignore_must_move] would require that a type is moved instead of dropped, but still allow an escape hatch to eventually drop the value.

#[must_move]
struct Connection;

fn main() {
  let conn = Connection;
  // ... Do stuff to the connection
  close(conn, 1); // We've moved the connection, so no warning!
}

fn close(conn: Connection, code: u8) {
  #[ignore_must_move]
  let _ = conn;
}

Note drop() should not use #[ignore_must_move] probably, because then any user could just drop the object explicitly (which is not terrible, but could allow users to mistakenly avoid your intent).

How would this interact with generics? For example drop(conn) is technically a move of conn, but drop internally doesn't #[ignore_must_move] nor moves its parameter because it is generic over T, and generic arguments are assumed to not be #[must_move].

Tbh I don't have too strong a grasp on how generics actually work in the compiler. I was under the assumption that when drop gets monomorphized, it will see that the type is #[must_move] and then it will generate the warning. Is that not the case?

Note: let _ = conn does not move/drop conn. If you want to drop it, you need to write drop(conn). This also makes #[ignore_must_move] unnecessary, since calling drop(conn) moves conn into the function call.

Additionally, a majority (though not all) of types which would want to be #[must_use] are so because they are Write-like. But for these,

The difficulty with any sort of linear typing/hinting is that it disappears when given to a generic interface (e.g. drop again). That's not to say a warning wouldn't be useful (#[must_use] is a lint and very effective). But it would be somewhat limited by that and easy to "defeat."

I suppose there's an alternative: warn if passing a #[must_use] type to a generic parameter which isn't itself annotated with #[must_move][1] (thus doesn't not get unmoved_must_move warnings). But personally I'd still mark fn std::mem::drop as taking #[must_move], since an explicit drop is specifically requesting the nullary (no further arguments) final use.

While it's possible to come up with a reasonable semantic for generic functions that maintains the lint, the real problem is generic types. What about Box<Connection>, or Arc<Connection>, or Mutex<Connection>, etc.? This is a definite case where #[must_move] will end up covered unless there's some way to inherit "must_movedness." It's not sufficient to always inherit because of borrowing types (e.g. MutexGuard<'a, Connection>), and it's not singly sufficient[2] either to infer based on whether a field owns that type by value, since dropping types may involve dropping types they don't directly contain, e.g. containers based off of raw pointers.

Then there's also many-collections to consider; dropping Vec<T> drops zero-or-more T. You'd think .into_iter()ing it would be sufficient, but dropping vec::IntoIter<T> also drops zero-or-more T. If you want to avoid false positives, but still lint if the Vec could nullary-drop some #[must_move] objects, you'll need some way to cover the #[must_move] lint when the types are consumed by Iterator::for_each[3] or a for loop... except not the for loop if it potentially breaks iteration before exhausting the iterator.

Ultimately, doing linear types as a lint/warning has all of the same issues as doing it as a trait/error. The only difference is that it can't be relied upon, and that imprecision is less immediate of a problem since it's "just a lint," so missing or spurious restrictions are "only" unfortunate footgun potential rather than build-breaking errors. But spurious warnings are still highly problematic, since they directly contribute to warning fatigue and just turning off the lint entirely.

But, to be completely fair, #[must_use] is a very simple lint which is "defeatable" by all of these things, and is still a useful lint. #[must_move] would still be useful even if it only works in extremely simple cases[4]. But trying to go further than that (i.e. by making generics sometimes cover #[must_move] responsibility and sometimes not) only leads to issues coming from expecting the lint to be more powerful than it is (or potentially even can be).

What's interesting is that Rust actually does have a very minimal, extremely limited support for linear types: in a const context, you can take a generic type parameter by value, but you are not allowed to drop (anything which (potentially) has drop glue and captures) it (because the drop glue is not necessarily valid in a const context). There it's important that the check is conservative and overestimates types which could potentially run nonconst drop glue, whereas the lint would prefer to be conservative in the other direction and avoid any false positives, such that the lint is actually useful when it fires.

But it's worth noting that even the const linearity ignores drops as a result of panicking/unwinding. Every proposal for linearity in Rust (even those around async drop/cancellation) still provides that a nullary (and synchronous/blocking) drop must be possible/available in the case of an unwind. But what makes linear types (or "higher RAII") truly powerful is the ability to provide extra arguments to cleanup which are required in order to destroy the value, which is unfortunately just fundamentally incompatible with unwinding[5].

Rust tries very hard to keep all warnings/errors pre-monomorphization, based solely on the generic signature and independent of what types the generics are instantiated with. Most monomorphization isn't even done during cargo check, and only performed once required by cargo build in order to lower to the codegen backend. (The exception is when things are used by a top-level (nongeneric) const context, in which case they're monomorphized in order to evaluate them.)


  1. This could be a place for #[ignore_must_move], but can be addressed just by suppressing the lint, e.g.

    #[allow(unmoved_must_move)]
    fn drop<#[must_move] T>(_: T) {}
    

    It probably makes sense to also allow applying #[must_move] to traits in this scenario, and treat generics bound by such traits as #[must_move]; the presumption being such a trait would provide a way to finally consume the value. ↩︎

  2. Since it's just a lint, it's probably sufficient to approximate and not propagate #[must_move] unless a field has a #[must_move] type (which includes the generic type), and let pointer-based containers use PhantomData to state by-example that they own (and thus drop) the generic type by-value. Ultimately, the question of "do you drop the generic parameter" is related to #[may_dangle], but it's still a distinct property. ↩︎

  3. Perhaps Iterator::for_each can mark the self receiver as #[must_move] to adopt responsibility for consuming that argument? ↩︎

  4. Specifically, the minimal support is just those cases where #[must_use] would already fire, plus local bindings with a concrete type that is annotated with #[must_move]. The one small extension to that which I don't think would be problematic would be to also lint for pub fields (recursive) which are of a #[must_move] type and the containing struct(s) does not implement Drop (i.e. the field actually can be individually moved out of). ↩︎

  5. However, a bridge can be somewhat constructed using a kind of defer statement to create a nullary drop that has the ability to refer to other stack bindings. And this can be useful for dependency injection kinds of things, such as sharing an allocator between multiple collections (rather than each carrying a clone of the allocator handle like required currently). But when any function can potentially unwind, such a nullary drop essentially must be provided when a binding is bound, if implicit nullary drops are to be prevented. ↩︎

2 Likes

Would a defer statement that wraps a type be able to achieve this?

struct Connection {}
let conn = Connection{};
let conn = conn.send_message("goodbye, world!").defer;
conn.send_message("hello, world!"); // using Deref/DerefMut

If .defer returned some generated type, like a impl Defer<Output = Connection> that wrapped the value and the arguments. Then couldn't we assume that send_message("goodbye, world!") would probably run, either as part of the destructor, or potentially by the user running a conn.take() function?

It's still not exactly a linear/must use type. It's more of a "probably used" type, but would that be enough?

It would be annoying to work with impl Defer<Output = impl Future<Output = T>> types though.

The scopeguard crate provides this via a function: let conn = guard(conn, |conn| conn.send_message("goodbye, world!"));

The trick is that any library-ish solution has to deal with closure captures. A first-class defer would be able to integrate with the borrow checker in such a way that any "captured" values are still usable, since the checked usage would only occur at the actual unwind/drop edges.

I acknowledged this approach in a footnote:

3 Likes

Niko Matsakis' blog post must move types is discussing this idea as well -- it looks like @DaveTheSheep's suggestion of calling this must-move is an independent suggestion of the same name for the same idea.

The blog post presents it as a "negative trait", which solve questions related to the interaction with generics.

The blog post contains unanswered questions about the interaction with panic. I agree with @Jon-Davis that a form of defer statement could be relevant there. It would make functions with multiple return/exit points more pleasant to write, and it may be part of the answer on what to do on panic.

Could you try something like:

struct Foo;

impl Drop for Foo {
  fn drop(&mut self) {
    compiler_error!("must transform to Bar before dropping")
  }
}

struct Bar(u8);

impl Foo {
  fn into_bar(self, x: u8) -> Bar {
    let _: () = transmute(self); // so it's not dropped right?
    Bar(x)
  }
}
1 Like

Great idea. That can actually be made to work with post-mono errors:

Playground

Never mind that only works when the type isn't used at all.

This one works though

Currently this pattern is limited in Rust, because it can only be checked dynamically at run-time

BTW, it's better to use ManuallyDrop than transmute.

Not sure if you missed my previous reply, but I've demonstrated that this can be checked at compile time. It just doesn't provide very good errors.

Here's an improved version that is IMO very easy to use.

5 Likes

I'd like to expand on the idea of compile-time forbidding dropping a must-move value. What if instead of transforming the data into a type that implements drop and holds the extra data, we made higher RAII be something like this?

trait HigherDrop: !Drop {
    type Args;
    type Ret;

    fn drop(self, args: Self::Args) -> Self::Ret; 
}

I'm specifically doing it like this for a very specific reason: async drop. If we generalize higher RAII as a must-move value that must be destructed with an explicit function call, and we make the function be this generic, we automatically get async drop:

#![feature(impl_trait_in_assoc_type)]

mod bikeshed {
    /// compiler magic: forbid drop and drop impls.
    pub trait HigherDrop {
        type Args;
        type Ret;

        fn drop(self, args: Self::Args) -> Self::Ret; 
    }
}


use core::future::Future;
use bikeshed::HigherDrop;

struct Foo;

impl HigherDrop for Foo {
    type Args = (String, usize);
    type Ret = impl Future<Output = ()>;
    
    fn drop(self, (str, n): Self::Args) -> Self::Ret {
        async move { println!("{str}, {n}") }
    }
}

#[tokio::main]
async fn main() {
    Foo.drop(("0".to_string(), 0)).await;
}

Can we get someone from T-lang to take a look at this? They probably have a better idea on what the plans are for async drop and the like.

We also have to deal with generics, and that's tough. In my mind, the most obvious solution is to statically forbid passing by value a type that implements HigherDrop as a generic parameter unless there are bounds for HigherDrop, as it'd be impossible for a generic function without HigherDrop bounds on its values to higher-drop them. Explaining stuff is always easier with bikeshedding, so:

struct Foo;

impl HigherDrop for Foo {
    type Args = ();
    type Ret = ();
    
    fn drop(self, (str, n): Self::Args) -> Self::Ret {}
}

fn generic<T>(_x: T) {}
fn generic_with_hd_bounds<T>(x: T)
where
    T: HigherDrop<Args = (), Ret = ()>
{
    x.drop(());
}

fn main() {
    let foo = Foo;
    // would raise a compile error:
    generic(Foo);
    // would eat foo just fine:
    generic_with_hd_bounds(Foo);
}

I just realised that this covers forbidding explicit core::mem::drop calls pretty well; and we'd only have to fight with new compiler magic the preexisting compiler magic, i.e. implicit drops and Copy. If done this way, we'd have to compile-time forbid implicit drops on HigherDrop values, and have HigherDrop be mutually exclusive with Copy and Drop (like Drop currently is with Copy).

If we generalize higher RAII as a must-move value that must be destructed with an explicit function call, and we make the function be this generic, we automatically get async drop

I don't think this fulfills most of the things people want out of async drop. In particular, requiring an explicit call to the drop function means that early returns become annoying and ? is not usable at all whereever a binding of a type implementing HigherDrop is alive. For some use cases, requiring explicit drop might be warranted (especially if dropping requires extra arguments), but I think most use cases for async drop don't fit that.

One fundamental problem with your HigherDrop trait is that it doesn't add capabilities to a type, it removes them, which is something traits don't usually[1] do. A natural formulation is to instead consider "implicitly droppable" as a capability, and make that a trait which gets implicitly implemented if HigherDrop is not implemented, then everything we currently have gets implicitly bounded by this new ImplicitDrop (like with Sized), and you can use ?ImplicitDrop + HigherDrop to allow it to be dropped with arguments. Adding a new implicit trait has always be considered a very extreme change though, so this might not be accepted. It also means that the entire ecosystem needs to be adapted for working with the lack of this capability, essentially creating viral ?ImplicitDrop + HigherDrop bounds on everything that can work with HigherDrop.

Another effect of the "implicit trait" formulation is that everything that exist now will require this trait, even language features. An example of this is your observation that generic functions need to be restricted (which comes naturally thanks to the implicit trait not being satisfied), but it isn't the only one:

  • you have to disallow converting HigherDrop types to trait objects, since they always assume the type to be droppable

    • and since HigherDrop is not object safe you can't even have a HigherDrop trait object, which would not be entirely unreasonable
  • you have to disallow any kind of control flow that would implicitly drop those types

    • this includes panics, which is particularly bad because you have to pessimistically assume a function may panic unless proven otherwise. For example just introduce a println in your function and now it may panic, see how this breaks @pitaj's proposal Rust Playground

    • this also includes .await points, essentially either disallowing storing HigherDrop types in Futures, or propagating the HigherDrop requirement to the Future itself, and making it incompatible with most APIs we currently have.


  1. Edit: to be fair though Drop already breaks this rule. ↩︎

2 Likes