An approach to linear-ish types

JarredAllen · July 1, 2024, 11:30pm

I've seen a bunch of people talking about trying to make a new trait for linear types (or undroppable, or unleakable), which I'm convinced would be a lot of work. However, I thought of something which might be easier to implement and still get most of the benefits: Instead of a full trait, we could make an annotation, like #[must_use], that raises a lint.

Any other thoughts about this approach to "linear" types anywhere? I find it hard to believe that I'd be the only one to think of this, but a cursory search didn't find anything.

My idea

We make a new attribute #[must_consume] (can bikeshed the name and whether it belongs under the diagnostic namespace), which can annotate a struct/enum/union definition and marks the type as one which must be consumed.

For the purposes, of this attribute, "consuming" the value can be either destructuring it or calling a method (only methods, not other functions) on that value which 1) takes ownership of the value, and 2) is defined in the same crate as the value.

As with #[must_use] you can also supply a string message to be displayed in a hint if a value is not consumed. Unlike #[must_use], a simple let _ = does not suffice to silence the lint (the only workaround is an #[allow(..)] where the value is dropped).

For example, with the following definition:

#[must_consume]
struct Foo {
    pub bar: u8,
    baz: u8,
}

The following function, defined in the same module as Foo is allowed because it destructures the Foo value:

fn destruct_foo(foo: Foo, noisy: bool) {
    let Self { bar, baz } = self;
    if noisy {
        println!("Destructing `Foo { bar: {bar}, baz: {baz} }`");
    }
}

This destruct_foo function would also be allowed outside of the module, if baz were a public field.

This method, also defined in the same module as Foo, is also legal because it's a method on Foo that takes ownership and is defined in the same crate:

impl Foo {
    fn consume(self) {
        println!("Consuming `Foo { bar: {}, .. }`", self.bar);
    }
}

However, replacing this method with a free function or writing this method outside of the crate defining Foo (e.g. through a trait implementation) would trigger the lint.

Because Foo has a private field, any Foo values created outside of the defining crate can only be destroyed by passing the value into the destruct_foo or Foo::consume functions defined in the crate. Note that it is allowed to do so indirectly, so e.g. these functions are allowed:

use foo::Foo;

fn take_foo(foo: Foo) {
    foo.consume();
}

fn take_foo_2(foo: Foo) {
    take_foo(foo);
}

Edge cases

If a value is wrapped in another value (e.g. a struct containing this type as a field), then it should still warn if the outer value is dropped, as that drops the inner value.

I'm not sure how easy this would be to do, but it'd also be nice if this applied through generic types, so you can't accidentally ignore this by sticking the value inside a wrapper type. The analysis can't perfectly extend through Rc and other types that may or may not drop values based on runtime behavior, but ideally we'd get as many ways of dropping a value as possible.

Limitations

The main limitation is that, as a lint, a library author can't stop users from #[allow(drop_must_consume)] (or whatever we call the lint), hence why I titled the post "linear-ish types", as it only incentivizes correct use, it doesn't require it. Thus, unsafe code can't rely on one of these functions being called for soundness (any code can rely on it for correctness, though the stdlib probably doesn't want to).

Also, this doesn't truly enforce linear types, even without lint allows, since you can e.g. make an Rc cycle or write to a static variable to leak the value.

Cases where it helps

Async Drop

This would help with the async Drop problem, as you can raise a warning when people hit the drop code. For example:

#[must_consume]
struct Session { /* private members */ }

impl Session {
    async fn close(self, cx: ..) { .. }
}

This will heavily encourage library users to call Session::close to gracefully close a session when they're done with the session, instead of dropping the handle. It's not a guarantee that it will get called (it could get leaked, or the lint could be allowed), but this makes it harder to accidentally misuse the Session type.

Return from Drop

There's a lot of types that do something like this:

struct IoResource { /* private members */ }

impl IoResource {
    /// Close the resource because we're done with it.
    fn close(self) -> io::Result<()> { .. }
}

impl Drop for IoResource {
    /// Attempt to close the resource.
    ///
    /// Prefer [`Self::close`] because it can return an error if something goes wrong.
    fn drop(&mut self) -> { .. }
}

If the type is now annotated with #[must_consume = "Close the `IoResource` by calling `close` to allow handling an error"], then users will get a lint unless they explicitly call IoResource::close on the value (up to the caveats mentioned above) suggesting that they call the close method instead (which returns a Result which is #[must_use], so the caller must also decide how to handle that).

If you want to be even more pedantic about closing the resource, you could even do:

fn close(self) -> Result<(), (io::Error, Self)> { .. }

to force the caller to retry closing the resource until it succeeds (probably not good API design most of the time, but you can do it).

pitaj · July 2, 2024, 12:53am

You can achieve something like this with post-mono errors

mod undrop {
    #[doc(hidden)]
    pub struct Helper<const D: bool>;
    impl<const D: bool> Drop for Helper<D> {
        fn drop(&mut self) {
            const {
                if !D {
                    panic!("this type cannot be dropped");
                }
            }
        }
    }
    
    /// A marker type which cannot be dropped.
    ///
    /// Do to how struct/enum fields are automatically
    /// dropped, placing this type at any depth will
    /// cause an implicit or explicit drop of the
    /// containing type to throw a post-mono error.
    pub type PhantomUndrop = Helper<false>;
    impl PhantomUndrop {
        pub fn drop(self) {
            let droppable: Helper<true> = unsafe {
                std::mem::transmute(self)
            };
            std::mem::drop(droppable);
        }
    }
    #[allow(non_upper_case_globals)]
    pub const PhantomUndrop: PhantomUndrop = Helper;
}

Playground

idanarye · July 2, 2024, 9:08am

The main problem remains - what do you do with early returns?

fn some_function() -> Result<(), SomeErrorType> {
    let mut io_resource = IoResource::new();
    let value = io_resource.read_value();
    let processed_value = process_value(value)?;
    io_resource.write_value(processed_value);
    io_resource.close();
    Ok(())
}

Note that I purposefully ignored the errors from IoResource's methods, so that we only have to consider one early return and so that it'd be clear that this early return does not also consume the value.

Should this function trigger the lint? Linear typing rules say yes. But this will make error management a nightmare, because as soon as a liner typed value enters the scope we can't use ? anymore and have to resort to:

fn some_function() -> Result<(), SomeErrorType> {
    let mut io_resource = IoResource::new();
    let value = io_resource.read_value();
    let processed_value = match process_value(value) {
        Ok(ok) => ok,
        Err(err) => {
            io_resource.close();
            return Err(err.into())
        }
    };
    io_resource.write_value(processed_value);
    io_resource.close();
    Ok(())
}

SkiFire13 · July 2, 2024, 9:50am

JarredAllen:

Note that it is allowed to do so indirectly, so e.g. these functions are allowed:
use foo::Foo;

fn take_foo(foo: Foo) {
    foo.consume();
}

fn take_foo_2(foo: Foo) {
    take_foo(foo);
}

This breaks the encapsulation principle, as now whether take_foo_2 is valid or not depends on the body of take_foo and not just its signature.

NoamB · July 2, 2024, 9:54am

Wouldn't take_foo() fail to compile on its own, because it is not allowed to be the final consumer of a Foo?

In this case, I would expect take_foo_2 to compile.

SkiFire13 · July 2, 2024, 10:07am

According to OP however that code should compile.

Vorpal · July 2, 2024, 10:15am

That is a neat trick. But how do you drop it on the code path that you want? I thought the idea of linear types was that you need to call a specific method to consume it, and that it (or they) alone are allowed to drop the type. To me it looks like that PhantomUndrop can never be "disarmed"?

2e71828 · July 2, 2024, 11:48am

The inherent method PhantomUndrop::drop disarms and consumes the PhantomUndrop. So, if you have a struct S that contains a PhantomUndrop, your cleanup function takes S by value so that it can be destructured and then calls PhantomUndrop::drop explicitly on the relevant field.

Vorpal · July 2, 2024, 1:44pm

Ah, i missed the distinction between the inherent and trait methods.

system · September 30, 2024, 1:45pm

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
[Idea] Attributes to warn if value is dropped without being acted upon language design	25	2376	November 10, 2019
Must_use and linear types language design	22	3323	October 30, 2020
[pre-RFC] linear types, take 2 ideas (deprecated)	10	3639	March 25, 2019
Destructuring Droppable structs language design	61	2702	October 13, 2024
De-facto linear types and `Arc`. Need for an “unwrap_or_drop” API? libs	11	1341	November 27, 2020