Christmas proposal: "Don't call" annotation

Marry Christmas to you all.

There are two different motivations lead to this proposal. One is in this existing proposal:

Me, and the author of the post above wanted to have a way to say "this object cannot be dropped unless you have done the check that it is a good time to drop it".

My precise challenge is that the object may have active reference leaked so it may not in a safe status to be dropped. To write unsafe code correctly, we have to check the status to ensure the object does not have active references.

One possible solution is like the proposal suggested, introduce a "NoDrop" marker trait, but I have another idea here. This lead to my second motivation.

Sometimes the IDE can help you generate some stub code like the following:

impl MyTrait for MyType {
  fn my_method() { todo!() }
}

This is usually a good start. The problem is that the compiler don't prevent you from calling it before you implement it. Then if you have a whole set of such things (just give a real life example, you are implementing a serde serializer, and there are 20 methods to implement), and forgot one of them, you have to wait for the test to figure out you have missed something.

The proposal

I propose to have an attribute #[do_not_call] to annotate the methods that are not ready to be called, or designed not to be called. In this way, to achieve the "NoDrop" requirement we write

impl Drop for MyType {
    #[do_not_call]
    fn drop(&mut self){}
}

This should prevent the following:

  1. Any explicit call to the drop method that need to be generated, includes closures that were not called
  2. Writing any code that will need a drop clue on an object of MyType being generated
  3. Turning the object into a dyn Drop since this prevents the compiler to check if a drop call will reach the implementation
  4. Obtaining the fn pointer

Except the second point (which specific to the Drop trait), the other points applies to all other such annotated methods.

However this annotation should not:

  • prevent access via unsafe code
  • prevent the trait bound checks being successful
  • prevent other type of objects of the same trait being turned into trait object

Optionally, we can allow custom error messages

impl MyTrait for MyType {
  #[do_not_call("This is not yet implemented")]
  fn my_method() { todo!() }
}
1 Like

So is this post-monomorohization errors?

Those are a big no-go (as much as possible, const errors are maybe bending this slightly) and effectively a non-starter.

6 Likes

We don't need to go that far I think; all I think we will need is that a where T: Drop clause should pass since it is not calling drop.

Then consider this code:

struct MyType;
impl Drop for MyType {
    #[do_not_call]
    fn drop(&mut self){}
}

fn foo<T>(_: T) {}
fn bar<T>(t: T) {
    std::mem::forget(t);
}
fn main() {
    foo(MyType);
    std::mem::forget(MyType);
    bar(MyType);
}

Does this compile? If yes, you can't rely on this for safety. If not, how do you detect it?

Moreover if that doesn't compile then how do you even consume an instance of a type that has its Drop::drop marked with #[do_not_call]?

This should NOT compile.

  • To generate the function body of foo::<MyType>, it requires to generate a drop clue. And the drop clue contains a call to drop.
  • if the function foo is also marked with #[do_not_call] the funcion body of foo should compile but the call in main will not.
  • if the call to foo was also removed, it should compile, and MyType will not be dropped since there is no call to drop.

To properly consume the instance, you need to write

fn foo<T>(t: T) {
   // Do something to check that t can be dropped safely
   // and can be safely forgot
   let _ = ManuallyDrop::new(t);
   // or std::mem::forget(t)
   // However, with ManuallyDrop you can move things out after the `new` call 
   // so it is not leaked.
   //
   // In general there should be a method `manually_drop` for `MyType` that encapsulate this.
}

That happens post-monomorphization, but you said we don't need to go that far. It also doesn't work for bar, because it doesn't actually drop t, it just leaks it. Detecting std::mem::forget/ManuallyDrop/Box::leak won't work because everyone can write their own Box::leak and that's currently sound.

3 Likes

I thought it would be during monomorphization? To generate an error the compiler don't have to get all generic instances being generated.

Also, for std::mem::forget it is already a leaking function today, this would not make things worse isn't it? It does not seem to be accidental that a programmer write a call to forget. It looks intentional.

Again, the community has been decided leaking is not a memory safety issue, as long as it is explicit. so "doesn't actually drop" is not a problem.

If you are kin to drop it properly, the correct way to do so would be unsafe { ManuallyDrop::new(t).drop(); }, because as I proposed unsafe code ignores the annotation.

I don't think that happens when running cargo check.

Yes, but this also means unsafe code can't rely on this in any way (except maybe Drop::drop not being called? Not sure how that's useful though), which is in contrast with your original goal:

Here is the code that worked today.

I will define the RcCell type. It is different than Rc that it will not drop the object itself when all references are dropped. It is different than RefCell that its reference type will be free of lifetime.

pub struct RcCell<T> {
    value: T,
    refs: Weak<()>,
}

Then we define the reference type.

pub struct RcRef<T> {
    refering: Rc<()>,
    ptr: NonNull<T>,
}
impl<T> Clone for RcRef<T> {
    fn clone(&self) -> Self {
        RcRef {
            refering: self.refering.clone(),
            ptr: self.ptr,
        }
    }
}
impl<T> Deref for RcRef<T> {
    type Target = T;
    fn deref(&self) -> &T {
        unsafe { self.ptr.as_ref() }
    }
}

The idea is the object holds a Rc::Weak pointer, and the reference type holds a Rc pointer. So if we have checked that the Weak can be upgraded we know there is a reference exists. This is what we need to make sure there is no active references out there.

impl RcCell {
       pub fn get_ref(&mut self) -> RcRef<T> {
        // We first check if there is active references.
        // If we already have, we reuse the existing reference.
        // Other we need to create a new one.
        let r = match self.refs.upgrade() {
            Some(r) => r,
            None => Rc::new(()),
        };
        self.refs = Rc::downgrade(&r);
        RcRef {
            refering: r,
            ptr: unsafe { NonNull::new_unchecked(&mut self.value) },
        }
    }

}

(Note, we require a &mut reference to the cell in order to create a reference since we are modifying the refs control field. Can be improved by putting it in to a Cell.)

Now the challenge is that we cannot allow the compiler to drop the object because we have to check if there are active references.

This is why I want to disallow a drop unless it is done manually. Today the best thing I can do is define a Drop and check if there is active references, panic.

Note that the above will not be safe if people call std::mem::forget when active references exists. However this issue can be easily fixed by changing the field type of value to Box<T>. So I will leave it.

std::mem::forget is not even needed for that code to be unsound, moving the RcCell (just moving, not dropping!) is enough to invalidate the ptr field in the RcRef. Box kinda fixes this, but you're still left with aliasing problems.

Overall looks like you're just trying to make an Rc where the first/root Rc can't be dropped before the others, which doesn't seem that useful to me.

Yes this is exactly I am looking for - to model a simplified version of the Rust borrow system, at runtime. So it is important to prevent the root Rc being dropped because it is what Rust compiler checking.

Maybe this case itself is not too useful, but there are other proposals like NoDrop trait (see the top post) that will also make this code possible by adding features that only apply to Drop. And I think my proposal will solve that problem as well, plus we can also have compiler checked unimplemented functions.

I'm not sure I'm following:

  • you want to prevent the RcCell from being dropped, which is a compile time check;
  • but you want to model a simplified version of the borrow system at runtime.

Another related problem is how do you handle consuming an RcCell? Do you have a function/method that takes it by value and returns a Result<(), RcCell<T>>? In that case how is the user supposed to handle the Err case? In my experience this either complicates things a lot or forces the user to panic, which is not better than just panicking in Drop::drop.

Regarding panics, what happens if a panic happens to try to drop an RcCell? Is that a compile time error? Is the RcCell leaked? Does the program abort?

ps: I don't see any unsafe code in your previous example that depends on this feature for safety.

I'm not saying there aren't usecases for a feature like this, but many of those usecases probably need a stronger version of this feature, which in turn has bigger problems. It would be useful if you could show some usecases that work with your specific proposal, not just the general idea of preventing drop.

The second implies the first - the current RefCell does this by enforces a lifetime to prevent the drop when references exists. But this is not pure runtime check.

Any pure runtime check system will require to have a way to prevent the drop.

Yes this is what proposed but not necessary the only one. For example the following

fn try_drop<T>(t: &mut Option<T>) -> Result<()> {
...
}

The caller wrap the object into an Option, then if the implementation checked that it is good to drop the input value will be modified to None and Ok(()) is to be returned. Otherwise gives an error and the input is not dropped.

This is actually better - even the user is forced to panic, they are in a better context to provide correct behavior. Other than panic they also have a lot of options - for example they can move the object temporary to a container that will have scheduled tasks looking at, to remove those objects that are ready to be dropped (a minimum GC system) .

My proposal will be that even the generated panic handling code should not call drop if it is annotated not to be called. So this will be compile error.

Unfortunately my knowledge about panic is not enough to predict the implication of this. So devils may be lurking here.

I don't know what do you mean. Is the unsafe block in the Deref impl count? It is depending on the object not to be dropped, isn't it?

Yes, this proposal is about allowing the compiler to prevent some monomorphized code being reached. It is not specifically for Drop.

It is just happen to also can be used for the NoDrop issue. However compare to other proposals to prevent auto drop, this proposal has the simplest brain model - "The compiler will complain when it tries to generate a call to this method".

Yes to make it working for drop it might require some hard work to get things done properly, but it is also not stopping us from adopting to any other better solutions - they should not interfere with this solution. For example, an NoDrop trait can be added later with some objects already annotated do_not_call on drop.

I would also highlight the guaranteed compatibility of this proposal:

  • All code that compiles today should compile without change
  • All code that to be rejected today should be rejected even when some do_not_call annotation being added

Concret use cases other than Drop

Let's say, we are writing a proc macro to be annotated on structs/enums, to add some functionalities. In order to do that, the proc macro will generate calls to different methods that to be provided by the user as trait methods.

(One not so good example is serde, to implement a new data format you need to implement Serializer with different methods. Then the proc macro derive(Serialize) will generate calls to those methods)

Now, let's say you are defining the trait but you don't want to support some type of definition (for example, you don't want to support structs without naming the fields).

Today, you will do so by returning an error at runtime if your method is not supported for the struct.

With this proposal, the compiler can detect that the user is attempted to call an unsupported method and gives the message that you the trait implementer wanted the user to see.

But if panic is forbidden then that container is the only way. I wouldn't call that GC though, it's more similar to temporarily leaking IMO.

This means that if you have a local variable with its Drop::drop marked as #[do_not_call] then you won't be able to call any code containing slice indexing, unwrap, expect ecc ecc. And code like that is pretty widespread, even if in practice it will never panic. Even worse, with this proposal adding panicking code to a function that didn't contain it before would now become a breaking change.

That doesn't fundamentally need unsafe though. You can have RcCell and RcRef both be wrappers around Rc<T>. Marking RcCell::drop as #[do_not_call] would get you the same API, but without it it doesn't become unsound, nor it becomes slower.

This is like saying that C++ templates are better than rust's generics because they don't need complex trait bounds, if you did something wrong the compiler will just complain. But this makes accidental breakages easier, the code will be harder to read because you can no longer understand if it will compile just by giving it a glance, and the errors become potentially much harder to understand.

To me his seems more like an hack than intended behaviour.

1 Like

I was thinking again and figure out that this proposal does not actually change the language in any ways (it does not adding new features that will accept code that will be rejected today), so it would be better be a Clippy lint. After all, the true requirement of this is to allow the programmer to know when a specific method get called and where. If the compiler can provide a monomorphized call map the IDE/Clippy can just look at it and check the record to find out.

So in theory the compiler don't need to emit a hard error. If Clippy can detect that a function which body that contains only a unreachable!() macro being called, we are OK to go. (In this case we would not even distinguish unsafe code because we can always add an annotation to bypass a lint)

I would better raise another proposal about compiler/tooling rather than language design. However I need to know if the compiler need to be adjusted to generate the call map.

AFAIK currently there's no lint that runs after monomorphization, so this definitely needs changes to how the compiler works.

Possibly related issue: Some errors only show up with `cargo build` not `cargo check` · Issue #70923 · rust-lang/rust · GitHub