Pre-RFC: adding a `#[warn(drop_with_ffi)]` lint

Neutron3529 · June 3, 2024, 4:24am

Summary

This RFC proposes a new grammar for ffi parameter with Drop implememted, and two warn-by-default lint #[warn(drop_with_ffi)] and #[warn(safe_drop_with_ffi)], two allow-by-default lint #[allow(unused_drop_ffi_mark)] and #[allow(missing_ffi_drop_impl)]

Motivation

With FFI interfaces, users may accidently use the wrong type which implement Drop. Currently such interface is NOT unsafe but it might break many assumptions and easily cause a double-free.

Let us consider the following situation:

In R ffi, Rust could allocate R type (called SEXP since R is a open source version of S), since R has a garbage collector, Rust must ensure the allocated vector is not recycled, thus we must PROTECT that vector, before return, we must unprotect all the allocated vectors to tell R recycle the unused variables.

Here comes the question, rust encourage us writting a impl Drop for Protected<T> to do such logic, but there is something different: In ffi interface, Protected<SEXP> have the exactly same repr as SEXP, and user may accidently wrote code like

extern "C" fn bad(a:SEXP)->Protected<SEXP>
extern "C" fn even_worse(a:Protected<SEXP>)->SEXP // cancel the protection of `a`, thus `a` might be recycled accidently.

Since it is statisticians rather than programmers who will write the ffi interface, for the base lib which provide SEXP and Protected<SEXP>, it is hard to tell statisticians not to use Protected<SEXP> in ffi interface.

It might be wise to add a #[warn(drop_with_ffi)] lint, let user know Protected<SEXP> is a rust type which cannot be send across ffi boundary.

The most surprising thing is that, send a impl Drop type across ffi is not marked as unsafe.. I don't think other language (even Rust itself) can deal with such extern call very well. Maybe the better practice is that, use unsafe block with the parameter that impl Drop.

Detailed Design

This RFC adds a new grammar for both parameter and the return type. Suppose we have

struct Foo {...}
impl Drop for Foo {...}

Then for any extern fn, these should issue an warning

/// warning: parameter `a` has type `Foo`, which implemented `Drop` trait, might not safe with ffi calls.
/// note: `#[warn(drop_with_ffi)]` on by default
/// note: use `extern fn bad(unsafe a: Foo impl Drop)` instead.
extern fn bad(a:Foo) {...}
/// warning: the returun type `Foo` implemented `Drop`, its `Drop` implementation might not be called, which might cause memory leak and other .
/// note: `#[warn(drop_with_ffi)]` on by default
/// note: use `extern fn bad2() -> unsafe Foo impl Drop` instead.
extern fn bad2()->Foo {...}

If sending such variable with Drop implemented is carefully designed and considered safe, a lint #[allow(safe_drop_with_ffi)] could be added

/// warning: parameter `a` has `Drop` implemented, might not safe enough with ffi calls.
/// note: use `extern fn bad(unsafe a: Foo impl Drop)` instead
/// note: If such function is carefully designed, you could add a lint #[allow(safe_drop_with_ffi)]
/// note: `#[warn(safe_drop_with_ffi)]` on by default
extern fn bad(a:Foo impl Drop) {...}
/// warning: the return type has `Drop` implemented, might not safe enough with ffi calls.
/// note: use `extern fn bad() -> unsafe Foo impl Drop` instead
/// note: If such function is carefully designed, you could add a lint #[allow(safe_drop_with_ffi)]
/// note: `#[warn(safe_drop_with_ffi)]` on by default
extern fn bad2()->Foo impl Drop{...}

these should works fine, file no warning by default.

/// normal rust function, should not yield any warning about Foo impl Drop
fn good(a:Foo) { ... } 
/// unnccessary Drop, allow by default
fn good2(a:Foo impl Drop) { ... } 
/// i32 is not Drop, thus the code is safe, allow by default even if `#[deny(safe_drop_with_ffi)]` is set
extern fn good3(a:i32 impl Drop) { ... } 
/// I'm unsafe function, use me wisely:)
unsafe extern fn good4(a:Foo) { ... }

Drawbacks

This RFC may warn about extern "Rust" fn bad(a:Foo) and even extern "rust-call" fn bad2(a:Foo). I have no idea what is the meaning of such extern.
for a pass-through function, the following function should be safe enough (since there is no need to call Drop function), but the lint would generate warnings.

extern fn pass_through(a:Foo)->Foo {
    // deal with a
    return a; // thus whether Foo impl Drop is not the concern.
}

Alternatives

Using a lot of magic things (like macro 2.0) to prevent user writting Protect<SEXP> in either the return type or the parameter location.
Writting documents to let user know which type is allowed in ffi interface.
Adding a #[deny(no_across_ffi)] lint and a attribute #[no_across_ffi], then we could directly write #[no_across_ffi]struct Protect<T>{...}

Unresolved questions

Should we add such lint to extern "Rust" fn or extern "rust-call" fn or even extern "rust-intrinsic" fn calls? Perhaps an unsafe is needed
Is there a better grammar dealing with pass_through functions?

CAD97 · June 3, 2024, 5:39pm

I think this is (or should be) just part of warn(improper_ctypes), which honestly needs a bit of TLC anyways. Your case is also a bit of whether a type is "visibly repr(transparent)" — you want to be able to transmute between the untyped raw SEXP and Protected but not expose that permission to library consumers. (If you don't need to be able to transmute or reinterpret behind a pointer, just don't use repr(transparent). Constructing a newtype style wrapper is zero-cost^[1] even without the repr hint.)

I think the short term bandaid fix would be an attribute #[diagnostics::warn_in_ffi({bool})] which can be used to override whether improper_ctypes fires for the use of a type in an FFI signature. Longer term I'm not sure what the ideal end state should be, other than it should distinguish between by-value FFI usage and by-reference FFI usage, since the former usually implies ABI stability, whereas the latter often doesn't need to imply layout stability (e.g. when pimpl like patterns are used).

extern FFI is inherently unsafe, so it's okay if some rough edges remain on Rust's built-in support; tools that make FFI safer necessarily need to coordinate with the build environment and other side of the FFI bridge if they want to provide safety. Even for targets where linkage itself is (partially) checked like wasm (component model), state of the art still requires decorating the FFI exposed API in some way to bridge between the Rust and FFI vocabulary. So in your case, you'd have some unsafe trait RAbi which describes types that can safely be marshalled between Rust and R and whatever decoration is used to actually expose/link the functions requires this for parameters. (With an AssertAbiSafe style wrapper that just uses C ABI directly for opt out usage of ABI safe types that can't have the trait implemented on them directly because of coherence.)

(Plus based on other things I recall you mentioning about R, e.g. how errors get handled, the R target might want for a separate triple abi so "unwinding" can use the R mechanism. At a minimum, it sounded like it'll be a similar case to Lua's SJLJ error handling where Rust frames cannot be unwound by "forced unwinding"; the absence of destructors is necessary but likely not sufficient to deallocate call frames without using Rust's panic unwinding mechanism.)

Zero cost within Rust code, at least. The extern "C" calling convention might dictate that primitives and primitive-shaped user-defined types are passed in a different manner. ↩︎

Neutron3529 · June 4, 2024, 5:32am

I suddenly realized that, this might not only improper_ctypes, If we really take care, we may writting an additional C wrapper who calls the drop function (just Rf_unprotect_ptr(return_value)) and thus transfer the struct with drop implemented should be allowed in some ways.

In this case, we cannot say whether send a struct with drop is improper.

Actually, the R interface suppose all the parameter is PROTECTED and the return type should not be PROTECTED..

If we have to match the R restriction, we must wrote:

extern "R" fn WTF(a:Protected<SEXP>, b:Protected<SEXP>)->SEXP {
/// executing deconstructor of both a and b, BOOM! 
}

In this case, the only choice is that, using improper ctypes.

Things is a little bit tricky here: It is not the lib maintainer who wrote the ffi extern "C" fn ..s

Normally, the type system of Rust just disallow sending different type across ffi, no matter how many #[repr] it has.

#![crate_type = "cdylib"]
#![allow(unused_variables,dead_code)]
type Foo=*mut i32;
#[repr(transparent)]
struct MyFoo(Foo);
struct MyFooWithDrop(MyFoo);
/// this is the user-defined function, which could be called by other program.
#[no_mangle]
extern "C" fn call_foo(a:Foo){
    println!("{a:p}")
}
#[test]
fn test(){
    let a:Foo=&mut 1i32 as *mut i32;
    call_foo(a);
    let b=MyFoo(a);
    // expected `*mut i32`, found `MyFoo`
    // call_foo(b);
}

The problem is that, it is normal user rather than professional Rust expert who wrote the code.

To make compiler happy, user may change the ffi interface from Foo to MyFoo or even MyFooWithDrop.

Since Rust cannot realize whether there is a drop executer in the foreign side, the choice Rust could make is just warn the user, such struct contains a drop, you should notice that fact, and planning writting and executing a equivlent drop procedure it in the foreign side.

kornel · June 4, 2024, 4:31pm

I don't think it has anything to do with Drop, but rather you have a type that you don't expect to receive from or return to FFI, and the reason for that just happens to involve a particular implementation of Drop for that type.

So you just need an ability to mark a type as FFI-incompatible.

Drop types in FFI can be safe and useful:

extern fn create_widget() -> Box<Widget> { Box::new(Widget::new()) }
extern fn use_widget(w: &Widget) { }
extern fn destroy_widget(_: Box<Widget>) {}

This saves Box::into_raw()/Box::into_from() boilerplate, which I think is a good thing, because it uses types like a native Rust interface would, so it's documenting ownership as part of the interface instead of degrading everything to *mut Widget. It also stays clear of mem::forget that could be used by mistake in such APIs.

CAD97 · June 4, 2024, 5:38pm

It's also worth noting that extern linkage doesn't even necessarily mean FFI using C capabilities only. The other side of FFI could be more Rust, it could be C++, it could be some other language with the ability to do RAII style cleanup. The only important part is the same one as always — ensuring both sides of the FFI agree on the signature and how to handle the types involved.

Just a minor note: it's of course safe to forget to run drop glue for a type. If the only reason you want to forbid an FFI signature is that other side of FFI could bypass the drop glue, that's an incoherent complaint; Rust can as well. And, as you figured out, FFI can still drop the type properly if some consuming API is provided.

More relevant: Rust doesn't really do warnings where the only way to quiet the warning is to allow it^[1]. The majority case when a warning fires should be because there's probably a better way to write the code that does what the developer intends without triggering the lint, or in specific cases (e.g. clippy::suspicious ) that the code looks wrong at first glance and deserves a note that no, this actually is what is meant and why.

Unfortunately, #[no_mangle] is unsafe^[2] functionality that doesn't require writing the unsafe token in the source. The RFC for unsafe attributes is accepted, so at some point this will likely be spelt as #[unsafe(no_mangle)]. Because FFI linkage is unsafe, it's on the author of the FFI signature on both sides to ensure that the signature matches, and it's outside the scope of Rust to guarantee that.

If you want to make things safer, generate the "client" side of the FFI from the "host" side so one of the two is canonical and the other is generated to match.

There are of course exceptions like unsafe_code and the clippy::restriction category, but they're all allow-by-default (opt-in). ↩︎
Even without the other side of FFI, a name collision violates ODR and can't always be prevented by the linker from causing issues. ↩︎

Neutron3529 · June 5, 2024, 10:17am

How to do it? Maybe you assume a wrong background.

It is users rather than me who will create FFI functions. I can control every function I can control, but I cannot even warning a user about that, "this parameter is incompatible with FFI interface."

The workflow is shown as follows:

I, wrote a crate for Rust users who want to writting R plugins with Rust
Users, use my crate and write FFI interfaces like #[no_mangle] pub extern "C" fn Foo()->Protected<f64> since compiler said that, Protected<f64> found.
Users call the functions in R, with .Call("Foo")
as a final result, the program alloc a Rust Protected SEXP without cancalling its protection, which causes memory leak.

I can only control 1, but not 2-4, I'm finding ways to mark the type as FFI-incompatible, but I actually cannot. I can only write documents about that.

But, who read documents?

You means, raise them into an hard error and break lots of code just like what unsafe did?

The misuse of drop function may easily leads to UAF, double free and so much unsafe things. The only reason I use a warning is that, there might be some code with drop just works fine. But actually, without those code, a hard-error might be better.

The problem is that, such thing cannot be done.

It is not me who controlls the FFI interface, and R just accept arbitrary SEXP expression. R have no knowledge which Rust type is Protected or Owned.

The only thing I could do is just writting documents.

And, to ensure people will read my document, I have to play lots of visible control magic to hide all the unwanted things untouchable.

Unfortunately, it is not no_mangle cause the unsafe things. The bad is #[no_mangle] exports the function. R could always call a function with its exported name.

.Call("foo",1L,2.,"3",list(4)) # legal R expression
.Call("NoMatterHowFooHasBeenMangled",1L,2.,"3",list(4)) # legal if the exported name for `foo` is NoMatterHowFooHasBeenMangled

The name is not the thing worth discussion here.

Currently, the code is:

#[cfg_attr(doc, doc(cfg(feature="public-all")))]
#[cfg(any(doc, feature="public-all"))]
pub mod thing;
#[cfg(not(any(doc, feature="public-all")))]
mod thing;

Thus, doc is visible, but unless enable the dangerous feature, thing is not touchable.

(And then, I have to find a way to wipe the wrong available with feature public-all only flag)

kornel · June 5, 2024, 12:11pm

There's no such thing in Rust yet. I mean the feature you're proposing should be marking a type as incompatible with FFI, and fire improper_ctypes lint, regardless of whether the type has Drop or not.

I assume your Protected type is defined with #[repr(transparent)]:

#[repr(transparent)]
pub struct Protected<T>(T)

which makes it inherit the layout of the type it wraps, so it becomes FFI-compatible whenever its only field is FFI-compatible (that's by design). You would need an attribute that undoes that.

Hypothetically:

#[repr(Rust, transparent)] // to stop transparently inheriting repr(C)
// or something like
#[improper_ctype]
pub struct Protected<T>(T)

bjorn3 · June 5, 2024, 2:16pm

Symbol mangling differs between any two versions of rustc (even between nightly versions that are a single commit apart), so for all intents and purposes it is exactly as easy to call as a function that isn't exported at all. In both cases you can guess the exact address at which the function is placed. And in both cases that is UB on the side of the caller.

Topic		Replies	Views
[Pre-RFC] Rustc and Handling Unsafe Code	7	1022	October 16, 2019
Pre-Pre-RFC: FFISafe trait language design	9	855	February 19, 2020
(Pre-)RFC for ExplicitDrop language design	20	1482	August 18, 2020
Pre-RFC: Unsafe Types	28	7653	March 25, 2019
Pre-RFC - Add compile_warning! macro language design	10	4304	May 7, 2019