Unwinding through FFI after Rust 1.33

gnzlbg · March 14, 2019, 9:59am

Honestly, this problem has always had a solution that works correctly and reliably 100% of the time: write a thin wrapper over the C / C++ / FooLang language that catches exceptions or whatever the error handling mechanism of that language is, and convert that to an error code that can be used on the Rust side.

The issue here is not about allowing people to solve a problem that they can't solve today.

AFAICT, the issue is that some people say that "this is too much work", and "we don't want to implement a binding generator that writes the code for us", and "we don't want to implement support for this in rust-bindgen", etc. therefore the language team and compiler team should add a solution to the language.

This sounds like a really long shot, and IMO, all those approaches are better suited to solve this problem than a language feature.

Nobody has asked or answered the question of "What do we mean by unwinding here?".

Rust panics are ABI incompatible with C SJLJ (SetJmp/LongJmp) and with C++ exceptions (and C++ exceptions are incompatible with C's SJLJ).

That is, if we say that panicking and catching panics across the FFI boundary is not UB, but implementation defined, then the implementation needs to specify per target an ABI for panicking through FFI, and if the code on both sides of the FFI does not adhere to it, then the behavior is undefined.

We can translate Rust panics into FFI by catching them on every FFI call, and we can catch all panics from FFI and translate them to Rust panics.

Yet if we set the unwinding ABI to match C++'s, C code using SJLJ would still be UB, and AFAICT, even if we set the unwinding ABI to match SJLJ then interfacing with C code using it will often be UB (e.g. SJLJ would require calling setjmp on the rust side and implicitly passing a jump context to C, but is this context passed as the first function argument? the last one? that's fixed by the ABI, yet every C library does this differently because they can).

And this is without mentioning the overhead this would add to FFI calls that are not #[no_unwind], which probably isn't acceptable. To remove the overhead Rust panic ABI would need to match that of the FFI, but then it either matches C++ or C, and is incompatible with the other. We probably can't match neither C++ or C because that would be a backwards incompatible change requiring tweaking destruction order or even allowing panics to skip destructors.

IMO even if we make it implementation-defined, the most reasonable default for an implementation would be to abort if Rust panics into FFI, to force the user to explicitly convert panics to the ABI of the language that they are interfacing with, and to remove the overhead for most FFI function calls which don't need to handle panics.

From panics into Rust, an implementation would probably say that unwinding into Rust is undefined behavior on that implementation, because there is no way to tell from the Rust side whether a FFI function will panic, and if so, from which programming language. For example, Rust calls into C without passing a jump context so the compiler assumes the C FFI can't panic, yet C calls C++, which throws where the C and C++ compiler, which are the same, cooperate to make that work in case C++ was calling the C code. So now we need to specify the C++ exception ABI for a C FFI function.

@BatmanAoD

So I believe that this could be well-defined with any of the major toolchains.

AFAIK there are multiple incompatible unwinding ABIs on Windows (SJLJ and SEH). When targeting e.g. the msvc targets which use SEH a C library can still use SJLJ (SetJmpLongJmp) for unwinding since those are standard C APIs. If we assume the unwinding ABI of the target to be SEH, which is what we should do on that target, the problem that's being discussed here would still not work because the C libraries use SJLJ instead.

@RalfJung

But instead people are sad because their crates don’t work any more, and then language designers are sad because they broke code.

A year and a half ago GCC stopped zero-initializing uninitialized variables, and people with programs relying on uninitialized memory being zeroed complained that now their code which was working correctly in the only platform and toolchain that they cared about was now broken.

This exact same thing is happening here. Some programs were invoking undefined behavior by unwinding across FFI boundaries and were appearing to work properly on some targets. A compiler change which catches this issue was then introduced, the migration path is clear, yet users argue that they shouldn't have to do anything.

There are many options available to users here. Working out the details about unwinding across FFI, platform unwinding ABIs, converting panics, exceptions, SJLJ across unwinding ABIs, adding controls to tweak that, etc. to the language, sounds like too much work for something that could be solved with a binding generator.

If the argument is that the binding generator needs to generate C code, then if we allow calling setjmp / longjmp from Rust, rust-bindgen could be extended to generate Rust code that handles these for the users, probably with some tuning to match the differences in ABI across the different libraries using SJLJ, and people would be able to write macros that do this for them if they don't want to use bindgen.

Topic		Replies	Views
Support C APIs designed for safe unwinding Unsafe Code Guidelines	12	2606	March 25, 2019
Unhandled panics in Rust vs. in FFI	11	2729	March 17, 2023
Some thoughts on a less slippery catch_unwind libs	7	2048	October 16, 2022
Pre-RFC: Remove Rust's dependency on Visual Studio in 4 (...complex?) steps compiler	15	4901	September 16, 2022
Pre-RFC: adding a `#[warn(drop_with_ffi)]` lint Unsafe Code Guidelines	8	585	November 27, 2025

Unwinding through FFI after Rust 1.33

Related topics