Support C APIs designed for safe unwinding


#1

Two very popular C libraries: libjpeg and libpng require unwinding for error handling. Rust doesn’t support interacting with them that way, and for me that’s a huge gap in Rust’s FFI story.

These libraries are specifically designed to be safe and not leak or break anything when their error handler unwinds (it’s plain dumb C, no destructors, and all heap allocations are tracked via libraries’ handles). In C they are most often used with setjmp/longjmp, but the APIs are setjmp-agnostic and work with anything that can unwind C’s stack.

extern "C"  {
   pub fn c_lib_call();
}
extern "C" fn callback_from_c() { panic!("go back to Rust"); }

fn main() {
   catch_unwind(|| c_lib_call());
}
extern callback_from_c() __attribute__((noreturn));
void c_lib_call() {
    callback_from_c();
}

Here’s the problem: Rust has paranoid approach to FFI unwinding that is ruined by (non)support for C++ and arbitrary black-box languages (which is understandable and I wouldn’t ask to support them). Unfortunately, that also throws away support for these common unwind-safe C libraries.

Up to Rust 1.23, even though it was theoretically undefined behavior, I was able to configure libpng and libjpeg to panic!() on error and catch that panic in Rust. This way I was able to use these libraries with pure-Rust wrapper, with no extra C code required. It also was as almost efficient as C, since I was able to set up catch_panic once for all FFI calls.

But Rust 1.24.0 (is the change coming back in a later version?) is intentionally preventing unwinding through FFI, which blocks the (unintentional) support libjpeg/libpng error handling method. Now I have to create setjmp-based wrappers in C for all my uses of these libraries. I can’t have pure-Rust project that just links to existing libraries on the system. My project has to include extra C code and depend on a C compiler. That’s a step backwards for me.

I know I could be calling FFI function that goes through C++, JVM JNI, custom assembly, and code compiled with exotic ABIs and extra instrumentation — but I also know when I’m definitely not doing any of this. Could Rust have “I know what I’m doing” attribute for unwinding through FFI?

When I’m calling plain-C libraries that are designed for unwinding I wish I could tell Rust not to try to save me from myself:

extern "C" {
     #[hey_rust_this_will_unwind]
     pub fn jpeg_start_compress(cinfo: &mut jpeg_compress_struct, write_all_tables: jboolean);
}

#[hey_rust_let_me_unwind_this]
extern "C" fn c_lib_error_handler(cinfo: &mut jpeg_common_struct) {
    panic!("{}", cinfo.err.msg_code);
}

#2

I don’t know much about the internals of unwinding, so I’m going to start with some naive questions. :slight_smile:

I was under the impression that, before this no-unwinding-through-FFI-enforcement was added, unwinding across an FFI boundary was already UB in all cases. Not knowing much about how unwinding is implemented, I figured it’d be something about the stack frames that the unwinding “jumps over” being required to be somehow “prepared” for this, making it UB for any of those stack frames to not be a Rust frame. The original bug report makes no restrictions when it says

It’s undefined to unwind past an FFI boundary such as a pub extern “C” fn.

So, given all that, I would think that what you did for libpng/libjpeg for up until Rust 1.23 was UB and just “happened to work”.

Are you saying that impression was wrong, and there are actually controlled circumstances under which Rust-triggered unwinding across non-Rust stack frames is guaranteed to work (in the sense of, the involved compilers and ABIs say that this is all right)? Also, imagine that we wanted to have our own ABI some day (for things like returning unsized rvalues), would we still be able to support unwinding in the same circumstances?

Cc @diwic @nikomatsakis @arielb1 who were involved in the PR introducing this feature.


#3

There is an existing attribute for this (although it appears to be super unstable). I know nothing about unwinding though, just remembered seeing something from this when reading kyren’s super interesting issue about it.


#4

I thought the change to enforce no unwinding across FFI boundaries was reverted after rust 1.24 broke rlua: https://github.com/rust-lang/rust/issues/48251#issuecomment-366742933 and the revert made that library work again: https://github.com/rust-lang/rust/issues/48251#issuecomment-370142834 Do libjepg/libpng do something with setjmp/longjmp that rlua doesn’t?


#5

AFAIK in Rust unwinding through FFI is declared to be UB, because it could be UB in certain circumstances (C++, and other languages that have their own exception mechanisms, RAII, defer, split stacks, etc.), but none of these apply to plain old C. Rust can’t know nor guarantee that any extern "C" function is merely pure C, so it bans them all just in case.


#6

libpng/libjpeg error handling is similar to Lua’s. However, it may be simpler to “fix” in Rust, because it doesn’t actually need to use C’s own setjmp/longjmp. libpng/libjpeg never call longjmp, they just allow user to call such functions (I’m not sure if rlua hardcodes longjmp calls, but the github thread seems to imply this). I think rlua’s problem is: setjmp → Rust fnextern "C" fnlongjmp (unwind from C to C, with Rust in between), which is definitely wrong for Rust fn.

OTOH I just need catch_panicextern "C" fnpanic! call sequence to work (unwind from Rust to Rust, but with some C calls sandwiched between them), but without mixing it with setjmp/SEH or any other unwinding mechanism.


#7

Sounds plausible, but we should find this out. :slight_smile: I hope someone knowledgeable appears in this thread and enlightens us. If they agree with this, I guess what you are looking for is stabilization of


#8

This was my impression as well when I fixed that I-unsound bug, which the release team then decided to regress again.

It’s also notable that during the rlua investigation it was found that MSVC is the only platform where the setjmp / longjmp actually triggers an abort, and that @acrichton seems to be working on how we generate abort landingpads on that platform, so then it won’t make a difference on MSVC either, after which the abort-on-panic landingpad can be re-enabled.

Meanwhile, it can also be worked around by a recompilation of the C library where you replace setjmp and longjmp with __intrinsic_setjmp and __intrinsic_longjmp on MSVC.


#9

Note there’s substantial difference between rlua’s problem and requirements of libjpeg/libpng. They’re almost completely different problems:

Almost entire discussion about rlua was about interoperability between setjmp, C++ exceptions, SEH and Rust unwinding. libjpeg/libpng does not need any of that. These libraries don’t do their own unwinding. These libraries just need Rust to catch its own panic!().


#10

Thanks for the diagram, but the RLua one is more complicated than that - it’s more like: Rust main -> C (lua) setjmp() -> Rust callback (extern “C” fn) -> C (lua) longjmp(). It does not try to catch something thrown in another language, it wants to jump over Rust stack frames.

For libjpeg’s case, which I guess explicitly states that using SEH or longjmp in a callback is okay - otherwise I would not have dared jumping over its stack frames - this exact case is supported with the #[unwind] attribute, which is unstable for reasons unknown to me. If you need this attribute (and it obviously seems like you do) you should push for its stabilization and then use it, instead of relying on UB to do something specific like you do today.


#11

Ah, interesting. This sounds highly relevant then:

“Jumping over” Rust stack frames may not be sound if that causes drop code to not be run.


#12

Bikeshed: extern "C unwind"?