Unwinding through FFI after Rust 1.33

I’m very sad that Rust is unable to use libjpeg and libpng with graceful error handling any more, due to blocking of unwinding through FFI.

These two popular libraries are specifically designed to be unwound, and unwinding through them worked perfectly well on x86 and arm (and probably every other platform, I just don’t have other machines to test). These C libraries have none of the UB problems discussed around unwinding of Rust or C++ through FFI.

libjpeg-turbo maintainer excluded possibility of changing libjpeg’s error handling to any non-unwinding mechanism, and there’s no Rust alternative of comparable speed and quality to libjpeg-turbo and MozJPEG.

Rust 1.33 effectively killed my mozjpeg crate, and I have no idea what to do next.

9 Likes

I feel like I’m missing something. libjpeg-turbo has a C API, a language that doesn’t have any concept of unwinding! The libjpeg-turbo documentation explicitly identifies use of longjmp out of error_exit as an option.

1 Like

longjmp is the way to use it in C. Rust doesn’t have it of course, but catch_unwind + panic!() worked well as a substitute for setjmp+longjmp. The libjpeg library is specifically designed not to leak any memory or have any other side effects if the error callback unwinds (or longjumps?).

Note that I’m talking about Rust -> C -> Rust unwinding. In my previous conversations about this it was almost always misunderstood as C -> Rust -> C unwinding, which isn’t safe.

3 Likes

Well, the #[ffi_returns_twice] RFC will make it possible to safely use setjmp + longjmp directly from Rust, if it doesn’t get stuck in bikeshed hell. :slight_smile: In the meantime, you could include some C code in your project that wraps the relevant functions from libjpeg/libpng while calling setjmp first.

I also think it should be feasible to actually support unwinding across FFI (including between C++ and Rust in both directions!) rather than declaring it UB. From what I understand of the implementation, I don’t see any fundamental blocker, though I could definitely be missing something.

I think it’d be easier to make #[unwind] stable. It’s already implemented and works.

I’d rather not use longjmp, especially in Rust (would that even work with drop?). libjpeg doesn’t need any of the complexity that C++ interoperability brings. Wrapping each individual call in C setjmp/longjmp pair is expensive: libjpeg may require one FFI call per line of the image, which is why it expects wrapping thousands of calls with a single pair.

I don’t need C -> Rust -> C or C -> Rust or Rust -> C or C++ -> Rust -> C++, or Rust -> C++ -> Rust complex unwinding. Just the trivial case of Rust -> C -> Rust is enough to support libjpeg and libpng, where the Rust start and end knows what it’s doing, and C doesn’t care (and these C libraries are specifically written in an “exception-safe” manner).

1 Like

You keep editing your post before I can reply! :slight_smile:

It is possible to "unwind" over Rust code using longjmp, so you could have C call setjmp then call back into Rust for a group of operations, but no, it wouldn't work with Drop, so you would have to avoid having any variables with destructors in that code (or at least accept that they wouldn't run).

Hmm... another option would be to include wrappers in your crate for each libjpeg API, but written in C++ instead of C. The error handler would also be written in C++ and would throw a C++ exception, and the API wrappers would catch the exception and translate it into a return code. That would avoid the upfront cost of setjmp.

But all these cases are C(++) -> Rust -> C(++) unwinding. That’s the exact opposite of what I need here.

Not necessarily.

For the setjmp variant, I'm suggesting that you'd have a wrapper like

fn with_setjmp<R>(impl FnOnce() -> R) -> Option<R>

which in turn would use a function defined in C like

int with_setjmp_c(void (*callback)(void *ctx), void *ctx);

which would call setjmp then call the provided callback, which would be written in Rust.

And then the error handler that calls longjmp could be written in either C or Rust, doesn't matter. So it would be Rust -> C (with_setjmp_c) -> Rust -> C (libjpeg APIs) -> Rust/C (error handler) -> longjmp, unwinding to with_setjmp_c.

For the C++ variant, you would have to avoid Rust code in the middle because that would trigger the unwinding abort, but you would wrap individual libjpeg APIs in C++ and also define the error handler in C++, so it would be Rust -> C++ (wrapper) -> C -> C++ (error handler), unwinding to the wrapper.

Rust -> C ( with_setjmp_c ) -> Rust -> C (libjpeg APIs) -> Rust/C (error handler) -> longjmp

AFAIK that bit in the middle would potentially leak all of its memory, which is why it's inferior to:

catch_unwind -> Rust -> C (exception-safe libjpeg API) -> Rust error handler panics.

where Rust, not C, is in control over unwinding and nothing leaks.

I’ve opened a tracking issue for the unwind attribute – it turns out we didn’t have one – https://github.com/rust-lang/rust/issues/58760.

I think at this point my recommendation would be to add those attributes and make crates be nightly-only for the time being if they need this unwind behavior.

1 Like

Yes, it would. I think the C++ approach is better because it avoids that issue. That said, I agree that it would be best to have Rust just support this natively.

1 Like

Very interesting and informative topic here; I didn’t know of the setjump trick to “re-route” an unwind in C. Could you provide a minimal working example within this thread?

Assuming the following Rust code:

extern "C" fn panic () // is ! compatible with extern "C"?
{
    panic!()
}

fn main ()
{
    extern "C" { fn call_from_C (extern "C" fn ()); }

    unsafe {
        call_from_C(panic);
    }
}

How should the body of the C function

void call_from_C (void (*rust_fn) (void));

look like?

Unfortunately, you’ve misunderstood. setjmp can’t re-route an unwind caused by a Rust panic; it can only be used as the target of a longjmp. In the case of kornel’s crate, I was suggesting that he change the error handler to call longjmp instead of panic!().

The reason C is involved is that it’s not (yet) possible to safely call setjmp directly from Rust, due to the lack of #[ffi_returns_twice]. It is safe to call longjmp from Rust – or at least not inherently unsafe. As mentioned previously, unlike panic!(), longjmp will not run any destructors for the stack frames it unwinds; not only is that likely to leak memory, some Rust libraries (e.g. scoped threads) also rely on destructors to uphold safety invariants, so you need to be very careful.

2 Likes

We’ve decided to revert the change back to the old default – see https://github.com/rust-lang/rust/pull/58795 for the PR and https://github.com/rust-lang/rust/issues/58794 for the tracking issue about this.

8 Likes

(Bringing it up as a sidebar since I haven’t seen it mentioned yet: setjmp+longjmp across arbitrary safe rust code is unsound, due to scoped APIs such as crossbeam’s, rayon’s, and others that expect to be able to run cleanup code or abort before the stack above them expires)

3 Likes

Thank you very much for changing the decision!

1 Like

It was my understanding that unwinding across FFI boundaries is UB, under all circumstances, period. The nomicon is quite explicit about that:

As such, unwinding into Rust from another language, or unwinding into another language from Rust is Undefined Behavior. You must absolutely catch any panics at the FFI boundary!

From this discussion I glean that it is actually allowed under some conditions. That's news to me. What are the conditions under which unwinding across FFI boundaries is okay? Does the Nomicon need updating?

3 Likes

In Rust uwinding through FFI is declared to always be UB. As in, it’s UB, because Rust says it is.

There are good reasons why in general unwinding through FFI can’t be guaranteed to be safe and reliable (because other languages can have their own unwinding semantics and Rust-incompatible implementations), so Rust’s general stance is entirely sensible.

However, in practice, there’s a special very useful case that seems to work fine despite being declared to be UB: unwinding through C, when using “exception-safe” C libraries. Rust doesn’t guarantee it’ll work, but it happens to work.

4 Likes

That sounds to me like it’s not actually UB. It’s not useful to say “this is UB but some programs rely on rustc not exploiting this”. We should find a way to specify the guarantees these libraries need, to make sure other Rust implementations (and rustc in the future) don’t start to break programs relying on this not actually being UB.

This is also a bad idea because it gives the wrong messaging around UB. In general, it is not okay to do things that are UB even if it currently happens to work. Some things are UB because we want to exploit them in the future.

17 Likes

Yeah, playing chicken with UB is the sort of thing that will one day make you extremely sad.

A real world example: chrome://inducebrowsercrashforrealz used to explode the browser process via segfault, by literally writing 42 to 0x00000000. Unfortunately, this made them extremely sad when LLVM started backtracing from the null deref… which is UB in C++. Now it’s just a ud2 instruction (or the equivalent on the target arch).

The moral: invoking UB is asking for your stuff to break at the drop of a hat.

7 Likes