Unwinding through FFI after Rust 1.33

@gnzlbg in retrospect this makes sense, but also seems to indicate that a no_std dynamic library would potentially be in quite a bit of trouble.

Just trying to understand the possible cases here, it seems like:

  • std:
    • The panic happens
    • The panic_hook is executed
    • The panic runtime is executed, which is either an unwind or an abort
  • no_std:
    • The panic happens
    • The program immediately jumps to the panic_handler function and must never return.

Is this correct?

Assuming so, here’s two annoying questiond that Rust probably needs an answer to because it can happen:

  • What happens if std rust code calls into a dynamic lib with no_std and a panic_handler? My guess, the lib uses its panic_handler and never returns
  • What happens if std Rust calls a std dynamic lib? will the right panic_hook get used? even across compiler versions? Guess: it’s probably totally broken if that happens?

Lots of things that can’t be checked need to align for it to work (e.g. what happens if the two Rust crates linked use the same panic implementation but a different memory allocator due to a different toolchain version / linking issue, and one crates tries to free the panic payload of the other crate, etc.).

All of the EH ABIs specify that the exception record contains some field that is used to destroy the exception record. You could check the exact function pointers to figure out if the Rust exception record could have a mismatched memory allocator issue.

1 Like

@jcranmer

If this function is exactly a call to free, this dynamic check can work.

Otherwise, what happens if this function is statically linked into each dynamic library ? Two copies of the function will have different function pointers, yet they might be the exact same function. I’d guess we could maybe force these to always be dynamically linked, not sure.

The lang team is requesting users to participate in forming the solution:

4 Likes

The fact that there’s no well-defined way to trigger SEH on Windows, which is what libjpeg and libpng (the libraries originally under discussion) need to do, contradicts your assertion that “this problem has always had a solution that works correctly and reliably 100% of the time”.

How come? What works 100% of the time is using an error code in FFI, converting from/to each language error handling mechanism. On functions exposed from Rust to FFI you catch panics and return an error code, and in FFI functions called from rust you raise the error code as a panic. On the other side of FFI, e.g., if it is C++, you catch all exceptions in exposed functions and return error codes, and when calling FFI you raise the error codes as exceptions.

Rust does not need to be able to raise / catch C++ exceptions using this method.

In the libjpeg example, where you have Rust -> C -> Rust, you need your extern "C" Rust functions to return an error code, then write a C wrapper for it that raises a SEH, then write a C wrapper that catches the C in the code that calls libjpeg, and then call that wrapper from Rust, so you end up with: Rust -> C libjpeg wrapper -> libpeg -> C wrapper over Rust callback -> Rust callback.

You don’t need to be able to trigger SEH from Rust for that to work, and that always works, because you only use the native error reporting mechanism of each language, and via FFI only error codes are propagated.

1 Like

Okay, I actually did not think of this. That does seem like it should be legal and solve the problem, though I’d be curious about the performance impact.

Having to write multiple C wrappers adds a substantial amount of pain, here. I still think we need a mechanism to handle this in Rust.

1 Like

Perhaps; but in the interim, the C wrappers provide a stable mechanism to achieve the desired effect. Thus, I think it would be reasonable to change to abort now and importantly to close the existing soundness hole in the language.

3 Likes

We had this argument, and I don’t particularly want to have it again. The conclusion in the last round of the argument was “let’s time-bound this to the next couple of releases, and within that time work with the actual developers of code using this to find a solution, and when we have a reasonable solution we can push this again”.

So, let’s actually work with the developers of this code and try to find a solution. (Also, let’s not split that discussion between multiple places.)

1 Like

I’m drafting an RFC for #[unwind(Rust)] as suggested in the tracking issue linked above.

Here is what I have so far; suggestions are welcome: https://github.com/rust-lang/rfcs/blob/annotate-unwind-rust/text/XXX-annotate-unwind-rust.md

I’d like to rewrite my RFC to be more strictly directed towards the FFI interoperability use case, and I’ve written up my current thoughts in comments on the GitHub discussion, but I believe I need some constructive feedback before I can move forward productively. @gnzlbg, @kornel, @jcranmer, @josh, @mjbshaw, any thoughts?

The RFC link, again: https://github.com/rust-lang/rfcs/blob/annotate-unwind-rust/text/XXX-annotate-unwind-rust.md