Unhandled panics in Rust vs. in FFI

Hey guys, I am thinking about the difference between consequences of unhandled panics in Rust alone and FFI.

Take following two call chains for examples:

// case 1: Rust alone
main() calls test(), panic occurs in test(), and there is no any catch_unwind() in user's code.
// case 2: FFI
C++ main() calls Rust test(), panic occurs in test(), and there is no any catch_unwind() in Rust's code.

I have made some test cases at my localhost (mac). I find that case 1 and case 2 will both show the message that thread panicked and are terminated. However, case 2 also showed that:

fatal runtime error: failed to initiate panic, error 5
Abort trap: 6

I have searched for lots of information but still not understand the meaning of failed to initiate panic. But when I search for the abort trap, it showed that it represented unhandled exception on macos system.

If fatal runtime error also means unhandled panic and generated by rustc, then I guess that case 1 doesn't show this message because runtime library wrap main function in catch_unwind(). (Please correct me if I am wrong;)

However, I am still confused about the difference between results of these two cases: From my perspective, they are just both thread-terminated. Is there any more reason that I should avoid the fatal runtime error in case 2?

I would appreciate any opinions!

The nomicon has this to say about the topic:

If an unwinding operation does encounter an ABI boundary that is not permitted to unwind, the behavior depends on the source of the unwinding (Rust panic or a foreign exception):

  • panic will cause the process to safely abort.
  • A foreign exception entering Rust will cause undefined behavior.

I believe the first point matches the case shown in your pseudocode.

1 Like

Hi @parasyte

Thanks for your response. I know in nightly version, it will abort when panic-unwind reaches the boundary of "C" ABI. However, it seems that it is already implemented in current stable version.

I guess developers would still be care about the panic they didn't catch for soundness, right?

The article I linked to has a lot more information about it than what I can offer, as I am unfortunately very unfamiliar with the peculiarities of Rust and FFI with respect to panics and exception handling. If I had to guess, it would seem appropriate that attempting to unwind stack frames from either language is UB because the other cannot appropriately run destructors, and such things.

Unwinding stack frame is just based on landing pad stored in dwarf information, so they define the implementation in unwind suffix ABI. Here, I am more interested in how industry or community think about the uncaught panic/exception. Will people still want to avoid it? Or don't care about it just because it can "safely" abort?

There is a condition to whether it can safely abort. So, no. I do not see any reason to believe no one should care about it.

1 Like

(This is a question about using / the current behavior of Rust, so should be on urlo.)

The current behavior of a panic! unwinding through an extern "C" function is Undefined Behavior, full stop. C functions are not allowed to unwind (although since most C compilers also are C++ compilers, it's very common that they have a nonstandard configuration which allows unwinding over the C ABI), and there's no predicting how the calling code may behave (e.g. trying to catch a Rust unwind as a C++ exception).

In the future, it is intended that unwinding from a function marked extern "C" will be an immediate abort of the program, not unlike attempting to unwind from a function marked nounwind in C++. It's not the prettiest abort (it's not an process::abort() style abort; it's currently lowered as the guaranteed undefined instruction), but it's consistent and defined behavior.

Doing this has been deferred until extern "C-unwind" is made available, because in practice it mostly works, and does so essentially the same way using the aforementioned flag to allow C++ unwinding across/through C code.

The behavior you're observing is what happens to occur when unwinding from C or C++ main to the system. As you found, the abort trap is macOS reporting the unhandled unwind. Based purely on speculation, the fatal runtime error is likely printed from the Rust panic runtime due to macOS's attempt to clean up the unwind state as if it were a C++ exception. Despite Rust's unwinds using the same OS-provided mechanism as C++ exceptions to handle unwinding, there are enough differences in how they work that it's not unthinkable that it would cause that message to be printed. (One likely theory is that the Rust panic runtime itself panics when the OS tries to call into it after main, and trying to create a new panic unwind fails, printing this error message.)

So just don't unwind from extern "C" fn, and definitely don't unwind from a C or C++ defined main :slightly_smiling_face: just don't make mistakes is the C/C++ way.

1 Like

That error is indeed printed when __rust_start_panic (the magic symbol which starts an unwind or aborts based on -Cpanic) fails.

Error 3 is _URC_FATAL_PHASE1_ERROR 5 is _URC_END_OF_STACK on the Itanium unwind ABI, if you'd like to look up more details yourself. If I recall Itanium correctly, this error is produced when first starting an unwind if there are no frames registered to catch it. Which, in this case, is because you're attempting to unwind out of a C++ main.

The Rust runtime does set up a catch frame around a Rust main which translates an unwind to a graceful(ish) standard exit with an error return code.

Hi, I still don't clearly understand the part:

the Rust panic runtime itself panics when the OS tries to call into it after main , and trying to create a new panic unwind fails

From my understanding, _URC_FATAL_PHASE1_ERROR is a return value used in search phase to show that there is no stack frame able to catch the exception or panic. In the case of C++ calling Rust, there should not be any stack frame able to catch the panic from Rust because catch_unwind only wrap around main by default. Therefore, it should call terminate directly.

Why system will try to create a new panic unwind after termination?


That was speculation, which turned out to probably be false once I went and dug to find out what error 5 was.

... Also apparently I can't read, and looked up error 3 instead for some reason. Error 5 is _URC_END_OF_STACK.

Perhaps of note, C++ catch can catch Rust panic unwinds, because it can catch any unwind. Doing so is solidly outside the C++ specification and relying on the platform to define the semantics of cross-language unwinds, of course. Even if C++ catch can't, the runtime absolutely can catch arbitrary unwinds, and is within its rights to assume they're C++ unwinds and thus following the full C++ exception ABI (which Rust unwinds don't; they just follow the subset required for unwinding, not the additional requirements to be manipulable from C++).

The reason way a new unwind may occur is if during handling of the first unwind (perhaps the C++ runtime on macOS catches exceptions leaving main, so the panic unwind succeeds and is caught there to be discarded, calling back into Rust code) another panic occurs (however, C++ catching a Rust unwind should in theory rtabort!("Rust panics must be rethrown")).

On the other hand, the message from a panic is printed before starting the unwind/abort (for obvious reasons), so if it were the case that this fatal runtime error message came from a second panic, we should see that second panic's message. (Or, if writing the second panic message fails, either nothing or a message for that failing, i.e. from the rtabort! guard against a panic hook itself panicking, which may write sufficiently differently to itself succeed.)

From my test case, I allocate objects on C++ stack frame and Rust stack frame, and also implement drop to print out message to let me know. The results just show the error 5 and abort trap, no any objects dropped which means that libunwind doesn't get into the process of cleanup phase.

So here is my understanding (If I understand it correctly) Program just terminate and return error 5 in search phase. After printing the fatal runtime error message, it will call libc::abort at the end(.library/std/src/sys/unix/mod.rs), which should terminate the thread immediately.

The reason why it raised the message abort trap 6 is that libc::abort will raise the SIGABRT signal. This is about how system response to the unhandled panic.

Therefore, I think there is no any unwinding succeed in the whole process.