Should Rust programs unwind on SIGINT?

(Sorry if I'm posting under the wrong category or if this should belong in URLO instead, but I think it fits in here because I am wondering about potential changes to how Rust programs work).

RAII patterns are not uncommon with Drop (IIRC, there were issues with this and threads before (thread guards maybe?), but in any case, people do use Drop to cleanup resources).

However, it seems that drop is not called when the program receives a SIGINT. By default, it just terminates. I just encountered this, and it seems somewhat surprising.

What's the rationale for SIGINT to not unwind and cleanup resources? panic!() can do this, and it terminates the program cleanly, but Ctrl+C doesn't. In my opinion, it is somewhat surprising that sending Ctrl+C to my program doesn't make it unwind and cleanup.

Would it be possible and desirable to change this behaviour? What would be the reasons to not do this change?

1 Like

The ability to make the program terminate with a panic, but triggered from outside is kind-of similar to the ability to stop another thread without that thread cooperating (e.g. if the thread reads some shared variable or listens to a channel). Both are things that Rust cannot currently do.

For comparison, other programming languages do offer features similar to what this question asks for. I know that e.g. in Haskell there is support for so-called “asynchronous exceptions” that can emerge from things like a SIGINT or can be thrown into a thread from the outside, if you have some kind of thread-id to the thread. Haskell is a pure functional language and ordinary Haskell code tends to allocate a lot, so the implementation of this feature uses (AFAIK) the allocation points to have the runtime check for any asynchronous exceptions. For lack of a runtime or allocation everywhere, this approach could not work for Rust, for multiple reasons.

There’s also a need to be super careful when designing exception-handling primitives in Haskell to make sure that you aren’t reacting to asynchronous exceptions immediately when your inside of any cleanup code at the moment. In Haskell, exceptions are a bit more common for indicating expected, non-fatal errors, e.g. around IO, so the problem is probably smaller in Rust; nevertheless, imagine a multi-threaded Rust program, and some thread is already unwinding at the moment, what should it do? Abort and skip all the resource cleanup after all? Or ignore the SIGINT; but if it has a catch_unwind, it might never terminate then. (I guess this question could be resolved to somehow e.g. continuing to unwind at the catch_unwind place if a SIGINT was received in the meantime.)

And Haskell also has the huge advantage of being garbage collected and mostly pure (== without side-effects). This means that it is usually realistic that execution of code can safely panic at pretty much any point. Whereas in Rust, writing unsafe code often involves assessing unwind-safety, where you need to consider if it is safe to exit the function with a panic at any place where a panic can happen. E.g. calling potentially panicking code, or calling a callback. On the other hand this means that for soundness reasons, code in Rust currently relies on the fact that there cannot be any panic happening in certain places of the code.

Here’s an article about Haskell’s asynchronous exceptions and also comparing to cancelling Futures in Rust:

4 Likes

One potential issue is unwind safety.

Unsafe code has to be careful that when it calls into untrusted code, that an unwind would not expose unsafe state to safe code.

When in straightline code, however, it's perfectly fine for unsafe code to mess up state temporarily. E.g. Vec can could it's length before moving in the new element being pushed. (With optimizations and code movement, I don't think you even have a guarantee of order here.)

4 Likes

By the way the comparison to Futures suggests thay it might be a good approach to, in an application that is completely async, and/or only has long or blocking computations in worker threads that operate in a way that they regularly check for a request in long computations or Interrupted io errors for if they should start to cooperatively shut down immediately, that you could override the SIGINT handler for such a program with something that makes the asynchronous runtime start shutting down and cancel every Future, and signals all the worker threads to stop working. The risk of missing some way for program to get locked up in a way that CTRL+C can’t terminate will of course still exist then, but it offers a way to handle resource cleanup and it should not be too hard to make sure the program is well-behaved. And we also have SIGTERM and ultimately SIGKILL when things go wrong.

2 Likes

Thank you for the responses! In particular, the remark on "something has to detect the signal and something has to start unwinding somewhere" makes it clear that this is not really possible because Rust doesn't really have neither the thing to check nor the right place to do so.

The async idea is interesting, and perhaps could make its way into mainstream runtimes behind a certain feature flag, but this is no longer up for Rust the language to decide.

2 Likes

Rust tries to be a low-level language and catching and overriding OS signals probably is not something the language should do on its own. Messing with these things would be something I would find really surprising. And users may want to have arbitrary handling for CTRL+C or anything else (a text editor can ask you if you really want to quit, a web server might want to finish all the in-flight requests...).

But you absolutely can subscribe and handle it manually or using a library. Originally, I was a bit confused about why there is no support in the standard library, but after years of maintaining a library for handling signals (signal-hook), and seeing how people disagree with each other about what the right strategy is and what unusual use cases they come up, I understand now that nobody would be able to figure the right interface yet.

8 Likes

Wouldn't it be possible to create a signal handler with signal-hook that panics, and catch the panic in the main function? That's somewhat similar to how it is handled in Python (interrupts are automatically converted to exceptions), except that the panic is handled explicitly.

No, you cannot unwind from signal handlers

2 Likes

The unwinder can unwind from signal handlers just fine. It does need compiler support though for asynchronous exceptions. For example gcc has -fasynchronous-unwind-tables. This makes the unwind tables contain an entry for each instruction. It also has -fnon-call-exceptions which only emits entries for instructions that can normally trap like memory accesses. See for example the email thread at Ewgenij Gawrilow - How to benefit from asynchronous unwind tables?.

You can't do that because a) unwinding through FFI boundary is UB (and the signal handler is called from C, so through FFI boundary), b) you could unwind into place that requires no panics for soundness.

1 Like

I have a really strong opinion on this one which is basically we should all strive to write crash only software.

If the process is sent SIGINT or SIGTERM (often at system shutdown) in the ideal case there is no signal handler - i.e. the kernel just unilaterally kills the process. That's maximally efficient! Unwinding the stack and calling e.g. free() on a ton of String etc. objects is just a waste of CPU.

I suspect the reason you're bringing this up is the interesting problem is around "external persistent" data, such as that created by tempfile::tempdir().

One approach I've been meaning to wire up into a crate is, at least on Linux you can use prctl(PR_SET_PDEATHSIG) to get a signal when the parent exits. So the idea here is that rather than having the parent process do e.g. std::fs::remove_dir_all() in its Drop(), instead we call fork(), have that child set up PDEATHSIG, and pass the path to it over a pipe. (This would generalize to an "external process drop handler"). When the parent exits due to e.g. crashing, the child would clean up. (To correctly handle scoping though of course the parent would should to still ask the child to eagerly invoke the cleanup; debatable whether that should be sync or async)

An advantage of this approach is one can reasonably use e.g. panic=abort and still ensure that things like temp dirs are cleaned up.

The generalization of this "external cleanup helper" is of course things like e.g. systemd-tmpfiles-cleanup.service but that has pretty conservative defaults so tempdirs can remain for a long time.

I don't think Rust can (or should) ever add a default SIGINT/SIGTERM handler.

3 Likes