Termination hook for handling error from `main() -> Result<(), SomeError>`

Hello everyone,

On Fuchsia, most of our processes run in an environment where STDOUT and STDERR are redirected to an equivalent to /dev/null. Instead, we structure most of our components to emit logs to a central syslog service, which allows us to attach more structured data to each message. Overall this works well for us, except in the case of our main functions returning Result::Err. By default, the Termination trait just logs the error with eprintln!(), and so these messages get lost. Presumably other operating system services could run into this, such as if a linux service running under systemd is set to run with:

...
StdoutOutput=null
StderrOutput=null
...

Has anyone explored creating an equivalent to std::panic::set_hook for the Termination trait? It seems like it'd be a pretty easy RFC and feature to write up, especially since we have a precedent with set_hook. I'd be happy to start working on this, but I wanted to check here first if there has been prior art in this space. I couldn't find anything on GitHub - rust-lang/rfcs: RFCs for changes to Rust or here.

1 Like

What exactly is the benefit of using such a hook over simply keeping main’s return type () (or ExitCode once that’s stable; in the meantime using process::exit for nonzero exit codes) and handling the Result value directly?

1 Like

+1 to @steffahn's points here.

This has come up in discussing whether to stabilize Termination too -- is it really better to implement the trait than to just have a macro for main or similar? (And just implementing a trait has way fewer questions than deciding what a hook should look like.)

To me, main() -> Result<(), ...> is like dbg! -- great for the simple cases, but it's totally expected that once one needs more control that something different is needed.

Out of curiosity, why not wire stderr up to that same log service? (systemd supports this by default, making it easy for programs to log by just printing to stderr.)

1 Like

Thanks for the comments.

The main reason I think it's worth exploring this problem space is that this is really easy for developers to trip over. Fuchsia is a large open source operating system with many binaries spread across multiple teams, and most of our developers are pretty experienced with Rust by now. That said, about 90 of our binaries use main() -> Result<(), Error> with our syslogger, and many of these callsites will silently drop the main function error results (for example, I fix a few in http://fxrev.dev/468902). So this suggests to me that this might be a pretty common issue amongst applications that use some kind of a logger service.

On Fuchsia, while we could try to change our developer's habits by centrally ban the use of main() -> Result<(), Error>, and have them switch to macros, wrapper libraries, or etc. However, we are an open source OS, and so (hopefully) we'll have third party ecosystem developing applications for Fuchsia, and it would be a bit more difficult to help those developers avoid this issue. It'd be a lot easier and less error prone if we could address this issue for everyone by registering a termination hook.

Finally, I prototyped out my idea GitHub - erickt/rust at termination-hook if you want to see it in action. I'm not sure if this is quite the right approach if we ever end up stabilizing the Termination trait, since it's only wired up to work with the Result type. Presumably we'd want something more like PanicInfo which could work with all Termination implementations.

Hm, some of the examples in the commit you linked even have some special procedural macros on them already to handle async. Those could perhaps be modified in order to also capture the return value and properly log it, right?

Also, every example in this commit seems to first do some form of call for initializing logging. This call could possibly be included in a non-async version of the macro, too, so there would even be more benefits beyond just fixing the main-return-value handling.

In don't think that requiring everyone to either use a macro or return () is too hard or confusing.

Fuchsia is a capability based operating system, and so we lean heavily into the Principle of Least Privilege. As part of that, we're trying to minimize as much functionality as possible in our component manager (which I suppose is somewhat analogous to systemd's service manager).

Our system logger isn't a first class service that's built into component manager, but instead is just a normal service that that has a capability to communicate to it that's routed to most components. So communicating with the system logger needs to be done inside the process. Furthermore, we don't have a notion of a STDOUT/STDERR in our component model, so any communication with the system logger has to be done in-process. We are exploring how we might want to support this, but we're still pretty early on in the design of that.

But even if we did have the ability to route standard I/O to our system logger, it still might be advantageous to have a termination hook. We are working towards adding structured log messages to our system logger, so it would be handy to use a hook to capture that information, rather than just getting raw strings with STDERR.

Yeah, it definitely is a possibility to use a wrapper, and actively encourage people to use it. I posted about this hear mainly because we happened to perform a natural experiment to reveal how easy it was to miss this issue. I wanted to explore here if other people have had this problem, and if we thought this was something we might want to try to address holistically, rather than in a case-by-case manner.