Currently the standard library has std::error::Error for error handling, which has proved to be problematic for a number of reasons. I will argue this is partly because it tries to cover 2 different types or errors (fast-path expected faults, and unexpected failures), and that separating these out gives a coherent strategy that is backwards compatible.
The std::error::Error trait
Copying from the standard library verbatim (except for explicit dyn):
pub trait Error: Debug + Display {
fn description(&self) -> &str { ... }
fn cause(&self) -> Option<&dyn Error> { ... }
}
That is, types that implement Error must also implement Debug and Display, and may optionally provide a description, and/or optionally link to another causal error.
There are problems with this Trait that lead to the Fail trait in the failure crate. They are:
- Implementations of
Errorare not required to be thread-safe (noSend,Syncbounds) - Implementations of
Errorare not required to beAny, so downcasting isn’t guaranteed to work (noAnyor'staticbounds). - The trait contains 3 ways of formatting as text:
- The
descriptionmethod, which just references a static string, - The
Displayimpl, which may do more complex formatting, - The
Debugimpl, similar to above.
- The
The current Error trait has implementations for all the variant situations
impl Error + 'static
impl Error + Send + 'static
....
The failure::Fail trait
Here is the Fail trait from the failure crate:
pub trait Fail: Display + Debug + Send + Sync + 'static {
fn cause(&self) -> Option<&dyn Fail> { ... }
fn backtrace(&self) -> Option<&dyn Backtrace> { ... }
fn context<D>(self, context: D) -> Context<D>
where
D: Display + Send + Sync + 'static,
Self: Sized,
{ ... }
fn compat(self) -> Compat<Self>
where
Self: Sized,
{ ... }
}
The differences are:
-
Send + Syncbound so the error can be moved/referenced between threads, -
'staticbound so the error can be downcast, - The
contextmethod, which boxes thisFailand creates another wrapping it, - The
backtracemethod, which generates a backtrace at the call site, and - The
compatmethod, for backwards compatibility withError
this means that Fail is perfect for unexpected errors: we can handle them anywhere, we can pass them around as Fails, and then try to downcast them to specific errors if we want, similar to
try {
// ...
} catch (IoException e) {
// ...
} catch (OtherException e) {
// ...
} catch (Exception e) {
// ....
}
where we can behave differently depending on the type of the exception.
This comes with a cost: trait objects have a performance and space cost, and so are not ideal for fast-path expected error recovery. I suspect this is why the Error trait was designed the way it was. However, for fast-path recovery we don’t need an Error trait! The error is being handled close to the code that causes it, and the author can just use his own types to handle it. We need an error trait when the handling happens away from the source, and we want to have choice about how much information we handle.
Note that on the happy path a Box<&dyn Fail> can be made to fit in a pointer (failure::Error) so its fine for hot code.
Add Fail to standard library
Therefore I propose
-
Add a
Failtrait (bikeshedded if necessary) that exists for unexpected errors, a.k.a failures. Explain the concept of unexpected errors in all the documentation and the book, and explain the history of the shortcomings of theErrortrait trying to be all things to all people.It doesn’t need to have the backtrace functionality, which is orthogonal, but backtraces are very useful in debugging.
-
Add the context infrastructure, and explain how it can be used to add extra contextual information to errors. Explain how you can walk the error chain to get the original error, and downcast it to a specific type if you need to.
And that’s it, everything else can be iterated on in crates. Deprecate the Error trait, explaining the 2 use cases and how Fail handles one, and for the other (fast-path) you don’t need an error trait. Say you can always implement Fail for any error, including fast-path errors, if it helps, just doing that doesn’t affect performance. Encourage the use of Box<&dyn Fail> for unexpected errors if you don’t want to expose the exact type of the error across crate boundaries.
A key argument is that as well as being backwards compatible, it makes sense to use another name than Error, since we are specifically dealing with unexpected errors, where performance is not a concern. Error sounds like a name for all errors.
What do people think? I don’t claim any of these ideas are original, but I thought it might be useful to have an article arguing the case. If anyone has prior discussion I will add links to it.