Add `thiserror::Error` proc macro to `std::prelude`

Summary

Add a built-in #[derive(Error)] macro to std::error, inspired by thiserror, to reduce boilerplate for custom error types.

Motivation

Most Rust projects need custom errors, but implementing std::error::Error, Display and From is verbose. thiserror solves this elegantly, but adds a dependency and cannot be used in all environments.

Example

#[derive(Error, Debug)]
enum MyError {
    #[error("not found")]
    NotFound,
    
    #[error("IO error")]
    Io(#[from] std::io::Error),
}

Possible Approaches

Using smaller chunks as @epage said would be great. For example starting with the following derives:

  • Display
  • From

It would also be very helpful if the API guidelines for errors would be clarified like in this reply

It might be worthwhile to consider what are the smaller chunks that are more modular, e.g. derives for

That said, I (as an outsider to std decisions) would be hesitant to move forward with a derive(Error) as I feel like crates like thiserror make bad error design too easy, negatively affecting the ecosystem. Enshrining that further by having it built-in would make the situation worse.

  • Error enums expose implementation details, making it harder to evolve an API, both for what fields get put in a variant and when wrapping other errors
  • Error chaining is a local maxima we've hit for adding context to errors and I'd like to see us explore more than we have on alternative methods.
15 Likes

Yeah, IMO a good derive(Display) by itself would knock off a lot of the boilerplate from making error types, while also being useful elsewhere. I think we should start there, and it definitely shouldn't be implicitly contained in a derive(Error). Unless you're wrapping other error types, this would save almost all of the boilerplate, so then you only need an impl core::error::Error for MyError {} line and the it's implemented.

6 Likes

displaydoc is some precedent for a derive(Display)-only macro. (It makes the particular choice of reusing doc-comments as format strings, but you can also use a separate attribute.)

I don't feel like the ecosystem has yet found the optimal way to do "library style errors". They don't feel ergonomic.

This is in contrast to anyhow/color-eyre which feels very ergonomic to use, but is only suitable if you want type erased errors. (The difference between anyhow and eyre is basically "how much additional metadata do you want to be able to attach" and colours).

There are some other library style error libraries, such as snafu, error-stack, and a few others I can't seem to remember the name of right now. I haven't (yet) tried these myself, but they offer different takes on error handling.

That said, some subsets of the problem space could definitely be moved into std, such as a general way to derive Display for enum variants. That part seems like thiserror does pretty optimally to me.

8 Likes

+1 to the push backs for the same reason. Strum is another way of deriving Display, which is mostly compatible with the thiserror syntax

A std::Error derive macro should probably just hook Error::source() up to a source field / single field enum variant, and perhaps Error::description() up to Display::to_string(), but otherwise be not do the From parsing that thiserror does. Realistically, that does seem like it could be useful enough though. E.g. something like the following (hypothetically if Display and From were available...)

#[derive(Debug, Display, Error, From)
enum MyError {
    #[display("not found")]
    NotFound,

    #[display("IO error: {0}")]
    Io(#[from] std::io::Error),
}
2 Likes

I would also like to eventually have better support in the standard library for errors, but I'm also worried about getting stuck with a local maximum. I feel that a first step would be to clarify the API guidelines for errors as I feel that there's still so much fragmentation across the ecossytem.

The few issues that I encounter regularly are:

  1. Formatting error chains. Despite source being there for a very long time, the API guideline PR to recommend using source instead of embedding the message of the cause has been stuck for years, and we still get tons of new libs even today repeating the mistake of including the source message (e.g. I like @joshka 's proposition, but even their message makes the mistake). I assume that it stems from the fact that there's still nothing in the standard lib to print an error chain. I use a helper lib, but it's still a bit surprising how hard it is to display an error chain.
  2. Poor guidance on how to bubble up errors. Using thiserror, the simplest way to handle it is through Error enums, but @epage is 100% right that it leaks impl details. Another popular style is to have an Error struct, with a Kind enum for the possible causes (std::io::Error is the most famous case). This is nice, but still a bit cumbersome to implement. There's also no guideline to help libs picking the best style.
  3. Despite the saying that "errors are regular values in Rust", they are still second class citizens in practice. Most of them are lacking base traits such as Eq, Clone, serde::Serialize which makes it very inconvenient to do anything else than just printing them.
  4. Per-lib or per-module errors are common but make error recovery very hard to implement since the type system becomes useless: all the error causes apply to all the functions, there's no granularity.

I would appreciate some improvements to the API guidelines for errors first, before enshrining a style in the std lib. I agree however that help to derive Display would be a good start. As another data-point, derive_more is also a lib supporting Display derives. Regarding displaydoc, I feel that it's a cool trick and we used it a lot at work, but in practice it turned out to be anti-pattern to conflate formatting with documentation (the doc is bad, with formatting elements mixed in).

8 Likes

The design of errors as enums has a flaw – if the Display of a variant doesn't include the source error it's wrapping, then the message alone is vague and often unhelpful. But if it quotes the source error, then printing of the whole error chain repeats the same messages and grows exponentially.

std hasn't stabilised helpers for printing chains of errors, so the ecosystem is undecided whether to display just one message, or the whole chain. Display/Formatter, or Error trait may need changes.

7 Likes

This is only an issue when reporting errors with eprintln!("{err}") or other naive solution relying on .to_string() without handling .source(). And the problem is that there's a complexity wall if only using std. Even just having a Display adapter similar to Path::display would be a huge improvement to reduce the boilerplate. A formatting attribute would also help, for example anyhow::Error uses the alternate selector {:#} to control if it should format the error or the chain.

nit on error message growth

if the source is included, it grows quadratically, which is less bad than exponentially but still bad

1 Like

Doing anything other than printing/logging an error and then exiting/cancelling and operation is a pain.

For example when writing a file integrity checker (which compares files on the file system against installed files from your Linux distro package manager): I wanted to collect all the errors encountered when processing lots of files on lots of threads (using rayon) and then report all normal and error results together. I wanted to include the error chains where appropriate and be able to add ad-hoc context (like with anyhow).

Whar so I mean by "where appropriate"? I wanted to nice simplified error messages for the "expected" errors like file not found or permission denied. But if I got something really odd that was more likely to be an error in the tool: include all the details and full chain. So something that does type erasure (anyhow etc) isn't a good option.

This sort of scenario is a pain to deal with currently, as it doesn't fit into either anyhow not thiserror. Plus transferring error types between threads becomes an issue when type erased. In the end I formatted the errors in worker threads and treated them as normal values, forgoing the result machinery entirely.

3 Likes

The use of {:#} for printing with the chain is pretty clever, but sadly not universal.

The std APIs for getting the chain of sources have ran into awkward issues with lifetimes, dyn Error support:

Most of the time when I have an std::io::Error, I want include the file name.

3 Likes

Note that some io::Error-using APIs don't have filenames associated with them at all. For example, std::process::Command::spawn returns it. Just the executable filename is kind of useless in a lot of failure modes. I feel like you want to provide why the path/command/etc. matters at all for its error to make sense. Sure, /foo/bar/frobnitz failed to read, but if you tell me that the cache file /foo/bar/frobnitz failed to read, I suddenly care a lot less about its read failure in the grand scheme of things (beyond metrics collection).

Exactly! Since io::Error does not tell me the filename, I want to add it to the wrapping Error.

I’d like to strongly second points 2 and 4. thiserror also has the secondary issue of encouraging extremely bloated error sizes, which then make extremely bloated Result sizes returned by all the functions in a library. At the very least, any interface included in the standard library should make it easy to do intermediate (or final) Box or Arcing ergonomic and convenient. (Tolnay has decided that he is not interested in supporting something like this, but more details here: `#[boxfrom]` attribute (or similar) to allow easy use of boxing with thiserror and `?` · Issue #415 · dtolnay/thiserror · GitHub)

As an alternative, I’d like to point to Zig’s error design, which includes both error traces and easy-to-do flat unions such that you can construct per-function error types easily. The flat unions also ensure that error types never exceed either 2 or 4 bytes (can’t remember what the limit was).

This has obvious disadvantages: sometimes flat lists of errors erase meaningful grouping information (like “network error” is a meaningful category of errors that could be wrapped in a top-level variant) and the lack of payloads has obvious downsides which are only partially compensated for by error traces.

But it’s nonetheless worth considering if flat unions and/or per-function error types could be supported more nicely somehow.

3 Likes

AFAIK Zig's error design depends on the fact that it compiles all code in a single compiler invocation to know all possible error variants and allocate disjoint indices for them. This is fundamentally incompatible with rustc compiling one crate at a time.

4 Likes

Possibly; there’s a number of reasons it doesn’t make sense to do exactly the Zig thing. This seems like it should be resolvable at the same time as monomorphization, maybe?

Even if not, needing to do some arithmetic on errors when converting from one crate’s errors to a superset defined in another crate doesn’t seem like that big a deal (while that wouldn’t work with Zig’s concept of anyerror and such, I don’t think Rust would ever have that anyway).

I was not calling for exactly what Zig does, just pointing to it as another way to do errors which seems pretty appealing.

1 Like

Monomorphization is done separately for every crate, not as a single step at the end. That could be changed, but it would often have a net negative effect on build performance (because then dependencies’ code has to be recompiled if it merely uses generics even if it is publicly non-generic).

Ability to flatten enums' discriminants across multiple enums would be beneficial even for Rust-style error types, because enum EverythingError { Network(NetworkError), Other(OtherError) } could be smaller if the discriminants of NetworkError and OtherError could be guaranteed to never overlap.

Linkers already do the non-overlapping thing for code locations! Having globally unique addresses of functions and unique enum discriminants are basically the same problem, except that linkers understand how to coordinate functions, but not enums.

3 Likes

Oh this makes sense. Nonetheless, it seems like a useful concept to consider even if it would be infeasible to implement directly in Rust.

(Though this feels like it should be a reasonably-solvable problem; we don’t need or necessarily want literal subtyping, we just need a From implementation, and the From implementation doing a bit of arithmetic would also be reasonable.)