Unified Errors, a non-proliferation treaty, and extensible types

m4b · June 11, 2017, 6:55am

comex:

First of all, if the compiler aggregates all possible error types into one giant enum, that means every error has the size of the largest error type anywhere in the program. As long as there's some error somewhere that contains a lot of fields or a large field (e.g. an array), a type like Result<u32, MyError>, which today might be two or three words, will take up probably hundreds of bytes. With a naive implementation, that entire region of memory will be copied whenever a value of that type is moved or even returned, even if most of it was uninitialized to start with because there isn't a big error stored in it. You might be able to optimize it by storing the used size and only copying that portion of the struct, but that logic would have its own overhead, and in any case there's still the issue of wasting a ton of stack space.

This is already a real and serious issue for current errors, so I'm not sure what the point is here. E.g., the panopticon error enum is currently 72 bytes; if some upstream crate adds an error with an array or a huge struct like you suggest and it gets added to the foreign links, it will extend the size...

I assume just do what error-chain or any other type that needs to be recursive does, use a Box or pointer, but perhaps your point might be more subtle here, not sure.

The array idea is interesting, but as you suggest, there are definitely some problems.

Finally, I agree Box<Error> solves some problems, but it introduces others; and as you mention, can't currently pattern match.

I just don't agree; I thought I presented several reasons for why this is the case, and I don't think it's a matter of beginners not learning their options efficiently, as I tried to explain, 1-3 still all apply to beginners and non-beginners alike...

This is very interesting. So when you mentioned dynamic linking previously, I started thinking about how one might implement an open type like this for dynamic libraries and the problem becomes tractable when you are given a unique ID for each library that uses/extends an open type.

In linux dynamic linking this is actually the DTPMOD64 relocation in x86_64 (which is incidentally used by __tls_get_addr).

So theoretically, what's interesting is that if you have your unique ID and other libraries' IDs taking part in the open type, I believe you can generate a unique discriminant without possibility of collision.

But crytographic hashes are also an interesting approach to this problem, all interesting stuff

Ixrec · June 12, 2017, 6:15pm

Alternative crazy idea I had:

pub fn run() -> Result<(), impl Error> {
  env_logger::init()?;
  let bytes = eat_bytes()?;
  info!("wow, bytes: {:?}", bytes);
  Ok(())
}

Where “impl Error” indicates a single concrete anonymous enum type with io::Error and SetLoggerError variants, which the compiler generates for you at compile time. It’s only as big as it has to be to accommodate the error types returned by run() (rather than every error type imported anywhere in the program), it has a statically known size, and it can be transparently replaced with a custom error type later on.

m4b · June 12, 2017, 6:38pm

That’s actually a really cool idea! I like the parallels with the hopefully forthcoming impl Trait

I believe it is would be backwards compatible with current errors as well, yes?

Ixrec · June 12, 2017, 7:48pm

I’m not quite sure what you mean by “backwards compatible with current errors”, but Result<(), impl Error> is already a valid return type with the impl Trait feature flag, and I believe all “error types” are supposed to implement the Error trait today (io::Error and SetLoggerError both appear to), so I assume switching to this strawman feature would “just work” for any code that either a) uses ? and a custom error type today or b) uses Result<(), impl Error> and explicit error values today. That’s assuming callers didn’t find a way to depend on your concrete error type (which is a standard caveat for all impl Trait uses).

notriddle · June 12, 2017, 8:15pm

Sure, but impl Error is not going to generate a type that can wrap both an io::Error and a diesel::Error.

m4b · June 12, 2017, 9:12pm

Ah I just meant that the open type version wouldn’t be backwards compatible in that the std lib would have to add all its errors to the open Exn type; or some compiler magic have to be involved otherwise. I guess this is more forwards compatibility, which I believe as you stated it would be.

m4b · June 12, 2017, 9:14pm

Ya as notriddle said I thought this was the interesting use case that was being presented with impl Error idea (having the compiler aggregate the types into an enum for you).

m4b · June 12, 2017, 9:17pm

Oh and all Error types def don’t have to impl Error trait, as that’s in std, and core errors won’t ever do that (or at least the impl will have a “std” feature flag)

le-jzr · June 12, 2017, 10:25pm

It’s an interesting idea to be sure, but if I understand the current impl Trait correctly, it abstracts away the concrete type, so it would be functionally identical to Box<Error>, just without allocation.

m4b · June 15, 2017, 7:23am

So, in the process of trying to make something very, very generic and also remove some offsetting logic out of one trait into another, I came across this problem, and the solution isn’t obvious, but it seems a case where the open error type idea shines very nicely, both ergonomically, and logically:

pub trait Pread<Ctx, E = error::Error, I = usize> : Index<I> + Index<RangeFrom<I>> + MeasureWith<Ctx, Units = I>
 where
       Ctx: Copy + Default + Debug,
       I: Add + Copy + PartialOrd,
{
    fn pread<'a, N: TryFromCtx<'a, Ctx, <Self as Index<RangeFrom<I>>>::Output, Error = E>>(&'a self, offset: I) -> result::Result<N, E> where <Self as Index<RangeFrom<I>>>::Output: 'a {
        self.pread_with(offset, Ctx::default())
    }
    fn pread_with<'a, N: TryFromCtx<'a, Ctx, <Self as Index<RangeFrom<I>>>::Output, Error = E>>(&'a self, offset: I, ctx: Ctx) -> result::Result<N, E> where <Self as Index<RangeFrom<I>>>::Output: 'a {
        let len = self.measure_with(&ctx);
        if offset >= len {
            // problem is here, what do i return? the error is/should be generic
            panic!("offset >= len")
        }
        N::try_from_ctx(&self[offset..], ctx)
    }
}

If errors were an open type, I would be able to return my custom Error enum, e.g. return Err(BadOffset(offset)) where the panic is, but while still allowing downstream clients/implementors of TryFromCtx to use their own custom errors in TryFromCtx, whose definition for completeness is:

/// Tries to read `Self` from `This` using the context `Ctx`
pub trait TryFromCtx<'a, Ctx: Copy = DefaultCtx, This: ?Sized = [u8]> where Self: 'a + Sized {
    type Error;
    #[inline]
    fn try_from_ctx(from: &'a This, ctx: Ctx) -> Result<Self, Self::Error>;
}

I might be able to do this using dynamic dispatch, but I’m not really an expert there, would require signature changes, and a performance hit that is not acceptable in this case for this crate’s usecases.

Anyway, just came up, and thought of this, so figured I’d post a little pain point.

As I see it, only reasonable solution might seem to be to remove the generic errors, and use explicit error hardcoded into the signature, which kind of sucks

glaebhoerl · June 15, 2017, 9:12am

@m4b Sorry if I haven’t been reading closely enough and you already answered this, but: would it be possible to do exhaustive error handling (without a “welp, some other thing happened” wildcard arm) if we used an open error type? Giving that up seems like a pretty significant sacrifice.

phaylon · June 15, 2017, 1:09pm

My usual solution would be to require the generic error to impl From<MySpecificError> and return MySpecificError for my error cases.

m4b · June 15, 2017, 4:52pm

Perhaps I’m missing something obvious, but I don’t understand. The signature is generic, pread_with returns an error of type E, which is fixed/given by the TryFromCtx impl. I can’t return a reified MyError in the pread_with body unless I change the return type to be fixed to my error

m4b · June 15, 2017, 4:54pm

Yes with open type would always require a wildcard in pattern matching.

The “impl Error”-esque suggestion, which afaiu would ask the compiler to aggregate the error type with every error return into an enum, so exhaustive pattern match might be possible there, dunno

m4b · June 15, 2017, 4:58pm

Oh I think the obvious part I missed was to call E::from(myerror) or into thanks for this very interesting suggestion @phaylon

Ixrec · June 15, 2017, 7:54pm

The impl Debug idea (I should’ve picked Debug instead of Error the first time, oh well) probably would not allow exhaustive matching either. The whole point of impl Trait is to prevent the calling code from needing to know or being able to rely on whatever the exact concrete type is, without giving up the benefits of returning a single concrete type. The same applies to returning enums of error types: if you allow calling code to exhaustively match on it, then you can’t ever change the set of error types you’re returning without breaking them, and we’re back to the same sort of inflexibility that prompted this thread in the first place. This similarity in backwards compatibility requirements is actually one of the reasons I thought of the impl Trait syntax earlier.

Personally, I’m confused as to what if any concrete goals everyone has in this thread. I originally thought the goal was allowing us to “just add” a line like foobar()?; to one of our functions without having to go modify a potentially unbounded number of function signatures and call sites just to deal with the new error type it’s returning. That’s something I can totally get behind. But if that is the goal, then I’m pretty sure exhaustive matching in the caller is just fundamentally not possible no matter what the proposed solution, so I’m not sure why that’s being pointed out as a negative.

Regarding your Pread example (now that I’ve spent several minutes parsing it…), I see how an “open error type” would be a solution, but it seems like returning a generic PReadWithError<E> enum with the two variants OffsetOutOfRange and E would also be an equally good solution that achieves all the same goals you specified. So tying back to my confusion point: What is it that makes an open error type the best solution in that example? (hopefully focusing on semantics rather than syntax, since we’ve both thrown out strawman syntaxes to minimize the typing burden)

m4b · June 15, 2017, 8:34pm

You've made some really great points which I think get at the heart of things here.

Yes, I hadn't thought too much about impl version wrt pattern matching but I think this statement gets at the essence of why I think non exhaustiveness in matching is a feature in this case (or a side effect, depending on how you look at it)

For me, what you state right here is the goal, which I believe is point 2, especially with respect to you mentioning refactoring call sites, etc.

And again, I think you noting that non exhaustive matching not being negative is also really crucial here.

As for Pread suggestion, I believe another developer on the API issue in scroll where this is being suggested mentioned that as a possibility; I'd have to see how it interacts and I suspect there will be similar trade offs.

I don't know if an open type is the best solution, but I feel like it addresses 1-3 in a way that is very satisfying both on a semantic as well as ergonomic level (but perhaps as others have justifiably pointed out, not at the implementation level)

le-jzr · June 15, 2017, 9:32pm

It would be hella convenient, that much is hard to disagree with.

Though I'm still a bit apprehensive about how it would fit into the broader crate ecosystem. Backwards compatibility prevents migrating existing errors in std and other high-exposure libraries, and using it for new stuff only would be a glaring API inconsistency.

It's a minor concern all things considered, but having two established ways of working with errors that are so alien to each other, feels uncomfortable.

m4b · June 16, 2017, 3:24am

Could you clarify what you mean here? Wouldn't future versions of say, std with errors simply pub type Exn := io::Error, and etc.? Hence old code would still work with io::Error (it didn't go anywhere); and any new libraries/binaries could opt into the open type/whatever by simply changing their signatures, or keep old behavior ; or similarly, perhaps like a trait, use std::exn::Exn; would allow the error to be apart of the unified open type (but only when it's imported, otherwise it's the old behavior)

bascule · June 16, 2017, 4:37am

For what it's worth, I have a PR in to error-chain to make it no_std compatible (but dependent on libcollections):

I'm not entirely happy with the state of this PR, mostly because it vendors std::error into error-chain for use in no_std environments (std::error is not in core, although it used to be).

Some ideas about how to move std::error to core::error here, although I've been told "scenarios" work will solve this entire general problem (I hope!):

github.com/rust-lang/rust

Comment by arielb1 to Move `::std::error` to `::core::error`

rust-lang:master ← thepowersgang:error-in-libcore

The libcore PR was rejected because it made `Error` fundamental, which was a bre…aking change. I think that can be bypassed with this hack (a combination of what I'll call the "exotic closure" trait and "predicate" trait): in libcore: ```Rust #[doc(hidden)] #[unstable(feature="libstd_implementation_detail")] pub trait ErrorOrStrMapper { type Output; fn from_error<E: Error>(self, e: E) -> Self::Output; fn from_str(self, s: &str) -> Self::Output; } #[doc(hidden)] #[unstable(feature="libstd_implementation_detail")] trait ErrorOrStr { fn convert<F: ErrorOrStrMapper>(self, f: F) -> F::Output; } impl<E: Error> ErrorOrStr for E { fn to_error<F: ErrorOrStrMapper>(self, f: F) -> F::Output { f.from_error(self) } } // ok because libcore knows that `! &'a str : Error` impl<'a> ErrorOrStr for &'a str { fn to_error<F: ErrorOrStrMapper>(self, f: F) -> F::Output { f.from_str(self) } } ``` in liballoc (`From<&str> for Box<Error>` must be implemented there, because all the types are available in `liballoc`, so `StringError` *must* use `Box<str>`): ```Rust struct StringError(Box<str>); struct BoxErrorMapper; impl ErrorOrStrMapper for BoxErrorMapper { type Output = Box<Error + Send + Sync>; fn from_error<E: Error>(self, e: E) -> Self::Output { Box::new(e) } fn from_str(self, s: &str) -> Self::Output { // I'm not sure on that `From` - we might want to return an // OomError here if we fail to allocate enough memory. Box::new(StringError(From::from(s))) } } impl<E: ErrorOrStr> From<E> for Box<Error + Send + Sync> { fn from(err: E) -> Self { err.to_error(BoxErrorMapper) } } impl<E: ErrorOrStr> From<E> for Box<Error> { fn from(err: E) -> Self { err.to_error(BoxErrorMapper) } } impl From<Box<str>> for Box<Error + Send + Sync> { fn from(s: Box<str>) -> Self { Box::new(StringError(s)) } } // ok because String is local, so we know `!String: ErrorOrStr` impl From<Box<str>> for Box<Error> { fn from(s: String) -> Self { Box::new(StringError(s)) } } ``` in libcollections: ```Rust // ok because String is local, so we know `!String: ErrorOrStr` impl From<String> for Box<Error + Send + Sync> { fn from(s: String) -> Self { Self::from(s.into_boxed_str()) } } // ok because String is local, so we know `!String: ErrorOrStr` impl From<String> for Box<Error> { fn from(s: String) -> Self { Self::from(s.into_boxed_str()) } } ``` The downside is that the docs will have the `From` impl as `E: ErrorOrStr` rather than separate `E: &str` and `E: Error`.

Topic		Replies	Views
Tokio psuedo-RFC: eliminate `io::Error`	41	3069	March 25, 2019
Add `thiserror::Error` proc macro to `std::prelude` language design	33	1724	June 26, 2025
Error ergonomics language design	34	3773	March 25, 2019
`impl Error for String` libs	21	8857	March 25, 2019
Simplify error handling language design	25	2485	August 16, 2020

Unified Errors, a non-proliferation treaty, and extensible types

Related topics