Should stdlib error type use Box<str|[u8]> instead of String|Vec<u8>?

In all the Rust code I've read error types are used as read only types. So why not optimize code this case, in other words do not return capacity field as part of error type?

For example NulError, it is struct NulError(usize, Vec<u8>):

it takes 32 bytes on amd64, but why it need capacity field?

pub struct NulError(usize, Box<[u8]>); contains the same data, but takes only 24 bytes.

The other example VarError, it's size is 24 bytes, but if use Box<[u8]> instead of OsString (=Vec<u8>) it would be only 16 bytes.

For NulError, that Vec<u8> comes from the user input. If you round-trip that through Box<[u8]>, then you might incur an extra copy. And it also needs to support NulError::into_vec.

As for VarError, I think you mean Box<OsStr> and not Box<[u8]>. And this can only be a historical question, since switching it to a Box<OsStr> would be a breaking change. I'm not sure exactly why OsString was selected here. It does seem like a Box<OsStr> would work here since std::env::var accepts an AsRef<OsStr>. So creating a Box<OsStr> shouldn't involve any additional copies like it might in the CString::new case. But this would be a semver incompatible change at this point, and I doubt the extra 8 bytes really makes much of a difference in practice. (And if it did, it's easy to work around it.)

2 Likes

With errors in general the problem is that error messages are likely to come from format!, and format! does not allocate precise length, so conversion to Box<str> could require reallocation.

6 Likes