Insufficient `std::io::Error`


#1

IMO the std::io::Error type is not sufficent especially to handle accept() failures correctly.

The current std::io::ErrorKind::Other variant mixes fatal and nonfatal errors and there’s no possibility to decide (except for parsing the message which is obviously not portable).

Errors that can be returned by accept on linux:

Fatal errors

  • EBADF => ErrorKind::Other
  • EINVAL => ErrorKind::InvalidInput
  • EOPNOTSUPP => ErrorKind::Other
  • ENOTSOCK => ErrorKind::Other
  • EFAULT => ErrorKind::Other

All of those should be panics IMO.

Semi-transient errors/unclear

  • EPERM => ErrorKind::PermissionDenied
  • EMFILE => ErrorKind::Other
  • ENFILE => ErrorKind::Other
  • ENOBUFS, ENOMEM => ErrorKind::Other
  • EPROTO => ErrorKind::Other

Transient errors

  • EAGAIN or EWOULDBLOCK => ErrorKind::WouldBlock
  • ECONNABORTED => ErrorKind::ConnectionAborted
  • EINTR => ErrorKind::Interrupted

According to the man pages, it’s also possible that errors can be returned, that are already pending for the new socket, those should be treated like EAGAIN, i.e. they are nonfatal. Many of those are also ErrorKind::Other.

That means, that it’s currently impossible to handle this correctly… ErrorKind::Other could mean anything.

Is there anything that can be done to clean this up a bit?


#2

Sorry for the noise, I have to think a bit more about this before posting…

EDIT: Restored the original post (edited a little bit)


#3

Could you talk more concretely about specifically which errors you want to handle and what “handling” them means?

You can get the raw errno value out of an io::Error, by the way: https://doc.rust-lang.org/std/io/struct.Error.html#method.raw_os_error


#4

Sounds to me like your problem would be mostly solved if you could distinguish between the variety of resource-exhausted errors and other Other errors, plus perhaps a change from some of the Other errors to InvalidInput? That last one might be hard, since different error values can mean different things for different system calls and different platforms. So if you want to be true to a particular OS, you’ll need to interpret the raw code regardless (and then no longer be platform-independent).


#5

Whoa, how could I miss that. That’s a bit embarassing now…

Specifically, on OS X the default maximum file handles is ridiculously low (256). I’ve run into this (ENFILE, or was it EMFILE..) and took it as a motivation to improve the error handling generally.

Previously I just handled all unknown errors as fatal, but here I can obviously just go on and once there are enough free file handles, it will all be back to normal. Same is true for other kind of errors, at least on linux apparently:

Error handling
   Linux accept() (and accept4()) passes already-pending network errors
   on the new socket as an error code from accept().  This behavior
   differs from other BSD socket implementations.  For reliable
   operation the application should detect the network errors defined
   for the protocol after accept() and treat them like EAGAIN by
   retrying.  In the case of TCP/IP, these are ENETDOWN, EPROTO,
   ENOPROTOOPT, EHOSTDOWN, ENONET, EHOSTUNREACH, EOPNOTSUPP, and
   ENETUNREACH.

OTOH I’m very hesitant to just retry on any unknown error, I fear that this will “end” in an endless loop eating up all my CPU.

Using the raw system error code is a last resort but it’s also not ideal. I think it should be possible to handle this in a robust way by just using the standard library without additionally depending on libc or nix (for the error constants).

Ideally the standard library would provide a classification of the error depending on the system call and the platform.


#6

I’m a bit suspicious that anyone actually pays attention to that paragraph of the man page :stuck_out_tongue: