Whoa, how could I miss that. That’s a bit embarassing now…
Specifically, on OS X the default maximum file handles is ridiculously low (256). I’ve run into this (ENFILE, or was it EMFILE..) and took it as a motivation to improve the error handling generally.
Previously I just handled all unknown errors as fatal, but here I can obviously just go on and once there are enough free file handles, it will all be back to normal.
Same is true for other kind of errors, at least on linux apparently:
Error handling
Linux accept() (and accept4()) passes already-pending network errors
on the new socket as an error code from accept(). This behavior
differs from other BSD socket implementations. For reliable
operation the application should detect the network errors defined
for the protocol after accept() and treat them like EAGAIN by
retrying. In the case of TCP/IP, these are ENETDOWN, EPROTO,
ENOPROTOOPT, EHOSTDOWN, ENONET, EHOSTUNREACH, EOPNOTSUPP, and
ENETUNREACH.
OTOH I’m very hesitant to just retry on any unknown error, I fear that this will “end” in an endless loop eating up all my CPU.
Using the raw system error code is a last resort but it’s also not ideal. I think it should be possible to handle this in a robust way by just using the standard library without additionally depending on libc or nix (for the error constants).
Ideally the standard library would provide a classification of the error depending on the system call and the platform.