io::Read read_to_end should handle OOM

Currently read_to_end() uses vec.reserve(), so if the vec can't allocate, it will cause the whole process to be aborted. This seems bad to me.

The abort is preventable by writing a custom read_to_end() implementation that uses vec.try_reserve(), but that is needlessly cumbersome and non-trivial, since libstd's read_to_end() has some optimization tricks.

So I think read_to_end() should be updated to use try_reserve() whenever possible, and report io::ErrorKind::OutOfMemory.

Previously, I've suggested similar for Vec's io::Write, but the argument was that Vec::write could not Err previously, and changing a hard abort to a Result would be a behavior change.

I think for read_to_end() the Hyrum's Law is less of a problem, because io::Read is implemented for each type individually, so the implementation for Vec (the silly case of vec.read_to_end(&mut other_vec)) can be left aborting, while all other implementations could start caring about OOM.

There's the other issue that read_to_end() could not guarantee to always handle OOM, because it can be implemented by anybody for any type. This varying quality of implementation is unfortunate, but libstd improving its own implementation would be a big step forward already, especially that many io::Read implementations keep the trait's default impl.

13 Likes

Discussions about OOM bring standard answers, so I'll preempt them:

  • Rust works on other platforms besides Linux with OOM killer enabled. Windows and embedded platforms can report OOM reliably. Various containers and sandboxes can make process run out of memory before the whole machine is in a dire state. WASM OOMs all the time. There's cap that can put a hard limit on Rust's mem too.

  • OOM handling in C is awful and hopelessly brittle due to manual error handling and untested error paths. Rust doesn't have this problem: its drop is automatic, and invoked on happy paths too. The ? operator is used for all kinds of errors, so error handling is much more reliable and well excercised.

  • While crashes can't always be prevented, a crash-only design is not always the best solution for every application.

  • Not all OOM cases can be handled, but OOM most often happens when a large Vec needs more pages from the OS or large contiguous address space, but the allocator may still have some spare memory for small objects like errors. io::Error with ErrorKind does not allocate.

8 Likes

Imo people are too conservative about hyrum's law here, being afraid of hypothetical (not even observed) breakage of things that were never guaranteed in the first place is excessive. API contracts exist for a reason. Otherwise we might as well stop documenting things and tell people to just read the implementation.

Sounds reasonable to me. In theory the kernel could already be returning ENOMEM on its own. Though that's an exceedingly rare thing ime.

3 Likes
5 Likes

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.