io::Read read_to_end should handle OOM

kornel · October 7, 2023, 1:36pm

Currently read_to_end() uses vec.reserve(), so if the vec can't allocate, it will cause the whole process to be aborted. This seems bad to me.

The abort is preventable by writing a custom read_to_end() implementation that uses vec.try_reserve(), but that is needlessly cumbersome and non-trivial, since libstd's read_to_end() has some optimization tricks.

So I think read_to_end() should be updated to use try_reserve() whenever possible, and report io::ErrorKind::OutOfMemory.

Previously, I've suggested similar for Vec's io::Write, but the argument was that Vec::write could not Err previously, and changing a hard abort to a Result would be a behavior change.

I think for read_to_end() the Hyrum's Law is less of a problem, because io::Read is implemented for each type individually, so the implementation for Vec (the silly case of vec.read_to_end(&mut other_vec)) can be left aborting, while all other implementations could start caring about OOM.

There's the other issue that read_to_end() could not guarantee to always handle OOM, because it can be implemented by anybody for any type. This varying quality of implementation is unfortunate, but libstd improving its own implementation would be a big step forward already, especially that many io::Read implementations keep the trait's default impl.

kornel · October 7, 2023, 1:37pm

Discussions about OOM bring standard answers, so I'll preempt them:

Rust works on other platforms besides Linux with OOM killer enabled. Windows and embedded platforms can report OOM reliably. Various containers and sandboxes can make process run out of memory before the whole machine is in a dire state. WASM OOMs all the time. There's cap that can put a hard limit on Rust's mem too.
OOM handling in C is awful and hopelessly brittle due to manual error handling and untested error paths. Rust doesn't have this problem: its drop is automatic, and invoked on happy paths too. The ? operator is used for all kinds of errors, so error handling is much more reliable and well excercised.
While crashes can't always be prevented, a crash-only design is not always the best solution for every application.
Not all OOM cases can be handled, but OOM most often happens when a large Vec needs more pages from the OS or large contiguous address space, but the allocator may still have some spare memory for small objects like errors. io::Error with ErrorKind does not allocate.

the8472 · October 7, 2023, 2:15pm

Imo people are too conservative about hyrum's law here, being afraid of hypothetical (not even observed) breakage of things that were never guaranteed in the first place is excessive. API contracts exist for a reason. Otherwise we might as well stop documenting things and tell people to just read the implementation.

Sounds reasonable to me. In theory the kernel could already be returning ENOMEM on its own. Though that's an exceedingly rare thing ime.

kornel · October 13, 2023, 4:31pm

github.com/rust-lang/rust

Handle out of memory errors in io:Read::read_to_end()

rust-lang:master ← kornelski:read-to-oom

opened 02:05AM - 15 Nov 23 UTC

kornelski

+34 -5

#116570 got stuck due to a [procedural confusion](https://github.com/rust-lang/r…ust/pull/116570#issuecomment-1768271068). Retrying so that it can get FCP with the proper team now. cc @joshtriplett @BurntSushi ---- I'd like to propose handling of out-of-memory errors in the default implementation of `io::Read::read_to_end()` and `fs::read()`. These methods create/grow a `Vec` with a size that is external to the program, and could be arbitrarily large. Due to being I/O methods, they can already fail in a variety of ways, in theory even including `ENOMEM` from the OS too, so another failure case should not surprise anyone. While this may not help much Linux with overcommit, it's useful for other platforms like WASM. [Internals thread](https://internals.rust-lang.org/t/io-read-read-to-end-should-handle-oom/19662). I've added documentation that makes it explicit that the OOM handling is a nice-to-have, and not a guarantee of the trait. I haven't changed the implementation of `impl Read for &[u8]` and `VecDeque` out of caution, because in these cases users could assume `read` can't fail. This code uses `try_reserve()` + `extend_from_slice()` which is optimized since #117503.

system · January 11, 2024, 4:32pm

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Uninitialized memory	57	10193	March 25, 2019
Pre-RFC: Read::read_into_uninitialized libs	15	2310	March 25, 2019
Could we support unwinding from OOM, at least for collections? libs	37	11596	March 25, 2019
Try_reserve returning non-growable Vec view language design	9	887	July 14, 2021
Allocation failure should panic language design	2	4117	March 25, 2019

io::Read read_to_end should handle OOM

Related topics