I've converted most of my Rust projects to use fallible_collections
, including a service running in production at "web scale". It does regularly abort itself due to actual OOM, on Linux. With plain libstd I had couple thousand rust_oom
coredumps per day. With try_reserve
and proactive detection of low-memory situations I brought it down to 5-10 aborts per day, which are all in 3rd party code now.
I've heard that there's an open question whether Rust should use a different approach to fallibility and have a FallibleVec
type or similar, like Vec<T, SomeFallibleNonDefaultFlag>
.
In my experience with the fallible_collections
prototype: type-based enforcement of fallibility is both too restrictive and not restrictive enough at the same time. It sucks.
Returning a non-standard variant of a Vec
(in this case FallibleVec
) is "viral" and forces callers of the function to also switch their Vec
type.
If the caller also has to work with other functions that use the vec, it requires those other functions to change too. Sometimes it's impossible if it involves 3rd party crates or std (e.g. io::Read
). If the Vec
needs to be stored in a struct, then the methods of the struct and users of the struct also need to change, causing more and more changes. This makes switch difficult, because a small change in one function may snowball info refactoring of the whole dependency tree.
It also feels too viral, because not every place that receives a Vec
needs ability to grow it. Very often data ends in a FallibleVec
only because it had to be fallible at creation time, but afterwards it's effectively treated as immutable or fixed-size. The right type for such non-growable owned data should have been Box<[T]>
rather than [Fallible]Vec<T>
, but a conversion from a Vec
with excess capacity to a boxed slice is not free, so in low-memory situations it's actually better to keep using needlessly-growable Vec
than to switch to a more theoretically-accurate Box<[T]>
.
Adoption of a FallibleVec
variant, despite its vitality, is also quite insufficient for ensuring all allocations are handled fallibly throughout the codebase. This type used in APIs doesn't do anything about Vec
s used inside function bodies. I've had to also find and eliminate all uses of temporary vecs inside functions — things like a .collect::<Vec<_>>()
to a temporary to sort it or to convert [Owned]
to [&borrowed]
. It's not easy to grep for all such cases, because there's too many methods that may allocate, e.g. From
/Into
.
Most importantly, adoption of a FallibleVec
in my codebase does absolutely nothing about 3rd party code. Crates may return wrappers around aborting-Vec
such as Image
or Bytes
. They should of course add fallibility support in their codebase, but my point is that my use of a FallibleVec
type doesn't force them to do so. Even if 3rd party crate's API doesn't use any aborting types, there's no guarantee that it doesn't use them. I would like to detect and potentially block use of 3rd party code that aborts on OOM, but there are no language features for that, and type-based enforcement can't do it.
My conclusions so far:
-
Vec::try_reserve
+Vec::try_with_capacity
+Vec::try_extend
are a big improvement. They're easy to adopt. Finding all places where allocations happen is a whack-a-mole, but once it's done it works well. -
Handling fallibility through a
FallibleVec
type is not worth it. It's a pain from usability and interoperability perspective, and it doens't improve anything overVec::try_reserve
. Just likeVec::try_reserve
it's only a partial opt-in solution. I would also be wary of code that uses generics to allow both fallible and non-fallible flavors ofVec
s, because that can cause generics bloat, and still fails to give a guarantee that the program won't abort. I would not use such type if it was in std. -
I would like to have some additional solution that enforces fallible allocation for entire scopes or entire crates, including calls to 3rd party code, even code that uses aborting-
Vec
only privately. I think it would be similar to enforcing no panics. -
I've never needed any detailed information from the allocation error. From the context it's always obvious what I was trying to allocate, and details don't matter beyond the fact that it failed. Any information about allocation size is redundant. Any information about remaining free mem is unusable due to inherent TOCTOU race, so there's really nothing useful I could do with details in allocation errors. The only option is to give up and propagate the error, so to me a zero-sized
AllocError
type is entirely sufficient.