Could we support unwinding from OOM, at least for collections?

Currently, running out of memory always aborts.

However, there are processes that need to survive OOM – as this thread states, kernels and microkernel servers are two examples. In these applications, one probably does not want any from of unwinding at all – it requires custom code and is not very predictable. Furthermore, such code often has soft (or even hard) real-time requirements, as well as the need to minimize fragmentation. Therefore, it often uses custom allocators instead of providing an implementation of the standard Rust allocator.

But there are other examples of code that need to survive OOM, or at least run cleanup code. The main cases I can think of are:

  • a program that needs to persist some state to disk on OOM (before exiting) – perhaps to roll back a file format (such as a .docx or a .odf) that cannot be updated atomically, and which would be left in a corrupt state, though a better solution is OS-provided atomic renames.
  • a program that stores large amounts of data in caches, and on OOM can drop some or all of these caches and retry the allocation. Such programs often run close to the limits of available memory anyway (for performance), and are designed to run with paging and overcommit disabled.

Do these provide a sufficiently compelling reason to be able to unwind from OOM, or at least run some cleanup hooks?

3 Likes

@dobenour this is the sort of thing that I hope could be toggled with a Cargo feature.

When the custom allocator API lands, I hope that any such allocator can be used with collections, but it is less clear to me that the collections can all be made robust in the presence of OOM: the current algorithms could move the collections to a temporarily inconsistent states from which oom recovery is impossible.

This kind of thing doesn't work very well in multiprogramming environments, because the process that detects the OOM condition is not, in general, the same process that holds the caches. If ntpd tries to allocate a page and fails, how will it tell the process you wrote to drop caches? It's great for kernels, since the kernel by definition knows about all memory in use and can free e.g. clean page cache pages.

Same reason that the OOM-killer generally goes for the largest process, not the one that's active on CPU at the time of the OOM detection.

ETA: catching out of memory conditions is a good idea on DOS and Mac Classic, unreliable verging on useless on most versions of Windows and Unix, and may be coming back in the brave new world of cgroup memory limits. It's very environment dependent.

2 Likes

Being killed by the OOM killer is IMO not the same as an OOM condition. The OOM killer is a supervisor process that kills processes that go haywire but this does not mean that it’s a hard OOM condition. If your process is killed by another process, there’s not so much you can do about it.

But there are still many other reasons where real OOM conditions can happen:

  • Memory overcommit disabled.
  • System is not Linux. AFAIK Windows, MacOS, BSD have no OOM killer.
  • Address space exhausted. There are still quite a few 32 bit programs, especially on Windows.

Only because there are conditions that cannot be handled, doesn’t mean one should also dismiss those that can be handled.

1 Like

I am specifically referring to processes that are in environments where OOM is survivable, and are designed for this. A good example is Microsoft’s SQL Server.

I might be overanalyzing this, but "OOM killer" is not a process or even a kernel thread (common mistake). It's a function which is called synchronously from the page allocator in (some) failure conditions.

Ok, so you've got 100 processes running 100 programs and you just ran out of free pages. One of those processes will make the next call to mmap, and will get ENOMEM. What will that process do in response? Remember, you didn't write 99 of those programs, and 99 of those programs had nothing to do with the current memory exhaustion anyway.

THIS is an example of a good use case for recovering from OOM, because address space limits are per-process. One process on a multi-process system can't meaningfully deal with a global OOM situation, but you can deal with local problems like address space exhaustion.

2 Likes

A process in a cgroup (that is allotted a subset of available memory) can meaningfully deal with a local OOM situation. And a microkernel server may not be able to deal with OOM, but must tolerate it (by returning an error), though it really ought to have enough memory reserved in advance.

3 Likes

But that's not the point.

  1. The process that invokes the OOM killer and the process that get's killed are generally not the same.
  2. The process is killed asynchronously with SIGKILL (or SIGTERM).

That means for the process that gets killed it's not visible why it is killed, it can just react to being killed (depending on the signal).

Close all files and handles and shut down gracefully without leaving any corrupt data for instance. Write a log entry. Of course one has to be careful not to allocate any memory while doing this, but this is solvable.

Oh and there is a another point to the list:

  • Memory limit set with ulimit

But really what I'm trying to say: Being killed by the OOM killer is entirely unrelated to the ENOMEM case. I see no reason to not give a possibility to handle it, even if it is rare.

There are two things that are going on here.

  1. The OOM killer. Basically the kernel realizes too late that there isn’t enough page file + physical memory to satisfy one of its own memory allocation requests. If an application requests memory it cannot satisfy, this does not cause an OOM killer, it just reports an out of memory failure to the application. The OOM killer occurs only due to the kernel itself not having enough memory to do important tasks. When the OOM killer does occur, there is nothing you can do about it. The kernel will pick some process by some arbitrary means and kill it off to reclaim some memory.

  2. Regular not enough memory errors. An application attempts to allocate some memory and due to various reasons there isn’t enough to satisfy that request. The allocator returns an error saying that there isn’t enough memory. The application now can choose how it wants to respond. In Rust that currently means a hard abort with no way to respond to the out of memory condition, which some people think is unacceptable. The reason Rust currently hard aborts instead of panicking is because when you are out of memory, it is possible for the panic itself to result in an allocation that fails which can result in recursively attempting to panic. I believe the ideal solution is to have an alternative standard library where all things that allocate memory return a Result to handle the situation where the allocation fails due to not enough memory.

So basically, stop worrying about the first point. There’s nothing you can do about OOM killers. Only the second point matters.

1 Like

Yes, that is what I am considering. Most cases where this is desired are ones where you would not want to actually panic (even if it did not allocate) because

  • you might not have the needed runtime support
  • you probably don’t want to panic at all in highly critical system processes which must not crash.

Here, I disagree, partially because "multi-process system" is rather subtle to define, and partially because there are ways to meaningfully deal with "I, personally, can no longer expect memory allocation to succeed."

The first point can be illustrated with memory cgroups: I may choose to run a Rust-based program as a service under systemd with a harsh memory limit, or even perhaps a generous one but a swap limit of zero. At this point, it's basically a single-process "system" as far as memory is concerned - the only thing that can make it fail an allocation is that the process itself already used up what's available.

The second point is really down to what was mentioned at the very beginning: The difference between hard termination and clean shutdown. If we presume the OOM killer is not an issue (either by platform or configuration), then even if the lack of free memory is some other process' fault, it may be beneficial to the end user for the process that experienced the failure to tidy up its desk, pack up its toys, and turn the lights out on the way down.

I've said elsewhere that there are, in the end, only three ways of handling OOM:

  • Retry
  • Lose something nonessential
  • Die

All three may be valid - and in fact, some (such as Retry) are more valid in the multiprocess setting than the single-process setting (someone else might back down).

And "Die" doesn't always mean "fall over" - the process may well be able to put its affairs in order first.

I imagine you would just want a Box::new_safe, Vec::with_capacity_safe, etc. which return Result<_, OomError>. No need to write a new standard library at all :smile:

Don’t forget any operation which can increase the size of a container would also have to have safe variants! :smiley:

You could just do a reserve_safe

But then you’d have significantly worse ergonomics by having to preallocate all memory.

This is true, this is true.

Are you talking about x86-64 Linux? Because this is exactly what happens. Memory requests by applications (via private mmap or sbrk) don't fail, since they're satisfied by mapping the zero page over the entire range. When the application tries to write to one of these, the page fault handler remaps to a free page. If it can't find one, it triggers the OOM killer. This is the principle of overcommit.

I don't remember ever seeing malloc fail on Linux.

In general, the greediest process is killed, which might well be the one triggering the killer.

Oh right, overcommit makes everything awful on linux, I forgot about that. On Windows anyway, once you've committed some memory you have a solid guarantee that accessing it won't result in an OOM killer. The point is when the OOM killer is triggered there is really nothing your application can do, so Rust doesn't need to concern itself with it. Rust only needs to be concerned with the situation where the allocation function refuses to allocate some memory.

My personal view is that while dying on OOM is a good idea for many programs, there are also many cases for being able to recover from it:

  • On Windows, SQL Server routinely runs out of memory during normal operation. That’s why the CLR (which SQL Server hosts) must always report OOM specially. As I understand it, Windows does not handle paging very well, so the best way to write databases and other programs that do a lot of caching is to allocate lots of RAM for caches and drop some caches on OOM (but I could be wrong).
  • A unikernel (monolithic, single-purpose kernel that runs in a VM and contains an entire application, such as MirageOS) can always survive OOM if programmed to do so, since it runs in a VM with fixed memory.
  • Database servers often run in low-memory environments because of extensive caching.
  • Even on Linux, the OOM killer can be turned off, or a program can be run with a memory limit. Some programs are probably designed for this.
  • Many managed runtimes have documented behaviors on OOM other than aborting the process. CoreCLR throws an OutOfMemoryException and Java throws an OutOfMemoryError.
  • On systems that do allow for OOM to be caught, I would certainly expect an editor to save the documents being edited to backup files.
2 Likes

There are APIs in Windows specifically for marking pages of memory as cache that can be discarded in low memory situations. There are tons of APIs for managing memory in every which way, it is just up to the application to take advantage of those APIs.