Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The only real solution (at least until and unless the kernel OOM killer is tuned to be massively more aggressive, which I doubt will happen) is to run a userspace OOM killer. If you don't like systemd-oomd, there are many alternatives, some which even show a desktop notification when you're dangerously low on memory and when it actually kills processes.

Maybe it would be interesting to see if there could be better kernel APIs for things like userspace OOM killers; ideally, we'd want to guarantee that the userspace OOM killer is always prioritized in low-memory situations, and ideally it'd be possible to install low memory event listeners into the kernel rather than to poll.



I disagree. The real solution is to do away with the need for OoM killer in the first place by turning off overcommit (in its current form anyway) and fixing the broken programs that rely on its behavior.


In practice, programs crash when they receive a null pointer from malloc (either through throwing an uncaught exception, or through `if (!ptr) { abort(); }`, or through dereferencing a null pointer). So even if your solution was realistic, it would just entail killing a random process, and it would prioritize killing an essentially random process. When you reach OOM situations and need to kill processes, there are probably better heuristics than "kill whichever process happened to allocate memory after we ran out".


So we're covering up bad programmer behavior by lying to them and then shooting other programs in the head when everything goes south. By default. Sorry if I feel like we should be able to do better than that.

Windows, for instance, doesn't have this kind of overcommit. Allocations need to be backed by RAM+pagefile (though Windows will grow the page file if it can and needs to).


Things go very south in windows if you ever use really much swap. Had this with a CAM process using a lot of swap.

You barely move the barrier until the backing pagefile cannot grow anymore (which with a fast but small nvm can be reached within a few seconds). After that you get stuff like a taskmanager without fonts, as there is no memory for loading it anymore...


Yeah, so, worst case it is exactly like Linux. However, at least with Windows properly behaving software isn't being lied to about its allocations.


My argument was that _even without overcommit or swap_, we would probably want some kind of OOM killer, because the heuristic of "kill the process which happens to try to allocate first when we have run out" is probably one of the worse heuristics for which processes should be killed.


Except that's up to the application to behave that way instead of some mysterious heuristic. If the programmer decides that terminating is the appropriate behavior, or is too lazy to do otherwise, then that is on the program. If programs have broken behavior then fix them, otherwise what is the point of all this open source software anyway? I find it incredible that Linux developers are expected to routinely deal with API breakages on library updates but would rather have random processes be terminated because the OS lies about memory than fix the badly behaving software!


And instead force every application which needs to store large amounts of temporary data to implement its own swapping mechanism? Don't you think we will end up with a lot of even less optimized swapping systems that way?


If you need swap, you need swap.... You shouldn't borrow it from other programs executable pages (without proper accounting).


IMO, a good start would be to fix the OOMKiller to kill all processes that overcommit first ordered by size of overcommit (maybe match the uid first).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: