Quote: Avoiding the OOM Killer

Submitted by Jeremy
on December 17, 2007 - 6:50pm

"Any time the OOM killer fires, something's wrong with the system, and it's more productive to deal with that than to wish for a more accurate OOM killer."

Throttling memory allocation/paging

on
December 19, 2007 - 3:53pm

Has there been any research into alternative approaches, such as throttling processes that are allocating memory too quickly, or induce paging too often?

As has been said, the OOM killer isn't there to fix a broken system, it's there so that a broken system would be recoverable/debuggable. Currently, by the time the OOM killer kicks in, the average desktop user will have lost all patience and already flicked the reset switch, because one process was allowed to push others completely out of RAM.

And those other processes were pushed out because they were using said resources rarely compared to the broken application. So, one could say that Linux currently facilitates broken applications to eat out everything else, by not giving anything else a chance to work anymore.

But in the bad scenario, the reverse makes much more sense -- penalizing memory hogs, not interactive processes that use resources sparingly.

For instance, if an application starts allocating memory and swapping at an insane rate, it makes little sense to swap out/discard X11's pages as a result, because (a) the amount of recoverable memory is probably small; (b) X11 is absolutely critical in allowing the user to terminate the misbehaving process in the first place.

The alternative is to let the misbehaving process run into the swap as much as it wants, while still keeping the working sets of well-behaving processes in RAM.

Anyway, this is just an idea, I'm not sure how applicable it is in real life.

Any time the OOM killer

Anonymous (not verified)
on
December 17, 2007 - 9:16pm

Any time the OOM killer fires, something's wrong with the system, and ... the system will probably take minutes to respond to each key presses, which is exactly why the OOM was created in the first place - to free up the largest memory hog so someone can begin figuring out what went wrong.

I'd recommend monitoring/logging/profiling/tracing the memory hog *before* it transforms the system into molasses.

oom doesn't help

on
December 17, 2007 - 10:04pm

The problem is that the OOM doesn't actually help fix my system once it goes to molasses. I mean, if my choices are (a) system so slow because Firefox sucks weiner that I can't use my actually important apps or (b) OOM killer kicks in because Firefox sucks weiner and my actually important apps get killed, then I'm going to choose (c) just reboot the computer and be ashamed that my roommate's XP system never, ever has this kind of stupid-ass problem.

Quite simply, Linux shouldn't ever get slow as molasses, and the OOM killer shouldn't even exist. No single user-space app should be able to effectively kill my desktop by way of swapping everything out to disk.

memory management issues?

Anonymous (not verified)
on
December 31, 2007 - 11:03pm

I don't know what sort of system you have, but in all my linux usage, I think I've seen the OOM killer in action once - there was a bad java app that wanted to just suck up all the memory as fast as possible. But in day to day linux usage, I just don't see the OOM killer, ever - and I use linux all day every day at work, and at home.

Sure, I've seen firefox go catatonic - but I haven't seen it trigger the OOM killer yet. I restart firefox and life goes on.

BTW, expee "never ever" has memory problems? ROTFLMAO! Good thing I wasn't drinking coffee or the keyboard would have been sprayed. I've heard too many expee horror stories to belive that hype!

System resource exhaustion not just for Linux

vomlehn (not verified)
on
December 18, 2007 - 3:10pm

Actually, it is possible to get your XP system into running as slow as molasses--it's simply a matter of exhausting a critical resources. Critical resources are generally CPU time, memory, and/or I/O bandwidth. The OOM is only focused on memory. The OOM killer is not intended to fix your system when you are out of memory. Rather, it is intended to free up enough memory so you can interact with the system to see what's going on.

You're very misinformed that

Anonymous (not verified)
on
December 18, 2007 - 12:30pm

You're very misinformed that XP systems don't have that problem. Memory isn't magical, it doesn't grow on trees. Completely exhaust the memory on _any_ system and it will "get slow as molasses", grind to a halt or crash.

I see this all to frequently when Visual Studio sucks up 600MB on this machine with 1GB of RAM. It starts paging so bad that it can take a good 20 seconds just to get IE to come to focus after being minimized. The biggest difference with XP is that it will (under default configuration) keep autoextending your paging (swap) file for you. Maybe removing any need for a true OOM killer, but not exactly fixing the problem either. Drive XP hard enough and it will grind so hard it appears locked up.

So really you should fix your system. Add more memory, throw in more swap, set ulimits on known memory hogs. Or just disable the OOM:

echo 0 > /proc/sys/vm/oom-kill

or

vm.oom-kill = 0

in /etc/sysctl.conf

read again

on
December 22, 2007 - 3:16pm

I don't think you read what I said. My roommate's system never, ever has this problem. I don't care if it's possible for XP to run out of memory in theory, because it never fucking happens on his machine, and he runs just as much crap (including Firefox) as I do. My roommate has been using XP with no virus scanner or other bloat for longer than this particular install of Ubuntu has been here (1.5 years), and he's not even once had a system crash, a locked up desktop, a virus or other malware, or any problems other than some trouble getting our printer working (and surprise, Linux can't print to it correctly either, it always cuts off the top 1" of every page, due to hpijs driver bugs that nobody's fixed in a year). If you aren't a jackass moron who installs random crap off the web and you use hardware with stable drivers, XP works faster and more stably than any Linux desktop I have ever had. I don't and won't use it, but I'll tell you right now I get _real_ jealous sometimes thinking of all the unstable crap I have to put up with on Linux.

Well, I have used more than

NyB (not verified)
on
December 28, 2007 - 8:08am

Well, I have used more than four or five XP systems where when an external USB drive with continuous I/O for more than a couple of minutes will kill any and all network connections. At least three of those had completely different hardware (CPU, chipset, USB host chip, network controller e.t.c.) - the only common factor was the OS. I guess everyone's experience varies. I do find it interesting that most people mention Firefox as a main cuplrit for their problems - next thing we know Firefox will have a --timedemo option, right next to a --showfps one...

Have you ever thought about

Anonymous (not verified)
on
December 23, 2007 - 4:31am

Have you ever thought about installing XP? Unless you develop for Linux, it's definitely worth it, even if you do have to pay some cash for it. Heck, even if I did develop for Linux, I would consider running VMWare under XP with a few different Linux VMs. Linux works perfectly in my house, but only in a server/router role. I have one desktop that has Linux installed on it, but it is a second PC in my bedroom that I just use to screw around on -- and it is great for that.

Then, configure your system

mangoo (not verified)
on
December 18, 2007 - 8:34am

Then, configure your system accordingly and don't allow these apps to grow beyond limits in memory.

think again

on
December 22, 2007 - 3:23pm

Awesome idea, too bad that isn't even remotely possible. Let's assume that I already set my user process limits to 512MiB on this machine, which has 2GiB of RAM. Awesome, now Firefox can at most consume only 1/4th of my machine's memory, right? Well, no, actually, that's not true at all. Because, see, with the great architecture we call X, all of a process's pixmap data isn't actually owned by the process, it's owned by the X process. So, after browsing for long enough on photo gallery sites or other media-heavy sites (like, say, the ones I work on for a living), Firefox is still using less than that 512MiB limit but X is now chewing up gigabytes. Take out the 512MiB of that from my graphics card and you still have a memory problem. One which eventually causes the machine to go into swap-death.

Sure, I could set a limit on X, but then when that limit is reached you either end up with X dying (which is for all intents and purposes no different than just rebooting the machine, since all of your apps and working data go bye-bye) or with X no longer to allocate more pixmap memory, which means all of your other apps still become dead since most interesting apps do a lot of pixmap allocation, even for simple things like text glyphs.

So, until there is some way for X to tell the kernel that some amount of internal memory should count toward's a process's memory limits, there is actually no feasible way to limit the actual amount of memory a process causes your system to consume. Added to the pixmap leaks in Firefox and even the occassional pixmap leak in X, it's only a matter of time before your system runs out of memory if you don't restart Firefox regularly.

X server and resource limits

on
December 30, 2007 - 3:30pm

Sure, I could set a limit on X, but then when that limit is reached you either end up with X dying [...] or with X no longer to allocate more pixmap memory,

X should never die because of low resources. If it does, that's a very serious bug, and one that you should immediately report to X.Org.

As you justly note, when it's out of memory, X will reject new allocations by returning a BadAlloc error to client applications. Ideally, the X server itself should be able to enfore per-client resource limits. Most of the work needed to do that has been done by Mark Vojkovich a few years ago (have a look at the XRes extension); all that's left is just a little bit of hacking.

--Juliusz

Not possible?

on
December 25, 2007 - 10:30pm

See the bottom of this page:

http://www.win.tue.nl/~aeb/linux/lk/lk-9.html

To Quote:

Since 2.1.27 [...] proc file /proc/sys/vm/overcommit_memory [...]

Since 2.5.30 the values are: 0 (default): as before: guess about how much overcommitment is reasonable, 1: never refuse any malloc(), 2: be precise about the overcommit - never commit a virtual address space larger than swap space plus a fraction overcommit_ratio of the physical memory. Here /proc/sys/vm/overcommit_ratio (by default 50) is another user-settable parameter. [...] (See also Documentation/vm/overcommit-accounting.)

Linux Memory

Anonymous (not verified)
on
December 26, 2007 - 10:02pm

Thanks for the link, it's really interesting.

I thought everyone had given

Nony mouse (not verified)
on
December 17, 2007 - 9:54pm

I thought everyone had given the OOM killer up as a bad idea.