Signaling When Out of Memory

Submitted by Jeremy
on October 19, 2007 - 9:29am

The previous 2.4 Linux kernel maintainer, Marcelo Tossati, resurrected a discussion on adding support for out of memory notifications to the Linux kernel. He explained, "AIX contains the SIGDANGER signal to notify applications to free up some unused cached memory," then noting, "there have been a few discussions on implementing such an idea on Linux, but nothing concrete has been achieved." In a request for discussion, Marcelo added, "on the kernel side Rik suggested two notification points: 'about to swap' (for desktop scenarios) and 'about to OOM' (for embedded-like scenarios)." Rik van Riel explained:

"The first threshold - 'we are about to swap' - means the application frees memory that it can. Eg. free()d memory that glibc has not yet given back to the kernel, or JVM running the garbage collector, or ...

"The second threshold - 'we are out of memory' - means that the first approach has failed and the system needs to do something else. On an embedded system, I would expect some application to exit or maybe restart itself."

From: Marcelo Tosatti <marcelo@...>
Subject: OOM notifications
Date: Oct 18, 4:25 pm 2007

Hi,

AIX contains the SIGDANGER signal to notify applications to free up some
unused cached memory:

http://www.ussg.iu.edu/hypermail/linux/kernel/0007.0/0901.html

There have been a few discussions on implementing such an idea on Linux,
but nothing concrete has been achieved.

On the kernel side Rik suggested two notification points: "about to
swap" (for desktop scenarios) and "about to OOM" (for embedded-like
scenarios).

With that assumption in mind it would be necessary to either have two
special devices for notification, or somehow indicate both events
through the same file descriptor.

Comments are more than welcome.

-


From: Rene Herman <rene.herman@...> Subject: Re: OOM notifications Date: Oct 18, 4:38 pm 2007

On 10/18/2007 10:25 PM, Marcelo Tosatti wrote:

> AIX contains the SIGDANGER signal to notify applications to free up some
> unused cached memory:
>
> http://www.ussg.iu.edu/hypermail/linux/kernel/0007.0/0901.html
>
> There have been a few discussions on implementing such an idea on Linux,
> but nothing concrete has been achieved.
>
> On the kernel side Rik suggested two notification points: "about to
> swap" (for desktop scenarios) and "about to OOM" (for embedded-like
> scenarios).
>
> With that assumption in mind it would be necessary to either have two
> special devices for notification, or somehow indicate both events
> through the same file descriptor.
>
> Comments are more than welcome.

Given the desktop/embedded distinction you made, do you need both scenarios
active at the same time? If not, it seems something like a

echo -n >/proc/sys/vm/danger

could do with just one sigdanger notification point? (with suitably
defined as or in terms of the used threshold value).

Rene.
-


From: Rik van Riel <riel@...> Subject: Re: OOM notifications Date: Oct 18, 4:52 pm 2007

On Thu, 18 Oct 2007 22:38:21 +0200
Rene Herman wrote:

> On 10/18/2007 10:25 PM, Marcelo Tosatti wrote:
>
> > AIX contains the SIGDANGER signal to notify applications to free up
> > some unused cached memory:
> >
> > http://www.ussg.iu.edu/hypermail/linux/kernel/0007.0/0901.html
> >
> > There have been a few discussions on implementing such an idea on
> > Linux, but nothing concrete has been achieved.
> >
> > On the kernel side Rik suggested two notification points: "about to
> > swap" (for desktop scenarios) and "about to OOM" (for embedded-like
> > scenarios).
> >
> > With that assumption in mind it would be necessary to either have
> > two special devices for notification, or somehow indicate both
> > events through the same file descriptor.
> >
> > Comments are more than welcome.
>
> Given the desktop/embedded distinction you made, do you need both
> scenarios active at the same time? If not, it seems something like a
>
> echo -n >/proc/sys/vm/danger
>
> could do with just one sigdanger notification point? (with
> suitably defined as or in terms of the used threshold value).

If you do that, how are applications to know which of the two
scenarios is happening when they get a signal?

--
"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it." - Brian W. Kernighan
-


From: Rene Herman <rene.herman@...> Subject: Re: OOM notifications Date: Oct 18, 5:06 pm 2007

On 10/18/2007 10:52 PM, Rik van Riel wrote:
> On Thu, 18 Oct 2007 22:38:21 +0200
> Rene Herman wrote:
>
>> On 10/18/2007 10:25 PM, Marcelo Tosatti wrote:
>>
>>> AIX contains the SIGDANGER signal to notify applications to free up
>>> some unused cached memory:
>>>
>>> http://www.ussg.iu.edu/hypermail/linux/kernel/0007.0/0901.html
>>>
>>> There have been a few discussions on implementing such an idea on
>>> Linux, but nothing concrete has been achieved.
>>>
>>> On the kernel side Rik suggested two notification points: "about to
>>> swap" (for desktop scenarios) and "about to OOM" (for embedded-like
>>> scenarios).
>>>
>>> With that assumption in mind it would be necessary to either have
>>> two special devices for notification, or somehow indicate both
>>> events through the same file descriptor.
>>>
>>> Comments are more than welcome.
>> Given the desktop/embedded distinction you made, do you need both
>> scenarios active at the same time? If not, it seems something like a
>>
>> echo -n >/proc/sys/vm/danger
>>
>> could do with just one sigdanger notification point? (with
>> suitably defined as or in terms of the used threshold value).
>
> If you do that, how are applications to know which of the two
> scenarios is happening when they get a signal?

They don't -- that's why I asked if you need both scenario's active at the
same time. SIGDANGER would just be SIGPLEASEFREEALLYOUCAN with the operator
deciding through setting the level at which point applications get it.

Or put differently; what's the additional value of notifying an application
that the system is about to go balistic when you've already asked it to free
all it could earlier? SIGSEEDAMNITITOLDYOUSO?

Don't get me wrong; never saw this discussion earlier, may be sensible...

Rene.

-


From: Rik van Riel <riel@...> Subject: Re: OOM notifications Date: Oct 18, 5:18 pm 2007

On Thu, 18 Oct 2007 23:06:52 +0200
Rene Herman wrote:

> They don't -- that's why I asked if you need both scenario's active
> at the same time. SIGDANGER would just be SIGPLEASEFREEALLYOUCAN with
> the operator deciding through setting the level at which point
> applications get it.
>
> Or put differently; what's the additional value of notifying an
> application that the system is about to go balistic when you've
> already asked it to free all it could earlier? SIGSEEDAMNITITOLDYOUSO?

The first threshold - "we are about to swap" - means the application
frees memory that it can. Eg. free()d memory that glibc has not yet
given back to the kernel, or JVM running the garbage collector, or ...

The second threshold - "we are out of memory" - means that the first
approach has failed and the system needs to do something else. On an
embedded system, I would expect some application to exit or maybe
restart itself.

--
"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it." - Brian W. Kernighan
-


From: Rene Herman <rene.herman@...> Subject: Re: OOM notifications Date: Oct 18, 6:01 pm 2007

On 10/18/2007 11:18 PM, Rik van Riel wrote:

> On Thu, 18 Oct 2007 23:06:52 +0200
> Rene Herman wrote:
>
>> They don't -- that's why I asked if you need both scenario's active
>> at the same time. SIGDANGER would just be SIGPLEASEFREEALLYOUCAN with
>> the operator deciding through setting the level at which point
>> applications get it.
>>
>> Or put differently; what's the additional value of notifying an
>> application that the system is about to go balistic when you've
>> already asked it to free all it could earlier? SIGSEEDAMNITITOLDYOUSO?
>
> The first threshold - "we are about to swap" - means the application
> frees memory that it can. Eg. free()d memory that glibc has not yet
> given back to the kernel, or JVM running the garbage collector, or ...
>
> The second threshold - "we are out of memory" - means that the first
> approach has failed and the system needs to do something else. On an
> embedded system, I would expect some application to exit or maybe
> restart itself.

That first threshold sounds fine yes. To me, the second mostly sounds like a
job for SIGTERM though.

The OOM killer could after it selected the task for killing first try a TERM
on it to give a chance to exit gracefully and only when that doesn't help
make it eligible for killing on a second round through the badness calculation.

You could moreover _never_ make a task eligible for killing before it
received a SIGTERM, thereby guaranteeing that everyone got the SIGTERM
before killing anything, and it seems SIGTERM would be a more focussed
version of SIGDANGER2 then.

Would at least forego any need for multiplexing the DANGER signal.

Rene.

-


From: Ulrich Drepper <drepper@...> Subject: Re: OOM notifications Date: Oct 18, 6:10 pm 2007

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Rene Herman wrote:
> That first threshold sounds fine yes. To me, the second mostly sounds
> like a job for SIGTERM though.

I agree. Applications shouldn't be expected to be yet more complicated
and have different levels of low memory handling. You might want to
give a process a second shot at handling SIGDANGER but after that's it's
all about preparation for a shutdown.

- --
➧ Ulrich Drepper ➧ Red Hat, Inc. ➧ 444 Castro St ➧ Mountain View, CA ❖
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (GNU/Linux)

iD8DBQFHF9m/2ijCOnn/RHQRAhwjAKC38y1OLv0mE5sWHY31CwJ2ZaoAXwCglDTO
05pmpe8jMVhwM0nlCHqZyaQ=
=5DvG
-----END PGP SIGNATURE-----
-


From: Pavel Machek <pavel@...> Subject: Re: OOM notifications Date: Oct 19, 6:17 am 2007

On Thu 2007-10-18 15:10:07, Ulrich Drepper wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Rene Herman wrote:
> > That first threshold sounds fine yes. To me, the second mostly sounds
> > like a job for SIGTERM though.
>
> I agree. Applications shouldn't be expected to be yet more complicated
> and have different levels of low memory handling. You might want to
> give a process a second shot at handling SIGDANGER but after that's it's
> all about preparation for a shutdown.

That works okay on a PC, but try cellphone one day.

You want management app to close the least used application. You do
not want _kernel_ to select "who to send SIGTERM to".
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-


Sensible feature

Anonymous (not verified)
on
October 19, 2007 - 11:20am

This sounds like a sensible feature. Instead of letting the system thrash about trying to handle what is probably a programming error it would be wise to signal applications to try to sort it out first.

Yeah, like all the

Anonymous (not verified)
on
October 19, 2007 - 2:29pm

Yeah, like all the applications that check the return-value of malloc :>

malloc() never returns NULL

Anonymous (not verified)
on
October 22, 2007 - 12:07pm

malloc() never returns NULL when overcommit is active. This is why OOM is such a common situation, because overcommit is the default behavior.

overcommit is useful, but leads to a problem similar to the problem when everyone goes to the bank and withdraws their savings at once. There isn't enough memory (or money) available to satisfy all of the immediate demand.

Out of Memory is a System Security Problem!!!

Anonymous (not verified)
on
October 19, 2007 - 3:14pm

If a buggy process leaks memory then it's very dangerous to hang the complete system that contains several processes more important than this buggy process.

The solution is not swapping more pages, the solution is to kill the buggy process as if it's a bug, as if the buggy process crashes although the process want not to crash itself.

I OS, I did kill this process for many reasons, this buggy process was sucking all the RAM of my system with important and critical processes.

--- dear process, 640 KB are enough for my 3.98 GiB system 4G/4G ---

Give system admins hooks to make the policy

Anonymous (not verified)
on
October 19, 2007 - 4:11pm

Why don't they have some app that gets woken up by the kernel for OOM events (kernel hook needed?) and have it policy driven such that the policy is based on whatever the system admin has dictated (i.e. term/kill order of apps and/or what not to kill if at all possible). If this policy app does not exist then fallback to whatever way it is currently handled (user hits power switch or a shutdown gets triggered).

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.