Your suggestion of summing up the memory of the parent and its children
would clearly bias kdeinit if it forks most of kde's threads as you
mentioned earlier in the thread. Imagine it, or another server
application that Rik mentioned, if all children are first generation: then
it would always be selected if that it is the only task operating on the
system. For a web server, for instance, where each query is handled by a
seperate thread, we'd obviously prefer to kill a child thread instead of
making the entire server unresponsive. That type of algorithm in the oom
killer and to kill the parent instead is just a non-starter.
1 minute? Unless you've got one of SGI's 4K cpu machines where these 1000
threads would actually get any runtime _at_all_ in such circumstances,
that threshold is unreasonable.
A valid point that wasn't raised is although we can't always detect out of
control forking applications, we certainly should do some due diligence in
making sure other applications aren't unfairly penalized when you do
make -j100, for example. That's not the job of the forkbomb detector in
my heuristic, however, it's the job of the baseline itself. In such
scenarios (and when we can't allocate or free any memory), the baseline is
responsible for identifying these tasks and killing them itself because
they are using an excessive amount of memory.
Again, it's not protection against forkbombs: the oom killer is not the
place where you want to enforce any policy that prohibits that.
We're not protecting against a large number of first-generation children,
we're simply penalizing them because the oom killer chooses to kill a
large memory-hogging task instead of the parent first. This shouldn't be
described as "forkbomb detection" because thats outside the scope of the
oom killer or VM, for that matter.
--