Did you want to respond to this?
I'm guessing at the relevancy here because the changelog is extremely
poorly worded (if I were Andrew I would have no idea how important this
patch is based on the description other than the alarmist words of "... is
completely broken)", but if we're concerned about the coredumper not being
able to find adequate resources to allocate memory from, we can give it
access to reserves specifically, we don't need to go killing additional
tasks which may have their own coredumpers.
That's an alternative solution as well, but I'm disagreeing with the
approach here because this enforces absolutely no guarantee that the next
task to be oom killed will be the coredumper, its much more likely that
we're just going to kill yet another task for the coredump. That task may
have a coredumper too. Who knows.
LOL, this code doesn't pretend to work, we rely heavily on it and have for
three years to prevent needless oom killing. We use the oom killer
constantly on every machine with memory isolation for enforcing a policy,
as the memcg will as well as it becomes more popular. Killing a job's
task is a serious matter and we'd prefer to kill as few as possible to
free memory. That's the entire motivation behind having a badness()
heuristic in the first place: so we don't kill tons of tasks to free
enough memory. Removing this check specifically will cause the oom killer
to kill tasks when it is currently a no-op and will free memory and we
have never had a problem with these so-called "deadlocks" that are being
found only by code inspection. So please care a little bit more about the
ramifications of other people's use of Linux before you go and insist that
certain code doesn't do a complete job in certain cases or it can
introduce a deadlock in situations that are anything but from the real
world must be removed entirely even though it has saved tons of our jobs
over the course of the past three years.
I've got no objection to that, but until you find the right solution,
please don't remove what works for people in practice. Show me an oom
deadlock as the result of this and the use of sane applications. Nobody
reports them because these are such ridiculous cornercases that are being
addressed by sweeping and extreme code changes without other people's
input being considered.
Since Google uses the oom killer to enforce memory isolation on every one
of our machines, and we do it at a scale that is unparalleled to anybody
else, I think you should consider this input.
--