The goal was to make the oom killer heuristic as predictable as possible
and to kill the most memory-hogging task to avoid having to recall it and
needlessly kill several tasks.
The goal behind oom_score_adj vs. oom_adj was for several reasons, as
pointed out before:
- give it a unit (proportion of available memory), oom_adj had no unit,
- allow it to work on a linear scale for more control over
prioritization, oom_adj had an exponential scale,
- give it a much higher resolution so it can be fine-tuned, it works with
a granularity of 0.1% of memory (~128M on a 128G machine), and
- allow it to describe the oom killing priority of a task regardless of
its cpuset attachment, mempolicy, or memcg, or when their respective
limits change.
You have full control over disabling a task from being considered with
oom_score_adj just like you did with oom_adj. Since oom_adj is
deprecated for two years, you can even use the old interface until then.
Xorg tends to be killed less because of the change to the heuristic's
baseline, which is now based on rss and swap instead of total_vm. This is
seperate from the issues you list above, but is a benefit to the oom
killer that desktop users especially will notice. I, personally, am
interested more in the server market and that's why I looked for a more
robust userspace tunable that would still be applicable when things like
cpusets have a node added or removed.
--