At this point I'd like to make the observation that spa_ebs is a very
fair scheduler if you consider "nice" to be an indication of the
relative entitlement of tasks to CPU bandwidth.
It works by mapping nice to shares using a function very similar to the
one for calculating p->load weight except it's not offset by the RT
priorities as RT is handled separately. In theory, a runnable task's
entitlement to CPU bandwidth at any time is the ratio of its shares to
the total shares held by runnable tasks on the same CPU (in reality, a
smoothed average of this sum is used to make scheduling smoother). The
dynamic priorities of the runnable tasks are then fiddled to try to keep
each tasks CPU bandwidth usage in proportion to its entitlement.
That's the theory anyway.
The actual implementation looks a bit different due to efficiency
considerations. The modifications to the above theory boil down to
keeping a running measure of the (recent) highest CPU bandwidth use per
share for tasks running on the CPU -- I call this the yardstick for this
CPU. When it's time to put a task on the run queue it's dynamic
priority is determined by comparing its CPU bandwidth per share value
with the yardstick for its CPU. If it's greater than the yardstick this
value becomes the new yardstick and the task gets given the lowest
possible dynamic priority (for its scheduling class). If the value is
zero it gets the highest possible priority (for its scheduling class)
which would be MAX_RT_PRIO for a SCHED_OTHER task. Otherwise it gets
given a priority between these two extremes proportional to ratio of its
CPU bandwidth per share value and the yardstick. Quite simple really.
The other way in which the code deviates from the original as that (for
a few years now) I no longer calculated CPU bandwidth usage directly.
I've found that the overhead is less if I keep a running average of the
size of a tasks CPU bursts and the length of its scheduling cycle (i.e.
from on CPU one time to on CPU next time) and using the ratio of these
values as a measure of bandwidth usage.
Anyway it works and gives very predictable allocations of CPU bandwidth
based on nice.
Another good feature is that (in this pure form) it's starvation free.
However, if you fiddle with it and do things like giving bonus priority
boosts to interactive tasks it becomes susceptible to starvation. This
can be fixed by using an anti starvation mechanism such as SPA's
promotion scheme and that's what spa_ebs does.
Peter
--
Peter Williams pwil3058@bigpond.net.au
"Learning, n. The kind of ignorance distinguishing the studious."
-- Ambrose Bierce
-