On Thu, 2010-05-06 at 19:30 +0200, Peter Zijlstra wrote:
OK a simple example, suppose we have 3 events, A, B and C. A and B are
exclusive. With the current RR scheduling we get:
/ {A}, {B, C}, {C, A} /
Which isn't fair, because we get a 2:1:2 ratio.
If we were to do as Stephane suggested and always schedule the PMU in a
greedy fashion, we'd get:
/ {A, C}, {B, C} /
Which isn't fair either 1:1:2.
The perfect case would be if we could always schedule all events on the
PMU, that way they'd get equal service. But since we cannot, we much
approximate that.
If we define lag to be the difference between perfect service and our
approximation thereof: lag_i = S - s_i, then for a scheduler to be fair
we must place two conditions thereon:
1) There must be a bound on the maximal lag_i, this ensures progress.
2) The sum of lags must be zero, this ensures we converge to the ideal
service. [*]
We can satisfy 1 by always at least scheduling the event which has thus
far received the least service, this ensures everybody makes progress,
since by definition everybody will at one point be that one.
This by itself however is not enough, since we can schedule multiple
events, lets again try a greedy model with the previous example. That
would again reduce to the unfair:
/ {A, C}, {B, C} /
Simply because C is always schedulable and A and B alternate between
being the one who received least service.
This would however violate 2, since C will always be scheduled, its lag
will decrease without bound.
We can fix this by explicitly not scheduling C when its not eligible.
That is, we only consider eligible events for placement on the PMU.
Eligibility means an event has received less service than it would have
had under the perfect model.
The second condition will help us here: \Sum_i S - s_i = 0. That can be
written as: n*S - \Sum s_i = 0, where n = \Sum_i 1. From this it
trivially follows that: S = 1/n \Sum s_i, or in other words S =
avg(s_i).
So eligibility can be expressed as: s_i < avg(s_i).
With this, we will get a schedule like:
/ {A, C}, {B} /
We are however still fully greedy, which is still O(n), which we don't
want. However if we stop being greedy and use the same heuristic we do
now, stop filling the PMU at the first fail, we'll still be fair,
because the algorithm ensures that.
[*] it might be possible these conditions are the same one.
--