Ingo, you should have read the rest of the paragraph too, I said "it's
needed for a good task placement", I didn't say anything about time
distribution.
Try to start a few niced busy loops and then try some interactivity tests.
You should also increase the granularity, the rather small time slices can
cover up a lot of bad scheduling decisions.
You're forgetting that only a few days before that announcement, the worst
issues had been fixed, which at that time I hadn't taken into account yet.
Did you read the rest of mail? I said a little bit more than that, which
actually explains this already in large parts.
(BTW this mail also has one example where I almost begged you to explain
me some of the CFS features in response to your splitup request - no
response.)
Accuracy is an important aspect, but it's not really the primary goal.
As I said I wanted a correct mathematical model of CFS, but due to the
complexity of CFS (of which a lot has been removed now in CFS-devel) it
was rather difficult to produce such a model.
Producing an accurate model is meant as a _tool_ for further
transformations, e.g. to analyze where are further simplifications
possible, where can the 64bit math be replaced with something simpler
without reducing scheduling quality significantly.
The added accuracy increases of course the complexity, but compared to the
already existing complexity it was still less (at least according to the
lmbench numbers), so IMO it's worth it. The advantage is that I didn't had
to worry about any effects of unexpected rounding errors. This scheduler
has to work with a wide range of clock implementations and AFAICT it's
impossible to guarantee that it work in any situation, it may not
break down completely, but I couldn't exclude unexplainable anomalities,
especially after seeing the problems in the early CFS version, which got
merged.
As I also mentioned this is only part of the problem (but to which early
CFS version significantly contributed). The main problem were the limits,
once the limits are exceeded, that overflow/underflow time is simply lost
and that is what finally resulted in the misbehaviour. The rounding
problems were one possible cause but not the only one. Other possibilities
would require more complex scheduling pattern, where de-/enqueuing of
tasks would push some tasks into these limits. Prime suspect here was the
sleeper bonus and the question was: is it possible to accumulate the
bonus, is it possible to force the punishment onto specific tasks.
The complexity of CFS makes it now hard to quantify the problem, it's easy
to say that it will work in most cases, but e.g. the rounding fixes
changed more the common case but not really the worst case. The point is
what would cost to be a little more acurate and as proved with my patch
not much, but in the end we would have a more reliable scheduler, that
not only works well in the common cases.
Anyway, as I said already earlier, with the step to an absolute virtual
time the biggest error source is gone, so in a way you also proved my
point that it's worth it, even if you don't want to admit it.
bye, Roman
-