On Aug 3, 2010, at 4:16 AM, Peter Zijlstra <peterz@infradead.org> wrote:
Yes, agreed. (Assuming that the next-earliest filed is always kept up-to-date by finding the next-earliest when the task is pulled.)
Can this lead to tasks bouncing back-and-forth? Under a strict interpretation of G-EDF, each job arrival should cause at most one migration. Can you bound the maximum number of times that the retry-loop is taken per scheduling decision? Can you prove that the lock-less traversal of the table yields a consistent snapshot, or is it possible to accidentally miss a priority inversion due to concurrent job arrivals?
In practice, repeated retries are probably not much of a problem, but not having a firm bound would violate strict validation rules (you can't prove it terminates), and would also violate academic real-time rules (again, you ought to be able to prove it correct). I realize that these rules may not be something that has a high priority for Linux, but on the other hand some properties such as the max number of migrations may be implicitly assumed in schedulability tests.
I'm not saying that the proposed implementation is not compatible with published analysis, but I'd be cautious to simply assume that it is. Some of the questions that were raised in this thread make it sound like the border between global and partitioned isn't clearly drawn in the implementation yet (e.g., handling of proc affinity masks), so my opinion may change when the code stabilizes. (This isn't meant as a criticism of Dario et al.'s good work; this is just something very hard to get right, and especially so on the first attempt.)
Going back to Dario's original comments, when combined with proc. affinities/partitioning you'd either have to move budget allocations from CPU to CPU or track a global utilization sum for admission test purposes.
- Björn
--