Robert Love recently submitted a group of O(1) scheduler patches for Alan Cox's 2.4-ac branch. The eight patch set brings the ultra-scalable scheduler up to date in the 2.4 stable tree, back porting the many fixes and improvements that were up to now only found in the 2.5 development tree. Robert highlights the addition of the migration thread as the most important change. Though the currently submitted patches are only for 2.4.19-pre7-ac2, Robert noted that he'd be making patches available for 2.4.18 and 2.4.19-pre7 shortly.
Robert's email follows, as does the related README file explaining each of the patches.
From: Robert Love Subject: [PATCH] 2.4-ac updated O(1) scheduler Date: 20 Apr 2002 20:49:56 -0400 Halo, The attached patch updates Ingo Molnar's ultra-scalable O(1) scheduler in 2.4-ac to the latest code base. It contains various fixes, cleanups, and new features. All changes are from 2.5 or patches pending for 2.5. Specifically, this patch makes the following changes to the scheduler in Alan's tree:- remove wake_up_sync and friends, we don't need them now
- abstract away access to need_resched
- fix scheduler deadlock on some platforms
- sched_yield optimizations and cleanup
- better use MAX_RT_PRIO define instead of magic numbers
- misc. code cleanups
- misc. fixes and optimizationsand most importantly:
- backport the migration_thread and associated code to 2.4The migration_thread code has the interrupt-off fix and William Lee
Irwin's new migration_init routine, both of which are pending for 2.5.Note I have sent this to Alan already, so it should be in a future
2.4-ac. You can get a better description as well as the patches in
logical chunks from:ftp://ftp.kernel.org/pub/linux/kernel/people/rml/sched
I will also make available fully updated base O(1)-scheduler patches for
2.4.18 and the latest prepatch shortly.Enjoy,
Robert Love
Against 2.4.19-pre7-ac2110-remove-wake-up-sync.patch
We do not need sync wakeups anymore, as the load balancer handles
the case fine. Remove wake_up_sync and friends and the sync flag
in the __wake_up method.120-need_resched-abstraction.patch
Abstract away access to need_resched into set_need_resched, etc.
130-frozen-lock.patch
Fix scheduler deadlock on some platforms. I'll let DaveM (the author)
explain:Some platforms need to grab mm->page_table_lock during switch_mm().
On the other hand code like swap_out() in mm/vmscan.c needs to hold
mm->page_table_lock during wakeups which needs to grab the runqueue
lock. This creates a conflict and the resolution chosen here is to
not hold the runqueue lock during context_switch().The implementation is specifically a "frozen" state implemented as a
spinlock, which is held around the context_switch() call. This allows
the runqueue lock to be dropped during this time yet prevent another cpu
from running the "not switched away from yet" task.140-sched_yield.patch
Optimize sched_yield.
150-need_resched-check.patch
A new task can become runnable during schedule(). We always want to
return from scheduler with the highest priority task running, so we
should check need_resched before returning to see if we should rerun
ourselves through schedule. This used to be in the scheduler but was
removed and then readded.160-maxrtprio-1.patch
Cleanup assumptions over what the value of MAX_RT_PRIO. No change to
object code; just replace magic numbers with defines.170-migration_thread.patch
Backport of the migration_thread migration code from 2.5. This
includes my interrupt-off bugfix and wli's new migration_init code.
The migration_thread code allows arch-independent task migration
via set_cpus_allowed and allows the creation of things like task cpu
affinity interfaces.180-misc-stuff.patch
Lots of misc stuff, almost entiirely invariant and trivial cleanups.
Specifically:- rename lock_task_rq -> task_rq_lock
- rename unlock_task_rq -> task_rq_lock
- cleanup lock_task_rq
- list_del_init -> list_del fix in dequeue_task
- comment cleanups and additions
- load_balance fixes and cleanups
- simple optimization (rt_task -> policy!=SCHED_OTHER)
Patches don't work
% grep '^EXTRAVERSION' Makefile
EXTRAVERSION = -pre7-ac2
% for file in ~/patches/2.4.19-ac2-o1-sched/*.patch; do patch -p1 -s --dry-run < $file || echo $file; done
1 out of 5 hunks FAILED -- saving rejects to file kernel/sched.c.rej
/home/cae/patches/2.4.19-ac2-o1-sched/170-migration_thread.patch
1 out of 15 hunks FAILED -- saving rejects to file kernel/sched.c.rej
/home/cae/patches/2.4.19-ac2-o1-sched/180-misc-stuff.patch
re: Patches don't work
I just applied them this morning, and they worked fine.
First, extract 2.4.18. Then patch with 2.4.19-pre7. Then patch with 2.4.19-pre7-ac2. Then you can apply the 0(1) scheduler patches...
I did...
Thats what I did, though in a slightly more roundabout way (using incremental patches from 2.4.18-rc4 to 2.4.19-pre7). I had no failures on any patches until I tried the O(1) stuff.
Sound of hand smacking forehead...
I realize why my "--dry-run" test didn't work. Since I tested each patch in succession, the later ones which depend on earlier ones having been applied will fail. I am a moron.
Re: Patches don't work
Yes they do. I suspect you applied them out of order - notice the numbers on them, each depends on the next.
Further, there is a big patch in ingo-O1 so you do not need to apply the multiple diffs (which are for Alan, hence the directory name).
Robert Love
preempt
Hello,
So will you be making a preemption patch that works with these scheduler patches any time soon?
I have been using ac with your preempt patches, which has been giving me, in what is in my opinion, the best set of patches (rmap, ingo scheduler, preempt, etc).
Thanks.
-Hiryu
Re: preempt
So will you be making a preemption patch that works with these scheduler patches any time soon?
Probably only for the -ac version, and only then when Alan merges it.
Robert Love
Latency issues fixed?
At one point there was a complaint in newer 2.5 kernels with awful latency in connection with the O(1) scheduler. Has this been figured out and is the fix included in the backport?
I'm sorry but it caught my eye and it does bother me
It's O(1) with a capitalized letter 'o', not 0(1) with number zero
</anal>
O(1)
Yes, indeed... thanks for catching my typo. Fixed.
heh...
I say after you say something enough times it should be accepted as part of the English standard. ;P
Everyone seems to spot that 0/O thing. You have some of the most picky readers I have ever seen, JA. I don't think I could put up with that much nonsense, especially compared to the typical stuff other people submit. :P