Robert Love recently submitted a group of O(1) scheduler patches for Alan Cox's 2.4-ac branch. The eight patch set brings the ultra-scalable scheduler up to date in the 2.4 stable tree, back porting the many fixes and improvements that were up to now only found in the 2.5 development tree. Robert highlights the addition of the migration thread as the most important change. Though the currently submitted patches are only for 2.4.19-pre7-ac2, Robert noted that he'd be making patches available for 2.4.18 and 2.4.19-pre7 shortly.
Robert's email follows, as does the related README file explaining each of the patches.
From: Robert Love Subject: [PATCH] 2.4-ac updated O(1) scheduler Date: 20 Apr 2002 20:49:56 -0400 Halo, The attached patch updates Ingo Molnar's ultra-scalable O(1) scheduler in 2.4-ac to the latest code base. It contains various fixes, cleanups, and new features. All changes are from 2.5 or patches pending for 2.5. Specifically, this patch makes the following changes to the scheduler in Alan's tree:- remove wake_up_sync and friends, we don't need them now
- abstract away access to need_resched
- fix scheduler deadlock on some platforms
- sched_yield optimizations and cleanup
- better use MAX_RT_PRIO define instead of magic numbers
- misc. code cleanups
- misc. fixes and optimizations
and most importantly:- backport the migration_thread and associated code to 2.4
The migration_thread code has the interrupt-off fix and William Lee
Irwin's new migration_init routine, both of which are pending for 2.5.
Note I have sent this to Alan already, so it should be in a future
2.4-ac. You can get a better description as well as the patches in
logical chunks from:
I will also make available fully updated base O(1)-scheduler patches for
2.4.18 and the latest prepatch shortly.
We do not need sync wakeups anymore, as the load balancer handles
the case fine. Remove wake_up_sync and friends and the sync flag
in the __wake_up method.
Abstract away access to need_resched into set_need_resched, etc.
Fix scheduler deadlock on some platforms. I'll let DaveM (the author)
Some platforms need to grab mm->page_table_lock during switch_mm().
On the other hand code like swap_out() in mm/vmscan.c needs to hold
mm->page_table_lock during wakeups which needs to grab the runqueue
lock. This creates a conflict and the resolution chosen here is to
not hold the runqueue lock during context_switch().
The implementation is specifically a "frozen" state implemented as a
spinlock, which is held around the context_switch() call. This allows
the runqueue lock to be dropped during this time yet prevent another cpu
from running the "not switched away from yet" task.
A new task can become runnable during schedule(). We always want to
return from scheduler with the highest priority task running, so we
should check need_resched before returning to see if we should rerun
ourselves through schedule. This used to be in the scheduler but was
removed and then readded.
Cleanup assumptions over what the value of MAX_RT_PRIO. No change to
object code; just replace magic numbers with defines.
Backport of the migration_thread migration code from 2.5. This
includes my interrupt-off bugfix and wli's new migration_init code.
The migration_thread code allows arch-independent task migration
via set_cpus_allowed and allows the creation of things like task cpu
Lots of misc stuff, almost entiirely invariant and trivial cleanups.
- rename lock_task_rq -> task_rq_lock
- rename unlock_task_rq -> task_rq_lock
- cleanup lock_task_rq
- list_del_init -> list_del fix in dequeue_task
- comment cleanups and additions
- load_balance fixes and cleanups
- simple optimization (rt_task -> policy!=SCHED_OTHER)