Hello, all. cpu_hog has been renamed to cpu_stop and moved into kernel/stop_machine.c per Peter Zijlstra's suggestion. This patchset is feature-wise identical to the second take of cpuhog[L]. The only changes are the rename, relocation and refresh against the current sched/core. The following API renames took place. - hog_one_cpu() -> stop_one_cpu() - hog_one_cpu_nowait() -> stop_one_cpu_nowait() - hog_cpus() -> stop_cpus() - try_hog_cpus() -> try_stop_cpus() - *_hog() callbacks -> *_cpu_stop() Internal names have been renamed accordingly. e.g. cpuhog thread became cpu_stopper thread and so on. This patchset contains the following four patches. 0001-cpu_stop-implement-stop_cpu-s.patch 0002-stop_machine-reimplement-using-cpu_stop.patch 0003-scheduler-replace-migration_thread-with-cpu_stop.patch 0004-scheduler-kill-paranoia-check-in-synchronize_sched_e.patch The patches are against the current linux-2.6-tip/sched/core (09a40af5240de02d848247ab82440ad75b31ab11) and are available in the following git tree. git://git.kernel.org/pub/scm/linux/kernel/git/tj/misc.git cpu_stop I retained the original acked/reviewed-by's as the changes are mostly cosmetic. If you disagree, please let me know. I'll try to push this through sched/core again once Peter acks. diffstat follows. Documentation/RCU/torture.txt | 10 arch/s390/kernel/time.c | 1 drivers/xen/manage.c | 14 - include/linux/rcutiny.h | 2 include/linux/rcutree.h | 1 include/linux/stop_machine.h | 59 ++-- kernel/cpu.c | 8 kernel/module.c | 14 - kernel/rcutorture.c | 2 kernel/sched.c | 271 +++------------------ kernel/sched_fair.c | 42 ++- kernel/stop_machine.c | 525 ++++++++++++++++++++++++++++++++---------- 12 files changed, 514 insertions(+), 435 deletions(-) Thanks. -- tejun [L] http://thread.gmane.org/gmane.linux.kernel/962635 --
Currently migration_thread is serving three purposes - migration pusher, context to execute active_load_balance() and forced context switcher for expedited RCU synchronize_sched. All three roles are hardcoded into migration_thread() and determining which job is scheduled is slightly messy. This patch kills migration_thread and replaces all three uses with cpu_stop. The three different roles of migration_thread() are splitted into three separate cpu_stop callbacks - migration_cpu_stop(), active_load_balance_cpu_stop() and synchronize_sched_expedited_cpu_stop() - and each use case now simply asks cpu_stop to execute the callback as necessary. synchronize_sched_expedited() was implemented with private preallocated resources and custom multi-cpu queueing and waiting logic, both of which are provided by cpu_stop. synchronize_sched_expedited_count is made atomic and all other shared resources along with the mutex are dropped. synchronize_sched_expedited() also implemented a check to detect cases where not all the callback got executed on their assigned cpus and fall back to synchronize_sched(). If called with cpu hotplug blocked, cpu_stop already guarantees that and the condition cannot happen; otherwise, stop_machine() would break. However, this patch preserves the paranoid check using a cpumask to record on which cpus the stopper ran so that it can serve as a bisection point if something actually goes wrong theree. Because the internal execution state is no longer visible, rcu_expedited_torture_stats() is removed. This patch also renames cpu_stop threads to from "stopper/%d" to "migration/%d". The names of these threads ultimately don't matter and there's no reason to make unnecessary userland visible changes. With this patch applied, stop_machine() and sched now share the same resources. stop_machine() is faster without wasting any resources and sched migration users are much cleaner. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Ingo Molnar ...
So who guarantees busiest->active_balance_work isn't already enqueued by some other cpu's load-balancer run? --
Hello, Hmmm... maybe I'm mistaken but isn't that guaranteed by busiest->active_balance which is protected by the rq lock? active_load_balance_cpu_stop is scheduled iff busiest->active_balance was changed from zero and only active_load_balance_cpu_stop() can clear it at the end of its execution at which point the active_balance_work is safe to reuse. Thanks. -- tejun --
Ah, indeed. It wasn't obvious from looking at the patch, but when looking at the full code it fairly easy to see. --
Hmmm... it's probably worthwhile to note tho. I'll add a comment and send out the updated patches soon. Thanks. -- tejun --
