Re: [PATCH 1/4] cpu_stop: implement stop_cpu[s]()

Previous thread: [PATCH 2/4] stop_machine: reimplement using cpu_stop by Tejun Heo on Thursday, April 22, 2010 - 9:09 am. (1 message)

Next thread: [PATCHSET sched/core] cpu_stop: implement and use cpu_stop by Tejun Heo on Thursday, April 22, 2010 - 9:09 am. (6 messages)
From: Tejun Heo
Date: Thursday, April 22, 2010 - 9:09 am

Implement a simplistic per-cpu maximum priority cpu monopolization
mechanism.  A non-sleeping callback can be scheduled to run on one or
multiple cpus with maximum priority monopolozing those cpus.  This is
primarily to replace and unify RT workqueue usage in stop_machine and
scheduler migration_thread which currently is serving multiple
purposes.

Four functions are provided - stop_one_cpu(), stop_one_cpu_nowait(),
stop_cpus() and try_stop_cpus().

This is to allow clean sharing of resources among stop_cpu and all the
migration thread users.  One stopper thread per cpu is created which
is currently named "stopper/CPU".  This will eventually replace the
migration thread and take on its name.

* This facility was originally named cpuhog and lived in separate
  files but Peter Zijlstra nacked the name and thus got renamed to
  cpu_stop and moved into stop_machine.c.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Dimitri Sivanich <sivanich@sgi.com>
Cc: Peter Zijlstra <peterz@infradead.org>
---
 include/linux/stop_machine.h |   39 ++++-
 kernel/stop_machine.c        |  367 +++++++++++++++++++++++++++++++++++++++++-
 2 files changed, 397 insertions(+), 9 deletions(-)

diff --git a/include/linux/stop_machine.h b/include/linux/stop_machine.h
index baba3a2..efcbd6c 100644
--- a/include/linux/stop_machine.h
+++ b/include/linux/stop_machine.h
@@ -1,15 +1,46 @@
 #ifndef _LINUX_STOP_MACHINE
 #define _LINUX_STOP_MACHINE
-/* "Bogolock": stop the entire machine, disable interrupts.  This is a
-   very heavy lock, which is equivalent to grabbing every spinlock
-   (and more).  So the "read" side to such a lock is anything which
-   disables preeempt. */
+
 #include <linux/cpu.h>
 #include <linux/cpumask.h>
+#include <linux/list.h>
 #include <asm/system.h>
 
 #if defined(CONFIG_STOP_MACHINE) && defined(CONFIG_SMP)
 
+/*
+ * stop_cpu[s]() is simplistic per-cpu maximum priority cpu
+ * monopolization mechanism.  The caller can specify a ...
From: Peter Zijlstra
Date: Monday, May 3, 2010 - 6:26 am

If you do:

 done = { .ret = -ENOENT, };


You can do away with all the ->executed bits.
--

From: Tejun Heo
Date: Monday, May 3, 2010 - 11:40 pm

Hello,


Oh, I had code piece which wanted to discern between -ENOENT from
non-excution and -ENOENT return from the work function which seems
gone now.  I'll check things again and drop ->executed if everything
looks okay.

Thanks.

-- 
tejun
--

From: Tejun Heo
Date: Monday, May 3, 2010 - 11:55 pm

Hello, again.


Eh... now I remember.  If we start with ->ret = 0, stop_cpus() can't
return -ENOENT when none of the specified cpus executed without
tracking execution status (so the current code).  If we start with
->ret = -ENOENT, we can't tell whether all cpus executed successfully
or none has executed unless we BUG_ON() -ENOENT return from work
functions and let 0 return override -ENOENT.

Thanks.

-- 
tejun
--

From: Peter Zijlstra
Date: Monday, May 3, 2010 - 6:26 am

Not sure if its worth the hassle, but you could list_splice_init() the
complete pending list onto a local list, possible avoiding some locks.

But since this isn't supposed to be used much, I doubt we'll ever see

You would use WARN_ONCE() and print the function that last ran and
leaked the preempt count.


--

From: Tejun Heo
Date: Monday, May 3, 2010 - 11:36 pm

Hello,




Updated to use WARN_ONCE() w/ print the function symbol and argument.

Thanks.

-- 
tejun
--

From: Tejun Heo
Date: Tuesday, May 4, 2010 - 12:03 am

Now that I think more about it, there's a subtle race condition with
the above BUG_ON().  Stoppers are prepared by CPU_UP_PREPARE and
started by CPU_ONLINE but brought down by CPU_DEAD.  IOW, they're
allowed to run detached from their designated CPUs between CPU_DYING
and CPU_DEAD (the reponsibility of guaranteeing target cpus's onliness
is on the callers).  So, the above BUG_ON() might trigger spuriously
if a cpu goes down after brought online before its cpu_stopper had a
chance to pass through the BUG_ON() test.

Thanks.

-- 
tejun
--

From: Peter Zijlstra
Date: Tuesday, May 4, 2010 - 1:43 am

Ah indeed. A well, drop it then, its not worth making a more complicated
test.
--

From: Peter Zijlstra
Date: Monday, May 3, 2010 - 6:26 am

*work_buf = (struct cpu_stop_work){ .fn = fn, .arg = arg, };


--

From: Tejun Heo
Date: Monday, May 3, 2010 - 11:36 pm

Updated.

-- 
tejun
--

Previous thread: [PATCH 2/4] stop_machine: reimplement using cpu_stop by Tejun Heo on Thursday, April 22, 2010 - 9:09 am. (1 message)

Next thread: [PATCHSET sched/core] cpu_stop: implement and use cpu_stop by Tejun Heo on Thursday, April 22, 2010 - 9:09 am. (6 messages)