Re: [PATCH 2/10] x86: convert to generic helpers for IPI function calls

Previous thread: none

Next thread: Re: linux-next: Tree for April 29 by Toralf on Tuesday, April 29, 2008 - 3:37 am. (2 messages)
To: <linux-kernel@...>
Cc: <peterz@...>, <npiggin@...>, <linux-arch@...>, <jeremy@...>, <mingo@...>, <paulmck@...>
Date: Tuesday, April 29, 2008 - 3:26 am

Hi,

This is a repost of the generic-ipi block git branch. It contains
generic helpers for issuing and handling IPI function calls. It
improves smp_call_function_single() so that it is now a scalable
interface that doesn't rely on call_lock and it also greatly
speeds up smp_call_function(). Microbenchmarks show that it is about
30% faster on call throughput on a simple 2-way SMP system. Benefits
should be much higher on bigger systems.

Changes since last post:

- Address Andrew's review comments
- Address Paul's RCU comments. Hopefully everything is covered now,
I'd much appreciate a second look at this code Paul!
- Drop s390 support, as it currently relies on smp_call_function()
not returning before other CPUs are ready (or have) called the passed
in function.
- Address the x86/xen comments from Jeremy, I hope xen works as expected
now.
- Address the review comments from Peter.
- Various other little things and improvements.

--
Jens Axboe

--

To: <linux-kernel@...>
Cc: <peterz@...>, <npiggin@...>, <linux-arch@...>, <jeremy@...>, <mingo@...>, <paulmck@...>, Jens Axboe <jens.axboe@...>
Date: Tuesday, April 29, 2008 - 3:26 am

This adds kernel/smp.c which contains helpers for IPI function calls. In
addition to supporting the existing smp_call_function() in a more efficient
manner, it also adds a more scalable variant called smp_call_function_single()
for calling a given function on a single CPU only.

The core of this is based on the x86-64 patch from Nick Piggin, lots of
changes since then. "Alan D. Brunelle" <Alan.Brunelle@hp.com> has
contributed lots of fixes and suggestions as well.

Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
---
arch/Kconfig | 3 +
include/linux/smp.h | 31 ++++-
init/main.c | 1 +
kernel/Makefile | 1 +
kernel/smp.c | 415 +++++++++++++++++++++++++++++++++++++++++++++++++++
5 files changed, 449 insertions(+), 2 deletions(-)
create mode 100644 kernel/smp.c

diff --git a/arch/Kconfig b/arch/Kconfig
index 694c9af..a5a0184 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -36,3 +36,6 @@ config HAVE_KPROBES

config HAVE_KRETPROBES
def_bool n
+
+config USE_GENERIC_SMP_HELPERS
+ def_bool n
diff --git a/include/linux/smp.h b/include/linux/smp.h
index 55232cc..b22b4fc 100644
--- a/include/linux/smp.h
+++ b/include/linux/smp.h
@@ -7,9 +7,19 @@
*/

#include <linux/errno.h>
+#include <linux/list.h>
+#include <linux/spinlock.h>
+#include <linux/cpumask.h>

extern void cpu_idle(void);

+struct call_single_data {
+ struct list_head list;
+ void (*func) (void *info);
+ void *info;
+ unsigned int flags;
+};
+
#ifdef CONFIG_SMP

#include <linux/preempt.h>
@@ -53,9 +63,27 @@ extern void smp_cpus_done(unsigned int max_cpus);
* Call a function on all other processors
*/
int smp_call_function(void(*func)(void *info), void *info, int retry, int wait);
-
+int smp_call_function_mask(cpumask_t mask, void(*func)(void *info), void *info,
+ int wait);
int smp_call_function_single(int cpuid, void (*func) (void *info), void *info,
...

To: Jens Axboe <jens.axboe@...>
Cc: <linux-kernel@...>, <peterz@...>, <npiggin@...>, <linux-arch@...>, <mingo@...>, <paulmck@...>
Date: Wednesday, April 30, 2008 - 6:56 pm

I wonder if it isn't finally time to drop this parameter? Now that
there aren't a zillion arch implementations of this to fix, we only need

Mention preemption needs to be disabled? Or allow preemption and take
appropriate precautions internally? It's not obvious to me that all the
WARN_ON(preemptible())?
J
--

To: Jens Axboe <jens.axboe@...>
Cc: <linux-kernel@...>, <peterz@...>, <npiggin@...>, <linux-arch@...>, <jeremy@...>, <mingo@...>
Date: Tuesday, April 29, 2008 - 9:59 am

Looks much better, but there still appears to be a potential deadlock
with a CPU spinning waiting (indirectly) for a grace period to complete.
Such spinning can prevent the grace period from ever completing.

See "!!!".

!!! What if this CPU recently did a smp_call_function_mask() in no-wait mode,
and this CPU's fallback element is still waiting for a grace period
to elapse? Wouldn't we end up with a deadlock, with this CPU spinning
preventing the grace period from completing, thus preventing the element

Good, now there is no need to worry about readers seeing a recycling
event on the fallback element. I just need to go check to see what
readers do if there is no memory available and the fallback element

Here is is OK not to wait for a grace period -- this CPU owns the element,

Ah -- per-CPU fallback element, and we are presumably not permitted
to block during the course of the call. But then someone might be
using it after we exit this function!!! More on this elsewhere...

Hmmm... What about the no-wait case?

Ah, not a problem, because we always use the "d" element that is allocated
on our stack. But aren't we then passing a reference to an on-stack
variable out of scope? Don't we need to allocate "d" at the top level
of the function? Isn't gcc within its rights to get rid of this local
(perhaps re-using it for a compiler temporary or some such) as soon as

If we are using the on-stack variable, generic_exec_single() is

!!! We are passing "d" out of its scope -- need to move the declaration

And there are no references to "d" past here. Preemption appears to be
disabled by all callers. So good.

Still might have the fallback element waiting for a grace period
past this point, which would interfere with a subsequent call to this
--

To: Jens Axboe <jens.axboe@...>
Cc: <linux-kernel@...>, <peterz@...>, <npiggin@...>, <linux-arch@...>, <jeremy@...>, <mingo@...>
Date: Wednesday, April 30, 2008 - 7:29 am

One additional question... Why not handle memory allocation failure
by pretending that the caller to smp_call_function() had specified
"wait"? The callee is in irq context, so cannot block, right?

--

To: Paul E. McKenney <paulmck@...>
Cc: <linux-kernel@...>, <peterz@...>, <npiggin@...>, <linux-arch@...>, <jeremy@...>, <mingo@...>
Date: Wednesday, April 30, 2008 - 7:34 am

(BTW a lot of thanks for your comments, I've read and understood most of
it, I'll reply in due time - perhaps not until next week, I'll be gone
from this afternoon and until monday).

We cannot always fallback to wait, unfortunately. If irqs are disabled,
you could deadlock between two CPUs each waiting for each others IPI
ack.

So the good question is how to handle the problem. The easiest would be
to return ENOMEM and get rid of the fallback, but the fallback deadlocks
are so far mostly in the theoretical realm since it PROBABLY would not
occur in practice. But still no good enough, so I'm still toying with
ideas on how to make it 100% bullet proof.

--
Jens Axboe

--

To: Jens Axboe <jens.axboe@...>
Cc: <linux-kernel@...>, <peterz@...>, <npiggin@...>, <linux-arch@...>, <jeremy@...>, <mingo@...>
Date: Wednesday, April 30, 2008 - 8:17 am

Here are some (probably totally broken) ideas:

1. Global lock so that only one smp_call_function() in the
system proceeds. Additional calls would be spinning with
irqs -enabled- on the lock, avoiding deadlock. Kind of
defeats the purpose of your list, though...

2. Maintain a global mask of current targets of smp_call_function()
CPUs. A given CPU may proceed if it is not a current target
and if none of its target CPUs are already in the mask.
This mask would be manipulated under a global lock.

3. As in #2 above, but use per-CPU counters. This allows the
current CPU to proceed if it is not a target, but also allows
concurrent smp_call_function()s to proceed even if their
lists of target CPUs overlap.

4. #2 or #3, but where CPUs can proceed freely if their allocation
succeeded.

5. If a given CPU is waiting for other CPUs to respond, it polls
its own list (with irqs disabled), thus breaking the deadlock.
This means that you cannot call smp_call_function() while holding
a lock that might be acquired by the called function, but that
is not a new prohibition -- the only safe way to hold such a
lock is with irqs disabled, and you are not allowed to call
the smp_call_function() with irqs disabled in the first place
(right?).

#5 might actually work...

Thanx, Paul
--

To: Paul E. McKenney <paulmck@...>
Cc: <linux-kernel@...>, <peterz@...>, <npiggin@...>, <linux-arch@...>, <jeremy@...>, <mingo@...>
Date: Wednesday, April 30, 2008 - 8:37 am

That is what we used to do, that will obviously work. But defeats most

Yeah, #5 sounds quite promising. I'll see if I can work up a patch for
that, or if you feel so inclined, I'll definitely take patches :-)

The branch is 'generic-ipi' on git://git.kernel.dk/linux-2.6-block.git
The link is pretty slow, so it's best pull'ed off of Linus base. Or just
grab the patches from the gitweb interface:

http://git.kernel.dk/?p=linux-2.6-block.git;a=shortlog;h=refs/heads/gene...

--
Jens Axboe

--

To: Jens Axboe <jens.axboe@...>
Cc: <linux-kernel@...>, <peterz@...>, <npiggin@...>, <linux-arch@...>, <jeremy@...>, <mingo@...>
Date: Thursday, May 1, 2008 - 10:02 pm

And here is an untested patch for getting rid of the fallback element,
and eliminating the "wait" deadlocks.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---

smp.c | 80 +++++++++++-------------------------------------------------------
1 file changed, 14 insertions(+), 66 deletions(-)

diff --git a/kernel/smp.c b/kernel/smp.c
index 36d3eca..9df96fa 100644
--- a/kernel/smp.c
+++ b/kernel/smp.c
@@ -17,7 +17,6 @@ __cacheline_aligned_in_smp DEFINE_SPINLOCK(call_function_lock);
enum {
CSD_FLAG_WAIT = 0x01,
CSD_FLAG_ALLOC = 0x02,
- CSD_FLAG_FALLBACK = 0x04,
};

struct call_function_data {
@@ -33,9 +32,6 @@ struct call_single_queue {
spinlock_t lock;
};

-static DEFINE_PER_CPU(struct call_function_data, cfd_fallback);
-static DEFINE_PER_CPU(unsigned long, cfd_fallback_used);
-
void __cpuinit init_call_single_data(void)
{
int i;
@@ -59,6 +55,7 @@ static void csd_flag_wait(struct call_single_data *data)
if (!(data->flags & CSD_FLAG_WAIT))
break;
cpu_relax();
+ generic_smp_call_function_interrupt();
} while (1);
}

@@ -84,48 +81,13 @@ static void generic_exec_single(int cpu, struct call_single_data *data)
csd_flag_wait(data);
}

-/*
- * We need to have a global per-cpu fallback of call_function_data, so
- * we can safely proceed with smp_call_function() if dynamic allocation
- * fails and we cannot fall back to on-stack allocation (if wait == 0).
- */
-static noinline void acquire_cpu_fallback(int cpu)
-{
- while (test_and_set_bit_lock(0, &per_cpu(cfd_fallback_used, cpu)))
- cpu_relax();
-}
-
-static noinline void free_cpu_fallback(struct call_single_data *csd)
-{
- struct call_function_data *data;
- int cpu;
-
- data = container_of(csd, struct call_function_data, csd);
-
- /*
- * We could drop this loop by embedding a cpu variable in
- * csd, but this should happen so extremely rarely (if ever)
- * that this seems like a better idea
- */
- for_each_possible_cpu(cpu) {
- if...

To: Paul E. McKenney <paulmck@...>
Cc: Jens Axboe <jens.axboe@...>, <linux-kernel@...>, <peterz@...>, <linux-arch@...>, <jeremy@...>, <mingo@...>
Date: Thursday, May 1, 2008 - 10:12 pm

Hey this is coming along really nicely, thanks guys.

The only problem I have with this is that if you turn IRQs off, you
probably don't expect call function functions to be processed under
you (sure that doesn't happen now, but it could if anybody actually
starts to call IPIs under irq off).

What I _really_ wanted to do is just keep the core API as a non-deadlocky
one that has its data passed into it; and then implemented the fallbacky,
deadlocky one on top of that. In places where it makes sense, callers
could then use the new API if they want to.

We could make another rule that smp_call_function might also run functions,
but IMO that is starting to turn into spaghetti ;) Clever spaghetti though,
--

To: Nick Piggin <npiggin@...>
Cc: Jens Axboe <jens.axboe@...>, <linux-kernel@...>, <peterz@...>, <linux-arch@...>, <jeremy@...>, <mingo@...>
Date: Friday, May 2, 2008 - 8:29 am

OK -- for some reason, I was thinking that it was illegal to
invoke smp_call_function() with irqs disabled...

Ah, I see it -- smp_call_function_mask() says:

* You must not call this function with disabled interrupts or from a
* hardware interrupt handler or from a bottom half handler.

So we have no problem with smp_call_function, then.

OK, so smp_call_function() -can- be invoked with irqs disabled?

I don't believe that you can make the fallback non-deadlocky... Perhaps
a failure of imagination on my part, of course, but I am beginning to

Well, given that you cannot call smp_call_function_mask() with irqs
disabled, my approach -does- work in that case, as an irq might come
in just after you called the function but before irqs were disabled.

So, how many places is smp_call_function() invoked with irqs disabled?

--

To: <paulmck@...>
Cc: Nick Piggin <npiggin@...>, Jens Axboe <jens.axboe@...>, <linux-kernel@...>, <peterz@...>, <linux-arch@...>, <jeremy@...>, <mingo@...>
Date: Friday, May 2, 2008 - 8:50 am

Doing any smp_call_function with interrupts disabled is a potential
deadlock. See http://lkml.org/lkml/2004/5/2/116.

--

To: Keith Owens <kaos@...>
Cc: Nick Piggin <npiggin@...>, Jens Axboe <jens.axboe@...>, <linux-kernel@...>, <peterz@...>, <linux-arch@...>, <jeremy@...>, <mingo@...>
Date: Friday, May 2, 2008 - 9:09 am

OK, cool, thank you for the confirmation!

Therefore, when you call smp_call_function(), you may get calls from
other CPUs showing up, and therefore my polling approach does not
introduce any new strands of spaghetti. ;-)

Thanx, Paul
--

To: Nick Piggin <npiggin@...>
Cc: Jens Axboe <jens.axboe@...>, <linux-kernel@...>, <peterz@...>, <linux-arch@...>, <jeremy@...>, <mingo@...>
Date: Friday, May 2, 2008 - 8:42 am

And here is one scenario that makes me doubt that my imagination is
faulty:

1. CPU 0 disables irqs.

2. CPU 1 disables irqs.

3. CPU 0 invokes smp_call_function(). But CPU 1 will never respond
because its irqs are disabled.

4. CPU 1 invokes smp_call_function(). But CPU 0 will never respond
because its irqs are disabled.

Looks like inherent deadlock to me, requiring that smp_call_function()
be invoked with irqs enabled.

So, what am I missing here?

--

To: <paulmck@...>
Cc: Nick Piggin <npiggin@...>, Jens Axboe <jens.axboe@...>, <linux-kernel@...>, <linux-arch@...>, <jeremy@...>, <mingo@...>
Date: Friday, May 2, 2008 - 8:59 am

The wish to do it anyway ;-)

I can imagine some situations where I'd like to try anyway and fall back
to a slower path when failing.

With the initial design we would simply allocate data, stick it on the
queue and call the ipi (when needed).

This is perfectly deadlock free when wait=0 and it just returns -ENOMEM
on allocation failure.

It it doesn't return -ENOMEM I know its been queued and will be
processed at some point, if it does fail, I can deal with it in another
way.

I know I'd like to do that and I suspect Nick has a few use cases up his
sleeve as well.

--

To: Peter Zijlstra <peterz@...>
Cc: <paulmck@...>, Jens Axboe <jens.axboe@...>, <linux-kernel@...>, <linux-arch@...>, <jeremy@...>, <mingo@...>
Date: Saturday, May 3, 2008 - 1:49 am

Yeah, I'm just talking about the wait=0 case. (btw. I'd rather the core
API takes some data rather than allocates some itself, eg because you
might want to have it on the stack).

For the wait=1 case, something very clever such as processing pending
requests in a polling loop might be cool... however I'd rather not add
such complexity until someone needs it (you could stick a comment in
there outlining your algorithm). But I'd just rather not have peole rely

At least with IPIs I think we can guarantee they will be processed on

It would be handy. The "quickly kick something off on another CPU" is
pretty nice in mm/ when you have per-cpu queues or caches that might
want to be flushed.
--

To: Nick Piggin <npiggin@...>
Cc: Peter Zijlstra <peterz@...>, Jens Axboe <jens.axboe@...>, <linux-kernel@...>, <linux-arch@...>, <jeremy@...>, <mingo@...>
Date: Saturday, May 3, 2008 - 2:11 pm

In that case we may need to go back to the global lock with only one
request being processed at a time. Otherwise, if two wait=1 requests
happen at the same time, they deadlock waiting for each other to process
their request. (See Keith Owens: http://lkml.org/lkml/2008/5/2/183).

In other words, if you want to allow parallel calls to
smp_call_function(), the simplest way to do it seems to be to do the
polling loop. The other ways I have come up with thus far are uglier
and less effective (see http://lkml.org/lkml/2008/4/30/164).

Now, what I -could- do would be to prohibit the wait=1 case from
irq-disable state from polling -- that would make sense, as the caller
probably had a reason to mask irqs, and might not take kindly to having

OK, so let me make sure I understand what is needed. One example might be
some code called from scheduler_tick(), which runs with irqs disabled.
Without the ability to call smp_call_function() directly, you have
to fire off a work queue or something. Now, if smp_call_function()
can hand you an -ENOMEM or (maybe) an -EBUSY, then you still have to
fire off the work queue, but you probably only have to do it rarely,
minimizing the performance impact.

Another possibility is when it is -nice- to call smp_call_function(),
but can just try again on the next scheduler_tick() -- ignoring dynticks
idle for the moment. In this case, you might still test the error return
to set a flag that you will check on the next scheduler_tick() call.

Is this where you guys are coming from?

And you are all OK with smp_call_function() called with irqs enabled
never being able to fail, right? (Speaking of spaghetti code, why

OK, I think I might be seeing what you guys are getting at. Here is
what I believe you guys need:

1. No deadlocks, ever, not even theoretical "low probability"
deadlocks.

2. No failure returns when called with irqs enabled. On the other
hand, when irqs are disabled, failure is possible. Though hopefully
unlikely.

3. Paralle...

To: Paul E. McKenney <paulmck@...>
Cc: Peter Zijlstra <peterz@...>, Jens Axboe <jens.axboe@...>, <linux-kernel@...>, <linux-arch@...>, <jeremy@...>, <mingo@...>
Date: Monday, May 5, 2008 - 12:15 am

I think we're talking past each other a little bit.

There is no irq-disabled calls as yet, therefore I don't think we should
add a lot of complex code just to _allow_ for it; at least, not until a
really compelling user comes up.

The main thing is to parallelise the code. The fact that we can trivially
support irq-disabled calls for nowait case (if the caller supplies the

I think I'd like to keep existing smp_call_function that disallows
irq-disabled calls and can't fail. Introduce a new one for irq-disabled
case.

For the last cases, I actually think your polling loop is pretty cool ;)
So I don't completely object to it, I just don't think we should add it
in until something wants it...

Don't let me dictate the requirements though, the only real one I had
was to make call_function_single scalable and faster, and call_function
be as optimal as possible.

--

To: Nick Piggin <npiggin@...>
Cc: Peter Zijlstra <peterz@...>, Jens Axboe <jens.axboe@...>, <linux-kernel@...>, <linux-arch@...>, <jeremy@...>, <mingo@...>
Date: Monday, May 5, 2008 - 1:43 pm

OK... But the wait=1 case already unconditionally stores its data on the
stack. In the wait=0 case, I would guess that it would be good to supply
a function that gets called after a grace period elapses after the last
CPU invokes the to-be-called-by-all-CPUs function:

int smp_call_function_nowait(void (*func)(void *), void *info, int natomic, void (*cleanup)(struct rcu_head *rcu))

Is this what you are looking for? My guess is that the idea is to
pre-allocate the memory before entering a bunch of irq-diabled code that

Well, that would explain my confusion!!! ;-)

It is also not hard to support irq-disabled calls for the wait case one
of two ways:

o Expand the scope of call_function_lock to cover the call
to csd_flag_wait(). In the irq-disabled-wait case, use
spin_trylock_irqsave() to acquire this lock. If lock acquisition
fails, hand back -EBUSY.

o Use the polling trick. This avoids deadlock and failure as well,
but means that one form of irqs happens behind your back despite
irqs being disabled.

The fallback is to waiting in the current patch (see below). This version
disallows calls with irqs disabled, so parallel calls to smp_call_function()
will process each other's function calls concurrently. Might be a bit of

That makes a lot of sense -- especially if we need to introduce a cleanup
function to handle memory being passed into smp_call_function_disabled()

OK. So you are willing to live with smp_call_function() and
smp_call_function_mask() being globally serialized? That would

Yep, it is one way to permit the irq-disabled-wait case without -EBUSY.
But it does add some hair, so I agree that it should wait until someone

OK. The below still permits parallel smp_call_function()
and smp_call_function_mask() as well as permitting parallel
smp_call_function_single(). It prohibits the irq-disabled case.

Thoughts?

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---

arch/Kconfig | 2 -
kernel/smp.c | 87 +++++++++...

To: Paul E. McKenney <paulmck@...>
Cc: Nick Piggin <npiggin@...>, Peter Zijlstra <peterz@...>, <linux-kernel@...>, <linux-arch@...>, <jeremy@...>, <mingo@...>
Date: Wednesday, May 7, 2008 - 4:42 pm

Sorry for the delayed response here. The below looks pretty good to me,
though it'll require a few modifications to fixup smp_call_function()
from irq disabled context.

But that's doable, though I have to do a grep over the kernel to check
everything (I remember CPU send_stop being one such path). Not a huge
fan of such an interface, but hey...

Is everyone OK with going this route, initially at least? It's the

This part puzzled me, why? We default to 'n' and let converted archs
opt-in, so I don't get this part of your patch...

--
Jens Axboe

--

To: Jens Axboe <jens.axboe@...>
Cc: Nick Piggin <npiggin@...>, Peter Zijlstra <peterz@...>, <linux-kernel@...>, <linux-arch@...>, <jeremy@...>, <mingo@...>
Date: Thursday, May 8, 2008 - 12:36 am

<red face> I made that change to get it to compile, and forgot to
revert it when making the patch. Please ignore that bit.

Thanx, Paul
--

To: Nick Piggin <npiggin@...>
Cc: Peter Zijlstra <peterz@...>, Jens Axboe <jens.axboe@...>, <linux-kernel@...>, <linux-arch@...>, <jeremy@...>, <mingo@...>
Date: Sunday, May 4, 2008 - 6:04 pm

On the off-chance that the answer to the above question is "no", here
is a crude patch on top of Jens's earlier patch.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---

arch/Kconfig | 2 -
kernel/smp.c | 107 ++++++++++++++++++-----------------------------------------
2 files changed, 34 insertions(+), 75 deletions(-)

diff --git a/arch/Kconfig b/arch/Kconfig
index a5a0184..5ae9360 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -38,4 +38,4 @@ config HAVE_KRETPROBES
def_bool n

config USE_GENERIC_SMP_HELPERS
- def_bool n
+ def_bool y
diff --git a/kernel/smp.c b/kernel/smp.c
index 36d3eca..d7e8dd1 100644
--- a/kernel/smp.c
+++ b/kernel/smp.c
@@ -17,7 +17,6 @@ __cacheline_aligned_in_smp DEFINE_SPINLOCK(call_function_lock);
enum {
CSD_FLAG_WAIT = 0x01,
CSD_FLAG_ALLOC = 0x02,
- CSD_FLAG_FALLBACK = 0x04,
};

struct call_function_data {
@@ -33,9 +32,6 @@ struct call_single_queue {
spinlock_t lock;
};

-static DEFINE_PER_CPU(struct call_function_data, cfd_fallback);
-static DEFINE_PER_CPU(unsigned long, cfd_fallback_used);
-
void __cpuinit init_call_single_data(void)
{
int i;
@@ -48,7 +44,7 @@ void __cpuinit init_call_single_data(void)
}
}

-static void csd_flag_wait(struct call_single_data *data)
+static void csd_flag_wait(struct call_single_data *data, int poll)
{
/* Wait for response */
do {
@@ -59,6 +55,8 @@ static void csd_flag_wait(struct call_single_data *data)
if (!(data->flags & CSD_FLAG_WAIT))
break;
cpu_relax();
+ if (poll)
+ generic_smp_call_function_interrupt();
} while (1);
}

@@ -66,7 +64,7 @@ static void csd_flag_wait(struct call_single_data *data)
* Insert a previously allocated call_single_data element for execution
* on the given CPU. data must already have ->func, ->info, and ->flags set.
*/
-static void generic_exec_single(int cpu, struct call_single_data *data)
+static void generic_exec_single(int cpu, struct call_single_data *data, int...

To: Peter Zijlstra <peterz@...>
Cc: Nick Piggin <npiggin@...>, Jens Axboe <jens.axboe@...>, <linux-kernel@...>, <linux-arch@...>, <jeremy@...>, <mingo@...>
Date: Friday, May 2, 2008 - 10:21 am

OK, so one approach would be to check for irqs being disabled,
perhaps as follows, on top of my previous patch:

struct call_single_data *data = NULL;

if (!wait) {
data = kmalloc(sizeof(*data), GFP_ATOMIC);
if (data)
data->flags = CSD_FLAG_ALLOC;
}
if (!data) {
if (unlikely(irqs_disabled())) {
put_cpu();
return -ENOMEM;
}
data = &d;
data->flags = CSD_FLAG_WAIT;
}

data->func = func;
data->info = info;
generic_exec_single(cpu, data);

That would prevent -ENOMEM unless you invoked the function with irqs
disabled. So normal callers would still see the current failure-free
semantics -- you really don't want to be inflicting failure when not
necessary, right?

There could only be one irq-disabled caller at a time, which could be
handled using a trylock, returning -EBUSY if the lock is already held.
Otherwise, you end up with the scenario called out above (which Keith
Ownens pointed out some years ago).

Does this approach make sense?

Thanx, Paul
--

To: Peter Zijlstra <peterz@...>
Cc: Nick Piggin <npiggin@...>, Jens Axboe <jens.axboe@...>, <linux-kernel@...>, <linux-arch@...>, <jeremy@...>, <mingo@...>
Date: Friday, May 2, 2008 - 10:30 pm

Actually, no... The irq-disabled callers would need to acquire the
spinlock -before- disabling irqs, otherwise we end up right back in
the deadlock scenario.

Thanx, Paul
--

To: Jens Axboe <jens.axboe@...>
Cc: <linux-kernel@...>, <peterz@...>, <npiggin@...>, <linux-arch@...>, <jeremy@...>, <mingo@...>
Date: Wednesday, April 30, 2008 - 10:44 pm

OK, will give it a shot. Low bandwidth at the moment, but getting there.
If worst comes to worst, I will annotate one of your patches.

Thanx, Paul
--

To: <linux-kernel@...>
Cc: <peterz@...>, <npiggin@...>, <linux-arch@...>, <jeremy@...>, <mingo@...>, <paulmck@...>, Jens Axboe <jens.axboe@...>, Hirokazu Takata <takata@...>
Date: Tuesday, April 29, 2008 - 3:26 am

This converts m32r to use the new helpers for smp_call_function() and
friends, and adds support for smp_call_function_single(). Not tested,
not even compiled.

Cc: Hirokazu Takata <takata@linux-m32r.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
---
arch/m32r/Kconfig | 1 +
arch/m32r/kernel/m32r_ksyms.c | 3 -
arch/m32r/kernel/smp.c | 128 ++++------------------------------------
arch/m32r/kernel/traps.c | 3 +-
include/asm-m32r/smp.h | 1 +
5 files changed, 17 insertions(+), 119 deletions(-)

diff --git a/arch/m32r/Kconfig b/arch/m32r/Kconfig
index de153de..a5f864c 100644
--- a/arch/m32r/Kconfig
+++ b/arch/m32r/Kconfig
@@ -296,6 +296,7 @@ config PREEMPT

config SMP
bool "Symmetric multi-processing support"
+ select USE_GENERIC_SMP_HELPERS
---help---
This enables support for systems with more than one CPU. If you have
a system with only one CPU, like most personal computers, say N. If
diff --git a/arch/m32r/kernel/m32r_ksyms.c b/arch/m32r/kernel/m32r_ksyms.c
index e6709fe..16bcb18 100644
--- a/arch/m32r/kernel/m32r_ksyms.c
+++ b/arch/m32r/kernel/m32r_ksyms.c
@@ -43,9 +43,6 @@ EXPORT_SYMBOL(dcache_dummy);
#endif
EXPORT_SYMBOL(cpu_data);

-/* Global SMP stuff */
-EXPORT_SYMBOL(smp_call_function);
-
/* TLB flushing */
EXPORT_SYMBOL(smp_flush_tlb_page);
#endif
diff --git a/arch/m32r/kernel/smp.c b/arch/m32r/kernel/smp.c
index c837bc1..74eb7bc 100644
--- a/arch/m32r/kernel/smp.c
+++ b/arch/m32r/kernel/smp.c
@@ -35,22 +35,6 @@
/*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*/

/*
- * Structure and data for smp_call_function(). This is designed to minimise
- * static memory requirements. It also looks cleaner.
- */
-static DEFINE_SPINLOCK(call_lock);
-
-struct call_data_struct {
- void (*func) (void *info);
- void *info;
- atomic_t started;
- atomic_t finished;
- int wait;
-} __attribute__ ((__aligned__(SMP_CACHE_BYTES)));
-
-static struct call...

To: <linux-kernel@...>
Cc: <peterz@...>, <npiggin@...>, <linux-arch@...>, <jeremy@...>, <mingo@...>, <paulmck@...>, Jens Axboe <jens.axboe@...>, Russell King <rmk@...>
Date: Tuesday, April 29, 2008 - 3:26 am

This converts arm to use the new helpers for smp_call_function() and
friends, and adds support for smp_call_function_single().

Fixups and testing done by Catalin Marinas <catalin.marinas@arm.com>

Cc: Russell King <rmk@arm.linux.org.uk>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
---
arch/arm/Kconfig | 1 +
arch/arm/kernel/smp.c | 157 +++++--------------------------------------------
2 files changed, 16 insertions(+), 142 deletions(-)

diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index b786e68..c72dae6 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -650,6 +650,7 @@ source "kernel/time/Kconfig"
config SMP
bool "Symmetric Multi-Processing (EXPERIMENTAL)"
depends on EXPERIMENTAL && (REALVIEW_EB_ARM11MP || MACH_REALVIEW_PB11MP)
+ select USE_GENERIC_SMP_HELPERS
help
This enables support for systems with more than one CPU. If you have
a system with only one CPU, like most personal computers, say N. If
diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c
index eefae1d..6344466 100644
--- a/arch/arm/kernel/smp.c
+++ b/arch/arm/kernel/smp.c
@@ -68,20 +68,10 @@ enum ipi_msg_type {
IPI_TIMER,
IPI_RESCHEDULE,
IPI_CALL_FUNC,
+ IPI_CALL_FUNC_SINGLE,
IPI_CPU_STOP,
};

-struct smp_call_struct {
- void (*func)(void *info);
- void *info;
- int wait;
- cpumask_t pending;
- cpumask_t unfinished;
-};
-
-static struct smp_call_struct * volatile smp_call_function_data;
-static DEFINE_SPINLOCK(smp_call_function_lock);
-
int __cpuinit __cpu_up(unsigned int cpu)
{
struct cpuinfo_arm *ci = &per_cpu(cpu_data, cpu);
@@ -366,114 +356,15 @@ static void send_ipi_message(cpumask_t callmap, enum ipi_msg_type msg)
local_irq_restore(flags);
}

-/*
- * You must not call this function with disabled interrupts, from a
- * hardware interrupt handler, nor from a bottom half handler.
- */
-static int smp_call_function_on_cpu(void (*func)(void *info), void *info,
- int retry, int wait, cpumas...

To: <linux-kernel@...>
Cc: <peterz@...>, <npiggin@...>, <linux-arch@...>, <jeremy@...>, <mingo@...>, <paulmck@...>, Jens Axboe <jens.axboe@...>
Date: Tuesday, April 29, 2008 - 3:26 am

This converts alpha to use the new helpers for smp_call_function() and
friends, and adds support for smp_call_function_single().

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
---
arch/alpha/Kconfig | 1 +
arch/alpha/kernel/core_marvel.c | 6 +-
arch/alpha/kernel/smp.c | 170 +++------------------------------------
include/asm-alpha/smp.h | 2 -
4 files changed, 14 insertions(+), 165 deletions(-)

diff --git a/arch/alpha/Kconfig b/arch/alpha/Kconfig
index 729cdbd..dbe8c28 100644
--- a/arch/alpha/Kconfig
+++ b/arch/alpha/Kconfig
@@ -528,6 +528,7 @@ config ARCH_MAY_HAVE_PC_FDC
config SMP
bool "Symmetric multi-processing support"
depends on ALPHA_SABLE || ALPHA_LYNX || ALPHA_RAWHIDE || ALPHA_DP264 || ALPHA_WILDFIRE || ALPHA_TITAN || ALPHA_GENERIC || ALPHA_SHARK || ALPHA_MARVEL
+ select USE_GENERIC_SMP_HELPERS
---help---
This enables support for systems with more than one CPU. If you have
a system with only one CPU, like most personal computers, say N. If
diff --git a/arch/alpha/kernel/core_marvel.c b/arch/alpha/kernel/core_marvel.c
index b04f1fe..ced4aae 100644
--- a/arch/alpha/kernel/core_marvel.c
+++ b/arch/alpha/kernel/core_marvel.c
@@ -660,9 +660,9 @@ __marvel_rtc_io(u8 b, unsigned long addr, int write)

#ifdef CONFIG_SMP
if (smp_processor_id() != boot_cpuid)
- smp_call_function_on_cpu(__marvel_access_rtc,
- &rtc_access, 1, 1,
- cpumask_of_cpu(boot_cpuid));
+ smp_call_function_single(boot_cpuid,
+ __marvel_access_rtc,
+ &rtc_access, 1, 1);
else
__marvel_access_rtc(&rtc_access);
#else
diff --git a/arch/alpha/kernel/smp.c b/arch/alpha/kernel/smp.c
index 2525692..95c905b 100644
--- a/arch/alpha/kernel/smp.c
+++ b/arch/alpha/kernel/smp.c
@@ -62,6 +62,7 @@ static struct {
enum ipi_message_type {
IPI_RESCHEDULE,
IPI_CALL_FUNC,
+ IPI_CALL_FUNC_SINGLE,
IPI_CPU_STOP,
};

@@ -558,51 +559,6 @@ send_ipi_message(cpumask_t to_whom, enum ipi_message_t...

To: <linux-kernel@...>
Cc: <peterz@...>, <npiggin@...>, <linux-arch@...>, <jeremy@...>, <mingo@...>, <paulmck@...>, Jens Axboe <jens.axboe@...>
Date: Tuesday, April 29, 2008 - 3:26 am

This converts x86, x86-64, and xen to use the new helpers for
smp_call_function() and friends, and adds support for
smp_call_function_single().

Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
---
arch/x86/Kconfig | 1 +
arch/x86/kernel/apic_32.c | 4 +
arch/x86/kernel/entry_64.S | 3 +
arch/x86/kernel/i8259_64.c | 4 +
arch/x86/kernel/smp.c | 152 ++++------------------------
arch/x86/kernel/smpcommon.c | 56 ----------
arch/x86/mach-voyager/voyager_smp.c | 94 +++--------------
arch/x86/xen/enlighten.c | 4 +-
arch/x86/xen/mmu.c | 2 +-
arch/x86/xen/smp.c | 120 +++++++----------------
arch/x86/xen/xen-ops.h | 9 +--
include/asm-x86/hw_irq_32.h | 1 +
include/asm-x86/hw_irq_64.h | 2 +
include/asm-x86/mach-default/entry_arch.h | 1 +
include/asm-x86/mach-default/irq_vectors.h | 1 +
include/asm-x86/mach-voyager/entry_arch.h | 2 +-
include/asm-x86/mach-voyager/irq_vectors.h | 4 +-
include/asm-x86/smp.h | 19 ++--
18 files changed, 113 insertions(+), 366 deletions(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index a12dbb2..5e0dcf1 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -162,6 +162,7 @@ config GENERIC_PENDING_IRQ
config X86_SMP
bool
depends on SMP && ((X86_32 && !X86_VOYAGER) || X86_64)
+ select USE_GENERIC_SMP_HELPERS
default y

config X86_32_SMP
diff --git a/arch/x86/kernel/apic_32.c b/arch/x86/kernel/apic_32.c
index 4b99b1b..71017f7 100644
--- a/arch/x86/kernel/apic_32.c
+++ b/arch/x86/kernel/apic_32.c
@@ -1358,6 +1358,10 @@ void __init smp_intr_init(void)

/* IPI for generic function call */
set_intr_gate(CALL_FUNCTION_VECTOR, call_function_interrupt);
+
+ ...

To: Jens Axboe <jens.axboe@...>
Cc: <linux-kernel@...>, <peterz@...>, <npiggin@...>, <linux-arch@...>, <mingo@...>, <paulmck@...>
Date: Wednesday, April 30, 2008 - 5:39 pm

Seems to work fine with this:

Subject: xen: build fix for generic IPI stuff

Not even one CONFIG_XEN test build?

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
---
arch/x86/xen/smp.c | 3 +--
include/asm-x86/xen/events.h | 1 +
2 files changed, 2 insertions(+), 2 deletions(-)

===================================================================
--- a/arch/x86/xen/smp.c
+++ b/arch/x86/xen/smp.c
@@ -43,8 +43,7 @@
static DEFINE_PER_CPU(int, debug_irq) = -1;

static irqreturn_t xen_call_function_interrupt(int irq, void *dev_id);
-
-static struct call_data_struct *call_data;
+static irqreturn_t xen_call_function_single_interrupt(int irq, void *dev_id);

/*
* Reschedule call back. Nothing to do,
===================================================================
--- a/include/asm-x86/xen/events.h
+++ b/include/asm-x86/xen/events.h
@@ -4,6 +4,7 @@
enum ipi_vector {
XEN_RESCHEDULE_VECTOR,
XEN_CALL_FUNCTION_VECTOR,
+ XEN_CALL_FUNCTION_SINGLE_VECTOR,

XEN_NR_IPIS,
};

--

To: Jens Axboe <jens.axboe@...>
Cc: <linux-kernel@...>, <peterz@...>, <npiggin@...>, <linux-arch@...>, <mingo@...>, <paulmck@...>
Date: Tuesday, April 29, 2008 - 4:35 pm

I added this to deal with the case where you're sending an IPI to
another VCPU which isn't currently running on a real cpu. In this case
you could end up spinning while the other VCPU is waiting for a real CPU
to run on. (Basically the same problem that spinlocks have in a virtual
environment.)

However, this is at best a partial solution to the problem, and I never
benchmarked if it really makes a difference. Since any other virtual
environment would have the same problem, its best if we can solve it
generically. (Of course a synchronous single-target cross-cpu call is a
simple cross-cpu rpc, which could be implemented very efficiently in the
host/hypervisor by simply doing a vcpu context switch...)

J
--

To: Jeremy Fitzhardinge <jeremy@...>
Cc: <linux-kernel@...>, <peterz@...>, <npiggin@...>, <linux-arch@...>, <mingo@...>, <paulmck@...>
Date: Wednesday, April 30, 2008 - 7:35 am

So, what would your advice be? Seems safe enough to ignore for now and
attack it if it becomes a real problem.

--
Jens Axboe

--

To: Jens Axboe <jens.axboe@...>
Cc: Jeremy Fitzhardinge <jeremy@...>, <linux-kernel@...>, <peterz@...>, <npiggin@...>, <linux-arch@...>, <mingo@...>
Date: Wednesday, April 30, 2008 - 8:20 am

How about an arch-specific function/macro invoked in the spin loop?
The generic implementation would do nothing, but things like Xen
could implement as above.

Thanx, Paul
--

To: Paul E. McKenney <paulmck@...>
Cc: Jeremy Fitzhardinge <jeremy@...>, <linux-kernel@...>, <peterz@...>, <npiggin@...>, <linux-arch@...>, <mingo@...>
Date: Wednesday, April 30, 2008 - 8:31 am

Xen could just stuff that bit into its arch_send_call_function_ipi(),
something like the below should be fine. My question to Jeremy was more
of the order of whether it should be kept or not, I guess it's safer to
just keep it and retain the existing behaviour (and let Jeremy/others
evaluate it at will later on). Note that I got rid of the yield bool and
break when we called the hypervisor.

Jeremy, shall I add this?

diff --git a/arch/x86/xen/smp.c b/arch/x86/xen/smp.c
index 2dfe093..064e6dc 100644
--- a/arch/x86/xen/smp.c
+++ b/arch/x86/xen/smp.c
@@ -352,7 +352,17 @@ static void xen_send_IPI_mask(cpumask_t mask, enum ipi_vector vector)

void xen_smp_send_call_function_ipi(cpumask_t mask)
{
+ int cpu;
+
xen_send_IPI_mask(mask, XEN_CALL_FUNCTION_VECTOR);
+
+ /* Make sure other vcpus get a chance to run if they need to. */
+ for_each_cpu_mask(cpu, mask) {
+ if (xen_vcpu_stolen(cpu)) {
+ HYPERVISOR_sched_op(SCHEDOP_yield, 0);
+ break;
+ }
+ }
}

void xen_smp_send_call_function_single_ipi(int cpu)

--
Jens Axboe

--

To: Jens Axboe <jens.axboe@...>
Cc: Paul E. McKenney <paulmck@...>, <linux-kernel@...>, <peterz@...>, <npiggin@...>, <linux-arch@...>, <mingo@...>
Date: Wednesday, April 30, 2008 - 10:51 am

Hold off for now. Given that its effects are unmeasured, I'm not even
sure its the right thing to do. For example, it will yield if you're
sending an IPI to a vcpu which wants to run but can't, but does nothing
for an idle vcpu. And always yielding may be a performance problem if
the IPI doesn't involve any cpu contention.

It's easy to add back if it turns out to be useful.

J
--

To: <linux-kernel@...>
Cc: <peterz@...>, <npiggin@...>, <linux-arch@...>, <jeremy@...>, <mingo@...>, <paulmck@...>, Jens Axboe <jens.axboe@...>, Tony Luck <tony.luck@...>
Date: Tuesday, April 29, 2008 - 3:26 am

This converts ia64 to use the new helpers for smp_call_function() and
friends, and adds support for smp_call_function_single().

Cc: Tony Luck <tony.luck@intel.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
---
arch/ia64/Kconfig | 1 +
arch/ia64/kernel/smp.c | 239 +++---------------------------------------------
include/asm-ia64/smp.h | 3 -
3 files changed, 16 insertions(+), 227 deletions(-)

diff --git a/arch/ia64/Kconfig b/arch/ia64/Kconfig
index 3aa6c82..9e1e115 100644
--- a/arch/ia64/Kconfig
+++ b/arch/ia64/Kconfig
@@ -286,6 +286,7 @@ config VIRT_CPU_ACCOUNTING

config SMP
bool "Symmetric multi-processing support"
+ select USE_GENERIC_SMP_HELPERS
help
This enables support for systems with more than one CPU. If you have
a system with only one CPU, say N. If you have a system with more
diff --git a/arch/ia64/kernel/smp.c b/arch/ia64/kernel/smp.c
index 9a9d4c4..c5dcd03 100644
--- a/arch/ia64/kernel/smp.c
+++ b/arch/ia64/kernel/smp.c
@@ -60,25 +60,9 @@ static struct local_tlb_flush_counts {

static DEFINE_PER_CPU(unsigned int, shadow_flush_counts[NR_CPUS]) ____cacheline_aligned;

-
-/*
- * Structure and data for smp_call_function(). This is designed to minimise static memory
- * requirements. It also looks cleaner.
- */
-static __cacheline_aligned DEFINE_SPINLOCK(call_lock);
-
-struct call_data_struct {
- void (*func) (void *info);
- void *info;
- long wait;
- atomic_t started;
- atomic_t finished;
-};
-
-static volatile struct call_data_struct *call_data;
-
#define IPI_CALL_FUNC 0
#define IPI_CPU_STOP 1
+#define IPI_CALL_FUNC_SINGLE 2
#define IPI_KDUMP_CPU_STOP 3

/* This needs to be cacheline aligned because it is written to by *other* CPUs. */
@@ -89,13 +73,13 @@ extern void cpu_halt (void);
void
lock_ipi_calllock(void)
{
- spin_lock_irq(&call_lock);
+ spin_lock_irq(&call_function_lock);
}

void
unlock_ipi_calllock(void)
{
- spin_unlock_irq(&call_lock);
+ spin_unlo...

To: <linux-kernel@...>
Cc: <peterz@...>, <npiggin@...>, <linux-arch@...>, <jeremy@...>, <mingo@...>, <paulmck@...>, Jens Axboe <jens.axboe@...>, Kyle McMartin <kyle@...>, Matthew Wilcox <matthew@...>, Grant Grundler <grundler@...>
Date: Tuesday, April 29, 2008 - 3:26 am

This converts parisc to use the new helpers for smp_call_function() and
friends, and adds support for smp_call_function_single(). Not tested,
not even compiled.

Cc: Kyle McMartin <kyle@parisc-linux.org>
Cc: Matthew Wilcox <matthew@wil.cx>
Cc: Grant Grundler <grundler@parisc-linux.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
---
arch/parisc/Kconfig | 1 +
arch/parisc/kernel/smp.c | 134 +++++++--------------------------------------
2 files changed, 22 insertions(+), 113 deletions(-)

diff --git a/arch/parisc/Kconfig b/arch/parisc/Kconfig
index bc7a19d..a7d4fd3 100644
--- a/arch/parisc/Kconfig
+++ b/arch/parisc/Kconfig
@@ -199,6 +199,7 @@ endchoice

config SMP
bool "Symmetric multi-processing support"
+ select USE_GENERIC_SMP_HELPERS
---help---
This enables support for systems with more than one CPU. If you have
a system with only one CPU, like most personal computers, say N. If
diff --git a/arch/parisc/kernel/smp.c b/arch/parisc/kernel/smp.c
index 85fc775..126105c 100644
--- a/arch/parisc/kernel/smp.c
+++ b/arch/parisc/kernel/smp.c
@@ -84,19 +84,11 @@ EXPORT_SYMBOL(cpu_possible_map);

DEFINE_PER_CPU(spinlock_t, ipi_lock) = SPIN_LOCK_UNLOCKED;

-struct smp_call_struct {
- void (*func) (void *info);
- void *info;
- long wait;
- atomic_t unstarted_count;
- atomic_t unfinished_count;
-};
-static volatile struct smp_call_struct *smp_call_function_data;
-
enum ipi_message_type {
IPI_NOP=0,
IPI_RESCHEDULE=1,
IPI_CALL_FUNC,
+ IPI_CALL_FUNC_SINGLE,
IPI_CPU_START,
IPI_CPU_STOP,
IPI_CPU_TEST
@@ -187,33 +179,12 @@ ipi_interrupt(int irq, void *dev_id)

case IPI_CALL_FUNC:
smp_debug(100, KERN_DEBUG "CPU%d IPI_CALL_FUNC\n", this_cpu);
- {
- volatile struct smp_call_struct *data;
- void (*func)(void *info);
- void *info;
- int wait;
-
- data = smp_call_function_data;
- func = data->func;
- info = data->info;
- wait = data->wait;
-
- mb...

To: <linux-kernel@...>
Cc: <peterz@...>, <npiggin@...>, <linux-arch@...>, <jeremy@...>, <mingo@...>, <paulmck@...>, Jens Axboe <jens.axboe@...>
Date: Tuesday, April 29, 2008 - 3:26 am

This converts ppc to use the new helpers for smp_call_function() and
friends, and adds support for smp_call_function_single().

ppc loses the timeout functionality of smp_call_function_mask() with
this change, as the generic code does not provide that.

Acked-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
---
arch/powerpc/Kconfig | 1 +
arch/powerpc/kernel/smp.c | 220 ++-----------------------------
arch/powerpc/platforms/cell/interrupt.c | 1 +
arch/powerpc/platforms/ps3/smp.c | 7 +-
arch/powerpc/platforms/pseries/xics.c | 6 +-
arch/powerpc/sysdev/mpic.c | 2 +-
include/asm-powerpc/smp.h | 5 +-
7 files changed, 23 insertions(+), 219 deletions(-)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 4e40c12..60f4a2f 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -110,6 +110,7 @@ config PPC
select HAVE_KPROBES
select HAVE_KRETPROBES
select HAVE_LMB
+ select USE_GENERIC_SMP_HELPERS if SMP

config EARLY_PRINTK
bool
diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
index be35ffa..facd49d 100644
--- a/arch/powerpc/kernel/smp.c
+++ b/arch/powerpc/kernel/smp.c
@@ -72,12 +72,8 @@ struct smp_ops_t *smp_ops;

static volatile unsigned int cpu_callin_map[NR_CPUS];

-void smp_call_function_interrupt(void);
-
int smt_enabled_at_boot = 1;

-static int ipi_fail_ok;
-
static void (*crash_ipi_function_ptr)(struct pt_regs *) = NULL;

#ifdef CONFIG_PPC64
@@ -99,12 +95,15 @@ void smp_message_recv(int msg)
{
switch(msg) {
case PPC_MSG_CALL_FUNCTION:
- smp_call_function_interrupt();
+ generic_smp_call_function_interrupt();
break;
case PPC_MSG_RESCHEDULE:
/* XXX Do we have to do this? */
set_need_resched();
break;
+ case PPC_MSG_CALL_FUNC_SINGLE:
+ generic_smp_call_function_single_interrupt();
+ break;
case PPC_MSG_DEBUGGER_BREAK:
if (crash_ipi...

To: <linux-kernel@...>
Cc: <peterz@...>, <npiggin@...>, <linux-arch@...>, <jeremy@...>, <mingo@...>, <paulmck@...>, Jens Axboe <jens.axboe@...>, Ralf Baechle <ralf@...>
Date: Tuesday, April 29, 2008 - 3:26 am

This converts mips to use the new helpers for smp_call_function() and
friends, and adds support for smp_call_function_single(). Not tested,
but it compiles.

mips shares the same IPI for smp_call_function() and
smp_call_function_single(), since not all hardware has enough available
IPIs available to support seperate setups.

Cc: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
---
arch/mips/Kconfig | 1 +
arch/mips/kernel/smp.c | 139 ++++-------------------------------------------
arch/mips/kernel/smtc.c | 1 -
include/asm-mips/smp.h | 10 ----
4 files changed, 12 insertions(+), 139 deletions(-)

diff --git a/arch/mips/Kconfig b/arch/mips/Kconfig
index e5a7c5d..ea70d5a 100644
--- a/arch/mips/Kconfig
+++ b/arch/mips/Kconfig
@@ -1763,6 +1763,7 @@ config SMP
bool "Multi-Processing support"
depends on SYS_SUPPORTS_SMP
select IRQ_PER_CPU
+ select USE_GENERIC_SMP_HELPERS
help
This enables support for systems with more than one CPU. If you have
a system with only one CPU, like most personal computers, say N. If
diff --git a/arch/mips/kernel/smp.c b/arch/mips/kernel/smp.c
index 33780cc..98a5758 100644
--- a/arch/mips/kernel/smp.c
+++ b/arch/mips/kernel/smp.c
@@ -131,145 +131,28 @@ asmlinkage __cpuinit void start_secondary(void)
cpu_idle();
}

-DEFINE_SPINLOCK(smp_call_lock);
-
-struct call_data_struct *call_data;
-
-/*
- * Run a function on all other CPUs.
- *
- * <mask> cpuset_t of all processors to run the function on.
- * <func> The function to run. This must be fast and non-blocking.
- * <info> An arbitrary pointer to pass to the function.
- * <retry> If true, keep retrying until ready.
- * <wait> If true, wait until function has completed on other CPUs.
- * [RETURNS] 0 on success, else a negative status code.
- *
- * Does not return until remote CPUs are nearly ready to execute <func>
- * or are or have executed.
- *...

To: <linux-kernel@...>
Cc: <peterz@...>, <npiggin@...>, <linux-arch@...>, <jeremy@...>, <mingo@...>, <paulmck@...>, Jens Axboe <jens.axboe@...>, Paul Mundt <lethal@...>
Date: Tuesday, April 29, 2008 - 3:26 am

This converts sh to use the new helpers for smp_call_function() and
friends, and adds support for smp_call_function_single(). Not tested,
but it compiles.

Cc: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
---
arch/sh/Kconfig | 1 +
arch/sh/kernel/smp.c | 48 ++++++++----------------------------------------
include/asm-sh/smp.h | 12 ++----------
3 files changed, 11 insertions(+), 50 deletions(-)

diff --git a/arch/sh/Kconfig b/arch/sh/Kconfig
index 6a679c3..60cfdf5 100644
--- a/arch/sh/Kconfig
+++ b/arch/sh/Kconfig
@@ -698,6 +698,7 @@ config CRASH_DUMP
config SMP
bool "Symmetric multi-processing support"
depends on SYS_SUPPORTS_SMP
+ select USE_GENERIC_SMP_HELPERS
---help---
This enables support for systems with more than one CPU. If you have
a system with only one CPU, like most personal computers, say N. If
diff --git a/arch/sh/kernel/smp.c b/arch/sh/kernel/smp.c
index 5d039d1..2ed8dce 100644
--- a/arch/sh/kernel/smp.c
+++ b/arch/sh/kernel/smp.c
@@ -36,13 +36,6 @@ EXPORT_SYMBOL(cpu_possible_map);
cpumask_t cpu_online_map;
EXPORT_SYMBOL(cpu_online_map);

-static atomic_t cpus_booted = ATOMIC_INIT(0);
-
-/*
- * Run specified function on a particular processor.
- */
-void __smp_call_function(unsigned int cpu);
-
static inline void __init smp_store_cpu_info(unsigned int cpu)
{
struct sh_cpuinfo *c = cpu_data + cpu;
@@ -178,42 +171,17 @@ void smp_send_stop(void)
smp_call_function(stop_this_cpu, 0, 1, 0);
}

-struct smp_fn_call_struct smp_fn_call = {
- .lock = __SPIN_LOCK_UNLOCKED(smp_fn_call.lock),
- .finished = ATOMIC_INIT(0),
-};
-
-/*
- * The caller of this wants the passed function to run on every cpu. If wait
- * is set, wait until all cpus have finished the function before returning.
- * The lock is here to protect the call structure.
- * You must not call this function with disabled interrupts or from a
- * hardware interrupt handler or from a bottom half handler.
...

Previous thread: none

Next thread: Re: linux-next: Tree for April 29 by Toralf on Tuesday, April 29, 2008 - 3:37 am. (2 messages)