> On Fri, May 02, 2008 at 04:12:34AM +0200, Nick Piggin wrote:
> > On Thu, May 01, 2008 at 07:02:41PM -0700, Paul E. McKenney wrote:
> > > On Wed, Apr 30, 2008 at 02:37:17PM +0200, Jens Axboe wrote:
> > > > > Here are some (probably totally broken) ideas:
> > > > >
> > > > > 1. Global lock so that only one smp_call_function() in the
> > > > > system proceeds. Additional calls would be spinning with
> > > > > irqs -enabled- on the lock, avoiding deadlock. Kind of
> > > > > defeats the purpose of your list, though...
> > > >
> > > > That is what we used to do, that will obviously work. But defeats most
> > > > of the purpose, unfortunately :-)
> > > >
> > > > > 2. Maintain a global mask of current targets of smp_call_function()
> > > > > CPUs. A given CPU may proceed if it is not a current target
> > > > > and if none of its target CPUs are already in the mask.
> > > > > This mask would be manipulated under a global lock.
> > > > >
> > > > > 3. As in #2 above, but use per-CPU counters. This allows the
> > > > > current CPU to proceed if it is not a target, but also allows
> > > > > concurrent smp_call_function()s to proceed even if their
> > > > > lists of target CPUs overlap.
> > > > >
> > > > > 4. #2 or #3, but where CPUs can proceed freely if their allocation
> > > > > succeeded.
> > > > >
> > > > > 5. If a given CPU is waiting for other CPUs to respond, it polls
> > > > > its own list (with irqs disabled), thus breaking the deadlock.
> > > > > This means that you cannot call smp_call_function() while holding
> > > > > a lock that might be acquired by the called function, but that
> > > > > is not a new prohibition -- the only safe way to hold such a
> > > > > lock is with irqs disabled, and you are not allowed to call
> > > > > the smp_call_function() with irqs disabled in the first place
> > > > > (right?).
> > > > >
> > > > > #5 might actually work...
> > > >
> > > > Yeah, #5 sounds quite promising. I'll see if I can work up a patch for
> > > > that, or if you feel so inclined, I'll definitely take patches :-)
> > > >
> > > > The branch is 'generic-ipi' on git://git.kernel.dk/linux-2.6-block.git
> > > > The link is pretty slow, so it's best pull'ed off of Linus base. Or just
> > > > grab the patches from the gitweb interface:
> > > >
> > > >
http://git.kernel.dk/?p=linux-2.6-block.git;a=shortlog;h=refs/heads/generic-ipi
> > >
> > > And here is an untested patch for getting rid of the fallback element,
> > > and eliminating the "wait" deadlocks.
> >
> > Hey this is coming along really nicely, thanks guys.
> >
> > The only problem I have with this is that if you turn IRQs off, you
> > probably don't expect call function functions to be processed under
> > you (sure that doesn't happen now, but it could if anybody actually
> > starts to call IPIs under irq off).
>
> OK -- for some reason, I was thinking that it was illegal to
> invoke smp_call_function() with irqs disabled...
>
> Ah, I see it -- smp_call_function_mask() says:
>
> * You must not call this function with disabled interrupts or from a
> * hardware interrupt handler or from a bottom half handler.
>
> So we have no problem with smp_call_function, then.
>
> OK, so smp_call_function() -can- be invoked with irqs disabled?
> Hmmm... I will give this some thought.
>
> > What I _really_ wanted to do is just keep the core API as a non-deadlocky
> > one that has its data passed into it; and then implemented the fallbacky,
> > deadlocky one on top of that. In places where it makes sense, callers
> > could then use the new API if they want to.
>
> I don't believe that you can make the fallback non-deadlocky... Perhaps
> a failure of imagination on my part, of course, but I am beginning to
> doubt that...