Re: [patch] x86: improved memory barrier implementation

!MAILaRCHIVE_VOTE_RePLACE
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]
To: Alan Cox <alan@...>
Cc: Linux Kernel Mailing List <linux-kernel@...>, Linus Torvalds <torvalds@...>, Andi Kleen <ak@...>
Date: Friday, September 28, 2007 - 12:54 pm

On Fri, Sep 28, 2007 at 05:07:19PM +0100, Alan Cox wrote:

And we still have the correct smp_wmb barrier for OOOSTORE systems. And
we never did any smp_wmb tricks for broken ppros in barrier code.

IOW, my patch is a complete noop for both (it changes smp_rmb to be a
noop, of course, but AFAIKS that shouldn't be relevant for either
problem -- and if it was, then you need to add special ifdefs for
them rather than make everyone else eat proverbial shit, like we do
for spin_unlock).



l:

Note that for IO, rmb/wmb/mb remain as they are.



Aside from the fact that we currently don't do anything for ppro in
smp_wmb *anyway*, I really don't think adding stuff there can fix it:
you can fix critical sections, because you can be sure that all stores
to the same memory regions are strictly fenced inside the locks, except
for the ops on the locks themselves. For those stores, you obviously can
work around the problems by using lock ops.

However, for lockless code, it is entirely possible to do conflicting
unlocked stores to the same location. Memory barriers are never going to
stop this, because they do not provide any synchronisation. They may make
some things less likely or some actual patterns effectively safe...

Anyway, I'm not arguing against adding "stuff" here for it. I repeat:
that isn't concerned with this patch.



OK, so in theory we could make smp_wmb just a simple barrier, however the
code for ooostore there is under a config option, so there is no harm in
keeping it around really.



Can I theorise that it will reduce icache misses and be done with? ;)



Doubt it. It would be really interesting to have an _exact_ trace of
how it breaks that isn't fundamentally broken already. I suspect that
information will never come out of intel. But again, if you think we
need some stronger stuff in smp_wmb, I can do a patch that depends
on the broken ppro stores config option if you just tell me what should
be there (again, remember that my patch isn't actually changing anything
already there except for smp_rmb side).



No. We already _have_ to rely on this model for barriers because we
have a lock-less spin_unlock. We can either: put explicit mfences around
all stores that aren't executed inorder WRT other stores; or make
spin_unlock use the lock prefix.

We have pretty explicitly decided to do the former, and the fact that
subsequent patches to implement these exotic stores have broken the
memory ordering model does in no way invalidate this patch, IMO.



Thanks :) Very nice feedback (I wasn't having a gentle dig at you,
honest!)

-
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

Messages in current thread:
[patch] x86: improved memory barrier implementation, Nick Piggin, (Fri Sep 28, 11:48 am)
Re: [patch] x86: improved memory barrier implementation, Nick Piggin, (Fri Sep 28, 12:54 pm)
Re: [patch] x86: improved memory barrier implementation, Linus Torvalds, (Fri Sep 28, 12:15 pm)
Re: [patch] x86: improved memory barrier implementation, Linus Torvalds, (Sat Sep 29, 12:07 pm)