login
Header Space

 
 

Re: [PATCH 8/8] i386: bitops: smp_mb__{before, after}_clear_bit() definitions

Score:
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]
To: Satyam Sharma <ssatyam@...>
Cc: Nick Piggin <nickpiggin@...>, Linux Kernel Mailing List <linux-kernel@...>, David Howells <dhowells@...>, Andi Kleen <ak@...>, Andrew Morton <akpm@...>
Date: Tuesday, July 24, 2007 - 1:12 pm

On Tue, 24 Jul 2007, Satyam Sharma wrote:

No. CPU memory barriers extend to all CPU's. End of discussion.

It's not about "that cacheline". The whole *point* of a CPU memory barrier 
is that it's about independent memory accesses.

Yes, for a memory barrier to be effective, all CPU's involved in the 
transaction have to have the barriers - the same way a lock needs to be 
taken by everybody in order for it to make sense - but the point is, CPU 
barriers are about *global* behaviour, not local ones.

So there's a *huge* difference between

	clear_bit(x,y);

and

	clear_bit(x,y);
	smp_mb__before_after_clear_bit();

and it has absolutely nothing to do with the particular cacheline that "y" 
is in, it's about the *global* memory ordering.

Any write you do after that "smp_mb__before_after_clear_bit()" will be 
guaranteed to be visible to _other_ CPU's *after* they have seen the bit 
being cleared. Yes, those other CPU's need to have a read barrier between 
reading the bit and reading some other thign, but the point is, this hass 
*nothing* to do with cache coherency, and the particular cache line that 
"y" is in.

And no, "smp_mb__before/after_clear_bit()" must *not* be just an empty "do 
{} while (0)". It needs to be a compiler barrier even when it has no 
actual CPU meaning, unless clear_bit() itself is guaranteed to be a 
compiler barrier (which it isn't, although the "volatile" on the asm in 
practice makes it something *close* to that).

Why? Think of the sequence like this:

	clear_bit(x,y);
	smp_mb__after_clear_bit();
	other_variable = 10;

the whole *point* of this sequence is that if another CPU does

	x = other_variable;
	smp_rmb();
	bit = test_bit(x,y)

then if it sees "x" being 10, then the bit *has* to be clear.

And this is why the compiler barrier in "smp_mb__after_clear_bit()" needs 
to be a compiler barrier:

 - it doesn't matter for the action of the "clear_bit()" itself: that one 
   is locked, and on x86 it thus also happens to be a serializing 
   instruction, and the cache coherency and lock obviously means that the 
   bit clearing *itself* is safe!

 - but it *does* matter for the compiler scheduling. If the compiler were 
   to decide that "y" and "other_variable" are totally independent, it 
   might otherwise decide to move the "other_variable = 10" assignment to 
   *before* the clear_bit(), which would make the whole code pointless!

See? We have two totally independent issues:

 - the CPU itself can re-order the visibility of accesses. x86 doesn't do 
   this very much, and doesn't do it at all across a locked instruction, 
   but it's still a real issue, even if it tends to be much easier to see 
   on other architectures.

 - the compiler doesn't care about rules of "locked instruction" at all, 
   because it has no clue. It has *different* rules about how it can 
   re-order instructions and accesses, and maybe the "asm volatile" will 
   guarantee that the compiler won't re-order things around the 
   clear_bit(), and maybe it won't. But making it a compiler barrier (by 
   using the "memory clobber" thing, *guarantees* that gcc cannot reorder 
   memory writes or reads.

See? Two different - and _totally_ independent - levels of ordering, and 
we need to make sure that both are valid.

		Linus
-
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

Messages in current thread:
[PATCH 0/8] i386: bitops: Cleanup, sanitize, optimize, Satyam Sharma, (Mon Jul 23, 12:05 pm)
Re: [PATCH 0/8] i386: bitops: Cleanup, sanitize, optimize, Denis Vlasenko, (Mon Jul 30, 1:57 pm)
Re: [PATCH 0/8] i386: bitops: Cleanup, sanitize, optimize, Satyam Sharma, (Mon Jul 30, 9:07 pm)
Re: [PATCH 4/8] i386: bitops: Kill volatile-casting of memor..., Benjamin Herrenschmidt, (Tue Jul 24, 5:49 am)
[PATCH 3/8] i386: bitops: Rectify bogus "+m" constraints, Satyam Sharma, (Mon Jul 23, 12:05 pm)
[PATCH 2/8] i386: bitops: Rectify bogus "Ir" constraints, Satyam Sharma, (Mon Jul 23, 12:05 pm)
Re: [PATCH 8/8] i386: bitops: smp_mb__{before, after}_clear_..., Linus Torvalds, (Tue Jul 24, 1:12 pm)
Re: [PATCH 8/8] i386: bitops: smp_mb__{before, after}_clear_..., Jeremy Fitzhardinge, (Tue Jul 24, 3:48 am)
Re: [PATCH 6/8] i386: bitops: Don't mark memory as clobbered..., Jeremy Fitzhardinge, (Tue Jul 24, 5:31 pm)
Re: [PATCH 6/8] i386: bitops: Don't mark memory as clobbered..., Benjamin Herrenschmidt, (Tue Jul 24, 5:52 am)
Re: [PATCH 6/8] i386: bitops: Don't mark memory as clobbered..., Benjamin Herrenschmidt, (Tue Jul 24, 5:36 pm)
Re: [PATCH 6/8] i386: bitops: Don't mark memory as clobbered..., Benjamin Herrenschmidt, (Tue Jul 24, 5:37 pm)
Re: [PATCH 6/8] i386: bitops: Don't mark memory as clobbered..., Benjamin Herrenschmidt, (Tue Jul 24, 6:32 pm)
Re: [PATCH 6/8] i386: bitops: Don't mark memory as clobbered..., Jeremy Fitzhardinge, (Mon Jul 23, 1:49 pm)
[PATCH 1/8] i386: bitops: Update/correct comments, Satyam Sharma, (Mon Jul 23, 12:05 pm)
Re: [PATCH 7/8] i386: bitops: Kill needless usage of __asm__..., Jeremy Fitzhardinge, (Mon Jul 23, 12:23 pm)
Re: [PATCH 7/8] i386: bitops: Kill needless usage of __asm__..., Jeremy Fitzhardinge, (Mon Jul 23, 1:39 pm)
Re: [PATCH 7/8] i386: bitops: Kill needless usage of __asm__..., Jeremy Fitzhardinge, (Mon Jul 23, 2:28 pm)
Re: [PATCH 7/8] i386: bitops: Kill needless usage of __asm__..., Jeremy Fitzhardinge, (Mon Jul 23, 4:40 pm)
speck-geostationary