Cc: Peter Zijlstra <peterz@...>, Stefan Richter <stefanr@...>, <jmerkey@...>, <linux-kernel@...>, Linus Torvalds <torvalds@...>, Nick Piggin <nickpiggin@...>, David Howells <dhowells@...>
I have thoroughly reviewed Linux memory barriers and the efficacy of the
barriers as defined in Linux are not the issue here. the code segment
discussed sits and spins on a variable waiting for a specific state, and
its a spinlock which creates a hard barrier, so no amount of barrier usage
should nor does matter here. Even if a processor was late in flushing its
writes, sooner or later the spinning processor would see the change in the
shared memory address -- IF IT WERE ACTUALLY A SHARED REFERENCE. What I
am seeing is not an issue of races between processors on load/store
operations, but cases where gcc has chosen to optimize away global
references entirely.
Jeff
--