On Sun, Sep 30, 2007 at 04:02:09PM -0700, Davide Libenzi wrote:Actually, CPU designers have to go quite a ways out of their way to prevent this BUG_ON from happening. One way that it would happen naturally would be if the cache line containing P were owned by CPU 2, and if CPUs 0 and 1 shared a store buffer that they both snooped. So, here is what could happen given careless or sadistic CPU designers: o CPU 0 stores &B to P, but misses the cache, so puts the result in the store buffer. This means that only CPUs 0 and 1 can see it. o CPU 1 fetches P, and sees &B, so stores a 1 to B. Again, this value for P is visible only to CPUs 0 and 1. o CPU 1 executes a wmb(), which forces CPU 1's stores to happen in order. But it does nothing about CPU 0's stores, nor about CPU 1's loads, for that matter (and the only reason that POWER ends up working the way you would like is because wmb() turns into "sync" rather than the "eieio" instruction that would have been used for smp_wmb() -- which is maybe what Oleg was thinking of, but happened to abbreviate. If my analysis is buggy, Anton and Paulus will no doubt correct me...) o CPU 1 stores to X. o CPU 2 loads X, and sees that the value is 1. o CPU 2 does an rmb(), which orders its loads, but does nothing about anyone else's loads or stores. o CPU 2 fetches P from its cached copy, which still points to A, which is still zero. So the BUG_ON fires. o Some time later, CPU 0 gets the cache line containing P from CPU 2, and updates it from the value in the store buffer, but too late... Unfortunately, cache-coherence protocols don't care much about pure time... It is possible to make a 16-CPU machine believe that a single variable has more than ten different values -at- -the- -same- -time-. This is easy to do -- have all the CPUs store different values to the same variable at the same time, then reload, collecting timestamps between each pair of operations. On a large SMP, the values will sit in the store buffers for many hundreds of nanoseconds, perhaps even several microseconds, while the cache line containing the variable being stored to shuttles around among the CPUs. ;-) Thanx, Paul -
| Tarkan Erimer | Re: Dual-Licensing Linux Kernel with GPL V2 and GPL V3 |
| Andrew Morton | -mm merge plans for 2.6.23 |
| James Bottomley | [Ksummit-2008-discuss] Fixing the Kernel Janitors project |
| Greg KH | [GIT PATCH] driver core patches against 2.6.24 |
git: | |
| Gerrit Renker | [PATCH 18/37] dccp: Support for Mandatory options |
| David Miller | Re: [PATCH] pkt_sched: Destroy gen estimators under rtnl_lock(). |
| David Miller | Re: [GIT]: Networking |
| Tantilov, Emil S | WARNING: at include/net/sock.h:417 udp_lib_unhash |
