On Sat, 29 Sep 2007, Nick Piggin wrote:I'm really saying that people to a first approximation should think "NT is an IO (DMA) thing". Whether cached or not. Exactly because they do not honor the normal memory ordering. It may be worth noting that "clflush" falls under that heading too - even if all the actual *writes* were done with totally normal writes, if anybody does a clflush instruction, that breaks the ordering, and that turns it to "DMA ordering" again - ie we're not talking about the normal SMP ordering rules at all. So all the spinlocks and all the smp_*mb() barriers have never really done *anything* for those things (in particular, "smp_wmb()" has *always* ignored them on i386!) No. As far as I can tell, the fast string operations are unordered *within*themselves*, but not wrt the operations around it. In other words, you cannot depend on the ordering of stores *in* the memcpy() or memset() when it is implemented by "rep movs/stos" - but that is 100% equivalent to the fact that you cannot depend on the ordering even when it isn't - since the "memcpy()" library routine might be copying memory backwards for all you know! The Intel memory ordering paper doesn't talk about the fast string instructions (except to say that the rules it *does* speak about do not hold), but the regular IA manuals do say (for example): "Code dependent upon sequential store ordering should not use the string operations for the entire data structure to be stored. Data and semaphores should be separated. Order dependent code should use a discrete semaphore uniquely stored to after any string operations to allow correctly ordered data to be seen by all processors." and note how it says you should just store to the semaphore. If you think about it, that semahore will be involving all the memory ordering requirements that we *already* depend on, so if a semaphore is sufficient to order the fast string instruction, then by definition using a spinlock around them must be the same thing! In other words, by Intels architecture manual, fast string instructions cannot escape a "semaphore" - but that means that they cannot escape a spinlock either (since the two are exactly the same wrt memory ordering rules! In other words, whenever the Intel docs say "semaphore", think "mutual exclusion lock", not necessarily the kernel kind of "sleeping semaphore"). But it might be good to have that explicitly mentioned in the IA memory ordering thing, so I'll ask the Intel people about that. However, I'd say that given our *current* documentation, string instructions may be *internally* out-of-order, but they would not escape a lock. See above. I do not believe that it's an existing bug, but the basic point that the change to "smp_rmb()" doesn't change our existing rules is true. Linus -
| Davide Libenzi | Re: [patch 7/8] fdmap v2 - implement sys_socket2 |
| Bart Van Assche | Integration of SCST in the mainstream Linux kernel |
| Greg Kroah-Hartman | [PATCH 005/196] Chinese: add translation of SubmittingDrivers |
| Mariusz Kozlowski | [KJ PATCHES] mostly kmalloc + memset conversion to k[cz]alloc |
git: | |
| KOSAKI Motohiro | [bug?] tg3: Failed to load firmware "tigon/tg3_tso.bin" |
| Stefan Richter | Re: [GIT]: Networking |
| David Miller | Re: [PATCH] pkt_sched: Destroy gen estimators under rtnl_lock(). |
| Gerrit Renker | [PATCH 0/37] dccp: Feature negotiation - last call for comments |
