On Mon, Oct 15, 2007 at 09:44:05AM +0200, Jarek Poplawski wrote:I'd say that's exactly what Intel wanted. It's pretty common (we do it all the time in the kernel too) to create an API which places a stronger requirement on the caller than is actually required. It can make changes much less painful. Has performance really been much problem for you? (even before the lfence instruction, when you theoretically had to use a locked op)? I mean, I'd struggle to find a place in the Linux kernel where there is actually a measurable difference anywhere... and we're pretty performance critical and I think we have a reasonable amount of lockless code (I guess we may not have a lot of tight computational loops, though). I'd be interested to know what, if any, application had found these barriers to be problematic... The thing is that those documents are not defining what a particular implementation does, but how the architecture is defined (ie. what must some arbitrary software/hardware provide and what may it expect). It's pretty natural that Intel started out with a weaker guarantee than their CPUs of the time actually supported, and tightened it up after (presumably) deciding not to implement such relaxed semantics for the forseeable future. -
| Ingo Molnar | [patch 12/13] syslets: x86: optimized copy_uatom() |
| Greg Kroah-Hartman | [PATCH 017/196] aoechr: Convert from class_device to device |
| Yinghai Lu | Re: 2.6.26, PAT and AMD family 6 |
| Jan Engelhardt | intel iommu (Re: -mm merge plans for 2.6.23) |
git: | |
| Gerrit Renker | [PATCH 27/37] dccp: Integration of dynamic feature activation - part 2 (server side) |
| David Miller | [GIT]: Networking |
| David Miller | Re: [PATCH] pkt_sched: Destroy gen estimators under rtnl_lock(). |
| Natalie Protasevich | [BUG] New Kernel Bugs |
