Le lundi 15 novembre 2010 à 08:25 -0600, Christoph Lameter a écrit :
Yes, this is what cmpxchg() does for sure.
Maybe its not clear, but atomic_inc_not_zero_hint() is going to be used
only in contexts we know the expected value, and not as a generic
replacement for atomic_inc_not_zero(). Even if cache line is already hot
in this cpu cache, it should be faster or same speed.
Then, in high contention contexts, using atomic_inc_not_zero_hint() with
whatever initial hint might also be a win over atomic_inc_not_zero(),
but we try to remove such contexts ;)
And two atomic_cmpxchg() are probably slower in non contended contexts,
in particular is cache line is already hot in this cpu cache.
In fact, in benchmarks, prefetch() or prefetchw() are a pain on x86, or
at least "perf tools" show artifact on them (high number of cycles
consumed on these instructions)
Andi had a patch to disable prefetch() in list iterators, and its a win.
I dont have Itanium platform to run tests. Is cmpxchg() that bad on
ia64 ? I also have old AMD cpus, so I cannot say if recent ones handle
prefetchw() better...
--