On Tue, 3 Apr 2007 14:49:48 -0700
Andrew Morton <akpm@linux-foundation.org> wrote:
Rohit solved this puzzle.
The 2-way is a single package, hyperthreaded.
The 8-way is two-package, four cores in each.
So on the 8-way, that lock is getting transferred between the two packages
like crazy. Running the benchmark on just cpus 0 and 1 (taskset -c 0,1)
took the runtime down to eight seconds (from 52!) and the context switch
rate went up to 200,000/sec (from 45,000).
-