I think the moving to another CPU gets really dependent on the CPU type.
On a P4+HT the caches are shared, and moving costs almost nothing for
cache hits, while on CPUs which have other cache layouts the migration
cost is higher. Obviously multi-core should be cheaper than
multi-socket, by avoiding using the system memory bus, but it still can
get ugly.
I have an IPC test around which showed that, it ran like hell on HT, and
progressively worse as cache because less shared. I wonder why the
latest git works so much better?
--
Bill Davidsen <davidsen@tmr.com>
"We have more to fear from the bungling of the incompetent than from
the machinations of the wicked." - from Slashdot
--