On Tue, 2008-08-26 at 06:43 -0700, Paul E. McKenney wrote:
right - while the local count will be balanced and will always end up on
zero, you have to check remote counts for zero as well.
But after a counter flip, the dying counter will only reach zero once
per cpu.
So each cpu gets to tickle a softirq once per cycle. That softirq can
then check all remote counters, and kick off the callback list when it
finds them all zero.
Of course, this scan is very expensive, n^2 at worst, each cpu
triggering a full scan, until finally the last cpu is done.
We could optimize this by keeping cpu masks of cpus found to have !0
counts - those who were found to have 0, will always stay zero, so we'll
not have to look at them again.
Another is making use of a scanning hierarchy.
--