Ok, thanks.
If I understand the new code in tip/rcu correctly, you have rewritten
that block anyway.
I'll try to implement my proposal - on paper, it looks far simpler than
the current code.
On the one hand, a state machine that keeps track of a global state:
- collect the callbacks in a nxt list.
- wait for quiecent
- destroy the callbacks in the nxt list.
(actually, there will be 5 states, 2 additional for "start the next rcu
cycle immediately")
On the other hand a cpu bitmap that keeps track of the cpus that have
completed the work that must be done after a state change.
The last cpu advances the global state.
The state machine could be seq_lock protected, the cpu bitmap could be
either hierarchical or flat or for uniprocessor just a nop.
Do you have any statistics about rcu_check_callbacks? On my single-cpu
system, around 2/3 of the calls are from "normal" context, i.e.
rcu_qsctr_inc() is called.
--
Manfred
--