Re: [PATCH RFC] v5 expedited "big hammer" RCU grace periods

Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]
From: Paul E. McKenney
Date: Wednesday, May 20, 2009 - 8:30 am

On Wed, May 20, 2009 at 10:09:24AM +0200, Ingo Molnar wrote:

I suppose that I could take this as a cue to reminisce about the old days
in a past life with a different implementation of CPU online/offline,
but life is just too short for that sort of thing.  Not that guys my
age let that stop them.  ;-)

And in that past life, exercising CPU online/offline usually exposed
painful bugs in new code, so I cannot claim that the old-life approach
to CPU hotplug was perfect.  Interestingly enough, running uniprocessor
also exposed painful bugs more often than not.  Of course, the only way
to run uniprocessor was to offline all but one of the CPUs, so you would
hit the online/offline bugs before hitting the uniprocessor-only bugs.

The thing that worries me most about CPU hotplug in Linux is that
there is no clear hierarchy of CPU function in the offline process,
given that the offlining process invokes notifiers in the same order
as does the onlining process.  Whether this is a real defect in the CPU
hotplug design or is instead simply a symptom of my not yet being fully
comfortable with the two-phase CPU-removal process is an interesting
question to which I do not have an answer.

Either way, the thought process is different.  In my old life, CPUs shed
roles in the opposite order that they acquired them.  This meant that a
given CPU was naturally guaranteed to be correctly taking interrupts for
the entire time that it was capable of running user-level processes.
Later in the offlining process, it would still take interrupts, but
would be unable to run user processes.  Still later, it would no longer
be taking interrupts, and would stop participating in RCU and in the
global TLB-flush algorithm.  There was no need to stop the whole machine
to make a given CPU go offline, in fact, most of the work was done by
the CPU in question.

In the case of RCU, this meant that there was no need for double-checking
for offlined CPUs, because CPUs could reliably indicate a quiescent
state on their way out.

On the other hand, there was no equivalent of dynticks in the old days.
And it is dynticks that is responsible for most of the complexity present
in force_quiescent_state(), not CPU hotplug.

So I cannot hold up RCU as something that would be greatly simplified
by changing the CPU hotplug design, much as I might like to.  ;-)

							Thanx, Paul
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

Messages in current thread:
Re: [PATCH RFC] v5 expedited "big hammer" RCU grace periods, Paul E. McKenney, (Wed May 20, 8:30 am)