Cc: Nigel Cunningham <nigel@...>, Benjamin Herrenschmidt <benh@...>, Pavel Machek <pavel@...>, Rafael J. Wysocki <rjw@...>, Matthew Garrett <mjg59@...>, <linux-kernel@...>, <linux-pm@...>
Except Linus already decreed (and I heartily agree) that hibernation
and suspend-to-RAM were fundamentally completely different operations
and therefore any attempts to share code were basically just making a
big muddy mess of things. Would a thread "Remove phase-of-the-moon
calculations from network-recv code" be relevant to lunar observation
just because the two had to do with the phases of the moon? No!
Why do we care? If the wakeup request arrives before we go to sleep,
we obviously aren't asleep and so can't wake up. If it arrives after
we go to sleep then it will wake us up. Anything that depends on a
wakeup arriving mid-sequence is 100% masochistic race condition.
(1a) As I describe below, step (1) includes setting NO_BIND and NO_IO
flags on devices as they are processed. Anybody who wants to do IO
while those flags are set should just go sleep on a waitqueue.
(1b) Again, that's where the NO_BIND flag comes in. If its set then
any device probe events must sleep, otherwise they can go through.
See points (1a) and (1b) above.
If any of those things screw up suspend-to-RAM then it is 100% the
drivers fault and no "process freezer" is going to fix it, end of
story. And "A" cannot be made reliable. At some point you shut off
interrupts right before going to sleep, and at that point any remote
wakeup event is just going to get dropped until you actually enter
sleep mode and the hardware takes over again. If you miss a wakeup
event then whatever sent it should just retry, just as with *every*
other kind of network packet.
That would be a driver bug. If you have asynchronous probing then
proper suspend handling includes being able to postpone driver probe
events until after resume. If you have synchronous probing then the
problem doesn't exist because "set_no_bind_flag" is just telling the
device not to raise any more device probe interrupts.
While binding it will clearly be holding a mutex/spinlock on the
parent device, so the suspend process will wait for it. When binding
is done the suspend_device() code will take the device lock and tell
everything else to postpone further bind requests as above.
Oh, so you're calling every waitqueue in the kernel a "freezer" now?
We do these things at the driver level *all* *the* *time*. For
instance, you can't submit new IOs to an ATA controller while it's
renegotiating the bus speed, but that's never been a problem before.
Most drivers have an implicit NO_BIND flag: The device's interrupts
are off and/or its in a low-power state. USB is already terribly
buggy with regards to suspend: If you hotplug a device during
suspend (like the touchpad in my powerbook powering down/up), then
the USB stack will basically hang that controller. The device is off
and the hotplug triggers interrupts and IO, *EVEN* *WITHOUT*
*USERSPACE*.
So if your driver doesn't already have a proper way of blocking IO
during suspend then it probably doesn't suspend 50% of the time
anyways. A bug which bites *every* *time* is easy to fix, one which
only bites when things hit a race condition is much harder.
That responsibility has been there ever since suspend-to-RAM support
was added. Nobody ever denied that writing a proper driver wasn't
tricky. You have to simultaneously be able to handle handle hot-
unplug, IO errors, interrupts, IO requests, suspend-to-RAM, and
hibernation. If your driver mutual-exclusion is buggy then it
probably already bites you during hotplug or other similar
scenarios. Let's at least make the problems much more reproducible
so we can fix the drivers properly instead of continuing to kludge
around it for all eternity.
Cheers,
Kyle Moffett
-