It's the "suspending" case that causes trouble. Go back and look at
the race I diagrammed in
https://lists.linux-foundation.org/pipermail/linux-pm/2008-February/016763.html
It _did_ happen. Precisely this race occurred in Bug #10030 (the SD
card insertion/removal), although there the window was bigger because
we blocked registrations starting from the start of the system sleep
instead of from when the parent's suspend method was called.
It's not that the suspend method itself will want to register children.
The problem is that the method has to wait for other threads that may
already have started to register a child. If those other threads are
blocked then suspend will deadlock.
This looks pretty awkward. Won't it cause lockdep to complain about
recursive locking of dev->power.lock?
It buys us one thing: The system will continue to limp along instead of
locking up.
If drivers don't check whether registration succeeded... What can I
say? It's another bug.
During suspend_late and resume_early _every_ device is suspended,
including the fictitious "device-tree-root" device. Hence _every_
registration is for a child of a suspended device.
Besides, you don't want to allow new devices to be registered during
suspend_late, do you? They wouldn't get suspended before the system
went to sleep.
Right now that may be the easiest solution. In fact, it may still be
the easiest solution even after we stop freezing user threads.
I'm not sure I understand. Sure, autoresume may not involve calling
the driver's resume method. But it does involve actually setting the
device back to full power, so what's the difference?
Are you still considering adding separate methods for suspend and
hibernate (maybe also for freeze and prethaw)? Perhaps the
"prevent_new_children" and "allow_new_children" methods could be added
then. This would allow some of this complication to go away.
Alan Stern
--