Re: [linux-pm] Fundamental flaw in system suspend, exposed by freezer removal

Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]
From: Alan Stern
Date: Wednesday, February 27, 2008 - 9:03 am

On Wed, 27 Feb 2008, Rafael J. Wysocki wrote:


The name refers to the "suspend" method, not the type of sleep being
carried out.  We use the same method for both suspend and hibernation.
But maybe "sleeping" would be better.


All right, we can set it to RESUME_RUNNING before calling the resume
method and then set it to 0 afterwards.  The point is that the value
shouldn't remain SUSPEND_DONE while resume runs, because it should be
legal for resume to register new children.


It will get noticed in device_pm_add() while holding dpm_list_mtx.  
The information can be stored in a static private flag
"child_added_while_parent_suspends" (or maybe something more terse!).


Check whether child_added_while_parent_suspends is nonzero.


Sure.  But it won't be the PM core's problem; it will be a bug in the
bus's driver.  We will print a warning in the log so the bug can be 
tracked down.


You misunderstand.  We can't require drivers to prevent these races 
entirely.  As an example, a properly-written, compliant driver might 
work like this:

	Task 0				Task 1
	------				------
	dev->power.sleeping =
	  SUSPEND_RUNNING;
	Call (drv->suspend)(dev)
					Register a child below dev
	suspend method prevents new
	  child registrations
	suspend method waits for
	  existing registration to
	  finish
					Check dev->power.sleeping and set
					  child_added_while_parent_suspends
					Registration completes successfully
	suspend method sees there is
	  an unsuspended child and
	  returns -EBUSY

	Check child_added_while_parent_suspends
	  and realize that we lost the race

There's nothing illegal about this; it's just an accident of timing.  
Nothing has gone wrong and we shouldn't abort the sleep.  We should
continue where we left off, by suspending the new child and then trying
to suspend the parent again.


We'll have to fix device_del() to prevent that from happening.  Your 
in_sleep_context() approach should work.


Unfortunately the lack of "prevent_new_children" and 
"allow_new_children" methods gives us no choice.  The example above 
shows why.

Alan Stern

--
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

Messages in current thread:
Re: Fundamental flaw in system suspend, exposed by freezer ..., Rafael J. Wysocki, (Mon Feb 25, 3:24 pm)
Re: [linux-pm] Fundamental flaw in system suspend, exposed ..., Rafael J. Wysocki, (Mon Feb 25, 3:25 pm)
Re: [linux-pm] Fundamental flaw in system suspend, exposed ..., Rafael J. Wysocki, (Mon Feb 25, 5:07 pm)
Re: [linux-pm] Fundamental flaw in system suspend, exposed ..., Rafael J. Wysocki, (Tue Feb 26, 4:17 pm)
Re: [linux-pm] Fundamental flaw in system suspend, exposed ..., Alan Stern, (Wed Feb 27, 9:03 am)
Re: [linux-pm] Fundamental flaw in system suspend, exposed ..., Rafael J. Wysocki, (Wed Feb 27, 12:50 pm)
Re: Fundamental flaw in system suspend, exposed by freezer ..., Benjamin Herrenschmidt, (Wed Feb 27, 1:36 pm)
Re: [linux-pm] Fundamental flaw in system suspend, exposed ..., Rafael J. Wysocki, (Thu Feb 28, 5:01 pm)
Re: [linux-pm] Fundamental flaw in system suspend, exposed ..., Rafael J. Wysocki, (Fri Feb 29, 7:26 am)
Re: [linux-pm] Fundamental flaw in system suspend, exposed ..., Rafael J. Wysocki, (Fri Feb 29, 10:02 am)
Re: [linux-pm] Fundamental flaw in system suspend, exposed ..., Rafael J. Wysocki, (Fri Feb 29, 2:57 pm)
Re: [linux-pm] Fundamental flaw in system suspend, exposed ..., Rafael J. Wysocki, (Fri Feb 29, 5:13 pm)
Re: [linux-pm] Fundamental flaw in system suspend, exposed ..., Rafael J. Wysocki, (Sun Mar 2, 12:11 pm)