login
Header Space

 
 

Re: [linux-pm] Fundamental flaw in system suspend, exposed by freezer removal

Score:
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]
To: Rafael J. Wysocki <rjw@...>
Cc: Linux-pm mailing list <linux-pm@...>, Kernel development list <linux-kernel@...>
Date: Tuesday, February 26, 2008 - 11:49 am

On Tue, 26 Feb 2008, Rafael J. Wysocki wrote:


Yes.


I've got some ideas on how to implement this.

We can add a new field "suspend_called" to dev->power.  It would be
owned by the PM core (protect by dpm_list_mtx) and read-only to
drivers.  Normally it will contain 0, but when the suspend method is
running we set it to SUSPEND_RUNNING and when the method returns
successfully we set it to SUSPEND_DONE.  Before calling the resume
method we set it back to 0.  Drivers can use this field as an easy way
of checking that all the child devices have been suspended.

When a new device is registered we check its parent's suspend_called
value.  If it is SUSPEND_DONE then the caller has a bug and we have to
fail the registration.  If it is SUSPEND_RUNNING then the registration
is legal, but we remember what happened.  Then when the
currently-running suspend method returns and we reacquire the
dpm_list_mtx, we will realize that a race was lost.  If the method
completed successfully (which it shouldn't) we can resume that device
immediately without ever taking it off the dpm_active list; but either
way we should continue the suspend loop.  Now the new child will be at
the end of the dpm_active_list, so it will be suspended before the
parent is reached again.

This way we can recover from drivers that are willing to suspend their 
device even though there are unsuspended children.  The only drawback 
will be that for a short time the child will be active while its parent 
is suspended.

We should not abort the entire sleep transition simply because we lost 
a race.  With this scheme we won't even need the pm_sleep_rwsem; the 
dpm_list_mtx will provide all the necessary protection.

This is more intricate than it should be.  It would have been better to
have had "disable_new_children" and "enable_new_children" methods from
the beginning; then there wouldn't be any races at all.  That's life...

The one tricky thing to watch out for is when a suspend or resume 
method wants to unregister the device being suspended or resumed.  Even 
that should be doable (set suspend_called to UNREGISTERED or something 
like that).


Unregistration should always be allowed, and registration should be 
allowed whenever the parent isn't suspended.  For devices with no 
parent, we can imagine there is a fictitious parent at the root of the 
device tree.  Conceptually it gets suspended after every real device 
and resumed before.  Maybe even before dpm_power_up(), meaning that 
devices with no parent could be registered by a resume_early method.

When your lock-removal stuff gets into Greg's tree, I'll write all 
this.  Sound good?

Alan Stern

--
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

Messages in current thread:
Re: Fundamental flaw in system suspend, exposed by freezer r..., Benjamin Herrenschmidt, (Wed Feb 27, 4:36 pm)
Re: Fundamental flaw in system suspend, exposed by freezer r..., Rafael J. Wysocki, (Mon Feb 25, 6:24 pm)
Re: [linux-pm] Fundamental flaw in system suspend, exposed b..., Rafael J. Wysocki, (Mon Feb 25, 6:25 pm)
Re: [linux-pm] Fundamental flaw in system suspend, exposed b..., Rafael J. Wysocki, (Mon Feb 25, 8:07 pm)
Re: [linux-pm] Fundamental flaw in system suspend, exposed b..., Alan Stern, (Tue Feb 26, 11:49 am)
Re: [linux-pm] Fundamental flaw in system suspend, exposed b..., Rafael J. Wysocki, (Tue Feb 26, 7:17 pm)
Re: [linux-pm] Fundamental flaw in system suspend, exposed b..., Rafael J. Wysocki, (Wed Feb 27, 3:50 pm)
Re: [linux-pm] Fundamental flaw in system suspend, exposed b..., Rafael J. Wysocki, (Fri Feb 29, 10:26 am)
Re: [linux-pm] Fundamental flaw in system suspend, exposed b..., Rafael J. Wysocki, (Fri Feb 29, 1:02 pm)
Re: [linux-pm] Fundamental flaw in system suspend, exposed b..., Rafael J. Wysocki, (Fri Feb 29, 5:57 pm)
Re: [linux-pm] Fundamental flaw in system suspend, exposed b..., Rafael J. Wysocki, (Fri Feb 29, 8:13 pm)
Re: [linux-pm] Fundamental flaw in system suspend, exposed b..., Rafael J. Wysocki, (Mon Mar 3, 12:32 pm)
Re: [linux-pm] Fundamental flaw in system suspend, exposed b..., Rafael J. Wysocki, (Thu Feb 28, 8:01 pm)
speck-geostationary