Re: [linux-pm] Fundamental flaw in system suspend, exposed by freezer removal

!MAILaRCHIVE_VOTE_RePLACE
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]
To: Rafael J. Wysocki <rjw@...>
Cc: Linux-pm mailing list <linux-pm@...>, Kernel development list <linux-kernel@...>
Date: Friday, February 29, 2008 - 6:46 pm

On Fri, 29 Feb 2008, Rafael J. Wysocki wrote:


That's different.  Before you were talking about acquiring
dev->power.lock _before_ calling the suspend method.  Now you're
talking about blocking child registration _after_ the parent is already
suspended.

It might work if you did it that way.  In theory it _should_ work,
since nobody should ever try to register a child below a suspended
parent.

Given that this is merely a way of preventing something which should
never happen in the first place, is it really necessary to add the
extra lock?  Certainly it's simpler just to fail the registration.  If
it turns out later that we'd be better off blocking it instead, we
can add the lock.


It is as far as lockdep is concerned.  You acquire power.lock for the 
first device, then you acquire it for the second device.  Lockdep 
doesn't know the two devices are different; all it knows is that you 
have tried to acquire a lock while already holding an instance of that 
same lock.  It's the same problem that affects attempts to convert 
dev->sem to a mutex.

As for the ordering of the lock and moving the device to dpm_off -- 
it's less of a problem if you don't acquire the lock until after the 
suspend method returns.  You can lock it just before reacquiring 
dpm_list_mtx, while the device is still on dpm_active.


Which is better, an oops or a hang?  As far as the user is concerned, 
either one is useless.  For kernel developers, an oops is easier to 
debug.

In the end we should just try it and see what happens.  I don't think 
we can decide which will work out better without some real-world 
experience.


That's not what I mean.  In the long run it will turn out that certain
kernel threads _want_ to be frozen.  That is, if allowed to run during
a system sleep transition they would mess things up, and their
subsystem is designed so that it can carry out a sleep transition
perfectly well without the thread running.  (An example of such a
thread is khubd.)

To accomodate these threads we can freeze them -- that's easy since the
freezer already exists.  Or we can remove the freezer and provide a new
way for these threads to block until the system wakes up.  IMO using
the existing code is better than writing new code.

All the objections to the freezer have been about using it on arbitrary
kernel threads and on all user tasks.  But if it gets used on only
those kernel threads which request it, and on no user tasks, there
shouldn't be any objections.


This is an interesting matter.  My view is that runtime PM should be
almost completely disabled when the PM core calls the device's suspend
method.  The only exception is that remote wakeup may be enabled.  If a
remote wakeup event occurs and the device resumes, then its parent's
suspend method will realize what has happened when it sees that the
device is no longer suspended.  So the parent's suspend method will
return -EBUSY and the sleep will be aborted.

Right now USB does not disable runtime PM during a system sleep.  It
hasn't been necessary, thanks to the freezer.  But when we stop
freezing user tasks it will become necessary.  When that time arrives I
intend to put user threads doing runtime resume into the "icebox"  
(remember that?).  Khubd and other kernel threads could go into the
icebox also, instead of the freezer; in this way the freezer could be
removed completely.


That is indeed the difference, and it's an important difference.  The
driver knows what other threads may be carrying out registrations, and
it knows which ones should be waited for and which can safely be
blocked or disabled.  The PM core doesn't know any of these things; all
it can do is blindly block everything.  That is dangerous and can lead
to deadlocks.

Alan Stern

--
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

Messages in current thread:
Re: Fundamental flaw in system suspend, exposed by freezer r..., Benjamin Herrenschmidt, (Wed Feb 27, 4:36 pm)
Re: Fundamental flaw in system suspend, exposed by freezer r..., Rafael J. Wysocki, (Mon Feb 25, 6:24 pm)
Re: [linux-pm] Fundamental flaw in system suspend, exposed b..., Rafael J. Wysocki, (Mon Feb 25, 6:25 pm)
Re: [linux-pm] Fundamental flaw in system suspend, exposed b..., Rafael J. Wysocki, (Mon Feb 25, 8:07 pm)
Re: [linux-pm] Fundamental flaw in system suspend, exposed b..., Rafael J. Wysocki, (Tue Feb 26, 7:17 pm)
Re: [linux-pm] Fundamental flaw in system suspend, exposed b..., Rafael J. Wysocki, (Wed Feb 27, 3:50 pm)
Re: [linux-pm] Fundamental flaw in system suspend, exposed b..., Rafael J. Wysocki, (Fri Feb 29, 10:26 am)
Re: [linux-pm] Fundamental flaw in system suspend, exposed b..., Rafael J. Wysocki, (Fri Feb 29, 1:02 pm)
Re: [linux-pm] Fundamental flaw in system suspend, exposed b..., Rafael J. Wysocki, (Fri Feb 29, 5:57 pm)
Re: [linux-pm] Fundamental flaw in system suspend, exposed b..., Alan Stern, (Fri Feb 29, 6:46 pm)
Re: [linux-pm] Fundamental flaw in system suspend, exposed b..., Rafael J. Wysocki, (Fri Feb 29, 8:13 pm)
Re: [linux-pm] Fundamental flaw in system suspend, exposed b..., Rafael J. Wysocki, (Mon Mar 3, 12:32 pm)
Re: [linux-pm] Fundamental flaw in system suspend, exposed b..., Rafael J. Wysocki, (Thu Feb 28, 8:01 pm)