Re: [linux-pm] Fundamental flaw in system suspend, exposed by freezer removal

!MAILaRCHIVE_VOTE_RePLACE
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]
To: Rafael J. Wysocki <rjw@...>
Cc: Linux-pm mailing list <linux-pm@...>, Kernel development list <linux-kernel@...>, Alexey Starikovskiy <astarikovskiy@...>
Date: Sunday, March 2, 2008 - 11:54 pm

On Sun, 2 Mar 2008, Rafael J. Wysocki wrote:


This is now a moot point, but I'll answer it anyway...

Drivers should not depend on any particular time interval between their 
suspend and resume methods being called.  It should be okay to call one 
and then the other a microsecond later, or 100 days later.  The driver 
shouldn't know or care.  I doubt that many drivers do.


"begin_sleep" and "end_sleep" are less ambiguous.

I was just thinking the same thing.  For instance, plenty of drivers
register children as part of their probe method, so the begin_sleep
method should prevent subsystems from probing the device.  (Not to
mention that it's kind of awkward for a driver to probe a suspended
device.)


After more thought, I'm not so sure about this.  It might be a good
idea to call the begin_sleep method just before the suspend method (or
any of its variants: freeze, hibernate, prethaw, etc.) and call the
end_sleep method just after the resume method.  This minimizes the time
drivers will spend in a peculiar non-hotplug-aware state, although it 
means that begin_sleep would have to be idempotent.

It also allows sophisticated drivers to do all their processing in the
begin_sleep (and end_sleep) method: both preventing new child
registrations and powering down the device.  At the moment I'm not sure
whether this would turn out to be a good strategy, but it might.

Alternatively, some subsystems might want to stop all child
registrations right at the beginning of the sleep transition.  This
would amount to blocking the kernel thread responsible for those
registrations when the sleep begins -- in other words, making that
thread freezable!


This isn't quite so simple either.  A notifier isn't good enough
because it doesn't provide any synchronization.  On the other hand, how
many devices ever get registered without a parent?  Does this happen at
all after system startup?  Maybe when a loadable module for a new bus
type initializes...  We could block module loading at the start of a
sleep transition, if necessary.


Fortunately that won't cause any problems for some time to come.  
There are no current plans for doing runtime PM of PCI-based USB
controllers.  This may change eventually, but not for a while.


Or have a "no runtime PM" quirk for the devices in question.


Sorry, I don't follow you.

I meant doing something approximately like this:

	mutex_lock(&dpm_list_mtx);
	while (!list_is_empty(&dpm_active)) {
		dev = dpm_active.prev;
		dev->power.sleeping = true;
		mutex_unlock(&dpm_list_mtx);
		error = suspend_device(dev);
		mutex_lock(&dpm_list_mtx);
		...
	}

and make device_pm_add() fail if dev->parent->power.sleeping is true.
This will do what you want, right?

With the begin_sleep method call added it gets a little more
complicated, since you have to reacquire dpm_list_mtx after calling the
begin_sleep method and before setting dev->power.sleeping to true, in
order to make sure that dev is still the last entry on dpm_active.  
But it's quite doable.

(BTW, I wonder if it's a good idea for device_add() to call 
device_pm_add() before calling bus_add_device().  If a suspend occurs 
in between, we could end up in a strange situation with a driver being 
asked to suspend a device before that device has been fully registered 
-- in fact the registration might still fail.)


I'm not sure either.  You can get the same effect by checking 
list_empty(&dev->power.entry).


In theory you could even expand it to freeze_begin and prethaw_begin.


To be safe, everything should return a result and we should abort the 
sleep if anything returns an error.

It's easier to ignore a return code now than to change a method
signature later.  :-)


No.  Some drivers might implement just one and some drivers just the 
other.


That's true.  If the driver's author wants to do things that way, he
can.  For instance, there are several subsystems which will probe for
new children as part of their resume processing.  So if a child was
detected just before the system went to sleep and failed to get
registered, then it will be detected again when the system wakes up and
now the registration will succeed.

Here's something else to think about.  We might want to allow some 
devices to be "power-irrelevant".  That is, the device exists in the 
kernel only as a representation of some software state, not as a 
physical device.  It doesn't consume power, it doesn't have any state 
to lose during a sleep, and its driver doesn't implement suspend or 
resume methods.  For these sorts of devices, we might allow 
device_add() to skip calling device_pm_add() altogether.  USB 
interfaces are a little like this, as are SCSI hosts and MMC hosts.

Alan Stern

--
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

Messages in current thread:
Re: Fundamental flaw in system suspend, exposed by freezer r..., Benjamin Herrenschmidt, (Wed Feb 27, 4:36 pm)
Re: Fundamental flaw in system suspend, exposed by freezer r..., Rafael J. Wysocki, (Mon Feb 25, 6:24 pm)
Re: [linux-pm] Fundamental flaw in system suspend, exposed b..., Rafael J. Wysocki, (Mon Feb 25, 6:25 pm)
Re: [linux-pm] Fundamental flaw in system suspend, exposed b..., Rafael J. Wysocki, (Mon Feb 25, 8:07 pm)
Re: [linux-pm] Fundamental flaw in system suspend, exposed b..., Rafael J. Wysocki, (Tue Feb 26, 7:17 pm)
Re: [linux-pm] Fundamental flaw in system suspend, exposed b..., Rafael J. Wysocki, (Wed Feb 27, 3:50 pm)
Re: [linux-pm] Fundamental flaw in system suspend, exposed b..., Rafael J. Wysocki, (Fri Feb 29, 10:26 am)
Re: [linux-pm] Fundamental flaw in system suspend, exposed b..., Rafael J. Wysocki, (Fri Feb 29, 1:02 pm)
Re: [linux-pm] Fundamental flaw in system suspend, exposed b..., Rafael J. Wysocki, (Fri Feb 29, 5:57 pm)
Re: [linux-pm] Fundamental flaw in system suspend, exposed b..., Rafael J. Wysocki, (Fri Feb 29, 8:13 pm)
Re: [linux-pm] Fundamental flaw in system suspend, exposed b..., Alan Stern, (Sun Mar 2, 11:54 pm)
Re: [linux-pm] Fundamental flaw in system suspend, exposed b..., Rafael J. Wysocki, (Mon Mar 3, 12:32 pm)
Re: [linux-pm] Fundamental flaw in system suspend, exposed b..., Rafael J. Wysocki, (Thu Feb 28, 8:01 pm)