Hi, The following three patches are intended to start the redesign of the suspend and hibernation framework for devices. The first one is the 5th revision of the patch introducing new callbacks for suspend and hibernation. The other two patches implement the new suspend and hibernation callbacks for the platform and PCI bus types (3rd revision of both). The main differences between these patches and the previous revision: * Dropped 'struct pm_noirq_ops' and introduced 'struct pm_ext_ops' that is a derivative of 'struct pm_ops' (adds the _noirq operations). * Made device_pm_add() refuse to add devices for which dev->parent->power.status == DPM_INVALID * Clarified (hopefully) the new platform and pci driver code Please review. Thanks, Rafael --
From: Rafael J. Wysocki <rjw@sisk.pl> Introduce 'struct pm_ops' and 'struct pm_ext_ops' representing suspend and hibernation operations for bus types, device classes and device types. Modify the PM core to use 'struct pm_ops' and 'struct pm_ext_ops' objects, if defined, instead of the ->suspend() and ->resume() or, respectively, ->suspend_late() and ->resume_early() callbacks that will be considered as legacy and gradually phased out. The main purpose of doing this is to separate suspend (aka S2RAM and standby) callbacks from hibernation callbacks in such a way that the new callbacks won't take arguments and the semantics of each of them will be clearly specified. This has been requested for multiple times by many people, including Linus himself, and the reason is that within the current scheme if ->resume() is called, for example, it's difficult to say why it's been called (ie. is it a resume from RAM or from hibernation or a suspend/hibernation failure etc.?). The second purpose is to make the suspend/hibernation callbacks more flexible so that device drivers can handle more than they can within the current scheme. For example, some drivers may need to prevent new children of the device from being registered before their ->suspend() callbacks are executed or they may want to carry out some operations requiring the availability of some other devices, not directly bound via the parent-child relationship, in order to prepare for the execution of ->suspend(), etc. Ultimately, we'd like to stop using the freezing of tasks for suspend and therefore the drivers' suspend/hibernation code will have to take care of the handling of the user space during suspend/hibernation. That, in turn, would be difficult within the current scheme, without the new ->prepare() and ->complete() callbacks. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> --- arch/x86/kernel/apm_32.c | 4 drivers/base/power/main.c | 693 ++++++++++++++++++++++++++++++++++----------- ...
Unfortunately I forgot to set dev->power.status to DPM_PREPARING before calling ->prepare(), as documented. Also, dpm_prepare() could cleaned up a bit. Fixed patch follows. Thanks, Rafael --- From: Rafael J. Wysocki <rjw@sisk.pl> Introduce 'struct pm_ops' and 'struct pm_ext_ops' representing suspend and hibernation operations for bus types, device classes and device types. Modify the PM core to use 'struct pm_ops' and 'struct pm_ext_ops' objects, if defined, instead of the ->suspend() and ->resume() or, respectively, ->suspend_late() and ->resume_early() callbacks that will be considered as legacy and gradually phased out. The main purpose of doing this is to separate suspend (aka S2RAM and standby) callbacks from hibernation callbacks in such a way that the new callbacks won't take arguments and the semantics of each of them will be clearly specified. This has been requested for multiple times by many people, including Linus himself, and the reason is that within the current scheme if ->resume() is called, for example, it's difficult to say why it's been called (ie. is it a resume from RAM or from hibernation or a suspend/hibernation failure etc.?). The second purpose is to make the suspend/hibernation callbacks more flexible so that device drivers can handle more than they can within the current scheme. For example, some drivers may need to prevent new children of the device from being registered before their ->suspend() callbacks are executed or they may want to carry out some operations requiring the availability of some other devices, not directly bound via the parent-child relationship, in order to prepare for the execution of ->suspend(), etc. Ultimately, we'd like to stop using the freezing of tasks for suspend and therefore the drivers' suspend/hibernation code will have to take care of the handling of the user space during suspend/hibernation. That, in turn, would be difficult within the current scheme, without the new ->prepare() and ...
My testing shows that some drivers tend to return error codes from their ->resume() callbacks, even though the devices in question appear to work correctly after the seemingly failing suspend. For this reason, I have changed the patch so that the "invalid" status of devices is cleared by the PM core in dpm_complete(). I have also added messages printed during a resume if an error is returned by a driver's callback. Thanks, Rafael --- From: Rafael J. Wysocki <rjw@sisk.pl> Introduce 'struct pm_ops' and 'struct pm_ext_ops' representing suspend and hibernation operations for bus types, device classes and device types. Modify the PM core to use 'struct pm_ops' and 'struct pm_ext_ops' objects, if defined, instead of the ->suspend() and ->resume() or, respectively, ->suspend_late() and ->resume_early() callbacks that will be considered as legacy and gradually phased out. Change the behavior of the PM core wrt the error codes returned by device drivers' ->resume() callbacks. Namely, if an error code is returned by one of them, the device for which it's been returned is regarded as "invalid" by the PM core which will refuse to handle it from that point on (in particualr, suspend/hibernation will not be started if there is an "invalid" device in the system). The main purpose of doing this is to separate suspend (aka S2RAM and standby) callbacks from hibernation callbacks in such a way that the new callbacks won't take arguments and the semantics of each of them will be clearly specified. This has been requested for multiple times by many people, including Linus himself, and the reason is that within the current scheme if ->resume() is called, for example, it's difficult to say why it's been called (ie. is it a resume from RAM or from hibernation or a suspend/hibernation failure etc.?). The second purpose is to make the suspend/hibernation callbacks more flexible so that device drivers can handle more than they can within the current scheme. For example, some ...
Hi Rafael. Please excuse me, but I'm going to ask the questions you get from someone who hasn't followed development to date, and is thus reading the explanation without prior knowledge. Hopefully that will be helpful when you come to finalising the commit message. Does ..._ext_... mean extended? (external?) If 'extended' (or if not), does that imply that they're mutually exclusive alternatives for drivers 'Respectively' doesn't look like the right word to use, but I'm not sure I understand correctly what you're trying to say. The way it's written at the moment, it sounds to me like you're saying that suspend_late() and resume_early are deprecated, but you're modifying the PM core to use s/particualr,/particular So drivers can never validly fail to resume. That sounds fair enough. If the hardware has gone away while in lower power mode (USB, say), should Do these changes allow for other power state possibilities besides That doesn't sound good. It would be good to be able to get drivers to s/unsucessfully/unsuccessfully Regards, Nigel --
Agreed. Prepare() should still allow request_firmware and full userspace Ben. --
Pepare() is called after userspace has been frozen. (Of course, once the freezer goes away this won't matter any more.) There is a separate notifier chain which drivers can subscribe to; notifications about impending sleeps are sent out while userspace is still alive. Drivers can use that for request_firmware, memory allocation, and other things. Alan Stern --
Hi. Then that should be mentioned here so that drivers authors can know that it is possible to request firmware, allocate memory and so on at a stage when you can still rely on userspace being alive. Regards, Nigel --
This documents the current state which is that we have the freezer and we have drivers that rely on it. I can remove the comment, but then someone will invent ->prepare() relying on the availability of the user space at this point and that won't work. Thanks, Rafael --
'ext' means 'extended'. The idea is that the 'extended' version will be used by bus types / driver types that don't need to implement the _noirq callbacks. Both the platform and PCI bus types generally allow drivers to use _noirq callbacks, so they use 'struct pm_ext_ops', as well as their corresponding Yes, the changelog is wrong, because I used a separate structure for the _noirq callbacks and (quite blindly) change the name of the structure in the I think so. IMO, an error code returned by a driver's ->resume() should mean "the device hasn't resumed and is presumably dead". Otherwise, ->resume() should return Hmm, you're right. This is the other way around - if a device allocates too Thanks, will fix. Rafael --
Something's wrong here. This seems to say that the "extended" version has _fewer_ method pointers -- in which case it should be called If the device is gone, it doesn't much matter what resume() returns. Alan Stern --
Hi. What if the same driver is handling multiple instances and only some of them fail to resume? Regards, Nigel --
This was a mistake explained in another message. The "don't" should not be ->resume() will be called separately for each of them. Thanks, Rafael --
No, it has more pointers. Specifically, it is like
struct pm_ext_ops {
struct pm_ops base;
--additional pointers go here--
Yes, it does. In that cases, the error code would tell the PM core not to attempt
to resume the device's children etc. Otherwise, it's quite meaningless to the
PM core, because it really can mean anything and how's the PM core supposed
to handle _that_?
Either we decide that the error codes returned by ->resume() mean critical
errors or there's no point in returning error codes from ->resume() at all
(other than logging the errors by the core).
Well, that's getting confused. I think I'll have to rework the patch not to
really handle the errors returned by ->resume() and friends, after all, but
I'll keep the reporting of them.
However, I'd like to add a recommendation that the _new_ "resume" callbacks
should only return errors in critical situations as the indication to the PM
core that something went _really_ wrong and the device in question is quite
surely unusable.
Thanks,
Rafael
--
If the device is gone then so are its descendants, right? So it Agreed. The most important aspect is that drivers should _not_ return an error if the device is working correctly. We should fix the drivers which make this mistake. Alan Stern --
We need to do something about devices that don't want to be resumed.
There's code like this:
static int usb_resume(struct device *dev)
{
struct usb_device *udev;
if (!is_usb_device(dev)) /* Ignore PM for interfaces */
return 0;
udev = to_usb_device(dev);
/* If udev->skip_sys_resume is set then udev was already suspended
* when the system suspend started, so we don't want to resume
* udev during this system wakeup. However a reset-resume counts
* as a wakeup event, so allow a reset-resume to occur if remote
* wakeup is enabled. */
if (udev->skip_sys_resume) {
if (!(udev->reset_resume && udev->do_remote_wakeup))
return -EHOSTUNREACH;
}
return usb_external_resume_device(udev);
}
Do we want to keep this in the subsystems?
Regards
Oliver
--
Basically yes. Subsystems and drivers are allowed to keep devices suspended if they were suspended before the system went to sleep. Remember, the purpose of the resume method is to let drivers know that the system is now awake, not to force them to put their devices into a high-power state. But under these circumstances we don't want to return an error code, since nothing really went wrong in the method. So the -EHOSTUNREACH should be changed to 0. Alan Stern --
Well, sometimes it is exactly that what we desire, eg. as a side effect of lsusb. Should the callbacks have different semantics depending on the reason you call them? And how should that information be transferred? Regards Oliver --
Which callbacks are you referring to? When lsusb opens a device and does an autoresume, it does not call the same routine as the PM core does when resuming from a system sleep. lsusb ends up calling usb_autoresume_device() whereas the PM core ends up calling usb_resume(), which is the function you quoted earlier. Alan Stern --
But how is usb_autoresume_device() supposed to take to the driver? Furthermore suppose the interface woken up is storage. This will have to work across subsystem borders. Why not put it into generic code? Regards Oliver --
I'm not sure exactly what you're asking. However it's worth mentioning that once the dust from Rafael's changes settles down, I intend to submit a patch creating new PM_EVENT_* codes: PM_EVENT_USER_SUSPEND PM_EVENT_USER_RESUME PM_EVENT_REMOTE_WAKEUP PM_EVENT_AUTOSUSPEND PM_EVENT_AUTORESUME (and the corresponding PMSG_* definitions). The PM core will never use the new codes, but subsystems like USB will be able to use them internally. Does that answer your question? Alan Stern --
Hi Rafael etc. Do you mean to say in the first sentence "...that _do_ need to implement..."? If not, then extended sounds like a misnomer and the two sentences seem to contradict one another. Yeah, but right now, it seems to me to be a bogus limitation for drivers to have no way of automatically loading firmware when you're about to hibernate. (Of course I've since been reminded of the notifier chain - that should probably be mentioned here as the way of achieving this). By the way, I'm going to go on record now as saying I think dropping the freezer is a silly idea. I'm therefore currently considering including the freezer in TuxOnice from the time it gets dropped from mainline. I know that will only make it less likely that TuxOnIce gets merged, but I've given up caring about that anyway - caring about merging is pointless when the people who decide if it gets merged don't care. Regards, Nigel --
This is a tricky stuff, though, because the notifier is used for disabling the Well, I'm just not sure if dropping the freezer entirely will actually work, but we won't know that if we don't try. There's been a lot of pressure on going into this direction recently and in principle it seems to be doable at least for suspend. Hibernation is another issue, but IMO it's better to focus on suspend first. Thanks, Rafael --
Hi. For suspend, I agree with dropping its use. For hibernation... Nigel --
Well, perhaps it's better to disable user mode helpers directly from freeze_processes(). I'm not sure and that's why I added the comment about the availability of the user space during ->prepare(). Besides, for now, the freezer is necessary anyway, even for suspend. Thanks, Rafael --
The drivers should be fixed not to do that, no? -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html --
The problem is we generally don't know which drivers do that. I know a few, but there probably are others. So, the idea is to allow them to do that for some time, but log error messages indicating which drivers they are. These messages will then allow us to fix the drivers and once they've been fixed, we can apply a (very simple) patch to change the behavior of the core (this patch will be easy to revert if need be, whereas I really wouldn't like to be forced to revert the $subject patch by predictable regressions). Thanks, Rafael --
From: Rafael J. Wysocki <rjw@sisk.pl>
Implement new suspend and hibernation callbacks for the platform bus
type.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
---
drivers/base/platform.c | 296 ++++++++++++++++++++++++++++++++++++++--
include/linux/platform_device.h | 1
2 files changed, 289 insertions(+), 8 deletions(-)
Index: linux-2.6/include/linux/platform_device.h
===================================================================
--- linux-2.6.orig/include/linux/platform_device.h
+++ linux-2.6/include/linux/platform_device.h
@@ -53,6 +53,7 @@ struct platform_driver {
int (*suspend_late)(struct platform_device *, pm_message_t state);
int (*resume_early)(struct platform_device *);
int (*resume)(struct platform_device *);
+ struct pm_ext_ops *pm;
struct device_driver driver;
};
Index: linux-2.6/drivers/base/platform.c
===================================================================
--- linux-2.6.orig/drivers/base/platform.c
+++ linux-2.6/drivers/base/platform.c
@@ -453,6 +453,8 @@ int platform_driver_register(struct plat
drv->driver.suspend = platform_drv_suspend;
if (drv->resume)
drv->driver.resume = platform_drv_resume;
+ if (drv->pm)
+ drv->driver.pm = &drv->pm->base;
return driver_register(&drv->driver);
}
EXPORT_SYMBOL_GPL(platform_driver_register);
@@ -560,7 +562,9 @@ static int platform_match(struct device
return (strncmp(pdev->name, drv->name, BUS_ID_SIZE) == 0);
}
-static int platform_suspend(struct device *dev, pm_message_t mesg)
+#ifdef CONFIG_PM_SLEEP
+
+static int platform_legacy_suspend(struct device *dev, pm_message_t mesg)
{
int ret = 0;
@@ -570,7 +574,7 @@ static int platform_suspend(struct devic
return ret;
}
-static int platform_suspend_late(struct device *dev, pm_message_t mesg)
+static int platform_legacy_suspend_late(struct device *dev, pm_message_t mesg)
{
struct platform_driver *drv = to_platform_driver(dev->driver);
struct platform_device *pdev;
@@ -583,7 +587,7 ...I found a bug in this patch. Namely, platform_legacy_suspend_late() was used
instead of platform_legacy_suspend() and vice versa in platform_pm_freeze() and
platform_pm_freeze_noirq().
Corrected patch follows.
Thanks,
Rafael
---
From: Rafael J. Wysocki <rjw@sisk.pl>
Implement new suspend and hibernation callbacks for the platform bus
type.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
---
drivers/base/platform.c | 296 ++++++++++++++++++++++++++++++++++++++--
include/linux/platform_device.h | 1
2 files changed, 289 insertions(+), 8 deletions(-)
Index: linux-2.6/include/linux/platform_device.h
===================================================================
--- linux-2.6.orig/include/linux/platform_device.h
+++ linux-2.6/include/linux/platform_device.h
@@ -53,6 +53,7 @@ struct platform_driver {
int (*suspend_late)(struct platform_device *, pm_message_t state);
int (*resume_early)(struct platform_device *);
int (*resume)(struct platform_device *);
+ struct pm_ext_ops *pm;
struct device_driver driver;
};
Index: linux-2.6/drivers/base/platform.c
===================================================================
--- linux-2.6.orig/drivers/base/platform.c
+++ linux-2.6/drivers/base/platform.c
@@ -453,6 +453,8 @@ int platform_driver_register(struct plat
drv->driver.suspend = platform_drv_suspend;
if (drv->resume)
drv->driver.resume = platform_drv_resume;
+ if (drv->pm)
+ drv->driver.pm = &drv->pm->base;
return driver_register(&drv->driver);
}
EXPORT_SYMBOL_GPL(platform_driver_register);
@@ -560,7 +562,9 @@ static int platform_match(struct device
return (strncmp(pdev->name, drv->name, BUS_ID_SIZE) == 0);
}
-static int platform_suspend(struct device *dev, pm_message_t mesg)
+#ifdef CONFIG_PM_SLEEP
+
+static int platform_legacy_suspend(struct device *dev, pm_message_t mesg)
{
int ret = 0;
@@ -570,7 +574,7 @@ static int platform_suspend(struct devic
return ret;
}
-static int ...From: Rafael J. Wysocki <rjw@sisk.pl>
Implement new suspend and hibernation callbacks for the PCI bus type.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
---
drivers/pci/pci-driver.c | 371 ++++++++++++++++++++++++++++++++++++++++++-----
include/linux/pci.h | 2
2 files changed, 334 insertions(+), 39 deletions(-)
Index: linux-2.6/drivers/pci/pci-driver.c
===================================================================
--- linux-2.6.orig/drivers/pci/pci-driver.c
+++ linux-2.6/drivers/pci/pci-driver.c
@@ -271,7 +271,52 @@ static int pci_device_remove(struct devi
return 0;
}
-static int pci_device_suspend(struct device * dev, pm_message_t state)
+static void pci_device_shutdown(struct device *dev)
+{
+ struct pci_dev *pci_dev = to_pci_dev(dev);
+ struct pci_driver *drv = pci_dev->driver;
+
+ if (drv && drv->shutdown)
+ drv->shutdown(pci_dev);
+}
+
+#ifdef CONFIG_PM_SLEEP
+
+/*
+ * Default "suspend" method for devices that have no driver provided suspend,
+ * or not even a driver at all.
+ */
+static void pci_default_pm_suspend(struct pci_dev *pci_dev)
+{
+ pci_save_state(pci_dev);
+ /*
+ * mark its power state as "unknown", since we don't know if
+ * e.g. the BIOS will change its device state when we suspend.
+ */
+ if (pci_dev->current_state == PCI_D0)
+ pci_dev->current_state = PCI_UNKNOWN;
+}
+
+/*
+ * Default "resume" method for devices that have no driver provided resume,
+ * or not even a driver at all.
+ */
+static int pci_default_pm_resume(struct pci_dev *pci_dev)
+{
+ int retval = 0;
+
+ /* restore the PCI config space */
+ pci_restore_state(pci_dev);
+ /* if the device was enabled before suspend, reenable */
+ retval = pci_reenable_device(pci_dev);
+ /* if the device was busmaster before the suspend, make it busmaster again */
+ if (pci_dev->is_busmaster)
+ pci_set_master(pci_dev);
+
+ return retval;
+}
+
+static int pci_legacy_suspend(struct device * dev, pm_message_t state)
{
struct pci_dev * pci_dev = ...