I don't know what exactly the i915_suspend() and i915_resume() are
supposed to do because it works better without them.After inserting "return 0;" right at the top of those two functions,
suspend (and power-off properly), and resume (without green screen) works
just fine.I would like to know what they're for.
Tested suspend-to-ram, and suspend-to-disk, both console and X on notebook
internal LCD display, all works without these two functions.But, anyway, got down to just one line in i915_drv.c causing the hang
during suspend. "pci_set_power_state(dev->pdev, PCI_D3hot);".And green screen problem during resume is caused by i915_restore_vga(dev);
So, let me where to go from here.
Thanks,
Jeff.--- linux/drivers/char/drm/i915_drv.c.bad 2008-02-20
11:29:14 +0800
+++ linux/drivers/char/drm/i915_drv.c 2008-02-21 00:58:37 +0800
@@ -369,7 +369,7 @@
if (state.event == PM_EVENT_SUSPEND) {
/* Shut down the device */
pci_disable_device(dev->pdev);
- pci_set_power_state(dev->pdev, PCI_D3hot);
+ //pci_set_power_state(dev->pdev, PCI_D3hot);
}return 0;
@@ -521,7 +521,7 @@
for (i = 0; i < 3; i++)
I915_WRITE(SWF30 + (i << 2), dev_priv->saveSWF2[i]);- i915_restore_vga(dev);
+ //i915_restore_vga(dev);return 0;
}
--
..
Does this machine have more than one CPU core? If so..
Does your kernel have CONFIG_HOTPLUG_CPU=y (if not, enable it).??
--
They're for saving and restoring GPU state across suspend/resume. They're
particularly useful if your machine doesn't re-POST at resume time. In that
case your GPU may be totally uninitialized, so either the kernel or X has toI know I fixed that problem in at least one configuration... Can you try:
# echo test > /sys/power/disk
# echo disk > /sys/power/state
and see if that also turns your screen green?Also, getting a GPU register dump would be helpful. The intel_reg_dumper tool
is built as part of the xf86-video-driver build
(git://anongit.freedesktop.org/git/xorg/driver/xf86-video-intel), can you
pull that down and try it out?Thanks,
Jesse
--
Yes, still green. But I got it to actual reboot with ...
echo reboot > /sys/power/disk
Attached are the two dumps from console. One prior to suspend, and one
after resume.Thanks,
Jeff.
Looks like the AR registers are hosed, which is what I thought I fixed... Can
you attach your i915_drv.c file just so I can sanity check it?Thanks,
Jesse
--
Attached.
Thanks,
Jeff.
Jeff, for the hang on suspend problem, I know suspect something else in
2.6.25-rc2 caused that.Can you try the 2.6.25-rc1 version of i915_drv.c (in fact all of
drivers/char/drm from 2.6.25-rc1) but in a 2.6.25-rc2 kernel? I ask because
2.6.25-rc1 suspends to disk just fine for me and resumes w/o a green screen,
while 2.6.25-rc2 fails to suspend (hangs like you say) and gives me a green
screen.Were there other changes in ACPI or the PM core that might have caused this I
wonder?Thanks,
Jesse
--
Looks like 2.6.25-rc1 also had broken suspend (my test was broken). IIRC,
Dave and I had it working at LCA using the out of tree DRM modules on
2.6.23.14 or 15... Maybe you could give that a try?Thanks,
Jesse
--
And just to confirm that, I just tested the current DRM modules against a
2.6.23.15 kernel. It suspends to disk correctly (w/o a hang) and doesn't
give me a green screen, so something in 2.6.25 must be causing that (even
2.6.25-rc1 seems to have the problem).Also, this patch against 2.6.25-rc1 seemed to prevent the 'green screen'
problem. 2.6.25-rc2 already has part of it...Anyway, let me know how your testing goes.
Thanks,
Jesse
--
In 2.6.23.x there's no second ->suspend() during hibernation, so no wonder.
I'll figure out how to work around this issue in the current mainline, but a
real fix will only be possible when we have separate callbacks for
hibernation.Thanks,
Rafael
--
In 2.6.23 it's just:
->suspend()
->resume()
*S4*
?I ask because we still do the D3hot call in the DRM tree, so the hang should
Ok, thanks.
Jesse
--
->shutdown()
(that breaks wake up from S4 with many devices, including but not limited to
Thanks,
Rafael
--
Ok, can you give this patch a try with the 'platform' method? It should at
least tell us what ACPI would like the device to do at suspend time, but it
probably won't fix the hang.Thanks,
Jessediff --git a/drivers/char/drm/i915_drv.c b/drivers/char/drm/i915_drv.c
index 4048f39..d8aa2c9 100644
--- a/drivers/char/drm/i915_drv.c
+++ b/drivers/char/drm/i915_drv.c
@@ -366,11 +366,11 @@ static int i915_suspend(struct drm_device *dev,
pm_message_t state)i915_save_vga(dev);
- if (state.event == PM_EVENT_SUSPEND) {
- /* Shut down the device */
- pci_disable_device(dev->pdev);
- pci_set_power_state(dev->pdev, PCI_D3hot);
- }
+ /* Ask ACPI which state the device should be put in */
+ pci_disable_device(dev->pdev);
+ printk("calling pci_set_power_state with %d\n",
+ acpi_pci_choose_state(dev, state));
+ pci_set_power_state(dev->pdev, acpi_pci_choose_state(dev, state));return 0;
}
@@ -380,7 +380,7 @@ static int i915_resume(struct drm_device *dev)
struct drm_i915_private *dev_priv = dev->dev_private;
int i;- pci_set_power_state(dev->pdev, PCI_D0);
+ pci_set_power_state(dev->pdev, acpi_pci_choose_state(dev, state));
pci_restore_state(dev->pdev);
if (pci_enable_device(dev->pdev))
return -1;
--
I can't get it to compile.
drivers/char/drm/i915_drv.c: In function 'i915_suspend':
drivers/char/drm/i915_drv.c:372: error: implicit declaration of
function 'acpi_pci_choose_state'
drivers/char/drm/i915_drv.c: In function 'i915_resume':
drivers/char/drm/i915_drv.c:383: error: 'state' undeclared (first use
in this function)
drivers/char/drm/i915_drv.c:383: error: (Each undeclared identifier is
reported only once
drivers/char/drm/i915_drv.c:383: error: for each function it appears in.)
make[3]: *** [drivers/char/drm/i915_drv.o] Error 1
make[2]: *** [drivers/char/drm] Error 2
make[1]: *** [drivers/char] Error 2
make: *** [drivers] Error 2Thanks,
Jeff.
--
And this change should just be reverted (leave it as PCI_D0).
Thanks,
Jesse
--
It says "calling pci_set_power_state with 3". Then after all then it
still hangs, and then resume with Mr Green.PM: Syncing filesystems ... done.
Freezing user space processes ... (elapsed 0.00 seconds) done.
Freezing remaining freezable tasks ... (elapsed 0.00 seconds) done.
PM: Shrinking memory... ^H-^Hdone (0 pages freed)
PM: Freed 0 kbytes in 0.20 seconds (0.00 MB/s)
ACPI: Preparing to enter system sleep state S4
Suspending console(s)
sd 0:0:0:0: [sda] Synchronizing SCSI cache
drm_sysfs_suspend
ACPI: PCI interrupt for device 0000:00:02.0 disabled
calling pci_set_power_state with 3
ACPI: PCI interrupt for device 0000:00:1d.7 disabled
ACPI: PCI interrupt for device 0000:00:1d.3 disabled
ACPI: PCI interrupt for device 0000:00:1d.2 disabled
ACPI: PCI interrupt for device 0000:00:1d.1 disabled
ACPI: PCI interrupt for device 0000:00:1d.0 disabled
ACPI: PCI interrupt for device 0000:00:1b.0 disabled
Disabling non-boot CPUs ...
PM: Creating hibernation image:
PM: Need to copy 25136 pages
tick-braodcast: ignoring broadcast for offline CPU #1
PM: Writing back config space on device 0000:00:02.0 at offset 1 (was
900007, writing 900003)
ACPI: PCI Interrupt 0000:00:1b.0[B] -> GSI 17 (level, low) -> IRQ 17
PCI: Setting latency timer of device 0000:00:1b.0 to 64
PCI: Setting latency timer of device 0000:00:1c.0 to 64
PCI: Setting latency timer of device 0000:00:1c.1 to 64
...Thanks,
Jeff.
--
So it returns the right value.
Jeff, Jesse, please check one thing for me.
Please boot 2.6.25-rc2 (or better, the current head of the Linus' tree) with
no_console_suspend and try to do the following:# echo 8 > /proc/sys/kernel/printk
# echo core > /sys/power/pm_test
# echo disk > /sys/power/state(that will run a test of the freeze/unfreeze code without creating the image)
and then# echo mem > /sys/power/state
(that will run a test of the suspend/resume code without actually suspending).
I'd like to know if that works.
Thanks,
Rafael
--
That comes back for me, without creating the green screen. There's a long
delay between it saying "entering S4" and actually resuming back to myThis also works (after doing the echo disk > ...) above. There's still a
delay between "entering S3" and the resume to my console though.Jesse
--
Below is a patch that should work around the issue. Please try it and let me
know if it helps.Thanks,
Rafael---
drivers/char/drm/i915_drv.c | 3 +++
include/linux/suspend.h | 2 ++
kernel/power/disk.c | 9 ++++++++-
3 files changed, 13 insertions(+), 1 deletion(-)Index: linux-2.6/include/linux/suspend.h
===================================================================
--- linux-2.6.orig/include/linux/suspend.h
+++ linux-2.6/include/linux/suspend.h
@@ -209,6 +209,7 @@ extern unsigned long get_safe_page(gfp_textern void hibernation_set_ops(struct platform_hibernation_ops *ops);
extern int hibernate(void);
+extern bool in_hibernation_power_off(void);
#else /* CONFIG_HIBERNATION */
static inline int swsusp_page_is_forbidden(struct page *p) { return 0; }
static inline void swsusp_set_page_free(struct page *p) {}
@@ -216,6 +217,7 @@ static inline void swsusp_unset_page_frestatic inline void hibernation_set_ops(struct platform_hibernation_ops *ops) {}
static inline int hibernate(void) { return -ENOSYS; }
+static inline bool in_hibernation_power_off(void) { return false; }
#endif /* CONFIG_HIBERNATION */#ifdef CONFIG_PM_SLEEP
Index: linux-2.6/kernel/power/disk.c
===================================================================
--- linux-2.6.orig/kernel/power/disk.c
+++ linux-2.6/kernel/power/disk.c
@@ -24,7 +24,7 @@#include "power.h"
-
+static bool entering_sleep_state;
static int noresume = 0;
static char resume_file[256] = CONFIG_PM_STD_PARTITION;
dev_t swsusp_resume_device;
@@ -381,6 +381,7 @@ int hibernation_platform_enter(void)
if (!hibernation_ops)
return -ENOSYS;+ entering_sleep_state = true;
/*
* We have cancelled the power transition by running
* hibernation_ops->finish() before saving the image, so we should let
@@ -412,6 +413,7 @@ int hibernation_platform_enter(void)
}
local_irq_enable();+ entering_sleep_state = false;
/*
* We don't need to reenable the nonboot...
I ended up applying the below patch instead, so it would build, and
unfortunately it still hung at suspend time.So at this point, the known workarounds to the hang at suspend time are to
remove the device power down call or to boot with 'no_console_suspend'.
The 'screen turns green' problem is fixed by the extra 'inb' added in the
patch below (at least for me).Jesse
diff --git a/drivers/char/drm/i915_drv.c b/drivers/char/drm/i915_drv.c
index 35758a6..35b5a60 100644
--- a/drivers/char/drm/i915_drv.c
+++ b/drivers/char/drm/i915_drv.c
@@ -27,6 +27,7 @@
*
*/+#include <linux/suspend.h>
#include "drmP.h"
#include "drm.h"
#include "i915_drm.h"
@@ -222,6 +223,7 @@ static void i915_restore_vga(struct drm_device *dev)
dev_priv->saveGR[0x18]);/* Attribute controller registers */
+ inb(st01); /* switch back to index mode */
for (i = 0; i < 20; i++)
i915_write_ar(st01, i, dev_priv->saveAR[i], 0);
inb(st01); /* switch back to index mode */
@@ -249,6 +251,9 @@ static int i915_suspend(struct drm_device *dev)
return -ENODEV;
}+ if (in_hibernation_power_off())
+ return 0;
+
pci_save_state(dev->pdev);
pci_read_config_byte(dev->pdev, LBB, &dev_priv->saveLBB);@@ -364,7 +369,6 @@ static int i915_suspend(struct drm_device *dev)
i915_save_vga(dev);/* Shut down the device */
- pci_disable_device(dev->pdev);
pci_set_power_state(dev->pdev, PCI_D3hot);return 0;
diff --git a/include/linux/suspend.h b/include/linux/suspend.h
index 1d7d4c5..58d9f67 100644
--- a/include/linux/suspend.h
+++ b/include/linux/suspend.h
@@ -209,6 +209,7 @@ extern unsigned long get_safe_page(gfp_t gfp_mask);extern void hibernation_set_ops(struct platform_hibernation_ops *ops);
extern int hibernate(void);
+extern bool in_hibernation_power_off(void);
#else /* CONFIG_HIBERNATION */
static inline int swsusp_page_is_forbidden(struct page *p) { return 0; }
static inline void swsusp_set_page_free(struct p...
I encountered the same patching problem, but realized that it was due
to earlier patch that you had wanted me to test, so if you revert your
patch back to the current git, Rafael's patch will apply and compile
cleanly.Thanks,
Jeff.
--
This thing should make i915_suspend() a noop in the last phase of hibernation,
so if it still only works when you remove the
pci_set_power_state(dev->pdev, PCI_D3hot), then I don't get it.Can you please try the pach below instead?
Thanks,
RafaelI ended up applying the below patch instead, so it would build, and
unfortunately it still hung at suspend time.So at this point, the known workarounds to the hang at suspend time are to
remove the device power down call or to boot with 'no_console_suspend'.
The 'screen turns green' problem is fixed by the extra 'inb' added in the
patch below (at least for me).Jesse
---
drivers/char/drm/i915_drv.c | 5 +++--
include/linux/suspend.h | 2 ++
kernel/power/disk.c | 10 +++++++++-
3 files changed, 14 insertions(+), 3 deletions(-)Index: linux-2.6/drivers/char/drm/i915_drv.c
===================================================================
--- linux-2.6.orig/drivers/char/drm/i915_drv.c
+++ linux-2.6/drivers/char/drm/i915_drv.c
@@ -27,6 +27,7 @@
*
*/+#include <linux/suspend.h>
#include "drmP.h"
#include "drm.h"
#include "i915_drm.h"
@@ -222,6 +223,7 @@ static void i915_restore_vga(struct drm_
dev_priv->saveGR[0x18]);/* Attribute controller registers */
+ inb(st01); /* switch back to index mode */
for (i = 0; i < 20; i++)
i915_write_ar(st01, i, dev_priv->saveAR[i], 0);
inb(st01); /* switch back to index mode */
@@ -366,9 +368,8 @@ static int i915_suspend(struct drm_devici915_save_vga(dev);
- if (state.event == PM_EVENT_SUSPEND) {
+ if (state.event == PM_EVENT_SUSPEND && !in_hibernation_power_off()) {
/* Shut down the device */
- pci_disable_device(dev->pdev);
pci_set_power_state(dev->pdev, PCI_D3hot);
}Index: linux-2.6/include/linux/suspend.h
===================================================================
--- linux-2.6.orig/include/linux/suspend.h
+++ linux-2.6/include/linux/suspend.h
@@ -20...
There's an intentional 5 sec. wait. If the delay is longer that 5 sec., that's a
If that's 5 sec., it's fine.
Please apply the appended patch and try to hibernate. I wonder if you get the
reboot or it hangs earlier.Thanks,
Rafael---
kernel/power/disk.c | 7 ++-----
1 file changed, 2 insertions(+), 5 deletions(-)Index: linux-2.6/kernel/power/disk.c
===================================================================
--- linux-2.6.orig/kernel/power/disk.c
+++ linux-2.6/kernel/power/disk.c
@@ -405,11 +405,7 @@ int hibernation_platform_enter(void)local_irq_disable();
error = device_power_down(PMSG_SUSPEND);
- if (!error) {
- hibernation_ops->enter();
- /* We should never get here */
- while (1);
- }
+ mdelay(1000);
local_irq_enable();/*
@@ -424,6 +420,7 @@ int hibernation_platform_enter(void)
resume_console();
Close:
hibernation_ops->end();
+ kernel_restart(NULL);
return error;
}--
drivers/char/drm/i915_drv.c: In function 'i915_suspend':
drivers/char/drm/i915_drv.c:372: warning: passing argument 1 of
'pci_choose_state' from incompatible pointer type
drivers/char/drm/i915_drv.c:373: warning: passing argument 1 of
'pci_choose_state' from incompatible pointer typeI hope those are just warning that can just be ignored.
Ok, rebooting and will get back shortly.
Thanks,
Jeff.
--
Oops again, should be dev->pdev. Silly DRM layer obfuscation.
Jesse
--
I was just about to write that the test didn't work. Both std str
hangs even before attempting to suspend.Anyway, I'm compiling and rebooting now.
Thanks,
Jeff.
--
Ok, so Linus' theory about something later in the resume path trying to touch
video is looking good.Hm, looks right. Let me see if I can reproduce this on my T61.
Thanks,
Jesse
--
Given the way the PM core works, do we need to set a flag like this? I really
hope there's a better way of doing this...Thanks,
Jessediff --git a/drivers/char/drm/i915_drv.c b/drivers/char/drm/i915_drv.c
index 4048f39..a2d6242 100644
--- a/drivers/char/drm/i915_drv.c
+++ b/drivers/char/drm/i915_drv.c
@@ -238,6 +238,13 @@ static void i915_restore_vga(struct drm_device *dev)}
+/*
+ * If we're doing a suspend to disk, we don't want to power off the device.
+ * Unfortunately, the PM core doesn't tell us if we're headed for a regular
+ * S3 state or that it's about to shut down the machine, so we use this flag.
+ */
+static int i915_hibernate;
+
static int i915_suspend(struct drm_device *dev, pm_message_t state)
{
struct drm_i915_private *dev_priv = dev->dev_private;
@@ -252,6 +259,9 @@ static int i915_suspend(struct drm_device *dev,
pm_message_t state)
if (state.event == PM_EVENT_PRETHAW)
return 0;+ if (state.event == PM_EVENT_FREEZE)
+ i915_hibernate = 1;
+
pci_save_state(dev->pdev);
pci_read_config_byte(dev->pdev, LBB, &dev_priv->saveLBB);@@ -366,7 +376,7 @@ static int i915_suspend(struct drm_device *dev,
pm_message_t state)i915_save_vga(dev);
- if (state.event == PM_EVENT_SUSPEND) {
+ if (!i915_hibernate) {
/* Shut down the device */
pci_disable_device(dev->pdev);
pci_set_power_state(dev->pdev, PCI_D3hot);
@@ -385,6 +395,8 @@ static int i915_resume(struct drm_device *dev)
if (pci_enable_device(dev->pdev))
return -1;+ i915_hibernate = 0;
+
pci_write_config_byte(dev->pdev, LBB, dev_priv->saveLBB);/* Pipe & plane A info */
--
Then, the .resume() called after the image creation will clear the flag and I
don't think it's safe to allow it to survive i915_resume() ...Thanks,
Rafael
--
Yeah. By *not* using "->suspend()" for freezing or hibernate.
Please, Rafael - just make the f*cking suspend-to-disk use other routines
already. 99% of all hardware needs to do exactly *nothing* on
suspend-to-disk, and the ones that really do need things tend to need to
not do a whole lot.For example, the "freeze" action for USB (which is one of the hardest
things to suspend) should literally be something like just setting the
controller STOP bit, and waiting for it to have stopped. The "unfreeze"
should be to just clear the stop bit, while the "restart" should be just a
controller reset to use the current memory image.NONE OF THIS HAS ABSOLUTELY ANYTHING TO DO WITH SUSPEND.
It never did. I've told people so for years. Maybe actually seeing the
problems will make people realize.So please, we shouldn't call "->suspend[_late]" or "->resume[_early]" at
all. Not with PMSG_FREEZE, not with PMSG_*anything*.Can we please get this fixed some day?
Linus
--
On Wednesday 20 February 2008 at 3:29 pm, Linus Torvalds penned
I can't say I even come close to understand what's going on but
getting s2ram to work on my Dell M4300 has been a nightmare. Even
after writing up how to get it to work (posted on the suspend-devel
list - but no one answered .. yet again), I'm having some quirks.If I had a bizillion $'s, I'd buy an M4300 for Linus and give him a
million to get it to s2ram! :pCheers,
--
Pablo Sanchez - Blueoak Database Engineering, Inc
Ph: 819.459.1926 Toll free: 888.459.1926
Fax: 603.720.7723 (US) Text Page: pablo_p@blueoakdb.com--
Okay, I think I'll just start sending patches for that, but rather not earlier
than in the 2.6.27 time frame. No one else works on that and I've been busy
with other things recently. Besides, I'm not even a full time kernelYes, we can (hopefully).
Thanks,
Rafael
--
Rafael,
If I can help, please say so.Regards,
--
In talking with Rafael on IRC about this, I think we're agreed that we need
separate entry points. Even with a kexec based hibernate, we'll probably
want ->hibernate callbacks so we don't end up shutting down the device.The current callback system looks like this (according to Rafael and the last
time I looked):
->suspend(PMSG_FREEZE)
->resume()
->suspend(PMSG_SUSPEND)
*enter S3 or power off*
->resume()
The fact that we get suspend/resume called once before suspend again in the
hibernate case is somewhat obnoxious, but it's even worse that we don't know
what we're about to enter after ->suspend(PMSG_SUSPEND). So in the short
term it would be nice to at least get the target state exported.And in the long term we could have:
->suspend()
*enter S3*
->resume()
or:
->hibernate()
*kexec to another kernel to save image*
*power off*
->return_from_hibernate() (or somesuch)Jesse
--
Yes, it's very messy.
It's messy for a few different reasons:
- the one you hit: a driver actually has a really hard time telling what
PMSG_SUSPEND really means.- more importantly, we generally don't want to "suspend/resume" the
hardware at all around a power-off, because we're going to resume with
the state at the time of the PMSG_FREEZE, which means that the hardware
has actually *changed* and been used in between!that second case is very fundamental for things like USB devices, which in
theory you can hold alive over a real suspend event (ie a STR event), but
which absolutely MUST NOT be resumed over a suspend-to-disk event, because
all the low-level request state is bogus!So the "->resume" really isn't a resume at all. It's much closer to a
"->reset".Of course, the "solution" to this all right now is that we have to reset
everything even if it *is* a suspend event, so it basically means that STR
ends up using the much weaker model that snapshot-to-disk uses.The fundamental problem being that the two really have nothing
Yes, apart from all the complexities (suspend_late/resume_early). So in
reality it's more than that, but the suspend/resume things are clearly
nesting, and they have the potential to actually keep state around
(because we *know* this machine is not going to mess with the devices in
between).IOW, here we actually can have as an option "assume the device is there
Enough people don't trust kexec that I suspect the right thing simply is
->freeze() // stop dma, synchronize device state
*snapshot*
->unfreeze(); // resume dma
*save image*
[ optionally ->poweroff() ] // do we really care? I'd say no
*power off*
->restore() // reset device to the frozen onewhich may have four entry-points that can be illogically mapped to the
suspend/resume ones like we do now, but they really have nothing to do
with suspending/resuming.And notice how while "freeze/restore" kind of pairs...
Well, it seems like we'll have to fix drivers in either case, and isn't a
kexec approach fundamentally more sound and simple, design-wise? Rafael
pointed out some problems with properly setting wakeup states, but I thinkYeah, definitely. It has to be much more robust and deal with configuration
changes, etc. (within reason).Jesse
--
Hi.
No. AFAICS, kexec is going to be more complex and ugly in many ways.
To summarise, a kexec based hibernation is going to need the following
additional requirements to just replace what we already have:- get the original kernel to allocate storage while racing against the
rest of the system (currently allocation is done post-atomic copy &
post-freezing - no racing). This makes it potentially slower, too;
- get the original kernel to transfer the information about what swap
was allocated to the kexec'd kernel, probably together with a lot of
other information (which pages are nosave etc).
- get the original kernel to keep memory free for the kexec'd kernel
which would otherwise be usable. Not a biggy on desktops or laptops, but
think about embedded.
- people keep talking about hibernating to an ext3 fs mounted on fuse as
a limitation of the freezer. To do that with kexec, you're still going
to have to bmap the ext3 fs and pass the block list (in which case we
can also do it without kexec) or umount all the ext3/fuse part and
remount in the kexec'd kernel. Sort of defeats the purpose, doesn't it?I also wonder about how much of a pain it's going to be setting up
userspace for this kexec'd kernel. Will you need a separate partition
just for it? If not, will the userspace be loaded into memory all the
time (more memory wasted for normal use), or loaded from ordinary
partitions at kexec time (how to do safely? - more info to transfer
between kernels?).I'd love it if kexec really was the panacea to the freezer issues, but
problems like these make me think it isn't a viable solution.Nigel
--
No, with a freezer-based model you can basically *never* suspend to
anything related to FUSE or a userspace USB device or anything involving
userspace iSCSI initiators or whatever. Sure, there are cases where
moving away from the current model doesn't buy you anything, but that
doesn't mean that the current model is a good thing. It's not. TheYou're looking at a tiny amount of memory when compared to current
systems. It's really not a problem.--
Matthew Garrett | mjg59@srcf.ucam.org
--
Hi.
Putting drivers and filesystems in userspace is the fundamentally broken
concept. Not just when it comes to the freezer. The whole idea is
inherently racy. You can draw silly diagrams about how the freezer
supposedly works in LCA slides and spread FUD as much as you like. In
the end, though, it's not nearly as hit-and-miss as you say, and
replacing the freezer with a kexec based freezer is only going to createPlease, quantify 'tiny'. In embedded, 5MB can be too much. I've worked
on embedded solutions. I'm not pulling problems out of thin air.Regards,
Nigel
--
I'm really not interested in debating the matter. There are all sorts of
potential uses for the freezer, but hibernation isn't one of them. We
*need* to get rid of the freezer for suspend to RAM (because a band-aid
to ensure atomicity is kind of pointless when the operation you're
entering is inherently atomic), and once all the drivers are able to
deal with that then it's trivial to get rid of it for hibernation as
well. Arguing that the reality of userspace drivers is broken doesn'tThen the in-kernel solution has already lost anyway, and I'm desperately
unconcerned about out of tree stuff.
--
Matthew Garrett | mjg59@srcf.ucam.org
--
Hi.
Re suspend to ram, I agree. No argument there. Re hibernation, I think
your assertion that it will be trivial to get rid of it for hibernation
is just plain wrong. Perhaps you don't understand the issues as well as
you think you do.Re arguing that the reality of userspace drivers is broken doesn't help
here: Yeah, I know. But sometimes if you point out broken ideas for long
enough, people do actually listen. Or you learn. Or both.Frankly, I don't want to debate the issue either. What I really want is
just to have a hibernation implementation that works, is flexibile,
reliable and quick, and one that I don't have to keep maintaining.
Unfortunately for me, most people seem to be more concerned with fixing
hypothetical problems than with giving users something they can actuallyI know. I'd submit it, or work on breaking it into pieces and submitting
them one at a time, but that seems to me to be a waste of time.Nigel
--
Racy with regards to other things becides trying to suspend a machine?
If so, what?thanks,
greg k-h
--
Hi.
That depends on what sort of tangled web you want to weave. Low memory
situations is one other situation that occurs to me quickly, especially
(though not only) if your ability to swap were to depend upon a
userspace driver and/or filesystem.Regards,
Nigel
--
Lots of them :)
We have tanks running Linux using userspace USB drivers for vision
control systems (scary, I know...) They seem to be successfully running
for many years now, and I'm interested in making sure those kinds of
things keep working.We also have laser welding robots with userspace PCI drivers in car
manufacturing plants. And other laser cutting robots slicing wood in
patterns moving at a rate of over 3 meters a second. Again, with
userspace drivers and Linux.Those users would also love to know of any potential problems you know
Sure, swap over a userspace filesystem or driver isn't a sane idea. And
neither is swaping over NFS over a PPP connection attached to a USB to
serial device. Yes, it's possible, and all in the kernel, but not a
wise decision.Other than foolish configurations, if you come up with other issues
surrounding userspace drivers that could cause problems, please let me
know.thanks,
greg k-h
--
Hi Greg.
A simple OOM condition isn't an issue? Surely a driver stalling because
some of its memory gets swapped out just before it goes to use it would
be a problem if it resulted in getting the length of a cut wrong or
caused some distorted vision or a late turn :>Am I missing something? Maybe these drivers mlock memory to avoid those
issues or something like that?Regards,
Nigel
--
I think the mlock their memory to prevent this from happening, it's not
hard when you control all the applications on the box :)thanks,
greg k-h
--
In fact the driver can find out in which state to put the device into,
We do, if there are devices that wake us up from S4 and don't wake us up from
S5, for example. Plus this f*cking fan in my box that doesn't work after theAgreed.
Thanks,
Rafael
--
And by "right low power states" you mean "wrong low-power states", right?
The thing is, they really *are* the wrong states for 99% of all hardware.
If you really have a piece of hardware that you want to have the
"->poweroff()" thing do the same as "->suspend()", then hey, just use the
same function (or better yet, use two different functions with a call to a
shared part).Because IT IS NOT TRUE that ->suspend() puts the devices in the "right
power state". The power states are likely to be totally different for S3
and for poweroff, and they are going to differ in different ways depending
on the device type.One example would be the one that started this version of the whole
discussion (shock horror! We're on subject!) ie when you do a system
shutdown, you generally do not even *want* to put individual devices into
low-power states at all, because the actual "power off the system" thing
will take care of it for you much better.So to take just something as simple as VGA as an example: you really do
not want to suspend that device, because you want to see the poweroff
messages until the very end.So that final device ->poweroff function really has absolutely *nothing*
in common with the device ->suspend[_late] functions, simply because
almost any sane driver would decide to do different things.Of course, we can continue to do the insane thing and just continue to use
inappropriate and misleadign function callback names, and then encodign
what the *real* action should be in the argument and/or in magic
system-wide state parameters.So in that sense, it's certainly totally the same thing whether we call it
->shutdown or ->poweroff or ->eat_a_banana, since you could always just
look at the argument and other clues, and decide that *this* time, for
*this* kind of device, the "eat a banana" callback actually means that we
should power it off, but wouldn't it be a lot more logical to just make it
clear in the first place th...
In fact we have acpi_pci_choose_state() that tells the driver which power
state to put the device into in ->suspend(). If that is used, the device endsNo. Again, if there are devices that wake us up from S4, but not from S5,
they need to be handled differently in the *enter S4* case (hibernation) andYes, it would. Still, the common thing is, it (ie. ->poweroff) _may_ want to
To clarify, I agree that we should use different callbacks for hibernation.
Well, I agree with that.
As I said before, that's mainly because I've been busy with other stuff
recently. Now, with the Alex's help, I'm hoping to take care of it soon.Thanks,
Rafael
--
..
Something I've never understood, is why we would ever want to bother with *S4* at all?
I actually like hibernation (great for travelling), but I treat it as if
it were a complete power-off (S5?). I pull batteries, unplug drives,
boot other operating systems, etc..And when I put it all back together again with the Linux disk inserted,
I fully expect it to "resume" from the hibernation of 3 months ago.
And it does.Why would I ever want anything less than a full poweroff for hibernation ????
Thanks.
--
First off, nobody should *ever* use that directly anyway.
Secondly, the one that people should use ("pci_choose_state()") doesn't
actually do what you claim it does. It does all kinds of wrong things, andAnd again, what does this have to do with (the example I used) the
graphics hardware? Answer: nothing. The example I gave you we simply DO
THE WRONG THING FOR.Same thing for things like USB devices - where pci_choose_state() doesn't
work to begin with. Why do we call "suspend()" on such a thing when we
don't want to suspend it? We shouldn't. We should call "freeze/unfreeze"
(which are no-ops) and then finally perhaps "poweroff", and that final
stage might want to spin things down or similar.But *none* of it has anything to do with suspend, and none of it has
anything to do with pci_choose_state() (much less acpi_pci_choose_state)The fact is, we should let the driver decide, and we should make it clear
to the driver writer what he is deciding about - rather than basically lie
and say "suspend the device and put it into D3" even when that's the last
thing it should ever do.Linus
--
Well, if platform_pci_choose_state() is defined, pci_choose_state() returns
its result and on ACPI systems that points to acpi_pci_choose_state(), so inI'm already convinced, really. :-)
Thanks,
Rafael
--
Did you check closer?
I repeat: acpi_pci_choose_state() (when called from pci_choose_state())
doesn't even look at the target 'state'. It just blindly assumes that you
want the deepest sleep-state you can have.Which happens to be correct for normal suspend, but means that if you want
to test other states (through '/sys/devices/.../power'), that sounds
broken.I didn't check any closer, but go check it yourself. The short and sweet:
acpi_pci_choose_state() totally ignores its 'state' argument. Do you
really think that's correct? But yes, "pci_choose_state()' effectively
does that too, apart from PM_EVENT_ON, which is never used.(But the whole and only point of pci_choose_state() was to do the
PM_EVENT_FREEZE thing differently, which it doesn't do, so I think the
real issue here is that the interface is really rather mis-designed)I suspect most people who ever really looked and worked on this code had a
specific device in mind, and I'm sure that all of the code individually
always ends up making sense from the standpoint of some specific device
driver. It's just that it never seems to make sense from a bigger issues
standpoint, and often seems senseless from the standpoint of other devices
of other types.Linus
--
acpi_pm_device_sleep_state() (that is called by acpi_pci_choose_state())
takes the target state directly from the ACPI layer.We just want to get rid of the argument passed to ->suspend() eventually, but
there may be many _suspend_ states available (eg. "mem" and "standby") and
for each of them there may be different constraints on the device's state. We
have to tell the driver which device states are possible in the target system
sleep state. Right now we arbitrarily choose the one with the lowest powerThis interface is not available any more (ie. there's only "wakeup" in
You're wrong, sorry. With PM_EVENT_FREEZE it wouldn't even be necessary.
It's there, because potentially there are many possibilities with
PM_EVENT_SUSPEND and in fact it shouldn't even be used with
PM_EVENT_FREEZE.All of this is more or less orthogonal to the issue at hand, which boils down
to the fact that we use the _suspend_ callbacks for hibernation and we
shouldn't be doing that.Thanks,
Rafael
--
Absolutely.
Two big reasons:
- debuggability
I know we don't do this correctly right now, but I want to be able to
at least feel like we can some day actually do printk's etc through 99%
of the suspend/resume cycle. It's a *huge* thing for debugging problems
that happen in the wild, and one of the biggest issues is that we
currently usualyl just get a "the machine died" message when suspend or
resume doesn't work.Yes, doing printk's to the Intel management flash stuff can help a lot
here, and I want that too, but I'd really like to shut down consoles
individually rather than having the "big hammer" approach that shuts
them up entirely over the whole suspend/resume sequence (or not at all,
if you use "no_console_suspend").And I'd *really* like to do things like VGA-console shutdown in the
late phase (and resume early).- it's actually likely *much* simpler for some devices.
Simple devices (and that includes things like PCI bridges etc, but
also potentially USB host controllers etc) are things that can often be
trivially suspended - all the complexity is really not in the
controller itself, but beyond, in the bus that it actually drives.And the late-suspend/early-resume means that you don't have to worry
about things like interrupts happening while you're suspended. Yes,
putting the device into D3 will disable interrupts from that device too
(unless there are bugs), *BUT* you may be sharing an interrupt line,
and interrupts may be posted and delayed, so an earlier interrupt may
well be pending etc.suspending late and resuming early just avoids those issues entirely.
Sometimes these things interact. For example, firewire is certainly not
trivial to suspend as a "subsystem" thing (ie all the devices behind the
firewire bridge need to do magic things, like spinning down etc that
obviously can not happen in the final "late" phase), but the firewire
controller it...
I've been watching for kexec hibernate for a little while now, and the
last I saw was that acpi was incompatible with the kexec hibernate (but
the suspend folks were still claiming that devices needed to be put in the
'right mode' not just powered off. I've been waiting to see this resolved.David Lang
--
david@lang.hm wrote:
..Yeah, exactly. What's so special about poweroff on hibernation?
Why even bother with the special "S4" state there?
I want a real full poweroff, or at least I think I do. Why wouldn't I?????
--
(1) To be able to wake up with the help of devices that can't wake
the system up from S5 (power off)You may want that, some people may not want it.
We are supposed to handle S4, the BIOS/platform may expect us to do that, so
IMO this is a good enough reason to do it. Especially that we can.Thanks,
Rafael
--
..
That's the theory. I've read about it, but have yet to imagine
any real-life situation where it applies.But this isn't my speciality, so.. do you have experience with any real examples?
--
Yup. The fan in my notebook behaves incorrectly after a resume from
hibernation if S5 is entered instead of S4 during it.I don't know why exactly it happens, but that's how it goes.
Also, some machines are reported to behave incorrectly after a "shutdown"
mode hibernation, while the same machines work just fine after a "platform"
mode hibernation. So at least for these machines it seems to matter.Thanks,
Rafael
--
so if you power off your laptop the fan doesn't work when you turn it back
given that we don't have a pure "shutdown" option available to try I don't
see how this can be said to have been tested.currently any attempts to do a shutdown type hibernate are tangled in the
other code that is there for the suspend modes. this makes it _very_ hard
to say that the hardware requires something as opposed to the strong
possibility that the software is doing something wrong.there are also a _lot_ of people who are not able to reliably use the
existing "platform" mode hibernation, so it's not a fair statement to say
that it's the 'right' thing to do. If you want to make it an option, fine.
But please give those of us who don't care about these other wakeup
options, and who want to be able to use other OS's while linux is stopped
an option as well.David Lang
--
There is such an option. Put
# echo shutdown > /sys/power/disk
into the init scripts and it will do the trick.
Thanks,
Rafael
--
Try suspend-and-resume without X.
Also, try it on one of the more modern laptops - even *with* X.
Basically, the kernel wants to be able to do what X does, because it means
that when it works, it works _so_ much better than doing it in X. So
getting it working is definitely worth it.That said, before you do anything else, try if suspend-to-RAM works.
That's the primary goal for this code anyway, and if it works that gives a
good hint. Suspend-to-disk is fundamentally different, and it's entirely
possible that for the suspend-to-disk case we should just say "screw
trying to suspend/resume graphics", since you'll have the BIOS resuming
text-mode anyway, and there are no performance or debugging advantages.Linus
--
Linus, guess I missed this part ... so before touch anything, I did
tried suspend-to-ram, and it works on console and in X.And suspend-to-disk hangs, but I can still press and hold the power
button to power it off. Then upon powering on and resume, I get the
ugly green "console" screen. I can still type and move around.
Starting X runs fine. Ctrl-Alt-Del or switching back to console will
get back to the green screen.Thanks,
Jeff.
--
The "press and hold for five seconds" is actually a hardware feature of
the southbridge (well, I guess there is "software" in there too, but it's
the embedded kind). So the fact that it powers off at that point means
nothing, it just means that ok, your kernel is hung, but the hardware
still works ;)This *sounds* like some part of the suspend-to-disk sequence is doing
something stupid like trying to access the screen after it has been turned
off, which doesn't surprise me at all. My oft-stated opinion has been that
suspend-to-disk isn't a suspend at all, and should never have been
confused with "suspending" anything.It's "snapshot-and-restore", and my opinion is that:
- it should *never* call "suspend()"/"resume()" at all (that should be
reserved purely for suspend-to-RAM and has real power management
issues!)- it should have a totally separate "halt/unhalt/restore" thing
that has nothing what-so-ever to do with power management, and is
purely about stopping the hardware for things like USB and network
cards (which otherwise do things like scan their command lists
asynchronously) and making sure that the driver state is consistent
with that stopped hw state.- the people who confuse snapshot/restore with suspend/resume are
horrible people that cause problems exactly because driver people then
get those things mixed up, and something like the video suspend/resume
should probably never have impacted suspend-to-disk in the first place!.. so this implies that while the laptop apparently hung at the end of the
snapshotting, the snapshotting did actually work, and it must have hung at
the very end, presumably when it tried to actually turn the power off.So there seems to be two (probably largely independent) problems:
- the hang at shutdown that requires you to press-and-hold the power
button to actually cut power.At a guess: putting the VGA device into D3hot makes the ACPI code that
actually ...
Hmm, entering S4 seems like good place to call suspend() for... unless
you want separate freeze()/unfreeze(), suspend()/resume(),
suspend_s4() and halt() callbacks.
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
--
Totally agreed. I remember when I started getting hibernation bug reports
against this new code and boggling at how hibernate was actually done. The
driver actually gets its ->suspend routine called twice with two different
pm_message_t values. We tried to do different stuff depending on the
pm_message_t (like only putting the device in D3hot if PM_EVENT_SUSPEND), butSounds like a good theory... now if we could just use set_power_state in the
suspend case only. That's what the latest code *tries* to do...JEsse
--
Here's an interesting discovery. After I found that "echo reboot >
/sys/power/disk" does reboot, I tried "echo shutdown >
/sys/power/disk", it does shutdown properly.With "platform" it refuses to shutdown. Both reboot and shutdown still
Yes.
Thanks,
Jeff.
That kind of suggests that the ACPI platform code is hitting the
hardware directly - we've seen similar issues with PATA controllers. The
right thing to do here is almost certainly just to avoid explicitly
powering down hardware on hibernation.--
Matthew Garrett | mjg59@srcf.ucam.org
--
Ok, what's next?
Thanks,
Jeff.
--
Ahh. You're using the BIOS to re-initialize your video, aren't you?
If STR works without X, then you have something else resuming graphics,
and that may be what then interacts badly with the fact that the kernelLet's try to narrow it down to what the interaction is. Are you using
something like acpi_sleep=s3_bios or similar? That's what the kernel
support is supposed to make unnecessary in the long run, along with all
the video mode flickering (ie we should be able to resume to the video
mode we want, not flicker through unnecessary modes).Linus
--
Ok, understand now.
Jeff.
--
Well, as far as I know, s2ram could be doing vbe save/restore for you.
My laptop (a Toshiba satellite U305, intel 945GM chipset, used to need
s2ram -f -p -m to STR correctly. In 2.6.25-rc2 I can simply STR with
echo mem > /sys/power/state.Romano
I imagine this will be received as blasphemy, but if only ndiswrapper
were not horribly broken, this will be my day-by-day kernel. I just hope
ath5k will arrive to my chipset soon...--
La presente comunicación tiene carácter confidencial y es para el exclusivo uso del destinatario indicado en la misma. Si Ud. no es el destinatario indicado, le informamos que cualquier forma de distribución, reproducción o uso de esta comunicación y/o de la información contenida en la misma están estrictamente prohibidos por la ley. Si Ud. ha recibido esta comunicación por error, por favor, notifíquelo inmediatamente al remitente contestando a este mensaje y proceda a continuación a destruirlo. Gracias por su colaboración.This communication contains confidential information. It is for the exclusive use of the intended addressee. If you are not the intended addressee, please note that any form of distribution, copying or use of this communication or the information in it is strictly prohibited by law. If you have received this communication in error, please immediately notify the sender by reply e-mail and destroy this message. Thank you for your cooperation.
--
Really? This contains confidential information? I'd better notify you and
destroy this message now... :)Jesse
--
STR is always working on my X60s. No green screen, no hang. Both s2ram
and "echo mem > /sys/power/state". It's STD that's having problem.But strange thing is I could even restore console after STD even
without agp and i915 module loaded, so I don't know how the console
vga got saved and restored.Thanks,
Jeff.
--
Your system (either your distro suspend/resume scripts or your platform) must
be running the video BIOS at resume time, otherwise it would probably come
back blank.Jesse
--
But I don't think so, unless acpid is doing just that. In my
suspend/resume event script, it's just doing a simple s2ram (without
options), and exit after resume.Jeff.
--
The s2ram command has a built-in whitelist used to set up video
rePOSTing. If you want to test, reboot and doecho mem >/sys/power/state
without i915 being loaded. If you get a console back on resume then the
platform is reinitialising video for you, but otherwise it's your
userspace.--
Matthew Garrett | mjg59@srcf.ucam.org
--
btw., why isnt there an in-kernel whitelist, with perhaps a dynamic,
convenient /debug/s2r/whitelist append-API for distros (and testers) to
add more entries to the whitelist/blacklist? (for cases where the kernel
whitelist has not caught up yet) Which would eventually converge to
Utopia: s2ram that just works out of box.This would be a lot more flexible (people could even temporarily extend
the whitelist via rc.local if need to be, etc.), a lot more robust and a
lot more user friendly than the "dont use /sys/power/state, rely on some
user-space tool to work around bugs" approach.Really, i couldnt make the s2ram API/quirks situation much worse even if
i deliberately tried to design the whole code to be as hard to use and
as confusing as possible :-/ These types of half-kernel half-userspace
solutions usually result in constant finger-pointing and constant lag,
and they result in about the crappiest user experience that is possible
to achieve physically.( Sorry about the strong words, while there's lots of good and positive
development lately i havent seen much change in this particular area
of s2ram in the past 1-2 years, and the whole chain is only as strong
as the weakest link - so someone finally has to deliver this message
to the cozy fire of s2r hackers while our testers and users are
standing out in the cold rain ... )Ingo
--
Because all of these video quirks are just workarounds for the fact that
the kernel doesn't work properly. In general, you really don't want to
call a real-mode video bios from the kernel, so punting it to userspace
(and leaving the whitelisting there) is somewhat more straightforward.
In addition, we can then extend the whitelist without requiring kernelWe've got i915 suspend/resume now, which already fixes this for a large
number of users. Recent ATI is easy, now that we actually have specs for
ATOM. The nouveau guys are almost at the point where we can do it for
nvidia. That basically just leaves VIA.The other s2r issues are pretty much just driver bugs at this point.
--
Matthew Garrett | mjg59@srcf.ucam.org
--
ok - sounds good :-)
Ingo
--
The big problem with that is
- the people who know about the devices are usually not kernel people
- the workarounds that the whitelist requires is quite often not a kernel
workaround.In other words, the most common workarounds for the s2ram whitelist is
usually to do things like running vbetool in user-level to do VGA register
save/restore (VBE_POST and VGE_SAVE). Sure, the kernel could do that with
usermodehelper etc, but s2ram also has those things as command line flags
etc, so...Linus
--
The problem with the whitelists is that they have to use quite a lot of data to
reliably match the system. The s2ram whitelist is not 100% reliable, because
it uses too little information to distinguish different versions of the same
machine model, for example.Plus, in an ideal world, we should be able to match all possible working
graphics/chipset/BIOS combinations and that would be quite a bit of a database.
Also, there are some quirks that need to be run from the user land, AFAICS
(eg. in an i86 real-mode compatible manner).IMO, whitelisting is not a solution. It's only a sort-of-working workaround
and as such it shouldn't be put into the kernel.Thanks,
Rafael
--
Your s2ram script is doing your STD also? Seems counterintuitive. Anyway,
some machines also re-POST the GPU on resume from S3; maybe yours is doing
that.Jesse
--
It's s2ram to do STR, not STD. Sorry for the confusion. But the key
point is there's no GREEN for STR. Mr Green only appear with STD.Thanks,
Jeff.
--
Ah, ok that makes sense.
So typically, what you'd see at suspend time is this ugly call chain:
1) user requests suspend or hibernate
2) kernel kicks users off VT
3) X calls LeaveVT function of X driver
4) X driver restores whatever video state it felt like saving when it started
up
5) kernel calls suspend methods
6) machine goes to sleepthen on resume:
1) user requests wakeup
2) kernel calls resume methods
3) X takes back the VT, calling driver EnterVT methods
4) X driver EnterVT routine runs, doing whatever it wants
5) you're back to where you were (on a good day anyway)So, on your machine, I suspect your firmware is doing enough that X doesn't
have to save/restore full video state around Enter/Leave VT (the same
functions called at VT switch time when you press ctl-alt-fx), otherwise
you'd be missing things like your backlight or text consoles.So the advantage of the kernel suspend/resume hooks for the DRM layer is that
the kernel video drivers can do full state save/restore (which X usually
doesn't do, and isn't really designed to do), so that if your platform
*doesn't* do it all, you'll still end up with a usable machine in the end.The fact that you'd started running into problems since we merged this just
means your platform was taking care of it for you (lucky you) and that we
have some bugs in the hibernate code that we're just discovering.Jesse
--
Well, I'm also hoping that eventually we could even just not do the VT
switch at all, and the kernel can treat X as "just another user process"
that it freezes.At least from a mode setting standpoint.
We'd still want to make sure that X repaints the screen if the contents
were lost, of course. And this is going to depend very intimately on the
type of graphics card and whether the video RAM is saved by STR or not -
for the Intel integrated graphics kind of situation, the video RAM will be
refreshed along with all the other memory, but for other cards we may end
up having to do the VT switch not so much for modesetting reasons as just
a way to get X to save and restore all the *other* state.How close is the i915 driver from not having to even signal X? Or is that
just a pipedream of mine?Linus
--
Drivers supporting kernel modesetting will have to stuff their VRAM somewhere,
It's there in the modesetting tree (though the requisite changes to avoid VT
notification aren't done, it should all work fine).Jesse
--
That's the main reason for moving to the X series. It's seems to work
very well on Linux.Thanks,
Jeff.
--
Jeff, can you please test hibernation with the patch I've just sent to Jesse
(reproduced below for convenience)?Thanks,
Rafael---
drivers/char/drm/i915_drv.c | 5 +++--
include/linux/suspend.h | 2 ++
kernel/power/disk.c | 10 +++++++++-
3 files changed, 14 insertions(+), 3 deletions(-)Index: linux-2.6/drivers/char/drm/i915_drv.c
===================================================================
--- linux-2.6.orig/drivers/char/drm/i915_drv.c
+++ linux-2.6/drivers/char/drm/i915_drv.c
@@ -27,6 +27,7 @@
*
*/+#include <linux/suspend.h>
#include "drmP.h"
#include "drm.h"
#include "i915_drm.h"
@@ -222,6 +223,7 @@ static void i915_restore_vga(struct drm_
dev_priv->saveGR[0x18]);/* Attribute controller registers */
+ inb(st01); /* switch back to index mode */
for (i = 0; i < 20; i++)
i915_write_ar(st01, i, dev_priv->saveAR[i], 0);
inb(st01); /* switch back to index mode */
@@ -366,9 +368,8 @@ static int i915_suspend(struct drm_devici915_save_vga(dev);
- if (state.event == PM_EVENT_SUSPEND) {
+ if (state.event == PM_EVENT_SUSPEND && !in_hibernation_power_off()) {
/* Shut down the device */
- pci_disable_device(dev->pdev);
pci_set_power_state(dev->pdev, PCI_D3hot);
}Index: linux-2.6/include/linux/suspend.h
===================================================================
--- linux-2.6.orig/include/linux/suspend.h
+++ linux-2.6/include/linux/suspend.h
@@ -209,6 +209,7 @@ extern unsigned long get_safe_page(gfp_textern void hibernation_set_ops(struct platform_hibernation_ops *ops);
extern int hibernate(void);
+extern bool in_hibernation_power_off(void);
#else /* CONFIG_HIBERNATION */
static inline int swsusp_page_is_forbidden(struct page *p) { return 0; }
static inline void swsusp_set_page_free(struct page *p) {}
@@ -216,6 +217,7 @@ static inline void swsusp_unset_page_frestatic inline void hibernation_set_ops(struct platform_hibernation_ops *ops) {}
...
I don't understand why hibernation just doesn't use a PM_EVENT_HIBERNATE,
and be done with it?Why should it be called PM_EVENT_SUSPEND when it isn't?
Adding some external global variables is absolutely the wrong way to fix
this.It's not even like there are very many drivers who actually care about
"state.event" anyway: a 'git grep' returns just 35 users in the whole
tree, so if this was done this ugly way just to avoid double-chcking the
other cases that compare against PM_EVENT_SUSPEND, then it really wasn't
worth it.Linus
--
Please relax, we're debugging the thing right now and the patch doesn't
even seem to help on the other affected box.The issue appears to be more complicated than we initially thought.
Thanks,
Rafael
--
Actually, looks like I forgot to reboot between tests (just rmmod'd &
modprobed i915), your patch actually does work.However, making new PM event messages might be a good thing anyway, assuming
Linus takes it for 2.6.25, since it should make the migration to ->hibernate
callbacks easier.Jesse
--
Rafael, I'd actually prefer these changes to the i915 driver. One is to avoid
the "green screen" problem and the other is to actually save state at
hibernate time in case we don't do a POST coming out of S4 (probably not
common but hey).Jesse
Make sure hibernation works by not shutting down the video device during
hibernation power off. This is important because later stages of the
hibernation cycle end up touching the video device, which may cause a hang if
it was disabled early on. Also make sure the restoration correctly restores
the AR registers by flipping the ARX register into index mode before doing
anything.Depends on Rafael's patch which exports hibernation state to drivers.
Signed-off-by: Jesse Barnes <jesse.barnes@intel.com>
diff --git a/drivers/char/drm/i915_drv.c b/drivers/char/drm/i915_drv.c
index 35758a6..5e73869 100644
--- a/drivers/char/drm/i915_drv.c
+++ b/drivers/char/drm/i915_drv.c
@@ -27,6 +27,7 @@
*
*/+#include <linux/suspend.h>
#include "drmP.h"
#include "drm.h"
#include "i915_drm.h"
@@ -222,6 +223,7 @@ static void i915_restore_vga(struct drm_device *dev)
dev_priv->saveGR[0x18]);/* Attribute controller registers */
+ inb(st01);
for (i = 0; i < 20; i++)
i915_write_ar(st01, i, dev_priv->saveAR[i], 0);
inb(st01); /* switch back to index mode */
@@ -364,8 +366,8 @@ static int i915_suspend(struct drm_device *dev)
i915_save_vga(dev);/* Shut down the device */
- pci_disable_device(dev->pdev);
- pci_set_power_state(dev->pdev, PCI_D3hot);
+ if (!in_hibernation_power_off())
+ pci_set_power_state(dev->pdev, PCI_D3hot);return 0;
}--
Below is a patch that introduces PM_EVENT_HIBERNATE as requested by Linus
and (hopefully) makes hibernation with i915 work correctly.I must admit I don't feel very comfortable introduces PM_EVENT_HIBERNATE at
this point, since such changes tend to introduce unexpected issues all over the
place, but hopefully this time it won't break anything.I have tested it on the nx6325.
Please review and tell me if it looks good.
Jesse and Jeff, please check if your boxes hibernate correctly with this patch
applied.Thanks,
Rafael---
Documentation/power/devices.txt | 13 ++++++++-----
drivers/ata/ahci.c | 3 ++-
drivers/ata/ata_piix.c | 3 ++-
drivers/ata/libata-core.c | 2 +-
drivers/char/drm/i915_drv.c | 4 +++-
drivers/ide/ppc/pmac.c | 6 ++++--
drivers/macintosh/mediabay.c | 4 +++-
drivers/pci/pci.c | 1 +
drivers/scsi/aic7xxx/aic79xx_osm_pci.c | 2 +-
drivers/scsi/aic7xxx/aic7xxx_osm_pci.c | 2 +-
drivers/scsi/mesh.c | 1 +
drivers/scsi/sd.c | 5 +++--
drivers/usb/host/sl811-hcd.c | 1 +
drivers/usb/host/u132-hcd.c | 10 +++++++---
drivers/video/chipsfb.c | 3 ++-
drivers/video/nvidia/nvidia.c | 3 ++-
include/linux/pm.h | 5 +++++
kernel/power/disk.c | 4 ++--
net/rfkill/rfkill.c | 3 ++-
19 files changed, 51 insertions(+), 24 deletions(-)Index: linux-2.6/kernel/power/disk.c
===================================================================
--- linux-2.6.orig/kernel/power/disk.c
+++ linux-2.6/kernel/power/disk.c
@@ -391,7 +391,7 @@ int hibernation_platform_enter(void)
goto Close;suspend_console();
- error = device_suspend(PMSG_SUSPEND);
+ error = device_suspend(PMSG_HIBERNATE);
if (error)
goto Resume_console;@@ -404,7 +404,7 @@ int ...
I'm doing this part separately, please drop it - it has nothing to do with
the rest of the patch.I'd also suggest that you just add a helper function like
int pm_event_powerdown(struct pm_message mesg)
{
return mesg.event >= PM_EVENT_SUSPEND;
}or something, so that you can have code like
if (pm_event_powerdown(mesg))
pci_set_power_state(pdev, PCI_D3hot);instead of the test for EVENT_SUSPEND/HIBERNATE explicitly.
Of course, the places that already do a switch-statement are much better
Didn't you miss the apci_pci_choose_state() thing that also needs this
Looks like a missing close-brace to me there - you removed the final '}'.
Or am I blind?
Apart from those issues it looks fine to me.
Linus
--
In the revised patch below I redefined the PM_EVENT_* things as flags so
that I can "or" them and defined PM_EVENT_SLEEP in analogy withOK, please have a look at the modified patch below.
Thanks,
Rafael---
Documentation/power/devices.txt | 13 ++++++++-----
drivers/ata/ahci.c | 2 +-
drivers/ata/ata_piix.c | 2 +-
drivers/ata/libata-core.c | 2 +-
drivers/ide/ppc/pmac.c | 4 ++--
drivers/macintosh/mediabay.c | 3 ++-
drivers/pci/pci.c | 1 +
drivers/scsi/aic7xxx/aic79xx_osm_pci.c | 2 +-
drivers/scsi/aic7xxx/aic7xxx_osm_pci.c | 2 +-
drivers/scsi/mesh.c | 1 +
drivers/scsi/sd.c | 3 +--
drivers/usb/host/sl811-hcd.c | 1 +
drivers/usb/host/u132-hcd.c | 11 ++++++++---
drivers/video/chipsfb.c | 2 +-
drivers/video/nvidia/nvidia.c | 2 +-
include/linux/pm.h | 9 ++++++++-
kernel/power/disk.c | 4 ++--
net/rfkill/rfkill.c | 2 +-
18 files changed, 42 insertions(+), 24 deletions(-)Index: linux-2.6/kernel/power/disk.c
===================================================================
--- linux-2.6.orig/kernel/power/disk.c
+++ linux-2.6/kernel/power/disk.c
@@ -391,7 +391,7 @@ int hibernation_platform_enter(void)
goto Close;suspend_console();
- error = device_suspend(PMSG_SUSPEND);
+ error = device_suspend(PMSG_HIBERNATE);
if (error)
goto Resume_console;@@ -404,7 +404,7 @@ int hibernation_platform_enter(void)
goto Finish;local_irq_disable();
- error = device_power_down(PMSG_SUSPEND);
+ error = device_power_down(PMSG_HIBERNATE);
if (!error) {
hibernation_ops->enter();
/* We should never get here */
Index: linux-2.6/include/linux/pm.h
===================================================================
--- linux-2.6.orig/include...
Seems to work here after basic tests. ACK.
(I discovered that -rc2 swsusp will not power down in some cases, but
it was here before the patch, too...)
Pavel--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
--
All right, I'm fine with it. Now we just need to confirm that it works for
people..Linus
--
On Sat, Feb 23, 2008 at 10:07 AM, Linus Torvalds
Looks good. Applied Rafael patch on top of your latest git tree that
has Jesse's i915 fix.No green screen. Tested with STD (platform), STR, and plain echo mem >
/sys/power/state.Thanks,
Jeff.
--
Thanks for testing. Below is the final version of the patch with a changelog
etc.Thanks,
Rafael---
From: Rafael J. Wysocki <rjw@sisk.pl>During the last step of hibernation in the "platform" mode (with the
help of ACPI) we use the suspend code, including the devices'
->suspend() methods, to prepare the system for entering the ACPI S4
system sleep state. At least for some devices the operations
performed by the ->suspend() callback in that case must be different
from its operations during regular suspend. For this reason,
introduce the new PM event type PM_EVENT_HIBERNATE and pass it to
the device drivers' ->suspend() methods during the last phase of
hibernation, so that they can distinguish this case and handle it as
appropriate. Modify the drivers that handle PM_EVENT_SUSPEND in a
special way and need to handle PM_EVENT_HIBERNATE in the same way.These changes are necessary to fix a hibernation regression related
to the i915 driver (ref. http://lkml.org/lkml/2008/2/22/488).Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Pavel Machek <pavel@ucw.cz>
Tested-by: Jeff Chua <jeff.chua.linux@gmail.com>
---
Documentation/power/devices.txt | 13 ++++++++-----
drivers/ata/ahci.c | 2 +-
drivers/ata/ata_piix.c | 2 +-
drivers/ata/libata-core.c | 2 +-
drivers/ide/ppc/pmac.c | 4 ++--
drivers/macintosh/mediabay.c | 3 ++-
drivers/pci/pci.c | 1 +
drivers/scsi/aic7xxx/aic79xx_osm_pci.c | 2 +-
drivers/scsi/aic7xxx/aic7xxx_osm_pci.c | 2 +-
drivers/scsi/mesh.c | 1 +
drivers/scsi/sd.c | 3 +--
drivers/usb/host/sl811-hcd.c | 1 +
drivers/usb/host/u132-hcd.c | 11 ++++++++---
drivers/video/chipsfb.c | 2 +-
drivers/video/nvidia/nvidia.c | 2 +-
include/linux/pm.h | 9 ++++++++-
k...
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=EF=BB=BFHmm. Doesn't compile for me in 2.6.25-rc2-git7:CC [M] drivers/usb/host/u132-hcd.o
drivers/usb/host/u132-hcd.c: In function =E2=80=98u132_suspend=E2=80=99:
drivers/usb/host/u132-hcd.c:3224: error: expected expression before
=E2=80=98int=E2=80=99
drivers/usb/host/u132-hcd.c:3225: error: =E2=80=98ports=E2=80=99 undeclared=
(first use
in this function)
drivers/usb/host/u132-hcd.c:3225: error: (Each undeclared identifier is
reported only once
drivers/usb/host/u132-hcd.c:3225: error: for each function it appears
in.)
make[3]: *** [drivers/usb/host/u132-hcd.o] Error 1
make[2]: *** [drivers/usb/host] Error 2
make[1]: *** [drivers/usb] Error 2
make: *** [drivers] Error 2This fixes it:
Thanks
Mirco---
From: Mirco Tischler <mt-ml@gmx.de>Fixes the following compile error caused by commit
3a2d5b700132f35401f1d9e22fe3c2cab02c2549...
CC [M] drivers/usb/host/u132-hcd.o
drivers/usb/host/u132-hcd.c: In function =E2=80=98u132_suspend=E2=80=99:
drivers/usb/host/u132-hcd.c:3224: error: expected expression before
=E2=80=98int=E2=80=99
drivers/usb/host/u132-hcd.c:3225: error: =E2=80=98ports=E2=80=99 undeclared=
(first use
in this function)
...Signed-off-by: Mirco Tischler <mt-ml@gmx.de>
---
drivers/usb/host/u132-hcd.c | 3 ++-
1 files changed, 2 insertions(+), 1 deletions(-)diff --git a/drivers/usb/host/u132-hcd.c b/drivers/usb/host/u132-hcd.c
index 6fca069..58830b2 100644
--- a/drivers/usb/host/u132-hcd.c
+++ b/drivers/usb/host/u132-hcd.c
@@ -3214,6 +3214,7 @@ static int u132_suspend(struct platform_device
*pdev, pm_message_t state)
return -ESHUTDOWN;
} else {
int retval =3D 0;
+ int ports =3D 0;
=20
switch (state.event) {
case PM_EVENT_FREEZE:
@@ -3221,7 +3222,7 @@ static int u132_suspend(struct platform_device
*pdev, pm_message_t state)
break;
case PM_EVENT_SUSPEND:
case PM_EVENT_HIBERNAT...
Hm, why not to do:
---
drivers/usb/host/u132-hcd.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)Index: linux-2.6/drivers/usb/host/u132-hcd.c
===================================================================
--- linux-2.6.orig/drivers/usb/host/u132-hcd.c
+++ linux-2.6/drivers/usb/host/u132-hcd.c
@@ -3214,6 +3214,7 @@ static int u132_suspend(struct platform_
return -ESHUTDOWN;
} else {
int retval = 0;
+ int ports = MAX_U132_PORTS;switch (state.event) {
case PM_EVENT_FREEZE:
@@ -3221,7 +3222,6 @@ static int u132_suspend(struct platform_
break;
case PM_EVENT_SUSPEND:
case PM_EVENT_HIBERNATE:
- int ports = MAX_U132_PORTS;
while (ports-- > 0) {
port_power(u132, ports, 0);
}
--
Ah, I see it's merged already.
Thanks for the fix, btw! :-)
--
Thanks, applied.
With this, I also find that I dislike the use of suspend/resume for
freezing for STD a lot less. It's still too easy to get confused, but at
least now drivers always have total knowledge about what is really going
on. I'd not like this interface as a driver writer, but now it's not
fundamentally broken any more, just slightly confusing.Linus
--
On Sun, Feb 24, 2008 at 2:43 AM, Linus Torvalds
Tested and working.
Thanks again,
Jeff.
--
Heh, thanks :-).
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
--
Testing now.
Jeff.
--
Great news. It works. STD (platform) does not hang at suspend. And the
annoying green is gone! STR still works.Thanks,
Jeff.
--
Great, thanks for testing.
If Jesse confirms that it also works for him, I'll prepare a cleaner final fix
tomorrow.Thanks,
Rafael
--
Tried "idle=poll" but it has not effect.
Thanks,
Jeff.
--
