Re: [Suspend-devel] 2.6.25-rc2 System no longer powers off aftersuspend-to-disk. Screen becomes green.

Previous thread: [PATCH] do_task_stat: don't take rcu_read_lock() by Oleg Nesterov on Wednesday, February 20, 2008 - 10:09 am. (1 message)

Next thread: [git patches] libata fixes by Jeff Garzik on Wednesday, February 20, 2008 - 10:25 am. (1 message)
From: Jeff Chua
Date: Wednesday, February 20, 2008 - 10:17 am

I don't know what exactly the i915_suspend() and i915_resume() are 
supposed to do because it works better without them.

After inserting "return 0;" right at the top of those two functions, 
suspend (and power-off properly), and resume (without green screen) works 
just fine.

I would like to know what they're for.

Tested suspend-to-ram, and suspend-to-disk, both console and X on notebook 
internal LCD display, all works without these two functions.

But, anyway, got down to just one line in i915_drv.c causing the hang 
during suspend. "pci_set_power_state(dev->pdev, PCI_D3hot);".

And green screen problem during resume is caused by i915_restore_vga(dev);

So, let me where to go from here.


Thanks,
Jeff.




--- linux/drivers/char/drm/i915_drv.c.bad	2008-02-20 
11:29:14 +0800
+++ linux/drivers/char/drm/i915_drv.c	2008-02-21 00:58:37 +0800
@@ -369,7 +369,7 @@
  	if (state.event == PM_EVENT_SUSPEND) {
  		/* Shut down the device */
  		pci_disable_device(dev->pdev);
-		pci_set_power_state(dev->pdev, PCI_D3hot);
+		//pci_set_power_state(dev->pdev, PCI_D3hot);
  	}

  	return 0;
@@ -521,7 +521,7 @@
  	for (i = 0; i < 3; i++)
  		I915_WRITE(SWF30 + (i << 2), dev_priv->saveSWF2[i]);

-	i915_restore_vga(dev);
+	//i915_restore_vga(dev);

  	return 0;
  }
--

From: Jeff Chua
Date: Wednesday, February 20, 2008 - 10:19 am

Tried "idle=poll" but it has not effect.

Thanks,
Jeff.
--

From: Linus Torvalds
Date: Wednesday, February 20, 2008 - 10:28 am

Try suspend-and-resume without X.

Also, try it on one of the more modern laptops - even *with* X.

Basically, the kernel wants to be able to do what X does, because it means 
that when it works, it works _so_ much better than doing it in X. So 
getting it working is definitely worth it.

That said, before you do anything else, try if suspend-to-RAM works. 

That's the primary goal for this code anyway, and if it works that gives a 
good hint. Suspend-to-disk is fundamentally different, and it's entirely 
possible that for the suspend-to-disk case we should just say "screw 
trying to suspend/resume graphics", since you'll have the BIOS resuming 
text-mode anyway, and there are no performance or debugging advantages.

		Linus
--

From: Jeff Chua
Date: Wednesday, February 20, 2008 - 10:37 am

Ok, what's next?

Thanks,
Jeff.
--

From: Linus Torvalds
Date: Wednesday, February 20, 2008 - 10:52 am

Ahh. You're using the BIOS to re-initialize your video, aren't you? 

If STR works without X, then you have something else resuming graphics, 
and that may be what then interacts badly with the fact that the kernel 

Let's try to narrow it down to what the interaction is. Are you using 
something like acpi_sleep=s3_bios or similar? That's what the kernel 
support is supposed to make unnecessary in the long run, along with all 
the video mode flickering (ie we should be able to resume to the video 
mode we want, not flicker through unnecessary modes).

			Linus
--

From: Jeff Chua
Date: Wednesday, February 20, 2008 - 11:02 am

Ok, understand now.

Jeff.
--

From: Romano Giannetti
Date: Thursday, February 21, 2008 - 12:43 pm

Well, as far as I know, s2ram could be doing vbe save/restore for you.

My laptop (a Toshiba satellite U305, intel 945GM chipset, used to need
s2ram -f -p -m to STR correctly. In 2.6.25-rc2 I can simply STR with
echo mem > /sys/power/state.

Romano

I imagine this will be received as blasphemy, but if only ndiswrapper
were not horribly broken, this will be my day-by-day kernel. I just hope
ath5k will arrive to my chipset soon...



--
La presente comunicación tiene carácter confidencial y es para el exclusivo uso del destinatario indicado en la misma. Si Ud. no es el destinatario indicado, le informamos que cualquier forma de distribución, reproducción o uso de esta comunicación y/o de la información contenida en la misma están estrictamente prohibidos por la ley. Si Ud. ha recibido esta comunicación por error, por favor, notifíquelo inmediatamente al remitente contestando a este mensaje y proceda a continuación a destruirlo. Gracias por su colaboración.

This communication contains confidential information. It is for the exclusive use of the intended addressee. If you are not the intended addressee, please note that any form of distribution, copying or use of this communication or the information in it is strictly prohibited by law. If you have received this communication in error, please immediately notify the sender by reply e-mail and destroy this message. Thank you for your cooperation.

--

From: Jesse Barnes
Date: Thursday, February 21, 2008 - 2:02 pm

Really?  This contains confidential information?  I'd better notify you and 
destroy this message now... :)

Jesse
--

From: Jeff Chua
Date: Thursday, February 21, 2008 - 5:20 pm

STR is always working on my X60s. No green screen, no hang. Both s2ram
and "echo mem > /sys/power/state". It's STD that's having problem.

But strange thing is I could even restore console after STD even
without agp and i915 module loaded, so I don't know how the console
vga got saved and restored.

Thanks,
Jeff.
--

From: Rafael J. Wysocki
Date: Thursday, February 21, 2008 - 5:31 pm

Jeff, can you please test hibernation with the patch I've just sent to Jesse
(reproduced below for convenience)?

Thanks,
Rafael

---
 drivers/char/drm/i915_drv.c |    5 +++--
 include/linux/suspend.h     |    2 ++
 kernel/power/disk.c         |   10 +++++++++-
 3 files changed, 14 insertions(+), 3 deletions(-)

Index: linux-2.6/drivers/char/drm/i915_drv.c
===================================================================
--- linux-2.6.orig/drivers/char/drm/i915_drv.c
+++ linux-2.6/drivers/char/drm/i915_drv.c
@@ -27,6 +27,7 @@
  *
  */
 
+#include <linux/suspend.h>
 #include "drmP.h"
 #include "drm.h"
 #include "i915_drm.h"
@@ -222,6 +223,7 @@ static void i915_restore_vga(struct drm_
 			   dev_priv->saveGR[0x18]);
 
 	/* Attribute controller registers */
+	inb(st01); /* switch back to index mode */
 	for (i = 0; i < 20; i++)
 		i915_write_ar(st01, i, dev_priv->saveAR[i], 0);
 	inb(st01); /* switch back to index mode */
@@ -366,9 +368,8 @@ static int i915_suspend(struct drm_devic
 
 	i915_save_vga(dev);
 
-	if (state.event == PM_EVENT_SUSPEND) {
+	if (state.event == PM_EVENT_SUSPEND && !in_hibernation_power_off()) {
 		/* Shut down the device */
-		pci_disable_device(dev->pdev);
 		pci_set_power_state(dev->pdev, PCI_D3hot);
 	}
 
Index: linux-2.6/include/linux/suspend.h
===================================================================
--- linux-2.6.orig/include/linux/suspend.h
+++ linux-2.6/include/linux/suspend.h
@@ -209,6 +209,7 @@ extern unsigned long get_safe_page(gfp_t
 
 extern void hibernation_set_ops(struct platform_hibernation_ops *ops);
 extern int hibernate(void);
+extern bool in_hibernation_power_off(void);
 #else /* CONFIG_HIBERNATION */
 static inline int swsusp_page_is_forbidden(struct page *p) { return 0; }
 static inline void swsusp_set_page_free(struct page *p) {}
@@ -216,6 +217,7 @@ static inline void swsusp_unset_page_fre
 
 static inline void hibernation_set_ops(struct platform_hibernation_ops *ops) {}
 static inline int ...
From: Jeff Chua
Date: Thursday, February 21, 2008 - 5:42 pm

Testing now.

Jeff.
--

From: Jeff Chua
Date: Thursday, February 21, 2008 - 6:01 pm

Great news. It works. STD (platform) does not hang at suspend. And the
annoying green is gone! STR still works.


Thanks,
Jeff.
--

From: Rafael J. Wysocki
Date: Thursday, February 21, 2008 - 6:06 pm

Great, thanks for testing.

If Jesse confirms that it also works for him, I'll prepare a cleaner final fix
tomorrow.

Thanks,
Rafael
--

From: Linus Torvalds
Date: Thursday, February 21, 2008 - 5:46 pm

I don't understand why hibernation just doesn't use a PM_EVENT_HIBERNATE, 
and be done with it?

Why should it be called PM_EVENT_SUSPEND when it isn't?

Adding some external global variables is absolutely the wrong way to fix 
this.

It's not even like there are very many drivers who actually care about 
"state.event" anyway: a 'git grep' returns just 35 users in the whole 
tree, so if this was done this ugly way just to avoid double-chcking the 
other cases that compare against PM_EVENT_SUSPEND, then it really wasn't 
worth it.

		Linus
--

From: Rafael J. Wysocki
Date: Thursday, February 21, 2008 - 5:54 pm

Please relax, we're debugging the thing right now and the patch doesn't
even seem to help on the other affected box.

The issue appears to be more complicated than we initially thought.

Thanks,
Rafael
--

From: Jesse Barnes
Date: Thursday, February 21, 2008 - 6:13 pm

Actually, looks like I forgot to reboot between tests (just rmmod'd & 
modprobed i915), your patch actually does work.

However, making new PM event messages might be a good thing anyway, assuming 
Linus takes it for 2.6.25, since it should make the migration to ->hibernate 
callbacks easier.

Jesse
--

From: Jesse Barnes
Date: Thursday, February 21, 2008 - 6:44 pm

Rafael, I'd actually prefer these changes to the i915 driver.  One is to avoid 
the "green screen" problem and the other is to actually save state at 
hibernate time in case we don't do a POST coming out of S4 (probably not 
common but hey).

Jesse

Make sure hibernation works by not shutting down the video device during 
hibernation power off.  This is important because later stages of the 
hibernation cycle end up touching the video device, which may cause a hang if 
it was disabled early on.  Also make sure the restoration correctly restores 
the AR registers by flipping the ARX register into index mode before doing 
anything.

Depends on Rafael's patch which exports hibernation state to drivers.

Signed-off-by:  Jesse Barnes <jesse.barnes@intel.com>

diff --git a/drivers/char/drm/i915_drv.c b/drivers/char/drm/i915_drv.c
index 35758a6..5e73869 100644
--- a/drivers/char/drm/i915_drv.c
+++ b/drivers/char/drm/i915_drv.c
@@ -27,6 +27,7 @@
  *
  */
 
+#include <linux/suspend.h>
 #include "drmP.h"
 #include "drm.h"
 #include "i915_drm.h"
@@ -222,6 +223,7 @@ static void i915_restore_vga(struct drm_device *dev)
 			   dev_priv->saveGR[0x18]);
 
 	/* Attribute controller registers */
+	inb(st01);
 	for (i = 0; i < 20; i++)
 		i915_write_ar(st01, i, dev_priv->saveAR[i], 0);
 	inb(st01); /* switch back to index mode */
@@ -364,8 +366,8 @@ static int i915_suspend(struct drm_device *dev)
 	i915_save_vga(dev);
 
 	/* Shut down the device */
-	pci_disable_device(dev->pdev);
-	pci_set_power_state(dev->pdev, PCI_D3hot);
+	if (!in_hibernation_power_off())
+		pci_set_power_state(dev->pdev, PCI_D3hot);
 
 	return 0;
 }

--


Below is a patch that introduces PM_EVENT_HIBERNATE as requested by Linus
and (hopefully) makes hibernation with i915 work correctly.

I must admit I don't feel very comfortable introduces PM_EVENT_HIBERNATE at
this point, since such changes tend to introduce unexpected issues all over the
place, but hopefully this time it won't break anything.

I have tested it on the nx6325.

Please review and tell me if it looks good.

Jesse and Jeff, please check if your boxes hibernate correctly with this patch
applied.

Thanks,
Rafael

---
 Documentation/power/devices.txt        |   13 ++++++++-----
 drivers/ata/ahci.c                     |    3 ++-
 drivers/ata/ata_piix.c                 |    3 ++-
 drivers/ata/libata-core.c              |    2 +-
 drivers/char/drm/i915_drv.c            |    4 +++-
 drivers/ide/ppc/pmac.c                 |    6 ++++--
 drivers/macintosh/mediabay.c           |    4 +++-
 drivers/pci/pci.c                      |    1 +
 drivers/scsi/aic7xxx/aic79xx_osm_pci.c |    2 +-
 drivers/scsi/aic7xxx/aic7xxx_osm_pci.c |    2 +-
 drivers/scsi/mesh.c                    |    1 +
 drivers/scsi/sd.c                      |    5 +++--
 drivers/usb/host/sl811-hcd.c           |    1 +
 drivers/usb/host/u132-hcd.c            |   10 +++++++---
 drivers/video/chipsfb.c                |    3 ++-
 drivers/video/nvidia/nvidia.c          |    3 ++-
 include/linux/pm.h                     |    5 +++++
 kernel/power/disk.c                    |    4 ++--
 net/rfkill/rfkill.c                    |    3 ++-
 19 files changed, 51 insertions(+), 24 deletions(-)

Index: linux-2.6/kernel/power/disk.c
===================================================================
--- linux-2.6.orig/kernel/power/disk.c
+++ linux-2.6/kernel/power/disk.c
@@ -391,7 +391,7 @@ int hibernation_platform_enter(void)
 		goto Close;
 
 	suspend_console();
-	error = device_suspend(PMSG_SUSPEND);
+	error = device_suspend(PMSG_HIBERNATE);
 	if (error)
 		goto Resume_console;
 
@@ -404,7 +404,7 @@ ...

I'm doing this part separately, please drop it - it has nothing to do with 
the rest of the patch.

I'd also suggest that you just add a helper function like

	int pm_event_powerdown(struct pm_message mesg)
	{
		return mesg.event >= PM_EVENT_SUSPEND;
	}

or something, so that you can have code like

	if (pm_event_powerdown(mesg))
		pci_set_power_state(pdev, PCI_D3hot);

instead of the test for EVENT_SUSPEND/HIBERNATE explicitly.

Of course, the places that already do a switch-statement are much better 

Didn't you miss the apci_pci_choose_state() thing that also needs this 

Looks like a missing close-brace to me there - you removed the final '}'.

Or am I blind?

Apart from those issues it looks fine to me.

			Linus
--


In the revised patch below I redefined the PM_EVENT_* things as flags so
that I can "or" them and defined PM_EVENT_SLEEP in analogy with



OK, please have a look at the modified patch below.

Thanks,
Rafael

---
 Documentation/power/devices.txt        |   13 ++++++++-----
 drivers/ata/ahci.c                     |    2 +-
 drivers/ata/ata_piix.c                 |    2 +-
 drivers/ata/libata-core.c              |    2 +-
 drivers/ide/ppc/pmac.c                 |    4 ++--
 drivers/macintosh/mediabay.c           |    3 ++-
 drivers/pci/pci.c                      |    1 +
 drivers/scsi/aic7xxx/aic79xx_osm_pci.c |    2 +-
 drivers/scsi/aic7xxx/aic7xxx_osm_pci.c |    2 +-
 drivers/scsi/mesh.c                    |    1 +
 drivers/scsi/sd.c                      |    3 +--
 drivers/usb/host/sl811-hcd.c           |    1 +
 drivers/usb/host/u132-hcd.c            |   11 ++++++++---
 drivers/video/chipsfb.c                |    2 +-
 drivers/video/nvidia/nvidia.c          |    2 +-
 include/linux/pm.h                     |    9 ++++++++-
 kernel/power/disk.c                    |    4 ++--
 net/rfkill/rfkill.c                    |    2 +-
 18 files changed, 42 insertions(+), 24 deletions(-)

Index: linux-2.6/kernel/power/disk.c
===================================================================
--- linux-2.6.orig/kernel/power/disk.c
+++ linux-2.6/kernel/power/disk.c
@@ -391,7 +391,7 @@ int hibernation_platform_enter(void)
 		goto Close;
 
 	suspend_console();
-	error = device_suspend(PMSG_SUSPEND);
+	error = device_suspend(PMSG_HIBERNATE);
 	if (error)
 		goto Resume_console;
 
@@ -404,7 +404,7 @@ int hibernation_platform_enter(void)
 		goto Finish;
 
 	local_irq_disable();
-	error = device_power_down(PMSG_SUSPEND);
+	error = device_power_down(PMSG_HIBERNATE);
 	if (!error) {
 		hibernation_ops->enter();
 		/* We should never get here */
Index: linux-2.6/include/linux/pm.h
===================================================================
--- ...

All right, I'm fine with it. Now we just need to confirm that it works for 
people..

		Linus
--


On Sat, Feb 23, 2008 at 10:07 AM, Linus Torvalds

Looks good. Applied Rafael patch on top of your latest git tree that
has Jesse's i915 fix.

No green screen. Tested with STD (platform), STR, and plain echo mem >
/sys/power/state.

Thanks,
Jeff.
--

From: Rafael J. Wysocki
Date: Saturday, February 23, 2008 - 11:13 am

Thanks for testing.  Below is the final version of the patch with a changelog
etc.

Thanks,
Rafael

---
From: Rafael J. Wysocki <rjw@sisk.pl>

During the last step of hibernation in the "platform" mode (with the
help of ACPI) we use the suspend code, including the devices'
->suspend() methods, to prepare the system for entering the ACPI S4
system sleep state.  At least for some devices the operations
performed by the ->suspend() callback in that case must be different
from its operations during regular suspend.  For this reason,
introduce the new PM event type PM_EVENT_HIBERNATE and pass it to
the device drivers' ->suspend() methods during the last phase of
hibernation, so that they can distinguish this case and handle it as
appropriate.  Modify the drivers that handle PM_EVENT_SUSPEND in a
special way and need to handle PM_EVENT_HIBERNATE in the same way.

These changes are necessary to fix a hibernation regression related
to the i915 driver (ref. http://lkml.org/lkml/2008/2/22/488).

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Pavel Machek <pavel@ucw.cz>
Tested-by: Jeff Chua <jeff.chua.linux@gmail.com>
---
 Documentation/power/devices.txt        |   13 ++++++++-----
 drivers/ata/ahci.c                     |    2 +-
 drivers/ata/ata_piix.c                 |    2 +-
 drivers/ata/libata-core.c              |    2 +-
 drivers/ide/ppc/pmac.c                 |    4 ++--
 drivers/macintosh/mediabay.c           |    3 ++-
 drivers/pci/pci.c                      |    1 +
 drivers/scsi/aic7xxx/aic79xx_osm_pci.c |    2 +-
 drivers/scsi/aic7xxx/aic7xxx_osm_pci.c |    2 +-
 drivers/scsi/mesh.c                    |    1 +
 drivers/scsi/sd.c                      |    3 +--
 drivers/usb/host/sl811-hcd.c           |    1 +
 drivers/usb/host/u132-hcd.c            |   11 ++++++++---
 drivers/video/chipsfb.c                |    2 +-
 drivers/video/nvidia/nvidia.c          |    2 +-
 include/linux/pm.h                     |    9 ++++++++-
 kernel/power/disk.c            ...

Thanks, applied. 

With this, I also find that I dislike the use of suspend/resume for 
freezing for STD a lot less. It's still too easy to get confused, but at 
least now drivers always have total knowledge about what is really going 
on. I'd not like this interface as a driver writer, but now it's not 
fundamentally broken any more, just slightly confusing.

			Linus
--


On Sun, Feb 24, 2008 at 2:43 AM, Linus Torvalds

Tested and working.

Thanks again,
Jeff.
--


=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=EF=BB=BFHmm. Doesn't compile for me in 2.6.25-rc2-git7:

  CC [M]  drivers/usb/host/u132-hcd.o
drivers/usb/host/u132-hcd.c: In function =E2=80=98u132_suspend=E2=80=99:
drivers/usb/host/u132-hcd.c:3224: error: expected expression before
=E2=80=98int=E2=80=99
drivers/usb/host/u132-hcd.c:3225: error: =E2=80=98ports=E2=80=99 undeclared=
 (first use
in this function)
drivers/usb/host/u132-hcd.c:3225: error: (Each undeclared identifier is
reported only once
drivers/usb/host/u132-hcd.c:3225: error: for each function it appears
in.)
make[3]: *** [drivers/usb/host/u132-hcd.o] Error 1
make[2]: *** [drivers/usb/host] Error 2
make[1]: *** [drivers/usb] Error 2
make: *** [drivers] Error 2

This fixes it:

Thanks
Mirco

---
From: Mirco Tischler <mt-ml@gmx.de>

Fixes the following compile error caused by commit
3a2d5b700132f35401f1d9e22fe3c2cab02c2549

...
  CC [M]  drivers/usb/host/u132-hcd.o
drivers/usb/host/u132-hcd.c: In function =E2=80=98u132_suspend=E2=80=99:
drivers/usb/host/u132-hcd.c:3224: error: expected expression before
=E2=80=98int=E2=80=99
drivers/usb/host/u132-hcd.c:3225: error: =E2=80=98ports=E2=80=99 undeclared=
 (first use
in this function)
...

Signed-off-by: Mirco Tischler <mt-ml@gmx.de>
---
 drivers/usb/host/u132-hcd.c |    3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/drivers/usb/host/u132-hcd.c b/drivers/usb/host/u132-hcd.c
index 6fca069..58830b2 100644
--- a/drivers/usb/host/u132-hcd.c
+++ b/drivers/usb/host/u132-hcd.c
@@ -3214,6 +3214,7 @@ static int u132_suspend(struct platform_device
*pdev, pm_message_t state)
                 return -ESHUTDOWN;
         } else {
                 int retval =3D 0;
+		int ports =3D 0;
=20
 		switch (state.event) {
 		case PM_EVENT_FREEZE:
@@ -3221,7 +3222,7 @@ static int u132_suspend(struct platform_device
*pdev, pm_message_t state)
 			break;
 		case PM_EVENT_SUSPEND:
 		case PM_EVENT_HIBERNATE:
-    ...

Hm, why not to do:

---
 drivers/usb/host/u132-hcd.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Index: linux-2.6/drivers/usb/host/u132-hcd.c
===================================================================
--- linux-2.6.orig/drivers/usb/host/u132-hcd.c
+++ linux-2.6/drivers/usb/host/u132-hcd.c
@@ -3214,6 +3214,7 @@ static int u132_suspend(struct platform_
                 return -ESHUTDOWN;
         } else {
                 int retval = 0;
+		int ports = MAX_U132_PORTS;
 
 		switch (state.event) {
 		case PM_EVENT_FREEZE:
@@ -3221,7 +3222,6 @@ static int u132_suspend(struct platform_
 			break;
 		case PM_EVENT_SUSPEND:
 		case PM_EVENT_HIBERNATE:
-                        int ports = MAX_U132_PORTS;
                         while (ports-- > 0) {
                                 port_power(u132, ports, 0);
                         }
--


Ah, I see it's merged already.

Thanks for the fix, btw! :-)

--


Seems to work here after basic tests. ACK.

(I discovered that -rc2 swsusp will not power down in some cases, but
it was here before the patch, too...)
								Pavel

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
--

From: Jesse Barnes
Date: Thursday, February 21, 2008 - 5:23 pm

Your system (either your distro suspend/resume scripts or your platform) must 
be running the video BIOS at resume time, otherwise it would probably come 
back blank.

Jesse
--

From: Jeff Chua
Date: Thursday, February 21, 2008 - 5:42 pm

But I don't think so, unless acpid is doing just that. In my
suspend/resume event script, it's just doing a simple s2ram (without
options), and exit after resume.

Jeff.
--

From: Jesse Barnes
Date: Thursday, February 21, 2008 - 5:46 pm

Your s2ram script is doing your STD also?  Seems counterintuitive.  Anyway, 
some machines also re-POST the GPU on resume from S3; maybe yours is doing 
that.

Jesse
--

From: Jeff Chua
Date: Thursday, February 21, 2008 - 5:52 pm

It's s2ram to do STR, not STD. Sorry for the confusion. But the key
point is there's no GREEN for STR. Mr Green only appear with STD.

Thanks,
Jeff.
--

From: Jesse Barnes
Date: Thursday, February 21, 2008 - 6:02 pm

Ah, ok that makes sense.

So typically, what you'd see at suspend time is this ugly call chain:

1) user requests suspend or hibernate
2) kernel kicks users off VT
3) X calls LeaveVT function of X driver
4) X driver restores whatever video state it felt like saving when it started 
up
5) kernel calls suspend methods
6) machine goes to sleep

then on resume:

1) user requests wakeup
2) kernel calls resume methods
3) X takes back the VT, calling driver EnterVT methods
4) X driver EnterVT routine runs, doing whatever it wants
5) you're back to where you were (on a good day anyway)

So, on your machine, I suspect your firmware is doing enough that X doesn't 
have to save/restore full video state around Enter/Leave VT (the same 
functions called at VT switch time when you press ctl-alt-fx), otherwise 
you'd be missing things like your backlight or text consoles.

So the advantage of the kernel suspend/resume hooks for the DRM layer is that 
the kernel video drivers can do full state save/restore (which X usually 
doesn't do, and isn't really designed to do), so that if your platform 
*doesn't* do it all, you'll still end up with a usable machine in the end.

The fact that you'd started running into problems since we merged this just 
means your platform was taking care of it for you (lucky you) and that we 
have some bugs in the hibernate code that we're just discovering.

Jesse
--

From: Jeff Chua
Date: Thursday, February 21, 2008 - 6:27 pm

That's the main reason for moving to the X series. It's seems to work
very well on Linux.

Thanks,
Jeff.
--

From: Linus Torvalds
Date: Thursday, February 21, 2008 - 6:28 pm

Well, I'm also hoping that eventually we could even just not do the VT 
switch at all, and the kernel can treat X as "just another user process" 
that it freezes.

At least from a mode setting standpoint.

We'd still want to make sure that X repaints the screen if the contents 
were lost, of course. And this is going to depend very intimately on the 
type of graphics card and whether the video RAM is saved by STR or not - 
for the Intel integrated graphics kind of situation, the video RAM will be 
refreshed along with all the other memory, but for other cards we may end 
up having to do the VT switch not so much for modesetting reasons as just 
a way to get X to save and restore all the *other* state.

How close is the i915 driver from not having to even signal X? Or is that 
just a pipedream of mine?

			Linus
--

From: Jesse Barnes
Date: Thursday, February 21, 2008 - 6:35 pm

Drivers supporting kernel modesetting will have to stuff their VRAM somewhere, 

It's there in the modesetting tree (though the requisite changes to avoid VT 
notification aren't done, it should all work fine).

Jesse
--

From: Matthew Garrett
Date: Friday, February 22, 2008 - 3:37 am

The s2ram command has a built-in whitelist used to set up video 
rePOSTing. If you want to test, reboot and do

echo mem >/sys/power/state

without i915 being loaded. If you get a console back on resume then the 
platform is reinitialising video for you, but otherwise it's your 
userspace.

-- 
Matthew Garrett | mjg59@srcf.ucam.org
--

From: Ingo Molnar
Date: Friday, February 22, 2008 - 6:06 am

btw., why isnt there an in-kernel whitelist, with perhaps a dynamic, 
convenient /debug/s2r/whitelist append-API for distros (and testers) to 
add more entries to the whitelist/blacklist? (for cases where the kernel 
whitelist has not caught up yet) Which would eventually converge to 
Utopia: s2ram that just works out of box.
 
This would be a lot more flexible (people could even temporarily extend 
the whitelist via rc.local if need to be, etc.), a lot more robust and a 
lot more user friendly than the "dont use /sys/power/state, rely on some 
user-space tool to work around bugs" approach.

Really, i couldnt make the s2ram API/quirks situation much worse even if 
i deliberately tried to design the whole code to be as hard to use and 
as confusing as possible :-/ These types of half-kernel half-userspace 
solutions usually result in constant finger-pointing and constant lag, 
and they result in about the crappiest user experience that is possible 
to achieve physically.

( Sorry about the strong words, while there's lots of good and positive
  development lately i havent seen much change in this particular area
  of s2ram in the past 1-2 years, and the whole chain is only as strong
  as the weakest link - so someone finally has to deliver this message
  to the cozy fire of s2r hackers while our testers and users are 
  standing out in the cold rain ... )

	Ingo
--

From: Rafael J. Wysocki
Date: Friday, February 22, 2008 - 9:10 am

The problem with the whitelists is that they have to use quite a lot of data to
reliably match the system.  The s2ram whitelist is not 100% reliable, because
it uses too little information to distinguish different versions of the same
machine model, for example.

Plus, in an ideal world, we should be able to match all possible working
graphics/chipset/BIOS combinations and that would be quite a bit of a database.
Also, there are some quirks that need to be run from the user land, AFAICS
(eg. in an i86 real-mode compatible manner).

IMO, whitelisting is not a solution.  It's only a sort-of-working workaround
and as such it shouldn't be put into the kernel.

Thanks,
Rafael
--

From: Linus Torvalds
Date: Friday, February 22, 2008 - 9:50 am

The big problem with that is
 - the people who know about the devices are usually not kernel people
 - the workarounds that the whitelist requires is quite often not a kernel 
   workaround.

In other words, the most common workarounds for the s2ram whitelist is 
usually to do things like running vbetool in user-level to do VGA register 
save/restore (VBE_POST and VGE_SAVE). Sure, the kernel could do that with 
usermodehelper etc, but s2ram also has those things as command line flags 
etc, so...

		Linus
--

From: Matthew Garrett
Date: Friday, February 22, 2008 - 11:01 am

Because all of these video quirks are just workarounds for the fact that 
the kernel doesn't work properly. In general, you really don't want to 
call a real-mode video bios from the kernel, so punting it to userspace 
(and leaving the whitelisting there) is somewhat more straightforward. 
In addition, we can then extend the whitelist without requiring kernel 

We've got i915 suspend/resume now, which already fixes this for a large 
number of users. Recent ATI is easy, now that we actually have specs for 
ATOM. The nouveau guys are almost at the point where we can do it for 
nvidia. That basically just leaves VIA.

The other s2r issues are pretty much just driver bugs at this point.

-- 
Matthew Garrett | mjg59@srcf.ucam.org
--

From: Ingo Molnar
Date: Saturday, February 23, 2008 - 4:17 am

ok - sounds good :-)

	Ingo
--

From: Jeff Chua
Date: Wednesday, February 20, 2008 - 10:54 am

Linus, guess I missed this part ... so before touch anything, I did
tried suspend-to-ram, and it works on console and in X.

And suspend-to-disk hangs, but I can still press and hold the power
button to power it off. Then upon powering on and resume, I get the
ugly green "console" screen. I can still type and move around.
Starting X runs fine. Ctrl-Alt-Del or switching back to console will
get back to the green screen.

Thanks,
Jeff.
--

From: Linus Torvalds
Date: Wednesday, February 20, 2008 - 11:37 am

The "press and hold for five seconds" is actually a hardware feature of 
the southbridge (well, I guess there is "software" in there too, but it's 
the embedded kind). So the fact that it powers off at that point means 
nothing, it just means that ok, your kernel is hung, but the hardware
still works ;)

This *sounds* like some part of the suspend-to-disk sequence is doing 
something stupid like trying to access the screen after it has been turned 
off, which doesn't surprise me at all. My oft-stated opinion has been that 
suspend-to-disk isn't a suspend at all, and should never have been 
confused with "suspending" anything.

It's "snapshot-and-restore", and my opinion is that:

 - it should *never* call "suspend()"/"resume()" at all (that should be
   reserved purely for suspend-to-RAM and has real power management 
   issues!)

 - it should have a totally separate "halt/unhalt/restore" thing 
   that has nothing what-so-ever to do with power management, and is 
   purely about stopping the hardware for things like USB and network 
   cards (which otherwise do things like scan their command lists 
   asynchronously) and making sure that the driver state is consistent 
   with that stopped hw state.

 - the people who confuse snapshot/restore with suspend/resume are 
   horrible people that cause problems exactly because driver people then 
   get those things mixed up, and something like the video suspend/resume 
   should probably never have impacted suspend-to-disk in the first place!


.. so this implies that while the laptop apparently hung at the end of the 
snapshotting, the snapshotting did actually work, and it must have hung at 
the very end, presumably when it tried to actually turn the power off.

So there seems to be two (probably largely independent) problems:

 - the hang at shutdown that requires you to press-and-hold the power 
   button to actually cut power.

   At a guess: putting the VGA device into D3hot makes the ACPI code that 
   ...
From: Jeff Chua
Date: Wednesday, February 20, 2008 - 11:49 am

Here's an interesting discovery. After I found that "echo reboot >
/sys/power/disk" does reboot, I tried "echo shutdown >
/sys/power/disk", it does shutdown properly.

With "platform" it refuses to shutdown. Both reboot and shutdown still



Yes.


Thanks,
Jeff.
From: Matthew Garrett
Date: Wednesday, February 20, 2008 - 12:25 pm

That kind of suggests that the ACPI platform code is hitting the 
hardware directly - we've seen similar issues with PATA controllers. The 
right thing to do here is almost certainly just to avoid explicitly 
powering down hardware on hibernation.

-- 
Matthew Garrett | mjg59@srcf.ucam.org
--

From: Jesse Barnes
Date: Wednesday, February 20, 2008 - 11:57 am

Totally agreed.  I remember when I started getting hibernation bug reports 
against this new code and boggling at how hibernate was actually done.  The 
driver actually gets its ->suspend routine called twice with two different 
pm_message_t values.  We tried to do different stuff depending on the 
pm_message_t (like only putting the device in D3hot if PM_EVENT_SUSPEND), but 

Sounds like a good theory... now if we could just use set_power_state in the 
suspend case only.  That's what the latest code *tries* to do...

JEsse
--

From: Pavel Machek
Date: Sunday, February 17, 2008 - 11:31 pm

Hmm, entering S4 seems like good place to call suspend() for... unless
you want separate freeze()/unfreeze(), suspend()/resume(),
suspend_s4() and halt() callbacks.
						Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
--

From: Jesse Barnes
Date: Wednesday, February 20, 2008 - 10:50 am

They're for saving and restoring GPU state across suspend/resume.  They're 
particularly useful if your machine doesn't re-POST at resume time.  In that 
case your GPU may be totally uninitialized, so either the kernel or X has to 


I know I fixed that problem in at least one configuration...  Can you try:
  # echo test > /sys/power/disk
  # echo disk > /sys/power/state
and see if that also turns your screen green?

Also, getting a GPU register dump would be helpful.  The intel_reg_dumper tool 
is built as part of the xf86-video-driver build 
(git://anongit.freedesktop.org/git/xorg/driver/xf86-video-intel), can you 
pull that down and try it out?

Thanks,
Jesse
--

From: Jeff Chua
Date: Wednesday, February 20, 2008 - 11:29 am

Yes, still green. But I got it to actual reboot with ...

echo reboot > /sys/power/disk


Attached are the two dumps from console. One prior to suspend, and one
after resume.

Thanks,
Jeff.
From: Jesse Barnes
Date: Wednesday, February 20, 2008 - 11:53 am

Looks like the AR registers are hosed, which is what I thought I fixed...  Can 
you attach your i915_drv.c file just so I can sanity check it?

Thanks,
Jesse
--

From: Jeff Chua
Date: Wednesday, February 20, 2008 - 12:10 pm

Attached.

Thanks,
Jeff.
From: Jesse Barnes
Date: Wednesday, February 20, 2008 - 12:18 pm

Ok, so Linus' theory about something later in the resume path trying to touch 
video is looking good.


Hm, looks right.  Let me see if I can reproduce this on my T61.

Thanks,
Jesse
--

From: Jesse Barnes
Date: Wednesday, February 20, 2008 - 1:09 pm

Given the way the PM core works, do we need to set a flag like this?  I really 
hope there's a better way of doing this...

Thanks,
Jesse

diff --git a/drivers/char/drm/i915_drv.c b/drivers/char/drm/i915_drv.c
index 4048f39..a2d6242 100644
--- a/drivers/char/drm/i915_drv.c
+++ b/drivers/char/drm/i915_drv.c
@@ -238,6 +238,13 @@ static void i915_restore_vga(struct drm_device *dev)
 
 }
 
+/*
+ * If we're doing a suspend to disk, we don't want to power off the device.
+ * Unfortunately, the PM core doesn't tell us if we're headed for a regular
+ * S3 state or that it's about to shut down the machine, so we use this flag.
+ */
+static int i915_hibernate;
+
 static int i915_suspend(struct drm_device *dev, pm_message_t state)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
@@ -252,6 +259,9 @@ static int i915_suspend(struct drm_device *dev, 
pm_message_t state)
 	if (state.event == PM_EVENT_PRETHAW)
 		return 0;
 
+	if (state.event == PM_EVENT_FREEZE)
+		i915_hibernate = 1;
+
 	pci_save_state(dev->pdev);
 	pci_read_config_byte(dev->pdev, LBB, &dev_priv->saveLBB);
 
@@ -366,7 +376,7 @@ static int i915_suspend(struct drm_device *dev, 
pm_message_t state)
 
 	i915_save_vga(dev);
 
-	if (state.event == PM_EVENT_SUSPEND) {
+	if (!i915_hibernate) {
 		/* Shut down the device */
 		pci_disable_device(dev->pdev);
 		pci_set_power_state(dev->pdev, PCI_D3hot);
@@ -385,6 +395,8 @@ static int i915_resume(struct drm_device *dev)
 	if (pci_enable_device(dev->pdev))
 		return -1;
 
+	i915_hibernate = 0;
+
 	pci_write_config_byte(dev->pdev, LBB, dev_priv->saveLBB);
 
 	/* Pipe & plane A info */
--

From: Rafael J. Wysocki
Date: Wednesday, February 20, 2008 - 1:14 pm

Then, the .resume() called after the image creation will clear the flag and I
don't think it's safe to allow it to survive i915_resume() ...

Thanks,
Rafael
--

From: Linus Torvalds
Date: Wednesday, February 20, 2008 - 1:29 pm

Yeah. By *not* using "->suspend()" for freezing or hibernate.

Please, Rafael - just make the f*cking suspend-to-disk use other routines 
already. 99% of all hardware needs to do exactly *nothing* on 
suspend-to-disk, and the ones that really do need things tend to need to 
not do a whole lot.

For example, the "freeze" action for USB (which is one of the hardest 
things to suspend) should literally be something like just setting the 
controller STOP bit, and waiting for it to have stopped. The "unfreeze" 
should be to just clear the stop bit, while the "restart" should be just a 
controller reset to use the current memory image.

NONE OF THIS HAS ABSOLUTELY ANYTHING TO DO WITH SUSPEND.

It never did. I've told people so for years. Maybe actually seeing the 
problems will make people realize.

So please, we shouldn't call "->suspend[_late]" or "->resume[_early]" at 
all. Not with PMSG_FREEZE, not with PMSG_*anything*.

Can we please get this fixed some day? 

		Linus
--

From: Jesse Barnes
Date: Wednesday, February 20, 2008 - 1:41 pm

In talking with Rafael on IRC about this, I think we're agreed that we need 
separate entry points.  Even with a kexec based hibernate, we'll probably 
want ->hibernate callbacks so we don't end up shutting down the device.

The current callback system looks like this (according to Rafael and the last 
time I looked):
  ->suspend(PMSG_FREEZE)
  ->resume()
  ->suspend(PMSG_SUSPEND)
  *enter S3 or power off*
  ->resume()
The fact that we get suspend/resume called once before suspend again in the 
hibernate case is somewhat obnoxious, but it's even worse that we don't know 
what we're about to enter after ->suspend(PMSG_SUSPEND).  So in the short 
term it would be nice to at least get the target state exported.

And in the long term we could have:
  ->suspend()
  *enter S3*
  ->resume()
or:
  ->hibernate()
  *kexec to another kernel to save image*
  *power off*
  ->return_from_hibernate() (or somesuch)

Jesse
--

From: Linus Torvalds
Date: Wednesday, February 20, 2008 - 2:13 pm

Yes, it's very messy.

It's messy for a few different reasons:

 - the one you hit: a driver actually has a really hard time telling what 
   PMSG_SUSPEND really means.

 - more importantly, we generally don't want to "suspend/resume" the 
   hardware at all around a power-off, because we're going to resume with 
   the state at the time of the PMSG_FREEZE, which means that the hardware 
   has actually *changed* and been used in between!

that second case is very fundamental for things like USB devices, which in 
theory you can hold alive over a real suspend event (ie a STR event), but 
which absolutely MUST NOT be resumed over a suspend-to-disk event, because 
all the low-level request state is bogus!

So the "->resume" really isn't a resume at all. It's much closer to a 
"->reset".

Of course, the "solution" to this all right now is that we have to reset 
everything even if it *is* a suspend event, so it basically means that STR 
ends up using the much weaker model that snapshot-to-disk uses.

The fundamental problem being that the two really have nothing 

Yes, apart from all the complexities (suspend_late/resume_early). So in 
reality it's more than that, but the suspend/resume things are clearly 
nesting, and they have the potential to actually keep state around 
(because we *know* this machine is not going to mess with the devices in 
between).

IOW, here we actually can have as an option "assume the device is there 

Enough people don't trust kexec that I suspect the right thing simply is

	->freeze()		// stop dma, synchronize device state
	*snapshot*
	->unfreeze();		// resume dma
	*save image*
	[ optionally ->poweroff() ]	// do we really care? I'd say no
	*power off*
	->restore()		// reset device to the frozen one

which may have four entry-points that can be illogically mapped to the 
suspend/resume ones like we do now, but they really have nothing to do 
with suspending/resuming.

And notice how while "freeze/restore" kind of pairs like a ...
From: Jesse Barnes
Date: Wednesday, February 20, 2008 - 2:44 pm

Well, it seems like we'll have to fix drivers in either case, and isn't a 
kexec approach fundamentally more sound and simple, design-wise?  Rafael 
pointed out some problems with properly setting wakeup states, but I think 

Yeah, definitely.  It has to be much more robust and deal with configuration 
changes, etc. (within reason).

Jesse
--

From: Linus Torvalds
Date: Wednesday, February 20, 2008 - 3:22 pm

Absolutely. 

Two big reasons:

 - debuggability

   I know we don't do this correctly right now, but I want to be able to 
   at least feel like we can some day actually do printk's etc through 99% 
   of the suspend/resume cycle. It's a *huge* thing for debugging problems 
   that happen in the wild, and one of the biggest issues is that we 
   currently usualyl just get a "the machine died" message when suspend or 
   resume doesn't work.

   Yes, doing printk's to the Intel management flash stuff can help a lot 
   here, and I want that too, but I'd really like to shut down consoles 
   individually rather than having the "big hammer" approach that shuts 
   them up entirely over the whole suspend/resume sequence (or not at all, 
   if you use "no_console_suspend").

   And I'd *really* like to do things like VGA-console shutdown in the 
   late phase (and resume early).

 - it's actually likely *much* simpler for some devices. 

   Simple devices (and that includes things like PCI bridges etc, but 
   also potentially USB host controllers etc) are things that can often be 
   trivially suspended - all the complexity is really not in the 
   controller itself, but beyond, in the bus that it actually drives.

   And the late-suspend/early-resume means that you don't have to worry 
   about things like interrupts happening while you're suspended. Yes, 
   putting the device into D3 will disable interrupts from that device too 
   (unless there are bugs), *BUT* you may be sharing an interrupt line, 
   and interrupts may be posted and delayed, so an earlier interrupt may 
   well be pending etc.

   suspending late and resuming early just avoids those issues entirely.

Sometimes these things interact. For example, firewire is certainly not 
trivial to suspend as a "subsystem" thing (ie all the devices behind the 
firewire bridge need to do magic things, like spinning down etc that 
obviously can not happen in the final "late" phase), but the firewire ...
From: david
Date: Thursday, February 21, 2008 - 1:30 am

I've been watching for kexec hibernate for a little while now, and the 
last I saw was that acpi was incompatible with the kexec hibernate (but 
the suspend folks were still claiming that devices needed to be put in the 
'right mode' not just powered off. I've been waiting to see this resolved.

David Lang
--

From: Mark Lord
Date: Friday, February 22, 2008 - 9:56 am

david@lang.hm wrote:
..

Yeah, exactly.  What's so special about poweroff on hibernation?
Why even bother with the special "S4" state there?
I want a real full poweroff, or at least I think I do.  Why wouldn't I?

????
--

From: Rafael J. Wysocki
Date: Friday, February 22, 2008 - 10:02 am

(1) To be able to wake up with the help of devices that can't wake
    the system up from S5 (power off)

You may want that, some people may not want it.

We are supposed to handle S4, the BIOS/platform may expect us to do that, so
IMO this is a good enough reason to do it.  Especially that we can.

Thanks,
Rafael
--

From: Mark Lord
Date: Friday, February 22, 2008 - 10:32 am

..

That's the theory.  I've read about it, but have yet to imagine
any real-life situation where it applies.

But this isn't my speciality, so.. do you have experience with any real examples?


--

From: Rafael J. Wysocki
Date: Friday, February 22, 2008 - 10:44 am

Yup.  The fan in my notebook behaves incorrectly after a resume from
hibernation if S5 is entered instead of S4 during it.

I don't know why exactly it happens, but that's how it goes.

Also, some machines are reported to behave incorrectly after a "shutdown"
mode hibernation, while the same machines work just fine after a "platform"
mode hibernation.  So at least for these machines it seems to matter.

Thanks,
Rafael
--

From: david
Date: Friday, February 22, 2008 - 12:23 pm

so if you power off your laptop the fan doesn't work when you turn it back 

given that we don't have a pure "shutdown" option available to try I don't 
see how this can be said to have been tested.

currently any attempts to do a shutdown type hibernate are tangled in the 
other code that is there for the suspend modes. this makes it _very_ hard 
to say that the hardware requires something as opposed to the strong 
possibility that the software is doing something wrong.

there are also a _lot_ of people who are not able to reliably use the 
existing "platform" mode hibernation, so it's not a fair statement to say 
that it's the 'right' thing to do. If you want to make it an option, fine. 
But please give those of us who don't care about these other wakeup 
options, and who want to be able to use other OS's while linux is stopped 
an option as well.

David Lang
--

From: Rafael J. Wysocki
Date: Friday, February 22, 2008 - 4:16 pm

There is such an option.  Put

# echo shutdown > /sys/power/disk

into the init scripts and it will do the trick.

Thanks,
Rafael
--

From: Rafael J. Wysocki
Date: Wednesday, February 20, 2008 - 3:36 pm

In fact the driver can find out in which state to put the device into,




We do, if there are devices that wake us up from S4 and don't wake us up from
S5, for example.  Plus this f*cking fan in my box that doesn't work after the




Agreed.

Thanks,
Rafael
--

From: Linus Torvalds
Date: Wednesday, February 20, 2008 - 4:13 pm

And by "right low power states" you mean "wrong low-power states", right?

The thing is, they really *are* the wrong states for 99% of all hardware.

If you really have a piece of hardware that you want to have the 
"->poweroff()" thing do the same as "->suspend()", then hey, just use the 
same function (or better yet, use two different functions with a call to a
shared part).

Because IT IS NOT TRUE that ->suspend() puts the devices in the "right 
power state". The power states are likely to be totally different for S3 
and for poweroff, and they are going to differ in different ways depending 
on the device type.

One example would be the one that started this version of the whole 
discussion (shock horror! We're on subject!) ie when you do a system 
shutdown, you generally do not even *want* to put individual devices into 
low-power states at all, because the actual "power off the system" thing 
will take care of it for you much better.

So to take just something as simple as VGA as an example: you really do 
not want to suspend that device, because you want to see the poweroff 
messages until the very end. 

So that final device ->poweroff function really has absolutely *nothing* 
in common with the device ->suspend[_late] functions, simply because 
almost any sane driver would decide to do different things.

Of course, we can continue to do the insane thing and just continue to use 
inappropriate and misleadign function callback names, and then encodign 
what the *real* action should be in the argument and/or in magic 
system-wide state parameters.

So in that sense, it's certainly totally the same thing whether we call it 
->shutdown or ->poweroff or ->eat_a_banana, since you could always just 
look at the argument and other clues, and decide that *this* time, for 
*this* kind of device, the "eat a banana" callback actually means that we 
should power it off, but wouldn't it be a lot more logical to just make it 
clear in the first place that they aren't ...
From: Rafael J. Wysocki
Date: Wednesday, February 20, 2008 - 4:35 pm

In fact we have acpi_pci_choose_state() that tells the driver which power
state to put the device into in ->suspend().  If that is used, the device ends

No.  Again, if there are devices that wake us up from S4, but not from S5,
they need to be handled differently in the *enter S4* case (hibernation) and

Yes, it would.  Still, the common thing is, it (ie. ->poweroff) _may_ want to

To clarify, I agree that we should use different callbacks for hibernation.

Well, I agree with that.

As I said before, that's mainly because I've been busy with other stuff
recently.  Now, with the Alex's help, I'm hoping to take care of it soon.

Thanks,
Rafael
--

From: Linus Torvalds
Date: Wednesday, February 20, 2008 - 5:00 pm

First off, nobody should *ever* use that directly anyway.

Secondly, the one that people should use ("pci_choose_state()") doesn't 
actually do what you claim it does. It does all kinds of wrong things, and 

And again, what does this have to do with (the example I used) the 
graphics hardware? Answer: nothing. The example I gave you we simply DO 
THE WRONG THING FOR.

Same thing for things like USB devices - where pci_choose_state() doesn't 
work to begin with. Why do we call "suspend()" on such a thing when we 
don't want to suspend it? We shouldn't. We should call "freeze/unfreeze" 
(which are no-ops) and then finally perhaps "poweroff", and that final 
stage might want to spin things down or similar.

But *none* of it has anything to do with suspend, and none of it has 
anything to do with pci_choose_state() (much less acpi_pci_choose_state)

The fact is, we should let the driver decide, and we should make it clear 
to the driver writer what he is deciding about - rather than basically lie 
and say "suspend the device and put it into D3" even when that's the last 
thing it should ever do.

			Linus
--

From: Rafael J. Wysocki
Date: Wednesday, February 20, 2008 - 5:13 pm

Well, if platform_pci_choose_state() is defined, pci_choose_state() returns
its result and on ACPI systems that points to acpi_pci_choose_state(), so in

I'm already convinced, really. :-)

Thanks,
Rafael
--

From: Linus Torvalds
Date: Wednesday, February 20, 2008 - 5:25 pm

Did you check closer?

I repeat: acpi_pci_choose_state() (when called from pci_choose_state()) 
doesn't even look at the target 'state'. It just blindly assumes that you 
want the deepest sleep-state you can have.

Which happens to be correct for normal suspend, but means that if you want 
to test other states (through '/sys/devices/.../power'), that sounds 
broken.

I didn't check any closer, but go check it yourself. The short and sweet: 
acpi_pci_choose_state() totally ignores its 'state' argument. Do you 
really think that's correct? But yes, "pci_choose_state()' effectively 
does that too, apart from PM_EVENT_ON, which is never used.

(But the whole and only point of pci_choose_state() was to do the 
PM_EVENT_FREEZE thing differently, which it doesn't do, so I think the 
real issue here is that the interface is really rather mis-designed)

I suspect most people who ever really looked and worked on this code had a 
specific device in mind, and I'm sure that all of the code individually 
always ends up making sense from the standpoint of some specific device 
driver. It's just that it never seems to make sense from a bigger issues 
standpoint, and often seems senseless from the standpoint of other devices 
of other types.

			Linus
--

From: Rafael J. Wysocki
Date: Wednesday, February 20, 2008 - 5:59 pm

acpi_pm_device_sleep_state() (that is called by acpi_pci_choose_state())
takes the target state directly from the ACPI layer.

We just want to get rid of the argument passed to ->suspend() eventually, but
there may be many _suspend_ states available (eg. "mem" and "standby") and
for each of them there may be different constraints on the device's state.  We
have to tell the driver which device states are possible in the target system
sleep state.  Right now we arbitrarily choose the one with the lowest power

This interface is not available any more (ie. there's only "wakeup" in


You're wrong, sorry.  With PM_EVENT_FREEZE it wouldn't even be necessary.
It's there, because potentially there are many possibilities with
PM_EVENT_SUSPEND and in fact it shouldn't even be used with
PM_EVENT_FREEZE.

All of this is more or less orthogonal to the issue at hand, which boils down
to the fact that we use the _suspend_ callbacks for hibernation and we
shouldn't be doing that.

Thanks,
Rafael
--

From: Mark Lord
Date: Friday, February 22, 2008 - 9:54 am

..

Something I've never understood, is why we would ever want to bother with *S4* at all?

I actually like hibernation (great for travelling), but I treat it as if
it were a complete power-off (S5?).  I pull batteries, unplug drives,
boot other operating systems, etc..

And when I put it all back together again with the Linux disk inserted,
I fully expect it to "resume" from the hibernation of 3 months ago.
And it does.

Why would I ever want anything less than a full poweroff for hibernation ????

Thanks.
--

From: Nigel Cunningham
Date: Wednesday, February 20, 2008 - 3:45 pm

Hi.


No. AFAICS, kexec is going to be more complex and ugly in many ways.

To summarise, a kexec based hibernation is going to need the following 
additional requirements to just replace what we already have:

- get the original kernel to allocate storage while racing against the 
rest of the system (currently allocation is done post-atomic copy & 
post-freezing - no racing). This makes it potentially slower, too;
- get the original kernel to transfer the information about what swap 
was allocated to the kexec'd kernel, probably together with a lot of 
other information (which pages are nosave etc).
- get the original kernel to keep memory free for the kexec'd kernel 
which would otherwise be usable. Not a biggy on desktops or laptops, but 
think about embedded.
- people keep talking about hibernating to an ext3 fs mounted on fuse as 
a limitation of the freezer. To do that with kexec, you're still going 
to have to bmap the ext3 fs and pass the block list (in which case we 
can also do it without kexec) or umount all the ext3/fuse part and 
remount in the kexec'd kernel. Sort of defeats the purpose, doesn't it?

I also wonder about how much of a pain it's going to be setting up 
userspace for this kexec'd kernel. Will you need a separate partition 
just for it? If not, will the userspace be loaded into memory all the 
time (more memory wasted for normal use), or loaded from ordinary 
partitions at kexec time (how to do safely? - more info to transfer 
between kernels?).

I'd love it if kexec really was the panacea to the freezer issues, but 
problems like these make me think it isn't a viable solution.

Nigel
--

From: Matthew Garrett
Date: Wednesday, February 20, 2008 - 5:13 pm

No, with a freezer-based model you can basically *never* suspend to 
anything related to FUSE or a userspace USB device or anything involving 
userspace iSCSI initiators or whatever. Sure, there are cases where 
moving away from the current model doesn't buy you anything, but that 
doesn't mean that the current model is a good thing. It's not. The 

You're looking at a tiny amount of memory when compared to current 
systems. It's really not a problem.

-- 
Matthew Garrett | mjg59@srcf.ucam.org
--

From: Nigel Cunningham
Date: Wednesday, February 20, 2008 - 5:40 pm

Hi.


Putting drivers and filesystems in userspace is the fundamentally broken 
concept. Not just when it comes to the freezer. The whole idea is 
inherently racy. You can draw silly diagrams about how the freezer 
supposedly works in LCA slides and spread FUD as much as you like. In 
the end, though, it's not nearly as hit-and-miss as you say, and 
replacing the freezer with a kexec based freezer is only going to create 

Please, quantify 'tiny'. In embedded, 5MB can be too much. I've worked 
on embedded solutions. I'm not pulling problems out of thin air.

Regards,

Nigel


--

From: Greg KH
Date: Wednesday, February 20, 2008 - 5:46 pm

Racy with regards to other things becides trying to suspend a machine?
If so, what?

thanks,

greg k-h
--

From: Nigel Cunningham
Date: Wednesday, February 20, 2008 - 6:17 pm

Hi.


That depends on what sort of tangled web you want to weave. Low memory 
situations is one other situation that occurs to me quickly, especially 
(though not only) if your ability to swap were to depend upon a 
userspace driver and/or filesystem.

Regards,

Nigel
--

From: Greg KH
Date: Wednesday, February 20, 2008 - 9:43 pm

Lots of them :)

We have tanks running Linux using userspace USB drivers for vision
control systems (scary, I know...)  They seem to be successfully running
for many years now, and I'm interested in making sure those kinds of
things keep working.

We also have laser welding robots with userspace PCI drivers in car
manufacturing plants.  And other laser cutting robots slicing wood in
patterns moving at a rate of over 3 meters a second.  Again, with
userspace drivers and Linux.

Those users would also love to know of any potential problems you know

Sure, swap over a userspace filesystem or driver isn't a sane idea.  And
neither is swaping over NFS over a PPP connection attached to a USB to
serial device.  Yes, it's possible, and all in the kernel, but not a
wise decision.

Other than foolish configurations, if you come up with other issues
surrounding userspace drivers that could cause problems, please let me
know.

thanks,

greg k-h
--

From: Nigel Cunningham
Date: Wednesday, February 20, 2008 - 11:05 pm

Hi Greg.


A simple OOM condition isn't an issue? Surely a driver stalling because 
some of its memory gets swapped out just before it goes to use it would 
be a problem if it resulted in getting the length of a cut wrong or 
caused some distorted vision or a late turn :>

Am I missing something? Maybe these drivers mlock memory to avoid those 
issues or something like that?

Regards,

Nigel
--

From: Greg KH
Date: Wednesday, February 20, 2008 - 11:37 pm

I think the mlock their memory to prevent this from happening, it's not
hard when you control all the applications on the box :)

thanks,

greg k-h
--

From: Matthew Garrett
Date: Wednesday, February 20, 2008 - 6:10 pm

I'm really not interested in debating the matter. There are all sorts of 
potential uses for the freezer, but hibernation isn't one of them. We 
*need* to get rid of the freezer for suspend to RAM (because a band-aid 
to ensure atomicity is kind of pointless when the operation you're 
entering is inherently atomic), and once all the drivers are able to 
deal with that then it's trivial to get rid of it for hibernation as 
well. Arguing that the reality of userspace drivers is broken doesn't 

Then the in-kernel solution has already lost anyway, and I'm desperately 
unconcerned about out of tree stuff.
-- 
Matthew Garrett | mjg59@srcf.ucam.org
--

From: Nigel Cunningham
Date: Wednesday, February 20, 2008 - 6:25 pm

Hi.


Re suspend to ram, I agree. No argument there. Re hibernation, I think 
your assertion that it will be trivial to get rid of it for hibernation 
is just plain wrong. Perhaps you don't understand the issues as well as 
you think you do.

Re arguing that the reality of userspace drivers is broken doesn't help 
here: Yeah, I know. But sometimes if you point out broken ideas for long 
enough, people do actually listen. Or you learn. Or both.

Frankly, I don't want to debate the issue either. What I really want is 
just to have a hibernation implementation that works, is flexibile, 
reliable and quick, and one that I don't have to keep maintaining. 
Unfortunately for me, most people seem to be more concerned with fixing 
hypothetical problems than with giving users something they can actually 

I know. I'd submit it, or work on breaking it into pieces and submitting 
them one at a time, but that seems to me to be a waste of time.

Nigel
--

From: Rafael J. Wysocki
Date: Wednesday, February 20, 2008 - 1:45 pm

Okay, I think I'll just start sending patches for that, but rather not earlier
than in the 2.6.27 time frame.  No one else works on that and I've been busy
with other things recently.  Besides, I'm not even a full time kernel


Yes, we can (hopefully).

Thanks,
Rafael
--

From: Alexey Starikovskiy
Date: Wednesday, February 20, 2008 - 2:26 pm

Rafael,
If I can help, please  say so.

Regards,

--

From: Pablo Sanchez
Date: Wednesday, February 20, 2008 - 1:33 pm

On Wednesday 20 February 2008 at 3:29 pm, Linus Torvalds penned

I can't say I even come close to understand what's going on but
getting s2ram to work on my Dell M4300 has been a nightmare.  Even
after writing up how to get it to work (posted on the suspend-devel
list - but no one answered .. yet again), I'm having some quirks.

If I had a bizillion $'s, I'd buy an M4300 for Linus and give him a
million to get it to s2ram!  :p

Cheers,
-- 
Pablo Sanchez - Blueoak Database Engineering, Inc
Ph:    819.459.1926          Toll free:  888.459.1926
Fax:   603.720.7723 (US)     Text Page:  pablo_p@blueoakdb.com

--

From: Jesse Barnes
Date: Wednesday, February 20, 2008 - 2:37 pm

Ok, can you give this patch a try with the 'platform' method?  It should at 
least tell us what ACPI would like the device to do at suspend time, but it 
probably won't fix the hang.

Thanks,
Jesse

diff --git a/drivers/char/drm/i915_drv.c b/drivers/char/drm/i915_drv.c
index 4048f39..d8aa2c9 100644
--- a/drivers/char/drm/i915_drv.c
+++ b/drivers/char/drm/i915_drv.c
@@ -366,11 +366,11 @@ static int i915_suspend(struct drm_device *dev, 
pm_message_t state)
 
 	i915_save_vga(dev);
 
-	if (state.event == PM_EVENT_SUSPEND) {
-		/* Shut down the device */
-		pci_disable_device(dev->pdev);
-		pci_set_power_state(dev->pdev, PCI_D3hot);
-	}
+	/* Ask ACPI which state the device should be put in */
+	pci_disable_device(dev->pdev);
+	printk("calling pci_set_power_state with %d\n",
+	       acpi_pci_choose_state(dev, state));
+	pci_set_power_state(dev->pdev, acpi_pci_choose_state(dev, state));
 
 	return 0;
 }
@@ -380,7 +380,7 @@ static int i915_resume(struct drm_device *dev)
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	int i;
 
-	pci_set_power_state(dev->pdev, PCI_D0);
+	pci_set_power_state(dev->pdev, acpi_pci_choose_state(dev, state));
 	pci_restore_state(dev->pdev);
 	if (pci_enable_device(dev->pdev))
 		return -1;
--

From: Jeff Chua
Date: Wednesday, February 20, 2008 - 5:35 pm

I can't get it to compile.

drivers/char/drm/i915_drv.c: In function 'i915_suspend':
drivers/char/drm/i915_drv.c:372: error: implicit declaration of
function 'acpi_pci_choose_state'
drivers/char/drm/i915_drv.c: In function 'i915_resume':
drivers/char/drm/i915_drv.c:383: error: 'state' undeclared (first use
in this function)
drivers/char/drm/i915_drv.c:383: error: (Each undeclared identifier is
reported only once
drivers/char/drm/i915_drv.c:383: error: for each function it appears in.)
make[3]: *** [drivers/char/drm/i915_drv.o] Error 1
make[2]: *** [drivers/char/drm] Error 2
make[1]: *** [drivers/char] Error 2
make: *** [drivers] Error 2

Thanks,
Jeff.
--

From: Jesse Barnes
Date: Wednesday, February 20, 2008 - 5:39 pm

And this change should just be reverted (leave it as PCI_D0).

Thanks,
Jesse
--

From: Jeff Chua
Date: Wednesday, February 20, 2008 - 6:19 pm

drivers/char/drm/i915_drv.c: In function 'i915_suspend':
drivers/char/drm/i915_drv.c:372: warning: passing argument 1 of
'pci_choose_state' from incompatible pointer type
drivers/char/drm/i915_drv.c:373: warning: passing argument 1 of
'pci_choose_state' from incompatible pointer type

I hope those are just warning that can just be ignored.

Ok, rebooting and will get back shortly.

Thanks,
Jeff.
--

From: Jesse Barnes
Date: Wednesday, February 20, 2008 - 6:21 pm

Oops again, should be dev->pdev.  Silly DRM layer obfuscation.

Jesse
--

From: Jeff Chua
Date: Wednesday, February 20, 2008 - 6:49 pm

I was just about to write that the test didn't work. Both std str
hangs even before attempting to suspend.

Anyway, I'm compiling and rebooting now.

Thanks,
Jeff.
--

From: Jeff Chua
Date: Wednesday, February 20, 2008 - 7:00 pm

It says "calling pci_set_power_state with 3". Then after all then it
still hangs, and then resume with Mr Green.

PM: Syncing filesystems ... done.
Freezing user space processes ... (elapsed 0.00 seconds) done.
Freezing remaining freezable tasks ... (elapsed 0.00 seconds) done.
PM: Shrinking memory...  ^H-^Hdone (0 pages freed)
PM: Freed 0 kbytes in 0.20 seconds (0.00 MB/s)
ACPI: Preparing to enter system sleep state S4
Suspending console(s)
sd 0:0:0:0: [sda] Synchronizing SCSI cache
drm_sysfs_suspend
ACPI: PCI interrupt for device 0000:00:02.0 disabled
calling pci_set_power_state with 3
ACPI: PCI interrupt for device 0000:00:1d.7 disabled
ACPI: PCI interrupt for device 0000:00:1d.3 disabled
ACPI: PCI interrupt for device 0000:00:1d.2 disabled
ACPI: PCI interrupt for device 0000:00:1d.1 disabled
ACPI: PCI interrupt for device 0000:00:1d.0 disabled
ACPI: PCI interrupt for device 0000:00:1b.0 disabled
Disabling non-boot CPUs ...
PM: Creating hibernation image:
PM: Need to copy 25136 pages
tick-braodcast: ignoring broadcast for offline CPU #1
PM: Writing back config space on device 0000:00:02.0 at offset 1 (was
900007, writing 900003)
ACPI: PCI Interrupt 0000:00:1b.0[B] -> GSI 17 (level, low) -> IRQ 17
PCI: Setting latency timer of device 0000:00:1b.0 to 64
PCI: Setting latency timer of device 0000:00:1c.0 to 64
PCI: Setting latency timer of device 0000:00:1c.1 to 64
...


Thanks,
Jeff.
--

From: Rafael J. Wysocki
Date: Thursday, February 21, 2008 - 9:27 am

So it returns the right value.

Jeff, Jesse, please check one thing for me.

Please boot 2.6.25-rc2 (or better, the current head of the Linus' tree) with
no_console_suspend and try to do the following:

# echo 8 > /proc/sys/kernel/printk
# echo core > /sys/power/pm_test
# echo disk > /sys/power/state

(that will run a test of the freeze/unfreeze code without creating the image)
and then

# echo mem > /sys/power/state

(that will run a test of the suspend/resume code without actually suspending).

I'd like to know if that works.

Thanks,
Rafael
--

From: Jesse Barnes
Date: Thursday, February 21, 2008 - 11:34 am

That comes back for me, without creating the green screen.  There's a long 
delay between it saying "entering S4" and actually resuming back to my 

This also works (after doing the echo disk > ...) above.  There's still a 
delay between "entering S3" and the resume to my console though.

Jesse
--

From: Rafael J. Wysocki
Date: Thursday, February 21, 2008 - 1:30 pm

There's an intentional 5 sec. wait.  If the delay is longer that 5 sec., that's a


If that's 5 sec., it's fine.

Please apply the appended patch and try to hibernate.  I wonder if you get the
reboot or it hangs earlier.

Thanks,
Rafael

---
 kernel/power/disk.c |    7 ++-----
 1 file changed, 2 insertions(+), 5 deletions(-)

Index: linux-2.6/kernel/power/disk.c
===================================================================
--- linux-2.6.orig/kernel/power/disk.c
+++ linux-2.6/kernel/power/disk.c
@@ -405,11 +405,7 @@ int hibernation_platform_enter(void)
 
 	local_irq_disable();
 	error = device_power_down(PMSG_SUSPEND);
-	if (!error) {
-		hibernation_ops->enter();
-		/* We should never get here */
-		while (1);
-	}
+	mdelay(1000);
 	local_irq_enable();
 
 	/*
@@ -424,6 +420,7 @@ int hibernation_platform_enter(void)
 	resume_console();
  Close:
 	hibernation_ops->end();
+	kernel_restart(NULL);
 	return error;
 }
 
--

From: Rafael J. Wysocki
Date: Thursday, February 21, 2008 - 3:11 pm

Below is a patch that should work around the issue.  Please try it and let me
know if it helps.

Thanks,
Rafael

---
 drivers/char/drm/i915_drv.c |    3 +++
 include/linux/suspend.h     |    2 ++
 kernel/power/disk.c         |    9 ++++++++-
 3 files changed, 13 insertions(+), 1 deletion(-)

Index: linux-2.6/include/linux/suspend.h
===================================================================
--- linux-2.6.orig/include/linux/suspend.h
+++ linux-2.6/include/linux/suspend.h
@@ -209,6 +209,7 @@ extern unsigned long get_safe_page(gfp_t
 
 extern void hibernation_set_ops(struct platform_hibernation_ops *ops);
 extern int hibernate(void);
+extern bool in_hibernation_power_off(void);
 #else /* CONFIG_HIBERNATION */
 static inline int swsusp_page_is_forbidden(struct page *p) { return 0; }
 static inline void swsusp_set_page_free(struct page *p) {}
@@ -216,6 +217,7 @@ static inline void swsusp_unset_page_fre
 
 static inline void hibernation_set_ops(struct platform_hibernation_ops *ops) {}
 static inline int hibernate(void) { return -ENOSYS; }
+static inline bool in_hibernation_power_off(void) { return false; }
 #endif /* CONFIG_HIBERNATION */
 
 #ifdef CONFIG_PM_SLEEP
Index: linux-2.6/kernel/power/disk.c
===================================================================
--- linux-2.6.orig/kernel/power/disk.c
+++ linux-2.6/kernel/power/disk.c
@@ -24,7 +24,7 @@
 
 #include "power.h"
 
-
+static bool entering_sleep_state;
 static int noresume = 0;
 static char resume_file[256] = CONFIG_PM_STD_PARTITION;
 dev_t swsusp_resume_device;
@@ -381,6 +381,7 @@ int hibernation_platform_enter(void)
 	if (!hibernation_ops)
 		return -ENOSYS;
 
+	entering_sleep_state = true;
 	/*
 	 * We have cancelled the power transition by running
 	 * hibernation_ops->finish() before saving the image, so we should let
@@ -412,6 +413,7 @@ int hibernation_platform_enter(void)
 	}
 	local_irq_enable();
 
+	entering_sleep_state = false;
 	/*
 	 * We don't need to reenable the ...
From: Jesse Barnes
Date: Thursday, February 21, 2008 - 4:45 pm

I ended up applying the below patch instead, so it would build, and 
unfortunately it still hung at suspend time.

So at this point, the known workarounds to the hang at suspend time are to 
remove the device power down call or to boot with 'no_console_suspend'.  
The 'screen turns green' problem is fixed by the extra 'inb' added in the 
patch below (at least for me).

Jesse

diff --git a/drivers/char/drm/i915_drv.c b/drivers/char/drm/i915_drv.c
index 35758a6..35b5a60 100644
--- a/drivers/char/drm/i915_drv.c
+++ b/drivers/char/drm/i915_drv.c
@@ -27,6 +27,7 @@
  *
  */
 
+#include <linux/suspend.h>
 #include "drmP.h"
 #include "drm.h"
 #include "i915_drm.h"
@@ -222,6 +223,7 @@ static void i915_restore_vga(struct drm_device *dev)
 			   dev_priv->saveGR[0x18]);
 
 	/* Attribute controller registers */
+	inb(st01); /* switch back to index mode */
 	for (i = 0; i < 20; i++)
 		i915_write_ar(st01, i, dev_priv->saveAR[i], 0);
 	inb(st01); /* switch back to index mode */
@@ -249,6 +251,9 @@ static int i915_suspend(struct drm_device *dev)
 		return -ENODEV;
 	}
 
+	if (in_hibernation_power_off())
+		return 0;
+
 	pci_save_state(dev->pdev);
 	pci_read_config_byte(dev->pdev, LBB, &dev_priv->saveLBB);
 
@@ -364,7 +369,6 @@ static int i915_suspend(struct drm_device *dev)
 	i915_save_vga(dev);
 
 	/* Shut down the device */
-	pci_disable_device(dev->pdev);
 	pci_set_power_state(dev->pdev, PCI_D3hot);
 
 	return 0;
diff --git a/include/linux/suspend.h b/include/linux/suspend.h
index 1d7d4c5..58d9f67 100644
--- a/include/linux/suspend.h
+++ b/include/linux/suspend.h
@@ -209,6 +209,7 @@ extern unsigned long get_safe_page(gfp_t gfp_mask);
 
 extern void hibernation_set_ops(struct platform_hibernation_ops *ops);
 extern int hibernate(void);
+extern bool in_hibernation_power_off(void);
 #else /* CONFIG_HIBERNATION */
 static inline int swsusp_page_is_forbidden(struct page *p) { return 0; }
 static inline void swsusp_set_page_free(struct page *p) {}
@@ -216,6 +217,7 ...
From: Rafael J. Wysocki
Date: Thursday, February 21, 2008 - 5:28 pm

This thing should make i915_suspend() a noop in the last phase of hibernation,
so if it still only works when you remove the
pci_set_power_state(dev->pdev, PCI_D3hot), then I don't get it.

Can you please try the pach below instead?

Thanks,
Rafael



I ended up applying the below patch instead, so it would build, and 
unfortunately it still hung at suspend time.

So at this point, the known workarounds to the hang at suspend time are to 
remove the device power down call or to boot with 'no_console_suspend'.  
The 'screen turns green' problem is fixed by the extra 'inb' added in the 
patch below (at least for me).

Jesse

---
 drivers/char/drm/i915_drv.c |    5 +++--
 include/linux/suspend.h     |    2 ++
 kernel/power/disk.c         |   10 +++++++++-
 3 files changed, 14 insertions(+), 3 deletions(-)

Index: linux-2.6/drivers/char/drm/i915_drv.c
===================================================================
--- linux-2.6.orig/drivers/char/drm/i915_drv.c
+++ linux-2.6/drivers/char/drm/i915_drv.c
@@ -27,6 +27,7 @@
  *
  */
 
+#include <linux/suspend.h>
 #include "drmP.h"
 #include "drm.h"
 #include "i915_drm.h"
@@ -222,6 +223,7 @@ static void i915_restore_vga(struct drm_
 			   dev_priv->saveGR[0x18]);
 
 	/* Attribute controller registers */
+	inb(st01); /* switch back to index mode */
 	for (i = 0; i < 20; i++)
 		i915_write_ar(st01, i, dev_priv->saveAR[i], 0);
 	inb(st01); /* switch back to index mode */
@@ -366,9 +368,8 @@ static int i915_suspend(struct drm_devic
 
 	i915_save_vga(dev);
 
-	if (state.event == PM_EVENT_SUSPEND) {
+	if (state.event == PM_EVENT_SUSPEND && !in_hibernation_power_off()) {
 		/* Shut down the device */
-		pci_disable_device(dev->pdev);
 		pci_set_power_state(dev->pdev, PCI_D3hot);
 	}
 
Index: linux-2.6/include/linux/suspend.h
===================================================================
--- linux-2.6.orig/include/linux/suspend.h
+++ linux-2.6/include/linux/suspend.h
@@ -209,6 +209,7 @@ extern ...
From: Jeff Chua
Date: Thursday, February 21, 2008 - 5:48 pm

I encountered the same patching problem, but realized that it was due
to earlier patch that you had wanted me to test, so if you revert your
patch back to the current git, Rafael's patch will apply and compile
cleanly.

Thanks,
Jeff.
--

From: Jesse Barnes
Date: Wednesday, February 20, 2008 - 3:32 pm

Jeff, for the hang on suspend problem, I know suspect something else in 
2.6.25-rc2 caused that.

Can you try the 2.6.25-rc1 version of i915_drv.c (in fact all of 
drivers/char/drm from 2.6.25-rc1) but in a 2.6.25-rc2 kernel?  I ask because 
2.6.25-rc1 suspends to disk just fine for me and resumes w/o a green screen, 
while 2.6.25-rc2 fails to suspend (hangs like you say) and gives me a green 
screen.

Were there other changes in ACPI or the PM core that might have caused this I 
wonder?

Thanks,
Jesse
--

From: Jesse Barnes
Date: Wednesday, February 20, 2008 - 4:03 pm

Looks like 2.6.25-rc1 also had broken suspend (my test was broken).  IIRC, 
Dave and I had it working at LCA using the out of tree DRM modules on 
2.6.23.14 or 15...  Maybe you could give that a try?

Thanks,
Jesse
--

From: Jesse Barnes
Date: Wednesday, February 20, 2008 - 4:34 pm

And just to confirm that, I just tested the current DRM modules against a 
2.6.23.15 kernel.  It suspends to disk correctly (w/o a hang) and doesn't 
give me a green screen, so something in 2.6.25 must be causing that (even 
2.6.25-rc1 seems to have the problem).

Also, this patch against 2.6.25-rc1 seemed to prevent the 'green screen' 
problem.  2.6.25-rc2 already has part of it...

Anyway, let me know how your testing goes.

Thanks,
Jesse
--

From: Rafael J. Wysocki
Date: Wednesday, February 20, 2008 - 4:49 pm

In 2.6.23.x there's no second ->suspend() during hibernation, so no wonder.

I'll figure out how to work around this issue in the current mainline, but a
real fix will only be possible when we have separate callbacks for
hibernation.

Thanks,
Rafael
--

From: Jesse Barnes
Date: Wednesday, February 20, 2008 - 5:17 pm

In 2.6.23 it's just:
  ->suspend()
  ->resume()
  *S4*
?

I ask because we still do the D3hot call in the DRM tree, so the hang should 

Ok, thanks.

Jesse
--

From: Rafael J. Wysocki
Date: Wednesday, February 20, 2008 - 6:07 pm

->shutdown()

(that breaks wake up from S4 with many devices, including but not limited to

Thanks,
Rafael
--

From: Mark Lord
Date: Wednesday, February 20, 2008 - 11:47 am

..

Does this machine have more than one CPU core?  If so..
Does your kernel have CONFIG_HOTPLUG_CPU=y (if not, enable it).

??
--

Previous thread: [PATCH] do_task_stat: don't take rcu_read_lock() by Oleg Nesterov on Wednesday, February 20, 2008 - 10:09 am. (1 message)

Next thread: [git patches] libata fixes by Jeff Garzik on Wednesday, February 20, 2008 - 10:25 am. (1 message)