Re: hibernate event order question

Previous thread: pcspkr lockdep trace. by Dave Jones on Saturday, May 17, 2008 - 10:26 am. (2 messages)

Next thread: Re: XFS/md/blkdev warning (was Re: Linux 2.6.26-rc2) by Alistair John Strachan on Saturday, May 17, 2008 - 11:22 am. (8 messages)
From: Tobias Diedrich
Date: Saturday, May 17, 2008 - 10:50 am

Hello,

for some time now I had the problem that after a suspend to disk
network connectivitiy was down and also wake-on-lan would not work.

First a bit of information, for the real question please see (*)
below.

Now I just found the ethtool -d (register dump) option
and decided I want to try to debug this long-standing bug:
http://bugzilla.kernel.org/show_bug.cgi?id=8381

I already found out why network connectivity is down after resume:

The interfaces eth0 and eth1 are bridged into br0.
The bridging code sets the interfaces into promiscous mode, but this
is not properly restored after resume.
This bug is easily fixable by calling nv_set_multicast(dev) either
from within nv_open() or after nv_open() in nv_resume().

The second problem (does not wol after s2disk) is still unsolved
though.  However I have found that wake-on-lan works fine if I
"rmmod forcedeth" before suspending.  This is even a usable
workaround for me, since the bridge device stays configured and I
just have to reload forcedeth and readd the interfaces to the bridge
after resume.

(*)
Now for my question:
AFAICS the ordering of events is as follows:
1) User reqests hibernate
2) Tasks are frozen
3) Device suspend callbacks get called
4) Device resume callbacks get called
5) Memory image is written to disk

Now, the problem I see with this is that while step 3) prepares the
device for suspend (and wake-on-lan), step 4) may undo some of this
preparation.

In my case, I think the fix for the first bug (promiscous mode does
not get restored on resume) breaks wake-on-lan (since the new value
of NvRegPacketFilterFlags may be incompatible with wake-on-lan).

Shouldn't there be a 'prepare for poweroff'-callback, which gets called
before the system is powered off for real?

--- linux-2.6.26-rc2/drivers/net/forcedeth.c	2008-05-17 14:22:06.000000000 +0200
+++ linux-2.6.26-rc2/drivers/net/forcedeth.c	2008-05-17 19:48:56.000000000 +0200
@@ -5823,6 +5823,7 @@
 	writel(txreg, base + NvRegTransmitPoll);
 ...
From: Rafael J. Wysocki
Date: Saturday, May 17, 2008 - 12:56 pm

6) Device suspend callbacks get called again to prepare for the system sleep


Thanks for the patch, it looks sane to me.

Rafael
--

From: Tobias Diedrich
Date: Saturday, May 17, 2008 - 2:03 pm

Hmm, I guess I don't see that because by then the console is disabled or
something like that?
(It also doesn't help that the system immediately powers down after that)
Can I get debugging output during that phase out onto the serial
console and/or plain text console & simply turn the poweroff into a
halt? (halt in addition to shutdown and reboot in /sys/power/disk
might be nice)

The problem is I'm really seeing/have seen multiple issues/situations here:
1) When the device is part of a bridge, connectivity is down after
   Hibernate
2) Boot->Hibernate->WOL works->Hibernate->WOL only works with
   byte-reversed MAC-address
3) I have to reconfirm this, but I think after fixing bug 1) with
   the patch of the previous mail bug 2) turns into 4):
4) WOL doesn't work at all after Hibernate
5) nforce ethernet and netgear switch don't like each other and
   sometimes negotiate only 100Mbit instead of 1Gbit, while my
   notebook on a much longer cable never has negotiation problems
6) In the past sometimes negotiation would take very long, that one
   I haven't seen in some time so I assume it's fixed in recent
   kernels.  This one only happened if both ports were enabled IIRC

"Device suspend callbacks get called again" means I have to retest 3)
Maybe I'm imagining things or seeing a heisenbug here?

1) Is fixed by the patch in the previous mail
2) I assume to be related to DEV_HAS_CORRECT_MACADDR,
NVREG_TRANSMITPOLL_MAC_ADDR_REV and some magic byteswapping code in
the driver.
1) and 2)/4) can be worked around (for the bridge case) by removing
the forcedeth module prior to hibernate and reinserting the module
and readding the devices to the bridge after resume.

-- 
Tobias						PGP: http://9ac7e0bc.uguu.de
このメールは十割再利用されたビットで作られています。
--

From: Tobias Diedrich
Date: Saturday, May 17, 2008 - 3:36 pm

According to my pci postcard this is not the case.
I modified nv_suspend to wiggle port 0x80 with mdelay(100) between
0xa0/0x50 wiggles and mdelay(1000) before returning, so it is easily
countable.
nv_suspend is called only once per ethernet port (step 3 above).

I reconfirmed this.  Without the promiscous mode bugfix,
wake-on-lan (partially) works.  With the promiscous mode bugfix the
machine doesn't wake up.  This is also easily explainable by 6) not

For what values of 'recent'?
I'm running 2.6.26-rc2 here. :)
How do I hook into this callback?
(suspend_late maybe? Going to try that one next...)

-- 
Tobias						PGP: http://9ac7e0bc.uguu.de
このメールは十割再利用されたビットで作られています。
--

From: Tobias Diedrich
Date: Saturday, May 17, 2008 - 4:24 pm

Ok, this patch fixes the 'regression' introduced by the previous
patch (at least for me ;)):

Index: linux-2.6.26-rc2.forcedwol/drivers/net/forcedeth.c
===================================================================
--- linux-2.6.26-rc2.forcedwol.orig/drivers/net/forcedeth.c	2008-05-18 01:10:18.000000000 +0200
+++ linux-2.6.26-rc2.forcedwol/drivers/net/forcedeth.c	2008-05-18 01:11:33.000000000 +0200
@@ -5828,8 +5828,24 @@
 out:
 	return rc;
 }
+
+static void nv_shutdown(struct pci_dev *pdev)
+{
+	struct net_device *dev = pci_get_drvdata(pdev);
+	struct fe_priv *np = netdev_priv(dev);
+	u8 __iomem *base = get_hwbase(dev);
+
+	if (netif_running(dev))
+		nv_close(dev);
+
+	pci_enable_wake(pdev, PCI_D3hot, np->wolenabled);
+	pci_enable_wake(pdev, PCI_D3cold, np->wolenabled);
+	pci_disable_device(pdev);
+	pci_set_power_state(pdev, PCI_D3hot);
+}
 #else
 #define nv_suspend NULL
+#define nv_shutdown NULL
 #define nv_resume NULL
 #endif /* CONFIG_PM */
 
@@ -6000,6 +6016,7 @@
 	.remove		= __devexit_p(nv_remove),
 	.suspend	= nv_suspend,
 	.resume		= nv_resume,
+	.shutdown	= nv_shutdown,
 };
 
 static int __init init_nic(void)

-- 
Tobias						PGP: http://9ac7e0bc.uguu.de
このメールは十割再利用されたビットで作られています。
--

From: Rafael J. Wysocki
Date: Sunday, May 18, 2008 - 3:45 am

From: Tobias Diedrich
Date: Sunday, May 18, 2008 - 4:22 am

/usr/local/bin/s2disk:
|#!/bin/sh
|
|SYSPRINTK=/proc/sys/kernel/printk
|OLDPRINTK=$(cat $SYSPRINTK)
|chvt 1
|# work around forcedeth wake-on-lan bug
|#rmmod forcedeth
|echo 9 > $SYSPRINTK
|# shutdown/reboot/platform
|echo shutdown > /sys/power/disk
|echo disk > /sys/power/state
|echo "$OLDPRINTK" > $SYSPRINTK
|# work around forcedeth wake-on-lan bug
|#modprobe forcedeth
|#brctl addif br0 eth0
|#brctl addif br0 eth1
|#ifconfig eth0 up
|#ifconfig eth1 up

I now also now why I get the 'swapped mac' problem:

In my case (DEV_HAS_CORRECT_MACADDR unset and
NVREG_TRANSMITPOLL_MAC_ADDR_REV unset on probe)
nv_probe() reads the original macaddress in reversed order to
correct it and sets NVREG_TRANSMITPOLL_MAC_ADDR_REV.

However the suspend/resume path does not touch the mac address at
all, so after a cold boot and resume from disk the macaddress is
again in 'BIOS order' and NVREG_TRANSMITPOLL_MAC_ADDR_REV is unset,
but NVREG_TRANSMITPOLL_MAC_ADDR_REV gets set unconditionally in
nv_resume(), so the net effect is that the MAC is now reversed.

With the promisc fix this is hidden for my brige setup and only
noticeable at the next hibernate, where I now have to use a reversed
MAC to wakeup the device.

The following patch (on top of the other two) fixes this by saving
and restoring the memory mapped device configuration space in
suspend/resume.  The shutdown hook is still needed since the promisc
mode seems to be incompatible with wake on lan.

It also reorders the code in suspend/resume to match that in e100/e1000e.
(configuration space is also saved for devices that are not running)

Index: linux-2.6.26-rc2.forcedwol/drivers/net/forcedeth.c
===================================================================
--- linux-2.6.26-rc2.forcedwol.orig/drivers/net/forcedeth.c	2008-05-18 11:26:12.000000000 +0200
+++ linux-2.6.26-rc2.forcedwol/drivers/net/forcedeth.c	2008-05-18 13:00:23.000000000 +0200
@@ -426,6 +426,7 @@
 #define NV_PCI_REGSZ_VER1      	0x270
 #define ...
From: Rafael J. Wysocki
Date: Sunday, May 18, 2008 - 4:32 am

You don't see the devices' suspend callbacks invoked before powering off

I'm unable to comment on the device-specific things, but apart from this the

I'm not sure if it's possible to put the device into D3cold without actually

Thanks,
Rafael
--

From: Tobias Diedrich
Date: Sunday, May 18, 2008 - 5:24 am

Hmm, I think someone recommended to use shutdown since that is
supposed to be more reliable.
Anyway, I think the shutdown hook is good anyway since I suspect
that if I do a normal system shutdown with promiscous mode still on,
then wake-on-lan would also not work?
I haven't checked that though.

Ok, I've cleaned it up a bit and split it up properly (I think):
http://www.tdiedrich.de/~ranma/patches/forcedeth/2.6.26-rc2/
series:
forcedeth.resume.promisc.patch
forcedeth.hibernate.wol.patch
forcedeth.swappedmac.patch
forcedeth.reorder_suspend_resume.patch



forcedeth.resume.promisc.patch:
nv_open() resets multicast settings, call nv_set_multicast(dev) to restore them.
(Maybe this should rather be moved into nv_open())

Index: linux-2.6.26-rc2.forcedwol/drivers/net/forcedeth.c
===================================================================
--- linux-2.6.26-rc2.forcedwol.orig/drivers/net/forcedeth.c	2008-05-18 13:52:59.000000000 +0200
+++ linux-2.6.26-rc2.forcedwol/drivers/net/forcedeth.c	2008-05-18 13:53:02.000000000 +0200
@@ -5823,6 +5823,7 @@
 	writel(txreg, base + NvRegTransmitPoll);
 
 	rc = nv_open(dev);
+	nv_set_multicast(dev);
 out:
 	return rc;
 }



forcedeth.hibernate.wol.patch:
When hibernating in 'shutdown' mode, after saving the image the suspend hook
is not called again.
However, if the device is in promiscous mode, wake-on-lan will not work.
This adds a shutdown hook to setup wake-on-lan before the final shutdown.

Index: linux-2.6.26-rc2.forcedwol/drivers/net/forcedeth.c
===================================================================
--- linux-2.6.26-rc2.forcedwol.orig/drivers/net/forcedeth.c	2008-05-18 13:53:02.000000000 +0200
+++ linux-2.6.26-rc2.forcedwol/drivers/net/forcedeth.c	2008-05-18 13:53:06.000000000 +0200
@@ -5827,8 +5827,23 @@
 out:
 	return rc;
 }
+
+static void nv_shutdown(struct pci_dev *pdev)
+{
+	struct net_device *dev = pci_get_drvdata(pdev);
+	struct fe_priv *np = netdev_priv(dev);
+
+	if ...
From: Rafael J. Wysocki
Date: Sunday, May 18, 2008 - 5:29 am

Thanks,
Rafael
--

Previous thread: pcspkr lockdep trace. by Dave Jones on Saturday, May 17, 2008 - 10:26 am. (2 messages)

Next thread: Re: XFS/md/blkdev warning (was Re: Linux 2.6.26-rc2) by Alistair John Strachan on Saturday, May 17, 2008 - 11:22 am. (8 messages)