Hello, for some time now I had the problem that after a suspend to disk network connectivitiy was down and also wake-on-lan would not work. First a bit of information, for the real question please see (*) below. Now I just found the ethtool -d (register dump) option and decided I want to try to debug this long-standing bug: http://bugzilla.kernel.org/show_bug.cgi?id=8381 I already found out why network connectivity is down after resume: The interfaces eth0 and eth1 are bridged into br0. The bridging code sets the interfaces into promiscous mode, but this is not properly restored after resume. This bug is easily fixable by calling nv_set_multicast(dev) either from within nv_open() or after nv_open() in nv_resume(). The second problem (does not wol after s2disk) is still unsolved though. However I have found that wake-on-lan works fine if I "rmmod forcedeth" before suspending. This is even a usable workaround for me, since the bridge device stays configured and I just have to reload forcedeth and readd the interfaces to the bridge after resume. (*) Now for my question: AFAICS the ordering of events is as follows: 1) User reqests hibernate 2) Tasks are frozen 3) Device suspend callbacks get called 4) Device resume callbacks get called 5) Memory image is written to disk Now, the problem I see with this is that while step 3) prepares the device for suspend (and wake-on-lan), step 4) may undo some of this preparation. In my case, I think the fix for the first bug (promiscous mode does not get restored on resume) breaks wake-on-lan (since the new value of NvRegPacketFilterFlags may be incompatible with wake-on-lan). Shouldn't there be a 'prepare for poweroff'-callback, which gets called before the system is powered off for real? --- linux-2.6.26-rc2/drivers/net/forcedeth.c 2008-05-17 14:22:06.000000000 +0200 +++ linux-2.6.26-rc2/drivers/net/forcedeth.c 2008-05-17 19:48:56.000000000 +0200 @@ -5823,6 +5823,7 @@ writel(txreg, base + NvRegTransmitPoll); ...
6) Device suspend callbacks get called again to prepare for the system sleep Thanks for the patch, it looks sane to me. Rafael --
Hmm, I guess I don't see that because by then the console is disabled or something like that? (It also doesn't help that the system immediately powers down after that) Can I get debugging output during that phase out onto the serial console and/or plain text console & simply turn the poweroff into a halt? (halt in addition to shutdown and reboot in /sys/power/disk might be nice) The problem is I'm really seeing/have seen multiple issues/situations here: 1) When the device is part of a bridge, connectivity is down after Hibernate 2) Boot->Hibernate->WOL works->Hibernate->WOL only works with byte-reversed MAC-address 3) I have to reconfirm this, but I think after fixing bug 1) with the patch of the previous mail bug 2) turns into 4): 4) WOL doesn't work at all after Hibernate 5) nforce ethernet and netgear switch don't like each other and sometimes negotiate only 100Mbit instead of 1Gbit, while my notebook on a much longer cable never has negotiation problems 6) In the past sometimes negotiation would take very long, that one I haven't seen in some time so I assume it's fixed in recent kernels. This one only happened if both ports were enabled IIRC "Device suspend callbacks get called again" means I have to retest 3) Maybe I'm imagining things or seeing a heisenbug here? 1) Is fixed by the patch in the previous mail 2) I assume to be related to DEV_HAS_CORRECT_MACADDR, NVREG_TRANSMITPOLL_MAC_ADDR_REV and some magic byteswapping code in the driver. 1) and 2)/4) can be worked around (for the bridge case) by removing the forcedeth module prior to hibernate and reinserting the module and readding the devices to the bridge after resume. -- Tobias PGP: http://9ac7e0bc.uguu.de このメールは十割再利用されたビットで作られています。 --
According to my pci postcard this is not the case. I modified nv_suspend to wiggle port 0x80 with mdelay(100) between 0xa0/0x50 wiggles and mdelay(1000) before returning, so it is easily countable. nv_suspend is called only once per ethernet port (step 3 above). I reconfirmed this. Without the promiscous mode bugfix, wake-on-lan (partially) works. With the promiscous mode bugfix the machine doesn't wake up. This is also easily explainable by 6) not For what values of 'recent'? I'm running 2.6.26-rc2 here. :) How do I hook into this callback? (suspend_late maybe? Going to try that one next...) -- Tobias PGP: http://9ac7e0bc.uguu.de このメールは十割再利用されたビットで作られています。 --
Ok, this patch fixes the 'regression' introduced by the previous
patch (at least for me ;)):
Index: linux-2.6.26-rc2.forcedwol/drivers/net/forcedeth.c
===================================================================
--- linux-2.6.26-rc2.forcedwol.orig/drivers/net/forcedeth.c 2008-05-18 01:10:18.000000000 +0200
+++ linux-2.6.26-rc2.forcedwol/drivers/net/forcedeth.c 2008-05-18 01:11:33.000000000 +0200
@@ -5828,8 +5828,24 @@
out:
return rc;
}
+
+static void nv_shutdown(struct pci_dev *pdev)
+{
+ struct net_device *dev = pci_get_drvdata(pdev);
+ struct fe_priv *np = netdev_priv(dev);
+ u8 __iomem *base = get_hwbase(dev);
+
+ if (netif_running(dev))
+ nv_close(dev);
+
+ pci_enable_wake(pdev, PCI_D3hot, np->wolenabled);
+ pci_enable_wake(pdev, PCI_D3cold, np->wolenabled);
+ pci_disable_device(pdev);
+ pci_set_power_state(pdev, PCI_D3hot);
+}
#else
#define nv_suspend NULL
+#define nv_shutdown NULL
#define nv_resume NULL
#endif /* CONFIG_PM */
@@ -6000,6 +6016,7 @@
.remove = __devexit_p(nv_remove),
.suspend = nv_suspend,
.resume = nv_resume,
+ .shutdown = nv_shutdown,
};
static int __init init_nic(void)
--
Tobias PGP: http://9ac7e0bc.uguu.de
このメールは十割再利用されたビットで作られています。
--
--
/usr/local/bin/s2disk: |#!/bin/sh | |SYSPRINTK=/proc/sys/kernel/printk |OLDPRINTK=$(cat $SYSPRINTK) |chvt 1 |# work around forcedeth wake-on-lan bug |#rmmod forcedeth |echo 9 > $SYSPRINTK |# shutdown/reboot/platform |echo shutdown > /sys/power/disk |echo disk > /sys/power/state |echo "$OLDPRINTK" > $SYSPRINTK |# work around forcedeth wake-on-lan bug |#modprobe forcedeth |#brctl addif br0 eth0 |#brctl addif br0 eth1 |#ifconfig eth0 up |#ifconfig eth1 up I now also now why I get the 'swapped mac' problem: In my case (DEV_HAS_CORRECT_MACADDR unset and NVREG_TRANSMITPOLL_MAC_ADDR_REV unset on probe) nv_probe() reads the original macaddress in reversed order to correct it and sets NVREG_TRANSMITPOLL_MAC_ADDR_REV. However the suspend/resume path does not touch the mac address at all, so after a cold boot and resume from disk the macaddress is again in 'BIOS order' and NVREG_TRANSMITPOLL_MAC_ADDR_REV is unset, but NVREG_TRANSMITPOLL_MAC_ADDR_REV gets set unconditionally in nv_resume(), so the net effect is that the MAC is now reversed. With the promisc fix this is hidden for my brige setup and only noticeable at the next hibernate, where I now have to use a reversed MAC to wakeup the device. The following patch (on top of the other two) fixes this by saving and restoring the memory mapped device configuration space in suspend/resume. The shutdown hook is still needed since the promisc mode seems to be incompatible with wake on lan. It also reorders the code in suspend/resume to match that in e100/e1000e. (configuration space is also saved for devices that are not running) Index: linux-2.6.26-rc2.forcedwol/drivers/net/forcedeth.c =================================================================== --- linux-2.6.26-rc2.forcedwol.orig/drivers/net/forcedeth.c 2008-05-18 11:26:12.000000000 +0200 +++ linux-2.6.26-rc2.forcedwol/drivers/net/forcedeth.c 2008-05-18 13:00:23.000000000 +0200 @@ -426,6 +426,7 @@ #define NV_PCI_REGSZ_VER1 0x270 #define ...
You don't see the devices' suspend callbacks invoked before powering off I'm unable to comment on the device-specific things, but apart from this the I'm not sure if it's possible to put the device into D3cold without actually Thanks, Rafael --
Hmm, I think someone recommended to use shutdown since that is supposed to be more reliable. Anyway, I think the shutdown hook is good anyway since I suspect that if I do a normal system shutdown with promiscous mode still on, then wake-on-lan would also not work? I haven't checked that though. Ok, I've cleaned it up a bit and split it up properly (I think): http://www.tdiedrich.de/~ranma/patches/forcedeth/2.6.26-rc2/ series: forcedeth.resume.promisc.patch forcedeth.hibernate.wol.patch forcedeth.swappedmac.patch forcedeth.reorder_suspend_resume.patch forcedeth.resume.promisc.patch: nv_open() resets multicast settings, call nv_set_multicast(dev) to restore them. (Maybe this should rather be moved into nv_open()) Index: linux-2.6.26-rc2.forcedwol/drivers/net/forcedeth.c =================================================================== --- linux-2.6.26-rc2.forcedwol.orig/drivers/net/forcedeth.c 2008-05-18 13:52:59.000000000 +0200 +++ linux-2.6.26-rc2.forcedwol/drivers/net/forcedeth.c 2008-05-18 13:53:02.000000000 +0200 @@ -5823,6 +5823,7 @@ writel(txreg, base + NvRegTransmitPoll); rc = nv_open(dev); + nv_set_multicast(dev); out: return rc; } forcedeth.hibernate.wol.patch: When hibernating in 'shutdown' mode, after saving the image the suspend hook is not called again. However, if the device is in promiscous mode, wake-on-lan will not work. This adds a shutdown hook to setup wake-on-lan before the final shutdown. Index: linux-2.6.26-rc2.forcedwol/drivers/net/forcedeth.c =================================================================== --- linux-2.6.26-rc2.forcedwol.orig/drivers/net/forcedeth.c 2008-05-18 13:53:02.000000000 +0200 +++ linux-2.6.26-rc2.forcedwol/drivers/net/forcedeth.c 2008-05-18 13:53:06.000000000 +0200 @@ -5827,8 +5827,23 @@ out: return rc; } + +static void nv_shutdown(struct pci_dev *pdev) +{ + struct net_device *dev = pci_get_drvdata(pdev); + struct fe_priv *np = netdev_priv(dev); + + if ...
Thanks, Rafael --
| Jesse Barnes | Re: [stable] [BUG][PATCH] cpqphp: fix kernel NULL pointer dereference |
| Greg KH | [003/136] p54usb: add Zcomax XG-705A usbid |
| Magnus Damm |
