Re: [PATCH] sky2: fix hard hang with netconsoling and iface going up

Previous thread: [PATCH] fec: Fix KS8721BL_ICSR phy register offset by Sascha Hauer on Tuesday, January 27, 2009 - 3:40 am. (5 messages)

Next thread: Is this normal? by Paweł Staszewski on Tuesday, January 27, 2009 - 7:05 am. (4 messages)
From: Alexey Dobriyan
Date: Tuesday, January 27, 2009 - 5:01 am

If sky2 interface is netconsoling, then

	ifconfig down; ifconfig up

leads to hard hang after printing

	sky2 eth0: enabling interface

SysRq+b reboots the box, nothing else works (or prints something).

This is with 2.6.28.2 and 2.6.29-rc2-5ee810072175042775e39bdd3eaaa68884c27805.
Most certainly was there since day one.

Haven't investigated further.
--

From: Alexey Dobriyan
Date: Tuesday, January 27, 2009 - 6:27 am

Well, duh...


[PATCH] sky2: fix hard hang with netconsoling and iface going up

Printing anything over netconsole before hw is up and running is,
of course, not going to work.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
---

 drivers/net/sky2.c |    6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

--- a/drivers/net/sky2.c
+++ b/drivers/net/sky2.c
@@ -1403,9 +1403,6 @@ static int sky2_up(struct net_device *dev)
 
  	}
 
-	if (netif_msg_ifup(sky2))
-		printk(KERN_INFO PFX "%s: enabling interface\n", dev->name);
-
 	netif_carrier_off(dev);
 
 	/* must be power of 2 */
@@ -1484,6 +1481,9 @@ static int sky2_up(struct net_device *dev)
 	sky2_write32(hw, B0_IMSK, imask);
 
 	sky2_set_multicast(dev);
+
+	if (netif_msg_ifup(sky2))
+		printk(KERN_INFO PFX "%s: enabling interface\n", dev->name);
 	return 0;
 
 err_out:
--

From: Alexey Dobriyan
Date: Tuesday, January 27, 2009 - 6:35 am

Oh, this means that most printks on error path in ->ndo_open are broken
and every early enough "I'm up!" printk is broken is well. :-(
--

From: Stephen Hemminger
Date: Tuesday, January 27, 2009 - 10:49 am

On Tue, 27 Jan 2009 16:35:04 +0300

Netconsole should be smart enough to not try and print to
a device that is not marked running.
--

From: David Miller
Date: Tuesday, January 27, 2009 - 11:19 am

From: Stephen Hemminger <shemminger@linux-foundation.org>

I think the hardware is marked running.

This is at the beginning of the driver's ->open() method.

The netpoll layer is in fact checking those states:

static void netpoll_send_skb(struct netpoll *np, struct sk_buff *skb)
{
 ...
	if (!npinfo || !netif_running(dev) || !netif_device_present(dev)) {
		__kfree_skb(skb);
		return;
	}
--

From: David Miller
Date: Thursday, January 29, 2009 - 6:00 pm

From: Alexey Dobriyan <adobriyan@gmail.com>

Alexey, can you get this tested by the reporter to make sure
it really fixes the hang?

Once it's been tested, I'll apply it.

Thank you.
--

From: David Miller
Date: Thursday, January 29, 2009 - 6:09 pm

From: David Miller <davem@davemloft.net>

Actually, I just read the followup where you indicated that
other debugging printk's in this driver might also have the
same problem.

I'm dropping this until everything is resolved, thanks.
--

From: Alexey Dobriyan
Date: Friday, January 30, 2009 - 4:59 am

As for other printks, someone with 8139cp should experience the same bug.
--

From: David Miller
Date: Friday, January 30, 2009 - 2:45 pm

From: Alexey Dobriyan <adobriyan@gmail.com>

Then you give me no choice, I have to apply this patch :-)

Thanks!
--

From: Stephen Hemminger
Date: Friday, January 30, 2009 - 3:01 pm

On Fri, 30 Jan 2009 13:45:50 -0800 (PST)

Yes, this is a nice harmless change
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
--

From: Alexey Dobriyan
Date: Sunday, February 1, 2009 - 3:38 pm

Strangely enough, 8139cp survives with no message printed and no hang or
anything. But this is with qemu/kvm.

	static int cp_open (struct net_device *dev)
	{
	        struct cp_private *cp = netdev_priv(dev);
	        int rc;

	        if (netif_msg_ifup(cp))
	                printk(KERN_DEBUG "%s: enabling interface\n", dev->name);

	        rc = cp_alloc_rings(cp);
			...
--

From: Stephen Hemminger
Date: Monday, February 2, 2009 - 11:01 am

On Mon, 2 Feb 2009 01:38:53 +0300

Unlike other drivers, sky2 calls its own up routine on resume.
In this case, the netif_running flag is already set.
--

Previous thread: [PATCH] fec: Fix KS8721BL_ICSR phy register offset by Sascha Hauer on Tuesday, January 27, 2009 - 3:40 am. (5 messages)

Next thread: Is this normal? by Paweł Staszewski on Tuesday, January 27, 2009 - 7:05 am. (4 messages)