Re: [PATCH] netconsole: fix BUG during net device "upping"

Previous thread: Are you and your friends ok? by Ernie on Sunday, March 22, 2009 - 2:42 am. (1 message)

Next thread: [PATCH net-next] Adding licensing to bnx2x_init_values.h by Eilon Greenstein on Sunday, March 22, 2009 - 4:14 am. (3 messages)
From: Marcin Slusarz
Date: Sunday, March 22, 2009 - 4:02 am

When ndo_open (eg skge_up) function printks something, netconsole decides
it can use this device because it checks state only (netif_running) which is
set before ndo_open. Check device flags too.

[35437.623580] skge eth1: enabling interface
[35437.623601] ------------[ cut here ]------------
[35437.623603] kernel BUG at drivers/net/skge.c:2767!
[35437.623606] invalid opcode: 0000 [#1] PREEMPT
[35437.623608] last sysfs file: /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
[35437.623611] CPU 0
[35437.623613] Modules linked in:
[35437.623617] Pid: 12711, comm: ip Not tainted 2.6.29-rc6-idle #82 To Be Filled By O.E.M.
[35437.623619] RIP: 0010:[<ffffffff803f3c30>]  [<ffffffff803f3c30>] skge_xmit_frame+0xbe/0x3ba
[35437.623628] RSP: 0018:ffff88003cc0f8b8  EFLAGS: 00010086
[35437.623630] RAX: 000000000000007f RBX: ffff88003e850000 RCX: 0000000000000001
[35437.623632] RDX: 0000000000000001 RSI: ffff88003f188720 RDI: ffff88002e568900
[35437.623635] RBP: ffff88003cc0f918 R08: 0000000000000002 R09: 0000000000000000
[35437.623637] R10: 0000000000000006 R11: 0000000000000000 R12: ffff88002e568900
[35437.623639] R13: ffff88003e850000 R14: ffffffff807180c0 R15: 0000000000000001
[35437.623642] FS:  00007f46b39086f0(0000) GS:ffffffff807dc020(0000) knlGS:00000000f6577b90
[35437.623644] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[35437.623646] CR2: 00007f46b3282110 CR3: 000000002ab18000 CR4: 00000000000006e0
[35437.623648] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[35437.623651] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[35437.623653] Process ip (pid: 12711, threadinfo ffff88003cc0e000, task ffff88002a8b5af0)
[35437.623655] Stack:
[35437.623657]  ffff88003eb43000 000000008077cc20 ffff88003f188000 ffff88003f188720
[35437.623660]  ffff88003efba800 ffff88003f188000 0000000000000000 ffff88003efd78a0
[35437.623664]  0000000000000086 ffff88003f188000 0000000000000000 0000000000000001
[35437.623667] Call Trace:
[35437.623670]  ...
From: Matt Mackall
Date: Sunday, March 22, 2009 - 6:20 pm

That's fairly unfortunate semantics for netif_running. But if Dave
agrees that it's reasonable for that to be set to true at this point in
time, then I guess we'll go with it.

-- 
http://selenic.com : development and support for Mercurial and Linux


--

From: David Miller
Date: Sunday, March 22, 2009 - 9:21 pm

From: Matt Mackall <mpm@selenic.com>

These kind of printk's simply are not allowed, we've removed such
printk's from other driver ->open() methods to fix this problem and
that's what should be done here.

I'm rejecting this patch.
--

From: Jarek Poplawski
Date: Monday, March 23, 2009 - 1:04 am

What is the rationale of this decision? printk is a basic tool,
especially designed to work in as many places as possible, and
netconsole is rather something secondary (sorry Matt)?!

Jarek P.
--

From: David Miller
Date: Monday, March 23, 2009 - 1:05 am

From: Jarek Poplawski <jarkao2@gmail.com>

And this basic tool cannot work from the drivers ->open() method.
--

From: Jarek Poplawski
Date: Monday, March 23, 2009 - 1:11 am

And in any function used in the drivers ->open(). BTW, with Marcin's
patch it can...

Jarek P.
--

From: David Miller
Date: Monday, March 23, 2009 - 1:15 am

From: Jarek Poplawski <jarkao2@gmail.com>

This issue came up before, and after we added the netif_running()
check we hit this IIF_UP one and at the time we looked into it
and the result we came up with is that you just can't do it in
a network driver's ->open()

Look up the thread, I'm too lazy...

--

From: Jarek Poplawski
Date: Monday, March 23, 2009 - 2:20 am

So I try to make appear I'm less lazy, and read this one thread only,
but can't see this IIF_UP being mentioned:

http://marc.info/?t=123306255900001&r=1&w=2

Jarek P.
--

From: Jarek Poplawski
Date: Tuesday, March 24, 2009 - 1:22 am

And from any function called anywhere on another cpu while driver's
->open() is running.

BTW, I've had a look at this and it seems the main problem is
netif_tx_stopped() isn't handled properly by the driver(s).

Jarek P.
--

Previous thread: Are you and your friends ok? by Ernie on Sunday, March 22, 2009 - 2:42 am. (1 message)

Next thread: [PATCH net-next] Adding licensing to bnx2x_init_values.h by Eilon Greenstein on Sunday, March 22, 2009 - 4:14 am. (3 messages)