2.6.24-rc6 oops in net_tx_action

Previous thread: Linux 2.6.24-rc7 by Linus Torvalds on Sunday, January 6, 2008 - 6:19 pm. (55 messages)

Next thread: [patch 1/2] show being-loaded/being-unloaded indicator for modules in oopses by Arjan van de Ven on Sunday, January 6, 2008 - 7:18 pm. (2 messages)
To: <linux-netdev@...>
Cc: <linux@...>, <linux-kernel@...>, <shemminger@...>
Date: Sunday, January 6, 2008 - 6:29 pm

Kernel is 2.6.24-rc6 + linuxpps patches, which are all to the serial
port driver.

2.6.23 was known stable. I haven't tested earlier 2.6.24 releases.
I think it happened once before; I got a black-screen lockup with
keyboard LEDs blinking, but that was with X running so I couldn't see a
console oops. But given that I installed 2.6.24-rc6 about 24 hours ago,
that's a disturbing pattern.

(N.B. I was pretty careful, but the following was transcribed by hand.)

BUG: unable to handle kernel paging request at virtual address 00100104
printing eip: b02b3d6a *pde=00000000
Oops: 0002 [#1]

Pid 3162, comm: ntop Not tainted (2.6.24-rc6 #36)
EIP: 0060[<b02b3d6a>] EFLAGS: 00210046 CPU: 0
EIP is at net_tx_action+0x8b/0xec
EAX: 00100100 EBX: efa63924 ECX: 0801fbff EDX: 00200200
ESI: 00000010 EDI: 00000010 EBP: 0000012c ESP: b0444fc8
DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068
Process ntop (pid: 3162, ti=b0444000 task=e9122f90 task.ti=e92ec000)
Stack: 00000000 0000000a b02b3a84 b044007b 000a7ac5 00000001 b0457a44 00000009
00000000 b0118016 e92ecf74 e92ec000 00200046 b0103c3c
Call Trace:
[<b02b3a84>] net_tx_action+0x5a/0xa8
[<b0118016>] __do_softirq+0x35/0x75
[<b0103c3c>] do_softirq+0x3e/0x8f
[<b012504d>] do_gettimeofday+0x2c/0xc6
[<b012c8d3>] handle_level_irq+0x0/0x8d
[<b0117f55>] irq_exit+0x29/0x58
[<b0103d3c>] do_IRQ+0xaf/0xc2
[<b0117b52>] sys_gettimeofday+0x27/0x53
[<b0102577>] common_interrupt+0x23/0x28
=======================
Code: 24 04 ec 61 3d b0 c7 04 24 87 01 3a b0 e8 ad 10 e6 ff e8 44 fd e4 ff c7 05 1c b9 46 b0 01 00 00 00 fa 39 fe 75 20 8b 03 8b 53 04 <89> 50 04 89 02 a1 fc b8 46 b0 c7 03 f8 b8 46 b0 89 1d fc b8 46
EIP: [<b02b3d6a>] at net_tx_action+0x8b/0xec SS:ESP 0068:b0444fc8
Kernel panic - not syncing: Fatal exception in interrupt

Network config is a little complex; there are 5 physical network
interfaces and a bunch of netfilter rules. A quad-port 100baseT Tulip
card whic...

To: <linux@...>
Cc: <linux-netdev@...>, <linux-kernel@...>, <shemminger@...>, <jgarzik@...>
Date: Sunday, January 6, 2008 - 7:57 pm

It is probably this one:

http://marc.info/?t=119782794000003&r=1&w=2

--
Ueimor
--

To: <linux@...>, <romieu@...>
Cc: <jgarzik@...>, <linux-kernel@...>, <netdev@...>, <shemminger@...>
Date: Monday, January 7, 2008 - 2:43 am

Thanks! I got the patch from
http://marc.info/?l=linux-netdev&m=119756785219214
(Which didn't make it into -rc7; please fix!)
and am recompiling now.

Actually, I grabbed the hardware mitigation followon patch while I was
at it. I notice that the comment explaining the format of CSR11 and
what 0x80F10000 means got lost; perhaps it would be nice to resurrect it?

0x80F10000
80000000 = Cycle size (timer control)
78000000 = TX timer in 16 * Cycle size
07000000 = No. pkts before Int. (0 = interrupt per packet)
00F00000 = Rx timer in Cycle size
000E0000 = No. pkts before Int.
00010000 = Continues mode (CM)

(Boy, that tulip driver could use a whitespace overhaul.)
--

To: <linux@...>
Cc: <romieu@...>, <jgarzik@...>, <linux-kernel@...>, <netdev@...>, <shemminger@...>
Date: Monday, January 7, 2008 - 4:23 am

From: linux@horizon.com

Jeff is busy so he's asked me to pick up the more important
driver bug fixes that get posted.

I'll push this around, thanks.
--

To: <davem@...>, <linux@...>
Cc: <jgarzik@...>, <linux-kernel@...>, <netdev@...>, <romieu@...>, <shemminger@...>
Date: Monday, January 7, 2008 - 3:13 pm

Much obliged. It's only 11 hours of uptime, but no problems so far,
even trying abusive things like "ping -f -l64 -s8000".
--

Previous thread: Linux 2.6.24-rc7 by Linus Torvalds on Sunday, January 6, 2008 - 6:19 pm. (55 messages)

Next thread: [patch 1/2] show being-loaded/being-unloaded indicator for modules in oopses by Arjan van de Ven on Sunday, January 6, 2008 - 7:18 pm. (2 messages)