Re: WARNING: at net/core/dev.c:2161 net_rx_action()

Previous thread: Re: What still uses the block layer? by Neil Brown on Monday, October 15, 2007 - 7:19 am. (4 messages)

Next thread: [patch 0/2] Protect crashkernel against BSS overlap by Bernhard Walle on Monday, October 15, 2007 - 7:50 am. (5 messages)
To: <linux-kernel@...>
Cc: David S. Miller <davem@...>, Linus Torvalds <torvalds@...>, Stephen Hemminger <shemminger@...>
Date: Monday, October 15, 2007 - 7:24 am

got this warning with Linus' latest -git tree:

WARNING: at net/core/dev.c:2161 net_rx_action()
[<80564db4>] net_rx_action+0xce/0x186
[<8011ba98>] __do_softirq+0x6c/0xcf
[<8011bb2d>] do_softirq+0x32/0x36
[<8011bcae>] irq_exit+0x35/0x40
[<80104fdb>] do_IRQ+0x5c/0x71
[<801048cd>] do_nmi+0x8f/0x238
[<801033a3>] common_interrupt+0x23/0x30
=======================

box is operating fine otherwise. config and bootlog attached. This
warning got introduced by:

commit bea3348eef27e6044b6161fd04c3152215f96411
Author: Stephen Hemminger <shemminger@linux-foundation.org>
Date: Wed Oct 3 16:41:36 2007 -0700

[NET]: Make NAPI polling independent of struct net_device objects.

Ingo

To: <mingo@...>
Cc: <linux-kernel@...>, <torvalds@...>, <shemminger@...>
Date: Monday, October 15, 2007 - 3:57 pm

From: Ingo Molnar <mingo@elte.hu>

This is a driver bug, the work "budget" passed into a driver's
->poll() handler should never be exceeded. That's what this
warning assertion is checking.

What ethernet card is in your system and what driver is being
used to drive it?

-

To: David Miller <davem@...>
Cc: <linux-kernel@...>, <torvalds@...>, <shemminger@...>, Jeff Garzik <jgarzik@...>
Date: Monday, October 15, 2007 - 6:03 pm

it's forcedeth.

i've checked nv_napi_poll(), and i dont see how it could return larger
than 'limit' number of packets.

it could return packets == limit though:

pkts = nv_rx_process_optimized(dev, budget);
...

if (pkts < budget) {
/* re-enable receive interrupts */
spin_lock_irqsave(&np->lock, flags);

__netif_rx_complete(dev, napi);

...
return pkts;

shouldnt that be "pkts <= budget"? But even that shouldnt cause a larger
than limit return. Weird.

there are two networking cards in the system, the other one is a:

eth1: RealTek RTL8139 at 0xf080e000, 00:c0:df:03:68:5d, IRQ 11
eth1: Identified 8139 chip type 'RTL-8139B'

but this one should be inactive (not plugged into the network). Should i
try to get a debug print out of the actual 'weight' and 'work' integers,
and of the n->poll function address?

Ingo
-

To: Ingo Molnar <mingo@...>
Cc: David Miller <davem@...>, <linux-kernel@...>, <torvalds@...>, <shemminger@...>
Date: Monday, October 15, 2007 - 6:20 pm

FWIW, forcedeth has this funky disable_irq()-based locking mechanism
that is currently the subject of some kernel.org bugzillas. I would
also check and make sure that isn't perturbing your results (even though
it seems unrelated, based on what I read here). That might include
disabling MSI if it's enabled, and/or turning on spinlock debugging.

Jeff

-

To: <mingo@...>
Cc: <linux-kernel@...>, <torvalds@...>, <shemminger@...>, <jgarzik@...>
Date: Monday, October 15, 2007 - 6:18 pm

From: Ingo Molnar <mingo@elte.hu>

No, not in this case. The driver must only netif_rx_complete()

That might help.

at least once, perhaps budget is being passed erroneously in as zero?
-

To: David Miller <davem@...>
Cc: <linux-kernel@...>, <torvalds@...>, <shemminger@...>, Jeff Garzik <jgarzik@...>
Date: Monday, October 15, 2007 - 6:07 pm

ok, i've added such a patch.

looking at the dev.c code - can napi_struct->weight be zero
legitimately? If yes then the 0 gets passed to the driver and the driver
would return 1 - violating the assertion.

Ingo
-

To: David Miller <davem@...>
Cc: <linux-kernel@...>, <torvalds@...>, <shemminger@...>, Jeff Garzik <jgarzik@...>
Date: Monday, October 15, 2007 - 6:30 pm

update:

[ 186.635916] WARNING: at net/core/dev.c:2166 net_rx_action()
[ 186.641351] [<c060d9f5>] net_rx_action+0x145/0x1b0
[ 186.646191] [<c011d752>] __do_softirq+0x42/0x90
[ 186.650784] [<c011d7c6>] do_softirq+0x26/0x30
[ 186.655202] [<c011db48>] local_bh_enable+0x48/0xa0
[ 186.660055] [<c06023e0>] lock_sock_nested+0xa0/0xc0
[ 186.664995] [<c065da16>] tcp_recvmsg+0x16/0xbc0
[ 186.669588] [<c013e94b>] __generic_file_aio_write_nolock+0x27b/0x520
[ 186.676001] [<c0601d75>] sock_common_recvmsg+0x45/0x70
[ 186.681202] [<c05ff5df>] sock_aio_read+0x11f/0x140
[ 186.686054] [<c015c086>] do_sync_read+0xc6/0x110
[ 186.690735] [<c012b9b0>] autoremove_wake_function+0x0/0x40
[ 186.696280] [<c060dcfc>] net_tx_action+0x3c/0xe0
[ 186.700961] [<c015c9c2>] vfs_read+0x132/0x140
[ 186.705378] [<c015cd41>] sys_read+0x41/0x70
[ 186.709625] [<c0102b66>] sysenter_past_esp+0x5f/0x89
[ 186.714651] =======================
[ 186.718210] work: 65, weight: 64
[ 186.721414] ->poll: (nv_napi_poll+0x0/0x760)

so nv_napi_poll() returned with 65. How is that possible? Ah ...:

(rx_processed_cnt++ < limit)) {

that should be:

(++rx_processed_cnt < limit)) {

right? Find the fix below.

Ingo

-------------------->
Subject: forcedeth: fix the NAPI poll function
From: Ingo Molnar <mingo@elte.hu>

fix the forcedeth NAPI poll function to not emit this warning:

[ 186.635916] WARNING: at net/core/dev.c:2166 net_rx_action()
[ 186.641351] [<c060d9f5>] net_rx_action+0x145/0x1b0
[ 186.646191] [<c011d752>] __do_softirq+0x42/0x90
[ 186.650784] [<c011d7c6>] do_softirq+0x26/0x30
[ 186.655202] [<c011db48>] local_bh_enable+0x48/0xa0
[ 186.660055] [<c06023e0>] lock_sock_nested+0xa0/0xc0
[ 186.664995] [<c065da16>] tcp_recvmsg+0x16/0xbc0
[ 186.669588] [<c013e94b>] __generic_file_aio_write_n...

To: Ingo Molnar <mingo@...>
Cc: David Miller <davem@...>, <linux-kernel@...>, <torvalds@...>, <shemminger@...>
Date: Tuesday, October 16, 2007 - 5:20 pm

Would the attached patch be ok with people?

It's basically the same thing, except that it aligns a bit more closely
with forcedeth rework stuff I'm doing.

I'll send upstream today unless people scream...

Jeff

To: Jeff Garzik <jgarzik@...>
Cc: David Miller <davem@...>, <linux-kernel@...>, <torvalds@...>, <shemminger@...>
Date: Wednesday, October 17, 2007 - 3:25 am

looks good to me.

Acked-by: Ingo Molnar <mingo@elte.hu>

Ingo
-

To: Jeff Garzik <jgarzik@...>
Cc: David Miller <davem@...>, <linux-kernel@...>, <torvalds@...>, <shemminger@...>
Date: Wednesday, October 17, 2007 - 6:18 am

albeit you forgot to merge half of my patch :-/ The (tested) patch below
fixes that.

Ingo

---------------->
Subject: forcedeth: fix rx-work condition in nv_rx_process_optimized() too
From: Ingo Molnar <mingo@elte.hu>

the merge of my previous fix to forcedeth.c lots an important hunk.

this patch fixes that.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
drivers/net/forcedeth.c | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)

Index: linux/drivers/net/forcedeth.c
===================================================================
--- linux.orig/drivers/net/forcedeth.c
+++ linux/drivers/net/forcedeth.c
@@ -2408,13 +2408,13 @@ static int nv_rx_process_optimized(struc
struct fe_priv *np = netdev_priv(dev);
u32 flags;
u32 vlanflags = 0;
- u32 rx_processed_cnt = 0;
+ int rx_work = 0;
struct sk_buff *skb;
int len;

while((np->get_rx.ex != np->put_rx.ex) &&
!((flags = le32_to_cpu(np->get_rx.ex->flaglen)) & NV_RX2_AVAIL) &&
- (rx_processed_cnt++ < limit)) {
+ (rx_work < limit)) {

dprintk(KERN_DEBUG "%s: nv_rx_process_optimized: flags 0x%x.\n",
dev->name, flags);
@@ -2517,9 +2517,11 @@ next_pkt:
np->get_rx.ex = np->first_rx.ex;
if (unlikely(np->get_rx_ctx++ == np->last_rx_ctx))
np->get_rx_ctx = np->first_rx_ctx;
+
+ rx_work++;
}

- return rx_processed_cnt;
+ return rx_work;
}

static void set_bufsize(struct net_device *dev)
-

To: Ingo Molnar <mingo@...>
Cc: David Miller <davem@...>, <linux-kernel@...>, <torvalds@...>, <shemminger@...>
Date: Wednesday, October 17, 2007 - 8:21 pm

applied

-

To: Ingo Molnar <mingo@...>
Cc: David Miller <davem@...>, <linux-kernel@...>, <torvalds@...>, <shemminger@...>
Date: Monday, October 15, 2007 - 6:40 pm

Two comments:

1) we have a vague definition of "RX work processed." Due to error
conditions and goto's in that function, rx_processed_cnt may or may not
equal the number of packets actually processed.

2) man I dislike these inline C statement combinations (ranting at
original code style, not you). I would much rather waste a few extra
lines of source code and make the conditions obvious:

while (... && (rx_processed_cnt < limit)) {
rx_processed_cnt++;

...
}

or even

while (1) {
...
if (rx_processed_cnt == limit)
break;
rx_processed_cnt++;
}

The compiler certainly doesn't care, and IMO it prevents bugs.

Jeff

-

To: Jeff Garzik <jgarzik@...>
Cc: David Miller <davem@...>, <linux-kernel@...>, <torvalds@...>, <shemminger@...>
Date: Tuesday, October 16, 2007 - 1:42 am

agreed. Do you have an uptodate patch/git-URI for the forcedeth rewrite
you did? I can throw it into the testbed.

Ingo
-

To: Ingo Molnar <mingo@...>
Cc: David Miller <davem@...>, <linux-kernel@...>, <torvalds@...>, <shemminger@...>
Date: Tuesday, October 16, 2007 - 1:47 am

Branch 'fe-lock' of
git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/netdev-2.6.git

It works here locally, but at this very minute I am rewriting those
changesets yet again :)

Jeff

-

To: Jeff Garzik <jeff@...>
Cc: David Miller <davem@...>, <linux-kernel@...>, <torvalds@...>, <shemminger@...>
Date: Tuesday, October 16, 2007 - 2:40 am

FYI, looks good so far, ontop of Linus' latest -git tree.

btw., one thing i always found weird about forcedeth is that it
generates an extra ~100 irqs per second even when there's no network
traffic:

11: 89752 XT-PIC-XT eth0
11: 89854 XT-PIC-XT eth0
11: 89955 XT-PIC-XT eth0
11: 90056 XT-PIC-XT eth0
11: 90157 XT-PIC-XT eth0

(irq count snapshot every second). Just in case it's easy to fix ...

Ingo
-

To: Ingo Molnar <mingo@...>
Cc: David Miller <davem@...>, <linux-kernel@...>, <torvalds@...>, <shemminger@...>
Date: Tuesday, October 16, 2007 - 3:17 am

May I presume that behavior occurs in both mainline and my fe-lock branch?

That is likely the NIC's timer irq, which is programmed to kick the irq
handler every so often. DEV_NEED_TIMERIRQ is a starting point, as well
as "Known bugs" at the top of forcedeth.c, if you're curious.

I'm _betting_ that some of that can be eliminated, but we'll see...

Jeff

-

To: Jeff Garzik <jeff@...>
Cc: David Miller <davem@...>, <linux-kernel@...>, <torvalds@...>, <shemminger@...>
Date: Tuesday, October 16, 2007 - 3:49 am

ok. Was just a FYI - you were looking for feedback :-)

Ingo
-

To: Ingo Molnar <mingo@...>
Cc: David Miller <davem@...>, <linux-kernel@...>, <torvalds@...>, <shemminger@...>
Date: Tuesday, October 16, 2007 - 12:52 pm

Any chance of getting a quick comparison of forcedeth CPU usage in your
setup, between old driver and fe-lock driver?

I'm curious about the effects of fe-lock on CPU usage, which NAPI is
known to perturb (in positive or negative directions, depending on
conditions).

I presume both drivers can work at wire speed, sans any "too many

Thanks! :)

Jeff

-

To: Jeff Garzik <jgarzik@...>
Cc: Ingo Molnar <mingo@...>, David Miller <davem@...>, <linux-kernel@...>, <torvalds@...>, <shemminger@...>
Date: Monday, October 15, 2007 - 6:41 pm

Though it wasn't clear, that's an ACK

I'll push the patch if nobody else beats me to it.

-

To: <mingo@...>
Cc: <linux-kernel@...>, <torvalds@...>, <shemminger@...>, <jgarzik@...>
Date: Monday, October 15, 2007 - 6:39 pm

From: Ingo Molnar <mingo@elte.hu>

Excellent catch:

Acked-by: David S. Miller <davem@davemloft.net>
-

To: <mingo@...>
Cc: <linux-kernel@...>, <torvalds@...>, <shemminger@...>, <jgarzik@...>
Date: Monday, October 15, 2007 - 6:21 pm

From: Ingo Molnar <mingo@elte.hu>

I touched upon this in another reply.

For forcedeth it should never be zero, it gets set in
netif_napi_add() to RX_WORK_PER_LOOP which is unconditionally
defined to 64.

Afterwards napi->weight should never be modified and that is
what is passed into n->poll().

Perhaps some memory scribble is causing it to be zero'd.
-

To: <linux-kernel@...>
Cc: David S. Miller <davem@...>, Linus Torvalds <torvalds@...>, Stephen Hemminger <shemminger@...>
Date: Monday, October 15, 2007 - 7:27 am

unfortunately the warning seems spurious - i just rebooted the box into
the same kernel and the message did not trigger again.

Ingo
-

To: <linux-kernel@...>
Cc: David S. Miller <davem@...>, Linus Torvalds <torvalds@...>, Stephen Hemminger <shemminger@...>
Date: Monday, October 15, 2007 - 12:18 pm

update: i got this warning for a second time now with another config -
so i guess if someone sent me a fix-patch or debug-patch i would
eventually trigger it within a day again.

Ingo
-

Previous thread: Re: What still uses the block layer? by Neil Brown on Monday, October 15, 2007 - 7:19 am. (4 messages)

Next thread: [patch 0/2] Protect crashkernel against BSS overlap by Bernhard Walle on Monday, October 15, 2007 - 7:50 am. (5 messages)