got this warning with Linus' latest -git tree:
WARNING: at net/core/dev.c:2161 net_rx_action()
[<80564db4>] net_rx_action+0xce/0x186
[<8011ba98>] __do_softirq+0x6c/0xcf
[<8011bb2d>] do_softirq+0x32/0x36
[<8011bcae>] irq_exit+0x35/0x40
[<80104fdb>] do_IRQ+0x5c/0x71
[<801048cd>] do_nmi+0x8f/0x238
[<801033a3>] common_interrupt+0x23/0x30
=======================
box is operating fine otherwise. config and bootlog attached. This
warning got introduced by:
commit bea3348eef27e6044b6161fd04c3152215f96411
Author: Stephen Hemminger <shemminger@linux-foundation.org>
Date: Wed Oct 3 16:41:36 2007 -0700
[NET]: Make NAPI polling independent of struct net_device objects.
Ingo
unfortunately the warning seems spurious - i just rebooted the box into the same kernel and the message did not trigger again. Ingo -
update: i got this warning for a second time now with another config - so i guess if someone sent me a fix-patch or debug-patch i would eventually trigger it within a day again. Ingo -
From: Ingo Molnar <mingo@elte.hu> This is a driver bug, the work "budget" passed into a driver's ->poll() handler should never be exceeded. That's what this warning assertion is checking. What ethernet card is in your system and what driver is being used to drive it? -
it's forcedeth.
i've checked nv_napi_poll(), and i dont see how it could return larger
than 'limit' number of packets.
it could return packets == limit though:
pkts = nv_rx_process_optimized(dev, budget);
...
if (pkts < budget) {
/* re-enable receive interrupts */
spin_lock_irqsave(&np->lock, flags);
__netif_rx_complete(dev, napi);
...
return pkts;
shouldnt that be "pkts <= budget"? But even that shouldnt cause a larger
than limit return. Weird.
there are two networking cards in the system, the other one is a:
eth1: RealTek RTL8139 at 0xf080e000, 00:c0:df:03:68:5d, IRQ 11
eth1: Identified 8139 chip type 'RTL-8139B'
but this one should be inactive (not plugged into the network). Should i
try to get a debug print out of the actual 'weight' and 'work' integers,
and of the n->poll function address?
Ingo
-
ok, i've added such a patch. looking at the dev.c code - can napi_struct->weight be zero legitimately? If yes then the 0 gets passed to the driver and the driver would return 1 - violating the assertion. Ingo -
From: Ingo Molnar <mingo@elte.hu> I touched upon this in another reply. For forcedeth it should never be zero, it gets set in netif_napi_add() to RX_WORK_PER_LOOP which is unconditionally defined to 64. Afterwards napi->weight should never be modified and that is what is passed into n->poll(). Perhaps some memory scribble is causing it to be zero'd. -
update:
[ 186.635916] WARNING: at net/core/dev.c:2166 net_rx_action()
[ 186.641351] [<c060d9f5>] net_rx_action+0x145/0x1b0
[ 186.646191] [<c011d752>] __do_softirq+0x42/0x90
[ 186.650784] [<c011d7c6>] do_softirq+0x26/0x30
[ 186.655202] [<c011db48>] local_bh_enable+0x48/0xa0
[ 186.660055] [<c06023e0>] lock_sock_nested+0xa0/0xc0
[ 186.664995] [<c065da16>] tcp_recvmsg+0x16/0xbc0
[ 186.669588] [<c013e94b>] __generic_file_aio_write_nolock+0x27b/0x520
[ 186.676001] [<c0601d75>] sock_common_recvmsg+0x45/0x70
[ 186.681202] [<c05ff5df>] sock_aio_read+0x11f/0x140
[ 186.686054] [<c015c086>] do_sync_read+0xc6/0x110
[ 186.690735] [<c012b9b0>] autoremove_wake_function+0x0/0x40
[ 186.696280] [<c060dcfc>] net_tx_action+0x3c/0xe0
[ 186.700961] [<c015c9c2>] vfs_read+0x132/0x140
[ 186.705378] [<c015cd41>] sys_read+0x41/0x70
[ 186.709625] [<c0102b66>] sysenter_past_esp+0x5f/0x89
[ 186.714651] =======================
[ 186.718210] work: 65, weight: 64
[ 186.721414] ->poll: (nv_napi_poll+0x0/0x760)
so nv_napi_poll() returned with 65. How is that possible? Ah ...:
(rx_processed_cnt++ < limit)) {
that should be:
(++rx_processed_cnt < limit)) {
right? Find the fix below.
Ingo
-------------------->
Subject: forcedeth: fix the NAPI poll function
From: Ingo Molnar <mingo@elte.hu>
fix the forcedeth NAPI poll function to not emit this warning:
[ 186.635916] WARNING: at net/core/dev.c:2166 net_rx_action()
[ 186.641351] [<c060d9f5>] net_rx_action+0x145/0x1b0
[ 186.646191] [<c011d752>] __do_softirq+0x42/0x90
[ 186.650784] [<c011d7c6>] do_softirq+0x26/0x30
[ 186.655202] [<c011db48>] local_bh_enable+0x48/0xa0
[ 186.660055] [<c06023e0>] lock_sock_nested+0xa0/0xc0
[ 186.664995] [<c065da16>] tcp_recvmsg+0x16/0xbc0
[ 186.669588] [<c013e94b>] __generic_file_aio_write_nolock+0x27b/0x520
[ 186.676001] [<c0601d75>] sock_common_recvmsg+0x45/0x70
[ 186.681202] [<c05ff5df>] sock_aio_read+0x11f/0x140
[ ...From: Ingo Molnar <mingo@elte.hu> Excellent catch: Acked-by: David S. Miller <davem@davemloft.net> -
Two comments:
1) we have a vague definition of "RX work processed." Due to error
conditions and goto's in that function, rx_processed_cnt may or may not
equal the number of packets actually processed.
2) man I dislike these inline C statement combinations (ranting at
original code style, not you). I would much rather waste a few extra
lines of source code and make the conditions obvious:
while (... && (rx_processed_cnt < limit)) {
rx_processed_cnt++;
...
}
or even
while (1) {
...
if (rx_processed_cnt == limit)
break;
rx_processed_cnt++;
}
The compiler certainly doesn't care, and IMO it prevents bugs.
Jeff
-
Though it wasn't clear, that's an ACK I'll push the patch if nobody else beats me to it. -
agreed. Do you have an uptodate patch/git-URI for the forcedeth rewrite you did? I can throw it into the testbed. Ingo -
Branch 'fe-lock' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/netdev-2.6.git It works here locally, but at this very minute I am rewriting those changesets yet again :) Jeff -
FYI, looks good so far, ontop of Linus' latest -git tree. btw., one thing i always found weird about forcedeth is that it generates an extra ~100 irqs per second even when there's no network traffic: 11: 89752 XT-PIC-XT eth0 11: 89854 XT-PIC-XT eth0 11: 89955 XT-PIC-XT eth0 11: 90056 XT-PIC-XT eth0 11: 90157 XT-PIC-XT eth0 (irq count snapshot every second). Just in case it's easy to fix ... Ingo -
May I presume that behavior occurs in both mainline and my fe-lock branch? That is likely the NIC's timer irq, which is programmed to kick the irq handler every so often. DEV_NEED_TIMERIRQ is a starting point, as well as "Known bugs" at the top of forcedeth.c, if you're curious. I'm _betting_ that some of that can be eliminated, but we'll see... Jeff -
ok. Was just a FYI - you were looking for feedback :-) Ingo -
Any chance of getting a quick comparison of forcedeth CPU usage in your setup, between old driver and fe-lock driver? I'm curious about the effects of fe-lock on CPU usage, which NAPI is known to perturb (in positive or negative directions, depending on conditions). I presume both drivers can work at wire speed, sans any "too many Thanks! :) Jeff -
Would the attached patch be ok with people? It's basically the same thing, except that it aligns a bit more closely with forcedeth rework stuff I'm doing. I'll send upstream today unless people scream... Jeff
looks good to me. Acked-by: Ingo Molnar <mingo@elte.hu> Ingo -
albeit you forgot to merge half of my patch :-/ The (tested) patch below
fixes that.
Ingo
---------------->
Subject: forcedeth: fix rx-work condition in nv_rx_process_optimized() too
From: Ingo Molnar <mingo@elte.hu>
the merge of my previous fix to forcedeth.c lots an important hunk.
this patch fixes that.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
drivers/net/forcedeth.c | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)
Index: linux/drivers/net/forcedeth.c
===================================================================
--- linux.orig/drivers/net/forcedeth.c
+++ linux/drivers/net/forcedeth.c
@@ -2408,13 +2408,13 @@ static int nv_rx_process_optimized(struc
struct fe_priv *np = netdev_priv(dev);
u32 flags;
u32 vlanflags = 0;
- u32 rx_processed_cnt = 0;
+ int rx_work = 0;
struct sk_buff *skb;
int len;
while((np->get_rx.ex != np->put_rx.ex) &&
!((flags = le32_to_cpu(np->get_rx.ex->flaglen)) & NV_RX2_AVAIL) &&
- (rx_processed_cnt++ < limit)) {
+ (rx_work < limit)) {
dprintk(KERN_DEBUG "%s: nv_rx_process_optimized: flags 0x%x.\n",
dev->name, flags);
@@ -2517,9 +2517,11 @@ next_pkt:
np->get_rx.ex = np->first_rx.ex;
if (unlikely(np->get_rx_ctx++ == np->last_rx_ctx))
np->get_rx_ctx = np->first_rx_ctx;
+
+ rx_work++;
}
- return rx_processed_cnt;
+ return rx_work;
}
static void set_bufsize(struct net_device *dev)
-
From: Ingo Molnar <mingo@elte.hu> No, not in this case. The driver must only netif_rx_complete() That might help. at least once, perhaps budget is being passed erroneously in as zero? -
FWIW, forcedeth has this funky disable_irq()-based locking mechanism that is currently the subject of some kernel.org bugzillas. I would also check and make sure that isn't perturbing your results (even though it seems unrelated, based on what I read here). That might include disabling MSI if it's enabled, and/or turning on spinlock debugging. Jeff -
