got this warning with Linus' latest -git tree:
WARNING: at net/core/dev.c:2161 net_rx_action()
[<80564db4>] net_rx_action+0xce/0x186
[<8011ba98>] __do_softirq+0x6c/0xcf
[<8011bb2d>] do_softirq+0x32/0x36
[<8011bcae>] irq_exit+0x35/0x40
[<80104fdb>] do_IRQ+0x5c/0x71
[<801048cd>] do_nmi+0x8f/0x238
[<801033a3>] common_interrupt+0x23/0x30
=======================box is operating fine otherwise. config and bootlog attached. This
warning got introduced by:commit bea3348eef27e6044b6161fd04c3152215f96411
Author: Stephen Hemminger <shemminger@linux-foundation.org>
Date: Wed Oct 3 16:41:36 2007 -0700[NET]: Make NAPI polling independent of struct net_device objects.
Ingo
From: Ingo Molnar <mingo@elte.hu>
This is a driver bug, the work "budget" passed into a driver's
->poll() handler should never be exceeded. That's what this
warning assertion is checking.What ethernet card is in your system and what driver is being
used to drive it?-
it's forcedeth.
i've checked nv_napi_poll(), and i dont see how it could return larger
than 'limit' number of packets.it could return packets == limit though:
pkts = nv_rx_process_optimized(dev, budget);
...if (pkts < budget) {
/* re-enable receive interrupts */
spin_lock_irqsave(&np->lock, flags);__netif_rx_complete(dev, napi);
...
return pkts;shouldnt that be "pkts <= budget"? But even that shouldnt cause a larger
than limit return. Weird.there are two networking cards in the system, the other one is a:
eth1: RealTek RTL8139 at 0xf080e000, 00:c0:df:03:68:5d, IRQ 11
eth1: Identified 8139 chip type 'RTL-8139B'but this one should be inactive (not plugged into the network). Should i
try to get a debug print out of the actual 'weight' and 'work' integers,
and of the n->poll function address?Ingo
-
FWIW, forcedeth has this funky disable_irq()-based locking mechanism
that is currently the subject of some kernel.org bugzillas. I would
also check and make sure that isn't perturbing your results (even though
it seems unrelated, based on what I read here). That might include
disabling MSI if it's enabled, and/or turning on spinlock debugging.Jeff
-
From: Ingo Molnar <mingo@elte.hu>
No, not in this case. The driver must only netif_rx_complete()
That might help.
at least once, perhaps budget is being passed erroneously in as zero?
-
ok, i've added such a patch.
looking at the dev.c code - can napi_struct->weight be zero
legitimately? If yes then the 0 gets passed to the driver and the driver
would return 1 - violating the assertion.Ingo
-
update:
[ 186.635916] WARNING: at net/core/dev.c:2166 net_rx_action()
[ 186.641351] [<c060d9f5>] net_rx_action+0x145/0x1b0
[ 186.646191] [<c011d752>] __do_softirq+0x42/0x90
[ 186.650784] [<c011d7c6>] do_softirq+0x26/0x30
[ 186.655202] [<c011db48>] local_bh_enable+0x48/0xa0
[ 186.660055] [<c06023e0>] lock_sock_nested+0xa0/0xc0
[ 186.664995] [<c065da16>] tcp_recvmsg+0x16/0xbc0
[ 186.669588] [<c013e94b>] __generic_file_aio_write_nolock+0x27b/0x520
[ 186.676001] [<c0601d75>] sock_common_recvmsg+0x45/0x70
[ 186.681202] [<c05ff5df>] sock_aio_read+0x11f/0x140
[ 186.686054] [<c015c086>] do_sync_read+0xc6/0x110
[ 186.690735] [<c012b9b0>] autoremove_wake_function+0x0/0x40
[ 186.696280] [<c060dcfc>] net_tx_action+0x3c/0xe0
[ 186.700961] [<c015c9c2>] vfs_read+0x132/0x140
[ 186.705378] [<c015cd41>] sys_read+0x41/0x70
[ 186.709625] [<c0102b66>] sysenter_past_esp+0x5f/0x89
[ 186.714651] =======================
[ 186.718210] work: 65, weight: 64
[ 186.721414] ->poll: (nv_napi_poll+0x0/0x760)so nv_napi_poll() returned with 65. How is that possible? Ah ...:
(rx_processed_cnt++ < limit)) {
that should be:
(++rx_processed_cnt < limit)) {
right? Find the fix below.
Ingo
-------------------->
Subject: forcedeth: fix the NAPI poll function
From: Ingo Molnar <mingo@elte.hu>fix the forcedeth NAPI poll function to not emit this warning:
[ 186.635916] WARNING: at net/core/dev.c:2166 net_rx_action()
[ 186.641351] [<c060d9f5>] net_rx_action+0x145/0x1b0
[ 186.646191] [<c011d752>] __do_softirq+0x42/0x90
[ 186.650784] [<c011d7c6>] do_softirq+0x26/0x30
[ 186.655202] [<c011db48>] local_bh_enable+0x48/0xa0
[ 186.660055] [<c06023e0>] lock_sock_nested+0xa0/0xc0
[ 186.664995] [<c065da16>] tcp_recvmsg+0x16/0xbc0
[ 186.669588] [<c013e94b>] __generic_file_aio_write_n...
Would the attached patch be ok with people?
It's basically the same thing, except that it aligns a bit more closely
with forcedeth rework stuff I'm doing.I'll send upstream today unless people scream...
Jeff
looks good to me.
Acked-by: Ingo Molnar <mingo@elte.hu>
Ingo
-
albeit you forgot to merge half of my patch :-/ The (tested) patch below
fixes that.Ingo
---------------->
Subject: forcedeth: fix rx-work condition in nv_rx_process_optimized() too
From: Ingo Molnar <mingo@elte.hu>the merge of my previous fix to forcedeth.c lots an important hunk.
this patch fixes that.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
drivers/net/forcedeth.c | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)Index: linux/drivers/net/forcedeth.c
===================================================================
--- linux.orig/drivers/net/forcedeth.c
+++ linux/drivers/net/forcedeth.c
@@ -2408,13 +2408,13 @@ static int nv_rx_process_optimized(struc
struct fe_priv *np = netdev_priv(dev);
u32 flags;
u32 vlanflags = 0;
- u32 rx_processed_cnt = 0;
+ int rx_work = 0;
struct sk_buff *skb;
int len;while((np->get_rx.ex != np->put_rx.ex) &&
!((flags = le32_to_cpu(np->get_rx.ex->flaglen)) & NV_RX2_AVAIL) &&
- (rx_processed_cnt++ < limit)) {
+ (rx_work < limit)) {dprintk(KERN_DEBUG "%s: nv_rx_process_optimized: flags 0x%x.\n",
dev->name, flags);
@@ -2517,9 +2517,11 @@ next_pkt:
np->get_rx.ex = np->first_rx.ex;
if (unlikely(np->get_rx_ctx++ == np->last_rx_ctx))
np->get_rx_ctx = np->first_rx_ctx;
+
+ rx_work++;
}- return rx_processed_cnt;
+ return rx_work;
}static void set_bufsize(struct net_device *dev)
-
applied
-
Two comments:
1) we have a vague definition of "RX work processed." Due to error
conditions and goto's in that function, rx_processed_cnt may or may not
equal the number of packets actually processed.2) man I dislike these inline C statement combinations (ranting at
original code style, not you). I would much rather waste a few extra
lines of source code and make the conditions obvious:while (... && (rx_processed_cnt < limit)) {
rx_processed_cnt++;...
}or even
while (1) {
...
if (rx_processed_cnt == limit)
break;
rx_processed_cnt++;
}The compiler certainly doesn't care, and IMO it prevents bugs.
Jeff
-
agreed. Do you have an uptodate patch/git-URI for the forcedeth rewrite
you did? I can throw it into the testbed.Ingo
-
Branch 'fe-lock' of
git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/netdev-2.6.gitIt works here locally, but at this very minute I am rewriting those
changesets yet again :)Jeff
-
FYI, looks good so far, ontop of Linus' latest -git tree.
btw., one thing i always found weird about forcedeth is that it
generates an extra ~100 irqs per second even when there's no network
traffic:11: 89752 XT-PIC-XT eth0
11: 89854 XT-PIC-XT eth0
11: 89955 XT-PIC-XT eth0
11: 90056 XT-PIC-XT eth0
11: 90157 XT-PIC-XT eth0(irq count snapshot every second). Just in case it's easy to fix ...
Ingo
-
May I presume that behavior occurs in both mainline and my fe-lock branch?
That is likely the NIC's timer irq, which is programmed to kick the irq
handler every so often. DEV_NEED_TIMERIRQ is a starting point, as well
as "Known bugs" at the top of forcedeth.c, if you're curious.I'm _betting_ that some of that can be eliminated, but we'll see...
Jeff
-
ok. Was just a FYI - you were looking for feedback :-)
Ingo
-
Any chance of getting a quick comparison of forcedeth CPU usage in your
setup, between old driver and fe-lock driver?I'm curious about the effects of fe-lock on CPU usage, which NAPI is
known to perturb (in positive or negative directions, depending on
conditions).I presume both drivers can work at wire speed, sans any "too many
Thanks! :)
Jeff
-
Though it wasn't clear, that's an ACK
I'll push the patch if nobody else beats me to it.
-
From: Ingo Molnar <mingo@elte.hu>
Excellent catch:
Acked-by: David S. Miller <davem@davemloft.net>
-
From: Ingo Molnar <mingo@elte.hu>
I touched upon this in another reply.
For forcedeth it should never be zero, it gets set in
netif_napi_add() to RX_WORK_PER_LOOP which is unconditionally
defined to 64.Afterwards napi->weight should never be modified and that is
what is passed into n->poll().Perhaps some memory scribble is causing it to be zero'd.
-
unfortunately the warning seems spurious - i just rebooted the box into
the same kernel and the message did not trigger again.Ingo
-
update: i got this warning for a second time now with another config -
so i guess if someone sent me a fix-patch or debug-patch i would
eventually trigger it within a day again.Ingo
-
| Greg Kroah-Hartman | [PATCH 001/196] Chinese: Add the known_regression URI to the HOWTO |
| Tarkan Erimer | Re: Dual-Licensing Linux Kernel with GPL V2 and GPL V3 |
| Amit K. Arora | [RFC] Heads up on sys_fallocate() |
| Bart Van Assche | Integration of SCST in the mainstream Linux kernel |
git: | |
| David Miller | [GIT]: Networking |
| Gerrit Renker | [PATCH 27/37] dccp: Integration of dynamic feature activation - part 2 (server side) |
| David Miller | Re: [PATCH] pkt_sched: Destroy gen estimators under rtnl_lock(). |
| Frans Pop | svc: failed to register lockdv1 RPC service (errno 97). |
