1. Use more efficient way to determine flag status.
2. Hardware dose not mark fragment bit against IPv6 packets,
print TCP/UDP checksum warning message for IPv4 packets only.Signed-off-by: Guo-Fu Tseng <cooldavid@cooldavid.org>
---
drivers/net/jme.c | 26 +++++++++++++-------------
1 files changed, 13 insertions(+), 13 deletions(-)diff --git a/drivers/net/jme.c b/drivers/net/jme.c
index 29cd5f4..e7068c7 100644
--- a/drivers/net/jme.c
+++ b/drivers/net/jme.c
@@ -880,27 +880,27 @@ jme_rxsum_ok(struct jme_adapter *jme, u16 flags)
if (!(flags & (RXWBFLAG_TCPON | RXWBFLAG_UDPON | RXWBFLAG_IPV4)))
return false;- if (unlikely(!(flags & RXWBFLAG_MF) &&
- (flags & RXWBFLAG_TCPON) && !(flags & RXWBFLAG_TCPCS))) {
- msg_rx_err(jme, "TCP Checksum error.\n");
- goto out_sumerr;
+ if (unlikely((flags & (RXWBFLAG_MF | RXWBFLAG_TCPON | RXWBFLAG_TCPCS))
+ == RXWBFLAG_TCPON)) {
+ if (flags & RXWBFLAG_IPV4)
+ msg_rx_err(jme, "TCP Checksum error\n");
+ return false;
}- if (unlikely(!(flags & RXWBFLAG_MF) &&
- (flags & RXWBFLAG_UDPON) && !(flags & RXWBFLAG_UDPCS))) {
- msg_rx_err(jme, "UDP Checksum error.\n");
- goto out_sumerr;
+ if (unlikely((flags & (RXWBFLAG_MF | RXWBFLAG_UDPON | RXWBFLAG_UDPCS))
+ == RXWBFLAG_UDPON)) {
+ if (flags & RXWBFLAG_IPV4)
+ msg_rx_err(jme, "UDP Checksum error.\n");
+ return false;
}- if (unlikely((flags & RXWBFLAG_IPV4) && !(flags & RXWBFLAG_IPCS))) {
+ if (unlikely((flags & (RXWBFLAG_IPV4 | RXWBFLAG_IPCS))
+ == RXWBFLAG_IPV4)) {
msg_rx_err(jme, "IPv4 Checksum error.\n");
- goto out_sumerr;
+ return false;
}return true;
-
-out_sumerr:
- return false;
}static void
--
1.6.0.6--
1) Revert ARP windows IP conflict verification support, it breaks
things, from Eric W. Biederman.2) SCTP unintentionally broken by routing changes, fix from
Wei Yongjun.3) XFRM layer code does xfrm_addr_cmp() by hand, leading to
code duplication and the usual assortment of cut&paste
bugs. Also from Wei Yongjun.4) Netfilter fixes from Patrick McHardy, Jaswinder Singh Rajput,
Jan Engelhardt, and Jesper Dangaard Brouer.5) bnx2x driver has same link reporting bug as bnx2 driver, from
Naohiro Ooiwa.6) More fib_trie rebalancing fixes from Jarek Poplawski.
7) usbnet was converted to use netdev->stats to record stats
instead of using a private copy. Unfortunately, there are
a bunch of sub-drivers that code share with usbnet, and those
did not get converted so stats got lost. Fixed by Herbert Xu.8) Fixes for the new ieee802154 stack from Dmitry Eremin-Solenikov.
9) Non-TSO packets can become TSO ones and vice versa, which is
illegal and results in packets which have an illegal state.
Fixes from Herbert Xu.10) bfin_sir doesn't compile, missed net_device_ops conversion.
Fix from Graf Yang.11) Intel ethernet driver fixes from J. Kirsher and the rest of
the Intel driver team. In particular, more bugs caught by
the DMA mapping API debugging layer.Please pull, thanks a lot.
The following changes since commit 52989765629e7d182b4f146050ebba0abf2cb0b7:
Linus Torvalds (1):
Merge git://git.kernel.org/.../davem/net-2.6are available in the git repository at:
master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6.git master
Alexander Duyck (1):
igb: return PCI_ERS_RESULT_DISCONNECT on permanent errorAndre Detsch (1):
e1000: return PCI_ERS_RESULT_DISCONNECT on permanent errorDavid S. Miller (2):
Merge branch 'master' of git://git.kernel.org/.../kaber/nf-2.6
Merge branch 'for-linus' of git://git.kernel.org/.../lowpan/lowpanDmitry Eremin-Solenikov (3):
...
Hm, something in this lot quickly wrecked networking here - see the
tx timeout dump below. It starts with:[ 351.004596] WARNING: at net/sched/sch_generic.c:246 dev_watchdog+0x10b/0x19c()
[ 351.011815] Hardware name: System Product Name
[ 351.016220] NETDEV WATCHDOG: eth0 (forcedeth): transmit queue 0 timed outConfig attached. Unfortunately i've got no time to do bisection
today.Ingo
[ 351.004596] WARNING: at net/sched/sch_generic.c:246 dev_watchdog+0x10b/0x19c()
[ 351.011815] Hardware name: System Product Name
[ 351.016220] NETDEV WATCHDOG: eth0 (forcedeth): transmit queue 0 timed out
[ 351.022996] Pid: 0, comm: swapper Tainted: G W 2.6.31-rc1-tip-01159-g24b7800-dirty #59554
[ 351.031850] Call Trace:
[ 351.034270] [<c105108d>] warn_slowpath_common+0x60/0x90
[ 351.039550] [<c10510fb>] warn_slowpath_fmt+0x29/0x2c
[ 351.044593] [<c18eadf6>] dev_watchdog+0x10b/0x19c
[ 351.049348] [<c1074779>] ? __lock_acquired+0xbc/0x2c3
[ 351.054474] [<c1059fb7>] ? run_timer_softirq+0x125/0x1f0
[ 351.059832] [<c105a00d>] run_timer_softirq+0x17b/0x1f0
[ 351.065046] [<c1059fb7>] ? run_timer_softirq+0x125/0x1f0
[ 351.070420] [<c18eaceb>] ? dev_watchdog+0x0/0x19c
[ 351.075173] [<c105625d>] __do_softirq+0xbc/0x16f
[ 351.079849] [<c10561a1>] ? __do_softirq+0x0/0x16f
[ 351.084632] <IRQ> [<c1055b7e>] ? irq_exit+0x3a/0x3c
[ 351.089646] [<c10340b4>] ? smp_apic_timer_interrupt+0x74/0x82
[ 351.095468] [<c102383b>] ? apic_timer_interrupt+0x2f/0x34
[ 351.100927] [<c10220a5>] ? cpu_idle+0x43/0x5e
[ 351.105334] [<c10297fc>] ? default_idle+0x6e/0xc4
[ 351.110114] [<c103adc8>] ? native_safe_halt+0xa/0xc
[ 351.115040] [<c1029801>] ? default_idle+0x73/0xc4
[ 351.119803] [<c10220ab>] ? cpu_idle+0x49/0x5e
[ 351.124241] [<c1a3170c>] ? rest_init+0x58/0x5a
[ 351.128735] [<c20797e6>] ? start_kernel+0x268/0x26d
[ 351.133688] [<...
forcedeth might have a problem, in its netif_wake_queue() logic, but
I could not see why a recent patch could make this problem visible now.CPU0/1: AMD Athlon(tm) 64 X2 Dual Core Processor 3800+ stepping 02
is not a new cpu either :)forcedeth uses an internal tx_stop without appropriate barrier.
Could you try following patch ?
(random guess as I dont have much time right now)
Thank you
diff --git a/drivers/net/forcedeth.c b/drivers/net/forcedeth.c
index 1094d29..dc6bbde 100644
--- a/drivers/net/forcedeth.c
+++ b/drivers/net/forcedeth.c
@@ -2165,7 +2165,7 @@ static int nv_start_xmit(struct sk_buff *skb, struct net_device *dev)
empty_slots = nv_get_empty_tx_slots(np);
if (unlikely(empty_slots <= entries)) {
netif_stop_queue(dev);
- np->tx_stop = 1;
+ set_mb(np->tx_stop, 1);
spin_unlock_irqrestore(&np->lock, flags);
return NETDEV_TX_BUSY;
}
@@ -2286,7 +2286,7 @@ static int nv_start_xmit_optimized(struct sk_buff *skb, struct net_device *dev)
empty_slots = nv_get_empty_tx_slots(np);
if (unlikely(empty_slots <= entries)) {
netif_stop_queue(dev);
- np->tx_stop = 1;
+ set_mb(np->tx_stop, 1);
spin_unlock_irqrestore(&np->lock, flags);
return NETDEV_TX_BUSY;
}
@@ -2564,7 +2564,7 @@ static void nv_tx_timeout(struct net_device *dev)
else
status = readl(base + NvRegIrqStatus) & NVREG_IRQSTAT_MASK;- printk(KERN_INFO "%s: Got tx_timeout. irq: %08x\n", dev->name, status);
+ printk(KERN_INFO "%s: Got tx_timeout. irq: %08x tx_stop=%d\n", dev->name, status, np->tx_stop);{
int i;
--
Oh well this patch was soooo stupid, sorry Ingo.
We might have a race in napi_schedule(), leaving interrupts disabled forever.
I cannot test this patch, I dont have the hardware...Thanks
diff --git a/drivers/net/forcedeth.c b/drivers/net/forcedeth.c
index 1094d29..3b4e076 100644
--- a/drivers/net/forcedeth.c
+++ b/drivers/net/forcedeth.c
@@ -3514,11 +3514,13 @@ static irqreturn_t nv_nic_irq(int foo, void *data)
nv_msi_workaround(np);#ifdef CONFIG_FORCEDETH_NAPI
- napi_schedule(&np->napi);
-
- /* Disable furthur irq's
- (msix not enabled with napi) */
- writel(0, base + NvRegIrqMask);
+ if (napi_schedule_prep(&np->napi)) {
+ /*
+ * Disable further irq's (msix not enabled with napi)
+ */
+ writel(0, base + NvRegIrqMask);
+ __napi_schedule(&np->napi);
+ }#else
do
@@ -3615,12 +3617,13 @@ static irqreturn_t nv_nic_irq_optimized(int foo, void *data)
nv_msi_workaround(np);#ifdef CONFIG_FORCEDETH_NAPI
- napi_schedule(&np->napi);
-
- /* Disable furthur irq's
- (msix not enabled with napi) */
- writel(0, base + NvRegIrqMask);
-
+ if (napi_schedule_prep(&np->napi)) {
+ /*
+ * Disable further irq's (msix not enabled with napi)
+ */
+ writel(0, base + NvRegIrqMask);
+ __napi_schedule(&np->napi);
+ }
#else
do
{
--
Yes, good catch. There is a race condition here with napi poll.
I would prefer to do the following to keep the code simple and clean.
writel(0, base + NvRegIrqMask);
napi_schedule(&np->napi);--
CC trimmed down to network devs only :)
It would be racy too ...
check drivers/net/amd8111e.c, drivers/net/natsemi.c ...
If this cpu inconditionaly calls writel(0, base + NvRegIrqMask);
while another cpu just called writel(np->irqmask, base + NvRegIrqMask),
we end with disabled interrupts ?--
<BCC linux-kernel>
Yes, but the next instruction is to call napi_schedule() which will
--
From: Ayaz Abdulla <aabdulla@nvidia.com>
Why? If you BCC linux-kernel, then followups don't get
logged there, that's completely and utterly pointless.Either CC: it or don't include it at all.
--
Sorry, I dont get it. You are referring to softirqs, while I speak
of hardware (NIC) interrupts, that can be masked for ever.With your patch :
CPU0 CPU1
napi_complete();
/* please note there is no smp_wmb()
in __napi_complete() after
clear_bit(NAPI_STATE_SCHED, &n->state);
(It's implied on x86 because of lock prefix)
*/
writel(np->irqmask, base + NvRegIrqMask)
writel(0, base + NvRegIrqMask);
napi_schedule(&np->napi); // might do nothing because this cpu see
// NAPI_STATE_SCHED set to oneNIC cannot deliver further interrupts.
With my patch :
CPU0 CPU1
napi_complete();
/* please note there is no smp_wmb()
in __napi_complete() after
clear_bit(NAPI_STATE_SCHED, &n->state);
*/
writel(np->irqmask, base + NvRegIrqMask)
if (napi_schedule_prep(&np->napi)) // if false, NIC can deliver further interrupts.
// if true, we mask interrupts but re-enable napi. packets will be processed,
// interrupt will be re-enabled later.My patch has a guarantee that we mask NIC interrupts *only* if napi is re-scheduled.
It probably makes no difference on x86, but it might be better to have same logic in all drivers...
--
Our hw only runs on architectures that provide coherency. i.e x86
I'm fine with keeping it with both napi_schedule_prep() and __napi_schedule() if you want it to be similar to other drivers.
Thanks!
________________________________________
From: Eric Dumazet [eric.dumazet@gmail.com]
Sent: Thursday, July 02, 2009 5:07 PM
To: Ayaz Abdulla
Cc: David Miller; netdev@vger.kernel.org
Subject: Re: [GIT]: NetworkingSorry, I dont get it. You are referring to softirqs, while I speak
of hardware (NIC) interrupts, that can be masked for ever.With your patch :
CPU0 CPU1
napi_complete();
/* please note there is no smp_wmb()
in __napi_complete() after
clear_bit(NAPI_STATE_SCHED, &n->state);
(It's implied on x86 because of lock prefix)
*/
writel(np->irqmask, base + NvRegIrqMask)
writel(0, base + NvRegIrqMask);
napi_schedule(&np->napi); // might do nothing because this cpu see
// NAPI_STATE_SCHED set to oneNIC cannot deliver further interrupts.
With my patch :
CPU0 CPU1
napi_complete();
/* please note there is no smp_wmb()
in __napi_complete() after
clear_bit(NAPI_STATE_SCHED, &n->state);
*/
writel(np->irqmask, base + NvRegIrqMask)
if (napi_schedule_prep(&np->napi)) // if false, NIC can deliver further interrupts.
// if true, we mask interrupts but re-enable napi. packets will be processed,
// interrupt will be re-enabled later.My patch has a guarantee that we mask NIC interrupts *only* if napi is re-scheduled.
It probably makes no difference on x86, but it might be better to have same logic in all drivers...
--------------------------------------------------------...
From: Ayaz Abdulla <AAbdulla@nvidia.com>
It has nothing to do with "coherency".
Re-read Eric's emails, the race can happen on x86 quite trivially.
--
I have tested it and the config that failed before works now. (as do
other configs as well)I'll have to re-test whether the failing config fails with your fix
reverted - that will have to wait until tomorrow or so.Ingo
--
update: that forcedeth system is still working fine.
I triggered other networking problems today though, a trylock
failure message in netconsole, plus a dma-debug assert from e1000e:[ 1275.442025] Hangcheck: hangcheck value past margin!
[ 1451.860024] ------------[ cut here ]------------
[ 1451.860024] BUG: spinlock trylock failure on UP on CPU#0, distcc/32540
[ 1451.860024] lock: f5318760, .magic: dead4ead, .owner: distcc/32540, .owner_cpu: 0
[ 1451.860024] Pid: 32540, comm: distcc Tainted: G W 2.6.31-rc1-tip-01236-gb62d24b-dirty #6631
[ 1451.860024] Call Trace:
[ 1451.860024] [<c17764e8>] spin_bug+0x98/0x140
[ 1451.860024] [<c1776d01>] _raw_spin_trylock+0x71/0xc0
[ 1451.860024] [<c283ffbb>] _spin_trylock+0x1b/0x80
[ 1451.860024] [<c23c7868>] netpoll_send_skb+0x198/0x470
[ 1451.860024] [<c23c7eb0>] netpoll_send_udp+0x240/0x290
[ 1451.860024] [<c2840884>] ? _spin_lock_irqsave+0x64/0x90
[ 1451.860024] [<c1d69beb>] write_msg+0xeb/0x160
[ 1451.860024] [<c10933c8>] __call_console_drivers+0x78/0xa0
[ 1451.860024] [<c10934d9>] _call_console_drivers+0xe9/0x120
[ 1451.860024] [<c10938e7>] call_console_drivers+0x137/0x260
[ 1451.860024] [<c1093aaa>] release_console_sem+0x9a/0x1a0
[ 1451.860024] [<c10949b6>] vprintk+0x416/0x500
[ 1451.860024] [<c178b8f7>] ? check_for_illegal_area+0x197/0x340
[ 1451.860024] [<c1094abb>] printk+0x1b/0x20
[ 1451.860024] [<c1092bdb>] warn_slowpath_common+0x1b/0xe0
[ 1451.860024] [<c1092ce9>] warn_slowpath_fmt+0x29/0x30
[ 1451.860024] [<c178b8f7>] check_for_illegal_area+0x197/0x340
[ 1451.860024] [<c178be4c>] debug_dma_map_page+0x1cc/0x310
[ 1451.860024] [<c239fe19>] skb_dma_map+0x219/0x580
[ 1451.860024] [<c23919b6>] ? skb_copy_bits+0x126/0x590
[ 1451.860024] [<c1ab920f>] e1000_tx_map+0x4f/0x2a0
[ 1451.860024] [<c1abbb6b>] e1000_xmit_frame+0x28b/0x5a0
[ 1451.860024] [<c23a9fc0>] dev_har...
From: Ingo Molnar <mingo@elte.hu>
I've applied Eric's patch, thanks!
--
I don't know about the DMA error but this is perfectly normal.
You're invoking netpoll from within the NIC's xmit routine,
that's why the trylock fails. So I guess you should fix spinlock
debugging instead :)Cheers,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--
| Tarkan Erimer | Re: Dual-Licensing Linux Kernel with GPL V2 and GPL V3 |
| Arjan van de Ven | [Announce] Development release 0.1 of the LatencyTOP tool |
| Andrew Morton | -mm merge plans for 2.6.23 |
| Greg Kroah-Hartman | [PATCH 020/196] IDE: Convert from class_device to device for ide-tape |
git: | |
| Tantilov, Emil S | RE: [PATCH] net: sk_alloc() should not blindly overwrite memory |
| David Miller | [GIT]: Networking |
| Gerrit Renker | [PATCH 0/37] dccp: Feature negotiation - last call for comments |
