Hi Kim, On Fri, Aug 17, 2007 at 11:34:37AM -0700, K Naru wrote:Well, I've reformated your patch so that it can be applied, and very slightly arranged it in order to save 13 bytes of code and a few CPU cycles. Also, I moved the if block before the spinlock as there is no reason for this code to be run with the lock held. I have run some performance measurements on an ALIX 3C2 motherboard with a 2.6.22-stable kernel. What I see is a reduction of CPU usage by about 20% when the network is saturated, but also a reduction of the network speed by 8%! Without the patch, I can produce a continuous traffic of about 99 Mbps with about 11% CPU (system only, 89% idle). With the patch, the traffic drops to 91 Mbps but CPU usage decreases to 9%. Now, if I reduce the MTU to exactly 1000, then the traffic increases to about 98 Mbps, but it progressively reduces when the MTU moves away from 1000. So I have run some deeper tests consisting in leaving NETIF_F_IP_CSUM unset and still asking the NIC to compute the checksums. The conclusion is very clear: as soon as *any* checksum bit is set (IP, TCP, UDP), the traffic immediately drops. I think that what happens is that the NIC is not pipelined at all and that no data is transferred while a checksum is being computed. This would also explain why reducing the MTU increases performance, since it reduces the time required to compute a checksum, reducing the off time. And the more I think about it, the more I think this is the problem, because the VT6105M has a 2kB transmit buffer, so it cannot checksum a 1.5kB frame while sending another one if it does it inside the buffer. And I'm pretty sure that the checksum is computed in the buffer and that the data is not transferred twice on the bus, because playing with PCI latency timer and other parameters does not change anything. So basically, we're there with a chip which can offload the CPU by performing the checksums itself, but it reduces performance for packets larger than 1kB (or possibly 500 bytes if there's a 1.5kB packet being transferred). The driver should be adjusted to permit the user to enable and disable this feature with ethtool. Right now, its status can only be consulted, and I'm using dd on /dev/mem and /dev/kmem to change the values on the fly. Given the fact that a 20% reduction on CPU usage which was already 10% only leaves a net gain of about 2% more CPU available, I'm not convinced that there is any advantage in enabling this feature by default with this NIC. Here's the updated patch for reference (maybe you'd want to enhance it). --- linux-2.6.22-wt3/drivers/net/via-rhine.c 2007-11-22 17:48:34 +0100 +++ linux-2.6.22-wt3.via-cksum/drivers/net/via-rhine.c 2007-12-30 20:53:30 +0100 @@ -95,6 +95,8 @@ #include <linux/netdevice.h> #include <linux/etherdevice.h> #include <linux/skbuff.h> +#include <linux/in.h> +#include <linux/ip.h> #include <linux/init.h> #include <linux/delay.h> #include <linux/mii.h> @@ -343,6 +345,9 @@ /* Initial value for tx_desc.desc_length, Buffer size goes to bits 0-10 */ #define TXDESC 0x00e08000 +#define TDES1_TCPCK 0x00100000 /* Bit 20, Transmit Desc 1 */ +#define TDES1_UDPCK 0x00080000 /* Bit 19, Transmit Desc 1 */ +#define TDES1_IPCK 0x00040000 /* Bit 18, Transmit Desc 1 */ enum rx_status_bits { RxOK=0x8000, RxWholePkt=0x0300, RxErr=0x008F @@ -788,6 +793,9 @@ if (rp->quirks & rqRhineI) dev->features |= NETIF_F_SG|NETIF_F_HW_CSUM; + if (pci_rev >= VT6105M) + dev->features |= NETIF_F_IP_CSUM; /* chip does checksum */ + /* dev->name not defined before register_netdev()! */ rc = register_netdev(dev); if (rc) @@ -1260,6 +1268,18 @@ rp->tx_ring[entry].desc_length = cpu_to_le32(TXDESC | (skb->len >= ETH_ZLEN ? skb->len : ETH_ZLEN)); + if ((dev->features & NETIF_F_IP_CSUM) && + (skb->ip_summed == CHECKSUM_PARTIAL)) { + /* Offload checksum to chip. */ + const struct iphdr *ip = ip_hdr(skb); + unsigned long flag; + + flag = (ip->protocol == IPPROTO_TCP) ? TDES1_TCPCK|TDES1_IPCK : + (ip->protocol == IPPROTO_UDP) ? TDES1_UDPCK|TDES1_IPCK : + TDES1_IPCK; + rp->tx_ring[entry].desc_length |= flag; + } + /* lock eth irq */ spin_lock_irq(&rp->lock); wmb(); Best regards, Willy --
| Fernando Luis | [PATCH] affinity is not defined in non-smp kernels - i386 (v2) |
| Andrew Morton | Re: [PATCH 2/2][MTD] Add support for > 2GiB MTD devices |
| Zev Weiss | [PATCH] [MTD] mtdchar.c: Fix regression in MEMGETREGIONINFO ioctl() |
| Bart Van Assche | Integration of SCST in the mainstream Linux kernel |
git: | |
| Dan Farina | backup or mirror a repository |
| Ken Pratt | pack operation is thrashing my server |
| Junio C Hamano | What's cooking in git.git (Aug 2008, #07; Sat, 23) |
| Sverre Rabbelier | Git vs Monotone |
| Richard Stallman | Real men don't attack straw men |
| Richard Daemon | OpenBSD 4.3 running in VirtualBox? Anyone have it working properly? |
| Kent Watsen | Re: vlan trunking with a powerconnect 5224 |
| David Collier-Brown | Re: GPL version 4 |
| Jim Winstead Jr. | Re: Root Disk/Book Disk Compatibility |
| C Wayne Huling | Re: Can males come from... |
| Dong Liu | Re: CXterm for LINUX |
| David Gabrius | Re: NT vs Linux (was: Re: truth or dare) |
