Re: [PATCH] myr10ge: again fix lro_gen_skb() alignment

Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]
From: Andrew Gallatin
Date: Friday, April 24, 2009 - 9:16 am

Herbert Xu wrote:

 From what I can tell,  CPU utilization is only broken when a CPU is
otherwise idle, so it should be accurate when you bind the IRQ and the
netserver to the same CPU.   Here are results from an older, slower
core-2 Xeon with a 4MB L2 cache:

processor       : 0
vendor_id       : GenuineIntel
cpu family      : 6
model           : 15
model name      : Intel(R) Xeon(R) CPU            5150  @ 2.66GHz
stepping        : 6
cpu MHz         : 2659.916
cache size      : 4096 KB
physical id     : 0
siblings        : 1
core id         : 0
cpu cores       : 2
apicid          : 0
initial apicid  : 0
fpu             : yes
fpu_exception   : yes
cpuid level     : 10
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge 
mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe 
syscall nx lm constant_tsc arch_perfmon pebs bts rep_good pni dtes64 
monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm dca lahf_lm tpr_shadow
bogomips        : 5319.83
clflush size    : 64
cache_alignment : 64
address sizes   : 36 bits physical, 48 bits virtual
power management:

The Xeon was running net-next, had DCA enabled, ioatdma disabled for
TCP (CONFIG_NET_DMA is not set).  The sender was the weak athlon64,
running 2.6.22.

LRO, no soaker: (13,200 intrs/sec)
Recv   Send    Send                          Utilization       Service 
Demand
Socket Socket  Message  Elapsed              Send     Recv     Send    Recv
Size   Size    Size     Time     Throughput  local    remote   local 
remote
bytes  bytes   bytes    secs.    10^6bits/s  % S      % S      us/KB   us/KB
  87380  65536  65536    60.02      9469.74   17.44    13.31    0.302 
0.461

LRO, soaker: (6,500 intrs/sec)

  87380  65536  65536    60.06      3955.74   7.11     25.02    0.294 
2.072

GRO, no soaker, (13,200 intrs/sec)
  87380  65536  65536    60.02      9467.90   16.76    14.16    0.290 
0.490

GRO, soaker: (6,500 intrs/sec)
  87380  65536  65536    60.02      3774.88   6.20     25.01    0.269 
2.171


These results are indeed quite close, so the performance problem seems
isolated to AMD CPUS, and perhaps due to the smaller caches.
Do you have any AMD you can use as a receiver?

Note that the GRO results were still obtained by (bogusly) setting
CHECKSUM_UNNECESSARY.  I tried to use your patch, and I see
terrible performance. Netperf shows between 1Gb/s to 2Gb/s (compared
to 5Gb/s with GRO disabled).  I don't see bad checksums in netstat
on the receiver, but it *feels* like something like that.

Here's a diff of netstat -st taken on the sender before and after
a 5 second netperf:
2c2
<     157 active connections openings
---
 >     159 active connections openings
7,9c7,9
<     31465934 segments received
<     72887021 segments send out
<     679 segments retransmited
---
 >     32184827 segments received
 >     73473546 segments send out
 >     698 segments retransmited
16c16
<     4596 packets directly queued to recvmsg prequeue.
---
 >     4603 packets directly queued to recvmsg prequeue.
18,21c18,21
<     15928 packets header predicted
<     18100148 acknowledgments not containing data received
<     13351873 predicted acknowledgments
<     343 times recovered from packet loss due to SACK data
---
 >     15930 packets header predicted
 >     18464095 acknowledgments not containing data received
 >     13706813 predicted acknowledgments
 >     365 times recovered from packet loss due to SACK data
23,25c23,25
<     53 congestion windows fully recovered
<     221 congestion windows partially recovered using Hoe heuristic
<     TCPDSACKUndo: 268
---
 >     60 congestion windows fully recovered
 >     228 congestion windows partially recovered using Hoe heuristic
 >     TCPDSACKUndo: 281
27,28c27,28
<     584 fast retransmits
<     93 forward retransmits
---
 >     597 fast retransmits
 >     99 forward retransmits
30c30
<     674 DSACKs received
---
 >     693 DSACKs received

And on the receiver (whose netstat is confused, and cannot read ext 
stats in a net-next kernel):
diff /tmp/a /tmp/b
3c3
<     12 passive connection openings
---
 >     14 passive connection openings
7,8c7,8
<     3776478 segments received
<     3775846 segments send out
---
 >     4495385 segments received
 >     4494747 segments send out



This was using a net-next pulled 1/2 hour ago.  The only patch was your 
GRO patch applied to myri10ge.  Do you have some other local patch
which might be helping you?

Drew
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

Messages in current thread:
[PATCH] myr10ge: again fix lro_gen_skb() alignment, Stanislaw Gruszka, (Wed Apr 15, 1:09 am)
Re: [PATCH] myr10ge: again fix lro_gen_skb() alignment, David Miller, (Wed Apr 15, 2:28 am)
Re: [PATCH] myr10ge: again fix lro_gen_skb() alignment, Brice Goglin, (Wed Apr 15, 2:48 am)
Re: [PATCH] myr10ge: again fix lro_gen_skb() alignment, David Miller, (Wed Apr 15, 3:02 am)
Re: [PATCH] myr10ge: again fix lro_gen_skb() alignment, Andrew Gallatin, (Wed Apr 15, 6:01 am)
Re: [PATCH] myr10ge: again fix lro_gen_skb() alignment, Andrew Gallatin, (Wed Apr 15, 2:04 pm)
Re: [PATCH] myr10ge: again fix lro_gen_skb() alignment, David Miller, (Wed Apr 15, 4:42 pm)
Re: [PATCH] myr10ge: again fix lro_gen_skb() alignment, David Miller, (Thu Apr 16, 2:02 am)
Re: [PATCH] myr10ge: again fix lro_gen_skb() alignment, Andrew Gallatin, (Tue Apr 21, 12:19 pm)
Re: [PATCH] myr10ge: again fix lro_gen_skb() alignment, Andrew Gallatin, (Wed Apr 22, 8:37 am)
Re: [PATCH] myr10ge: again fix lro_gen_skb() alignment, Herbert Xu, (Thu Apr 23, 10:45 pm)
Re: [PATCH] myr10ge: again fix lro_gen_skb() alignment, Andrew Gallatin, (Fri Apr 24, 5:45 am)
Re: [PATCH] myr10ge: again fix lro_gen_skb() alignment, Andrew Gallatin, (Fri Apr 24, 9:16 am)
Re: [PATCH] myr10ge: again fix lro_gen_skb() alignment, Rick Jones, (Fri Apr 24, 10:13 am)
Re: [PATCH] myr10ge: again fix lro_gen_skb() alignment, David Miller, (Mon Apr 27, 2:32 am)
Re: [PATCH] myr10ge: again fix lro_gen_skb() alignment, David Miller, (Mon Apr 27, 5:45 am)
Re: [PATCH] myr10ge: again fix lro_gen_skb() alignment, David Miller, (Mon Apr 27, 5:45 am)
Re: [PATCH] myr10ge: again fix lro_gen_skb() alignment, Herbert Xu, (Mon Apr 27, 11:12 pm)
Re: [PATCH] myr10ge: again fix lro_gen_skb() alignment, Andrew Gallatin, (Tue Apr 28, 8:00 am)
Re: [PATCH] myr10ge: again fix lro_gen_skb() alignment, David Miller, (Tue Apr 28, 8:02 am)
Re: [PATCH] myr10ge: again fix lro_gen_skb() alignment, Andrew Gallatin, (Tue Apr 28, 8:44 am)
Re: [PATCH] myr10ge: again fix lro_gen_skb() alignment, Andrew Gallatin, (Tue Apr 28, 2:12 pm)
Re: [PATCH] myr10ge: again fix lro_gen_skb() alignment, Andrew Gallatin, (Wed Apr 29, 6:42 am)
Re: [PATCH] myr10ge: again fix lro_gen_skb() alignment, Eric Dumazet, (Wed Apr 29, 6:53 am)
Re: [PATCH] myr10ge: again fix lro_gen_skb() alignment, Andrew Gallatin, (Wed Apr 29, 7:18 am)
Re: [PATCH] myr10ge: again fix lro_gen_skb() alignment, Eric Dumazet, (Wed Apr 29, 8:26 am)
Re: [PATCH] myr10ge: again fix lro_gen_skb() alignment, Andrew Gallatin, (Wed Apr 29, 10:28 am)
Re: [PATCH] myr10ge: again fix lro_gen_skb() alignment, Eric Dumazet, (Thu Apr 30, 1:17 am)
Re: [PATCH] myr10ge: again fix lro_gen_skb() alignment, Andrew Gallatin, (Thu Apr 30, 12:14 pm)