It seems that the network stack becomes slower over time? Here is a list of tbench results with various kernel versions: 2.6.22 3207.77 mb/sec 2.6.24 3185.66 2.6.25 2848.83 2.6.26 2706.09 2.6.27(rc2) 2571.03 And linux-next is: 2.6.28(l-next) 2568.74 It shows that there is still have work to be done on linux-next. Too close to upstream in performance. Note the KT event between 2.6.24 and 2.6.25. Why is that? --
is this with SLAB or with SLUB? SLUB has been known to impact network performance... Auke --
The original testing config was SLAB based so it was used throughout. --
From: Christoph Lameter <cl@linux-foundation.org> Isn't that when some major scheduler changes went in? I'm not blaming the scheduler, but rather I'm making the point that there are other subsystems in the kernel that the networking interacts with that influences performance at such a low level. This includes the memory allocator :-) --
Right this covers a significant portion of the kernel. SLAB was used since .22 was pretty early for SLUB. And around 2.6.24 we had the merges of the antifrag logic. .25 was the point where HR timers came in. By switching off hrtimers I can get some (minor) portion of performance back. There must be more things in play though. Maybe what we are seeing is general bloat in kernel execution paths due to the growth in complexity? --
From: Christoph Lameter <cl@linux-foundation.org> It could be, and any kind of analysis into this would be great. I had a change that RCU destroyed sockets and this added a tiny bit of latency, so I never added it even though it would have allowed a lot of simplification of socket handling (which I though would make up for RCU's latency, but it didn't). --
perhaps Rick Jones who maintains netperf could enlighten us on some historic numbers? he usually seems to be happy to prop up new netperf numbers :) Auke --
While this is an excellent opening to talk about how netperf top-of-trunk can now emit keyword=value results easier (ostensibly) to put into a database then the regular or even CSV output formats, I cannot fully exploit it by pointing at a database of results :( rick jones --
Wouldn't surprise me. Have you considered doing profiles? e.g. just oprofiling the benchmark on the different kernels and see if there's some obvious difference in the CPU consumers? -Andi --
If I get the time I will try to do that. Another way to understand why we are accepting the regressions here may be that we give more consideration to real time issues and deterministic performance these days. Hardware speed gains compensate for the additional bloat? (I ran the old kernels on cutting edge hardware after all). --
...IIRC, somebody in the past did even bisect his (probably netperf) 2.6.24-25 regression to some scheduler change (obviously it might or might not be related to this case of yours)... -- i. --
I did find much regression with netperf TCP-RR-1/UDP-RR-1/UDP-RR-512. I start 1 serve and 1 client while binding them to a different logical processor in different physical cpu. Comparing with 2.6.22, the regression of TCP-RR-1 on 16-core tigerton is: 2.6.23 6% 2.6.24 6% 2.6.25 9.7% 2.6.26 14.5% 2.6.27-rc1 22% Other regressions on other machines are similar. yanmin --
> On Tue, 2008-08-12 at 11:13 +0300, Ilpo J
I reverted the patch against 2.6.27-rc1 and did a quick testing with netperf TCP-RR-1 and didn't find improvement. So your patch is good. Mostly, I suspect process scheduler causes the regression. It seems when there are only 1 or 2 tasks running on the cpu, the performance isn't good. My netperf testing is just one example. --
There are AIM7 regressions that are similar to tbench. 2.6.22 28436 2.6.26 23064 --
On Mon, Aug 18, 2008 at 7:07 AM, Christoph Lameter Just a shot in the dark -- is this with Group Scheduling on or off? Off is prefered for benchmarks. --
Mostly, AIM7 has about 4~5% regression on my machines. As AIM7 result is stable, so 4% is big. --
What's the hardware configuration? Is it dual-core? I also track tbench performance with lastest kernels on a couple of quad-core machines, and didn't find such regression while the results did have fluctuation. What's the commandline you is using to start tbench? I start tbench with CPU_NUM*2. --
| Greg KH | Og dreams of kernels |
| Jens Axboe | [PATCH 31/33] Fusion: sg chaining support |
| Arnd Bergmann | Re: finding your own dead "CONFIG_" variables |
| Mark Brown | [PATCH 2/2] Subject: natsemi: Allow users to disable workaround for DspCfg reset |
| Tony Breeds | [LGUEST] Look in object dir for .config |
git: | |
| Brian Downing | Re: Git in a Nutshell guide |
