Re: Mainline kernel OLTP performance update

!MAILaRCHIVE_VOTE_RePLACE
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]
To: Andrew Morton <akpm@...>, <netdev@...>, <sfr@...>
Cc: <matthew@...>, <matthew.r.wilcox@...>, <chinang.ma@...>, <linux-kernel@...>, <sharad.c.tripathi@...>, <arjan@...>, <andi.kleen@...>, <suresh.b.siddha@...>, <harita.chilukuri@...>, <douglas.w.styner@...>, <peter.xihong.wang@...>, <hubert.nueckel@...>, <chris.mason@...>, <srostedt@...>, <linux-scsi@...>, <andrew.vasquez@...>, <anirban.chakraborty@...>
Date: Friday, January 16, 2009 - 2:46 am

On Friday 16 January 2009 15:12:10 Andrew Morton wrote:
wrote:

OK, I have these numbers to show I'm not completely off my rocker to suggest
we merge SLQB :) Given these results, how about I ask to merge SLQB as default
in linux-next, then if nothing catastrophic happens, merge it upstream in the
next merge window, then a couple of releases after that, given some time to
test and tweak SLQB, then we plan to bite the bullet and emerge with just one
main slab allocator (plus SLOB).


System is a 2socket, 4 core AMD. All debug and stats options turned off for
all the allocators; default parameters (ie. SLUB using higher order pages,
and the others tend to be using order-0). SLQB is the version I recently
posted, with some of the prefetching removed according to Pekka's review
(probably a good idea to only add things like that in if/when they prove to
be an improvement).

time fio examples/netio (10 runs, lower better):
SLAB AVG=13.19 STD=0.40
SLQB AVG=13.78 STD=0.24
SLUB AVG=14.47 STD=0.23

SLAB makes a good showing here. The allocation/freeing pattern seems to be
very regular and easy (fast allocs and frees). So it could be some "lucky"
caching behaviour, I'm not exactly sure. I'll have to run more tests and
profiles here.


hackbench (10 runs, lower better):
1 GROUP
SLAB AVG=1.34 STD=0.05
SLQB AVG=1.31 STD=0.06
SLUB AVG=1.46 STD=0.07

2 GROUPS
SLAB AVG=1.20 STD=0.09
SLQB AVG=1.22 STD=0.12
SLUB AVG=1.21 STD=0.06

4 GROUPS
SLAB AVG=0.84 STD=0.05
SLQB AVG=0.81 STD=0.10
SLUB AVG=0.98 STD=0.07

8 GROUPS
SLAB AVG=0.79 STD=0.10
SLQB AVG=0.76 STD=0.15
SLUB AVG=0.89 STD=0.08

16 GROUPS
SLAB AVG=0.78 STD=0.08
SLQB AVG=0.79 STD=0.10
SLUB AVG=0.86 STD=0.05

32 GROUPS
SLAB AVG=0.86 STD=0.05
SLQB AVG=0.78 STD=0.06
SLUB AVG=0.88 STD=0.06

64 GROUPS
SLAB AVG=1.03 STD=0.05
SLQB AVG=0.90 STD=0.04
SLUB AVG=1.05 STD=0.06

128 GROUPS
SLAB AVG=1.31 STD=0.19
SLQB AVG=1.16 STD=0.36
SLUB AVG=1.29 STD=0.11

SLQB tends to be the winner here. SLAB is close at lower numbers of
groups, but drops behind a bit more as they increase.


tbench (10 runs, higher better):
1 THREAD
SLAB AVG=239.25 STD=31.74
SLQB AVG=257.75 STD=33.89
SLUB AVG=223.02 STD=14.73

2 THREADS
SLAB AVG=649.56 STD=9.77
SLQB AVG=647.77 STD=7.48
SLUB AVG=634.50 STD=7.66

4 THREADS
SLAB AVG=1294.52 STD=13.19
SLQB AVG=1266.58 STD=35.71
SLUB AVG=1228.31 STD=48.08

8 THREADS
SLAB AVG=2750.78 STD=26.67
SLQB AVG=2758.90 STD=18.86
SLUB AVG=2685.59 STD=22.41

16 THREADS
SLAB AVG=2669.11 STD=58.34
SLQB AVG=2671.69 STD=31.84
SLUB AVG=2571.05 STD=45.39

SLAB and SLQB seem to be pretty close, winning some and losing some.
They're always within a standard deviation of one another, so we can't
make conclusions between them. SLUB seems to be a bit slower.


Netperf UDP unidirectional send test (10 runs, higher better):

Server and client bound to same CPU
SLAB AVG=60.111 STD=1.59382
SLQB AVG=60.167 STD=0.685347
SLUB AVG=58.277 STD=0.788328

Server and client bound to same socket, different CPUs
SLAB AVG=85.938 STD=0.875794
SLQB AVG=93.662 STD=2.07434
SLUB AVG=81.983 STD=0.864362

Server and client bound to different sockets
SLAB AVG=78.801 STD=1.44118
SLQB AVG=78.269 STD=1.10457
SLUB AVG=71.334 STD=1.16809

SLQB is up with SLAB for the first and last cases, and faster in
the second case. SLUB trails in each case. (Any ideas for better types
of netperf tests?)


Kbuild numbers don't seem to be significantly different. SLAB and SLQB
actually got exactly the same average over 10 runs. The user+sys times
tend to be almost identical between allocators, with elapsed time mainly
depending on how much time the CPU was not idle.


Intel's OLTP shows SLQB is "neutral" to SLAB. That is, literally within
their measurement confidence interval. If it comes down to it, I think we
could get them to do more runs to narrow that down, but we're talking a
couple of tenths of a percent already.


I haven't done any non-local network tests. Networking is the one of the
subsystems most heavily dependent on slab performance, so if anybody
cares to run their favourite tests, that would be really helpful.

Disclaimer
----------
Now remember this is just one specific HW configuration, and some
allocators for some reason give significantly (and sometimes perplexingly)
different results between different CPU and system architectures.

The other frustrating thing is that sometimes you happen to get a lucky
or unlucky cache or NUMA layout depending on the compile, the boot, etc.
So sometimes results get a little "skewed" in a way that isn't reflected
in the STDDEV. But I've tried to minimise that. Dropping caches and
restarting services etc. between individual runs.


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

Messages in current thread:
Re: Mainline kernel OLTP performance update, Nick Piggin, (Fri Jan 16, 2:46 am)
Re: Mainline kernel OLTP performance update, Rick Jones, (Fri Jan 16, 2:11 pm)
Re: Mainline kernel OLTP performance update, Nick Piggin, (Mon Jan 19, 3:43 am)
Re: Mainline kernel OLTP performance update, Rick Jones, (Mon Jan 19, 6:19 pm)
Re: Mainline kernel OLTP performance update, Andrew Morton, (Fri Jan 16, 3:00 am)
Re: Mainline kernel OLTP performance update, Nick Piggin, (Fri Jan 16, 4:59 am)
Re: Mainline kernel OLTP performance update, Nick Piggin, (Fri Jan 16, 3:25 am)
Re: Mainline kernel OLTP performance update, Matthew Wilcox, (Fri Jan 16, 2:55 am)
Re: Mainline kernel OLTP performance update, Zhang, Yanmin, (Fri Jan 16, 3:53 am)
Re: Mainline kernel OLTP performance update, Andi Kleen, (Fri Jan 16, 6:20 am)
Re: Mainline kernel OLTP performance update, Zhang, Yanmin, (Tue Jan 20, 1:16 am)
Re: Mainline kernel OLTP performance update, Christoph Lameter, (Wed Jan 21, 7:58 pm)
Re: Mainline kernel OLTP performance update, Zhang, Yanmin, (Thu Jan 22, 4:36 am)
Re: Mainline kernel OLTP performance update, Pekka Enberg, (Thu Jan 22, 5:15 am)
Re: Mainline kernel OLTP performance update, Zhang, Yanmin, (Thu Jan 22, 5:28 am)
Re: Mainline kernel OLTP performance update, Pekka Enberg, (Thu Jan 22, 5:47 am)
Re: Mainline kernel OLTP performance update, Zhang, Yanmin, (Thu Jan 22, 11:02 pm)
Re: Mainline kernel OLTP performance update, Nick Piggin, (Fri Jan 23, 4:33 am)
Re: Mainline kernel OLTP performance update, Zhang, Yanmin, (Fri Jan 23, 5:02 am)
Re: Mainline kernel OLTP performance update, Pekka Enberg, (Fri Jan 23, 2:52 am)
Re: Mainline kernel OLTP performance update, Pekka Enberg, (Fri Jan 23, 4:06 am)
Re: Mainline kernel OLTP performance update, Zhang, Yanmin, (Fri Jan 23, 4:30 am)
Re: Mainline kernel OLTP performance update, Pekka Enberg, (Fri Jan 23, 5:46 am)
Re: Mainline kernel OLTP performance update, Christoph Lameter, (Fri Jan 23, 11:22 am)
Re: Mainline kernel OLTP performance update, Zhang, Yanmin, (Fri Jan 23, 10:55 pm)
Re: Mainline kernel OLTP performance update, Christoph Lameter, (Mon Jan 26, 1:36 pm)
Re: Mainline kernel OLTP performance update, Zhang, Yanmin, (Sat Jan 31, 10:52 pm)
Re: Mainline kernel OLTP performance update, Pekka Enberg, (Sat Jan 24, 3:36 am)
Re: Mainline kernel OLTP performance update, Zhang, Yanmin, (Thu Feb 12, 1:22 am)
Re: Mainline kernel OLTP performance update, Zhang, Yanmin, (Thu Feb 12, 1:47 am)
Re: Mainline kernel OLTP performance update, Pekka Enberg, (Thu Feb 12, 12:03 pm)
Re: Mainline kernel OLTP performance update, Christoph Lameter, (Thu Feb 12, 11:25 am)
Re: Mainline kernel OLTP performance update, Pekka Enberg, (Thu Feb 12, 12:07 pm)
Re: Mainline kernel OLTP performance update, Pekka Enberg, (Fri Jan 23, 11:31 am)
Re: Mainline kernel OLTP performance update, Christoph Lameter, (Fri Jan 23, 11:55 am)
Re: Mainline kernel OLTP performance update, Pekka Enberg, (Fri Jan 23, 12:01 pm)
Re: Mainline kernel OLTP performance update, Pekka Enberg, (Fri Jan 23, 4:40 am)
Re: Mainline kernel OLTP performance update, Nick Piggin, (Fri Jan 16, 3:06 am)