Hi, Just ran some tbench numbers (from dbench-3.04), on a 2 socket, 8 core x86 system, with 1 NUMA node per socket. With kernel 2.6.24-rc2, comparing slab vs slub allocators. I run from 1 to 16 client threads, 5 times each, and restarting the tbench server between every run. I'm just taking the highest of each of the 5 tests (because the scheduler placement can sometimes be poor). It's not completely scientific, but from the graph you can guess it is relatively stable and seems significant. Summary: slub is consistently slower. When all CPUs are saturated, it is around 20% slower. Attached is a graph (x is nrclients, y is throughput MB/s) If I can help with reproducing it or testing anything, let me know. I'll be trying out a few other benchmarks too... anything you want me to test specifically and I can try. Thanks, Nick
| Greg Kroah-Hartman | [PATCH 002/196] Chinese: rephrase English introduction in HOWTO |
| Kok, Auke | Re: Linux 2.6.21-rc1 |
| Greg KH | Re: Dual-Licensing Linux Kernel with GPL V2 and GPL V3 |
| Jeff Garzik | Re: [Patch v2] Make PCI extended config space (MMCONFIG) a driver opt-in |
git: | |
| David Miller | [GIT]: Networking |
| Gerrit Renker | [PATCH 15/37] dccp: Set per-connection CCIDs via socket options |
| Jarek Poplawski | [PATCH] pkt_sched: Destroy gen estimators under rtnl_lock(). |
| Eric Dumazet | [PATCH] net: remove superfluous call to synchronize_net() |
