On Mon, Oct 01, 2007 at 01:50:44PM -0700, Christoph Lameter wrote:Could you cut out the snarky remarks? It takes a long time to run a test, and testing every one of the patches you send really isn't high on anyone's priority list. The performance team have also been having problems getting stable results with recent kernels, adding to the delay. The good news is that we do now have committment to testing upstream kernels, so you should see results more frequently than you have been. I'm taking over from Suresh as liason for the performance team, so if you hear *anything* from *anyone* else at Intel about performance, I want you to cc me about it. OK? And I don't want to hear any more whining about hearing different things from different people. So, on "a well-known OLTP benchmark which prohibits publishing absolute numbers" and on an x86-64 system (I don't think exactly which model is important), we're seeing *6.51%* performance loss on slub vs slab. This is with a 2.6.23-rc3 kernel. Tuning the boot parameters, as you've asked for before (slub_min_order=2, slub_max_order=4, slub_min_objects=8) gets back 0.38% of that. It's still down 6.13% over slab. For what it's worth, 2.6.23-rc3 already has a 1.19% regression versus RHEL 4.5, so the performance guys are really unhappy about going up to almost 8% regression. In the detailed profiles, __slab_free is the third most expensive function, behind only spin locks. get_partial_node is right behind it in fourth place, and kmem_cache_alloc is sixth. __slab_alloc is eight and kmem_cache_free is tenth. These positions don't change with the slub boot parameters. Now, where do we go next? I suspect that 2.6.23-rc9 has significant changes since -rc3, but I'd like to confirm that before kicking off another (expensive) run. Please, tell me what useful kernels are to test. -- Intel are signing my paycheques ... these opinions are still mine "Bill, look, we understand that you're interested in selling us this operating system, but compare it to ours. We can't possibly take such a retrograde step." -
| Amit K. Arora | [RFC] Heads up on sys_fallocate() |
| H. Peter Anvin | Re: [RFC 00/15] x86_64: Optimize percpu accesses |
| Nicolas Pitre | Re: [RFC patch 08/18] cnt32_to_63 should use smp_rmb() |
| Bart Van Assche | Integration of SCST in the mainstream Linux kernel |
git: | |
| Jarek Poplawski | [PATCH] pkt_sched: Destroy gen estimators under rtnl_lock(). |
| David Miller | [GIT]: Networking |
| Gerrit Renker | [PATCH 27/37] dccp: Integration of dynamic feature activation - part 2 (server side) |
| Natalie Protasevich | [BUG] New Kernel Bugs |
