On Thu, 18 Oct 2007, Andrew Morton wrote:Well the problem right now is the regression in slab_free() on SMP. AFAICT UP and NUMA is fine and also most loads under SMP. Concurrent allocation / frees on multiple processors are several times faster (I see up to 10 fold improvements on an 8p). However, long sequences of free operations from a single processor under SMP require too many atomic operations compared with SLAB. If I only do frees on a single processor on SMP then I can produce a 30% regression for slabs between 128 and 1024 byte in size. I have a patchset in the works that reduces the atomic operations for those. SLAB currently has an advantage since it uses coarser grained locking. SLAB can take a global lock and then perform queue operations on multiple objects. SLUB has fine grained locking which increases concurrency but also the overhead of atomic operations. The regression does not surface under UP since we do not need to do locking. And it does not surface under NUMA since the alien cache stuff in SLAB is reducing slab_free performance compared to SMP. -
| Mariusz Kozlowski | [PATCH 01] kmalloc + memset conversion co kzalloc |
| Greg Kroah-Hartman | [PATCH 001/196] Chinese: Add the known_regression URI to the HOWTO |
| Vladislav Bolkhovitin | Re: Integration of SCST in the mainstream Linux kernel |
| Jeremy Allison | Re: [RFC] Heads up on sys_fallocate() |
git: | |
| Gerrit Renker | [PATCH 0/37] dccp: Feature negotiation - last call for comments |
| Natalie Protasevich | [BUG] New Kernel Bugs |
| David Miller | Re: [GIT]: Networking |
| Jeff Garzik | Re: [bug?] tg3: Failed to load firmware "tigon/tg3_tso.bin" |
