On Fri, 28 Sep 2007, Nick Piggin wrote:The more objects in a page the more the fast path runs. The more the fast path runs the lower the cache footprint and the faster the overall allocations etc. SLAB can be configured for large queues holdings lots of objects. SLUB can only reach the same through large pages because it does not have queues. One could add the ability to manage pools of cpu slabs but that would be adding yet another layer to compensate for the problem of the small pages. Reliable large page allocations means that we can get rid of these layers and the many workarounds that we have in place right now. The unqueued nature of SLUB reduces memory requirements and in general the more efficient code paths of SLUB offset the advantage that SLAB can reach by being able to put more objects onto its queues. SLAB necessarily introduces complexity and cache line use through the need to manage those queues. Again I have not seen any fallbacks to vmalloc in my testing. What we are doing here is mainly to address your theoretical cases that we so far have never seen to be a problem and increase the reliability of allocations of page orders larger than 3 to a usable level. So far I have so far not dared to enable orders larger than 3 by default. AFAICT The performance of vmalloc is not really relevant. If this would become an issue then it would be possible to reduce the orders used to avoid fallbacks. AFAICT SLUBs performance is superior to SLAB in most cases and it was like that from the beginning. I am still concerned about several corner cases though (I think most of them are going to be addressed by the per cpu patches in mm). Having a comparable or larger amount of per cpu objects as SLAB is something that also could address some of these concerns and could increase performance much further. -
| Vladislav Bolkhovitin | Re: Integration of SCST in the mainstream Linux kernel |
| Greg Kroah-Hartman | [PATCH 001/196] Chinese: Add the known_regression URI to the HOWTO |
| Tarkan Erimer | Re: Dual-Licensing Linux Kernel with GPL V2 and GPL V3 |
| Martin Michlmayr | Network slowdown due to CFS |
git: | |
| Paweł Staszewski | rib_trie / Fix inflate_threshold_root. Now=15 size=11 bits |
| David Miller | [GIT]: Networking |
| Gerrit Renker | [PATCH 27/37] dccp: Integration of dynamic feature activation - part 2 (server side) |
| David Miller | Re: [PATCH] pkt_sched: Destroy gen estimators under rtnl_lock(). |
