Hi, On Tue, Sep 11, 2007 at 07:31:01PM +0100, Mel Gorman wrote:Initially 4k kmalloced tails aren't going to be mapped in userland. But let's take the kernel stack that would generate the same problem and that is clearly going to pin the whole 64k slab/slub page. What I think you're missing is that for Nick's worst case to trigger with the config_page_shift design, you would need the _whole_ ram to be _at_least_once_ allocated completely in kernel stacks. If the whole 100% of ram wouldn't go allocated in slub as a pure kernel stack, such a scenario could never materialize. With the SGI design + defrag, Nick's scenario can instead happen with only total_ram/64k kernel stacks allocated. The the problem with the slub fragmentation isn't a new problem, it happens in today kernels as well and at least the slab by design is meant to _defrag_ internally. So it's practically already solved and it provides some guarantee unlike the buddy allocator. It's not the fault of anyone, I simply didn't push too hard towards my agenda for the reasons I just said, but I used any given opportunity to discuss it. With on-topic I meant not talking about it during the other topics, like mmap_sem or RCU with radix tree lock ;) Well, I only meant I'm still free to disagree if I think there's a better way. All SGI has provided so far is data to show that their I/O subsystem is much faster if the data is physically contiguous in ram (ask Linus if you want more details, or better don't ask). That's not very interesting data for my usages and with my hardware, and I guess it's more likely that config_page_shift will produce interesting numbers than their patch on my possible usage cases, but we'll never know until both are finished. Indeed! pagetables aren't the issue. They should be still pre-allocated in page_size chunks. The 4k entries with 64k page-size are sure not worse than a 32byte kmalloc today, the slab by design defragments the stuff. There's probably room for improvement in that area even without freeing any object by just ordering the list with an rbtree (or better an heak like CFS should also use!!) so to always allocate new slabs from the most full partial slab, that alone would help a lot probably (not sure if slub does anything like that, I'm not fond on slub yet). Disagree here... Also note that not all users will need to turn on the tail packing. We're here talking about features that not all users will need anyway.. And we're in the same boat as ppc64, no difference. Thanks! -
| Tarkan Erimer | Re: Dual-Licensing Linux Kernel with GPL V2 and GPL V3 |
| Greg Kroah-Hartman | [PATCH 005/196] Chinese: add translation of SubmittingDrivers |
| Kamalesh Babulal | [BUILD-FAILURE] 2.6.26-rc8-mm1 - build failure at drivers/char/hvc_rtas.c |
| Luciano Rocha | usb hdd problems with 2.6.27.2 |
| David Miller | Re: [PATCH] pkt_sched: Destroy gen estimators under rtnl_lock(). |
| Arjan van de Ven | Re: [GIT]: Networking |
| Christoph Lameter | Network latency regressions from 2.6.22 to 2.6.29 |
| Gerrit Renker | [PATCH 15/37] dccp: Set per-connection CCIDs via socket options |
git: | |
