On Tue, Sep 11, 2007 at 04:52:19AM +1000, Nick Piggin wrote:MM kernels also forbids mmap, so there's no chance the largepages are mlocked etc... that's not the final thing that is being measured. Seconded. Additionally I feel the ones that will get the main advantage from the quick hack are the crippled devices that are ~30% slower if the SG tables are large. Yep. Agreed. From my part I am really convinced the only sane way to approach the VM scalability and larger-physically contiguous pages problem is the CONFIG_PAGE_SHIFT patch (aka large PAGE_SIZE from Hugh for 2.4). I also have to say I always disliked the PAGE_CACHE_SIZE definition too ;). I take it only as an attempt to documentation. Furthermore all the issues with writeprotect faults over MAP_PRIVATE regions will have to be addressed the same way with both approaches if we want real 100% 4k-granular backwards compatibility. On this topic I'm also going to suggest the cpu vendors to add a 64k tlb using the reserved 62th bitflag in the pte (right after the NX bit). So if alignment allows we can map pagecache with a 64k large tlb on x86 (with a PAGE_SIZE of 64k), mixing it with the 4k tlb in the same address space if userland alignment forbids using the 64k tlb. If we want to break backwards compatibility and force all alignments on 64k and get rid of any 4k tlb to simplify the page fault code we can do it later anyway... No idea if this feasible to achieve on the hardware level though, it's not my problem anyway to judge this ;). As constraints to the hardware interface it would be ok to require the 62th 64k-tlb bitflag to be only available on the pte that would have normally mapped a physical address 64k naturally aligned, and to require all later overlapping 4k ptes to be set to 0. If you've better ideas to achieve this than my interface please let me know. And if I'm terribly wrong and the variable order pagecache is the way to go for the long run, the 64k tlb feature will fit in that model very nicely too. The reason of the 64k magic number is that this is the minimum unit of contiguous I/O required to reach platter speed on most devices out there. And it incidentally also matches ppc64 ;). -
| Tarkan Erimer | Re: Dual-Licensing Linux Kernel with GPL V2 and GPL V3 |
| Greg Kroah-Hartman | [PATCH 002/196] Chinese: rephrase English introduction in HOWTO |
| Christoph Lameter | [00/41] Large Blocksize Support V7 (adds memmap support) |
| Chuck Ebbert | Re: Linux 2.6.21 |
git: | |
| Gerrit Renker | [PATCH 03/37] dccp: List management for new feature negotiation |
| Jarek Poplawski | [PATCH] pkt_sched: Destroy gen estimators under rtnl_lock(). |
| Hugh Dickins | Re: [bug?] tg3: Failed to load firmware "tigon/tg3_tso.bin" |
| David Miller | [GIT]: Networking |
