On Sun, Sep 16, 2007 at 03:54:56PM +0200, Goswin von Brederlow wrote:What does the large square represent here? A "largepage"? If yes, which order? There seem to be quite some pixels in each square... If the largepage is the square, there can't be red pixels mixed with green pixels with the config-page-shift design, this is the whole difference... zooming in I see red pixels all over the squares mized with green pixels in the same square. This is exactly what happens with the variable order page cache and that's why it provides zero guarantees in terms of how much ram is really "free" (free as in "available"). If I understood correctly, here you agree that mixing movable and unmovable objects in the same largepage is a bad thing, and that's incidentally what config-page-shift prevents. It avoids it instead of undoing the mixture later with defrag when it's far too late for anything but updatedb. With config-page-shift mmap works on 4k chunks but it's always backed by 64k or any other largesize that you choosed at compile time. And if the virtual alignment of mmap matches the physical alignment of the physical largepage and is >= PAGE_SIZE (software PAGE_SIZE I mean) we could use the 62nd bit of the pte to use a 64k tlb (if future cpus will allow that). Nick also suggested to still set all ptes equal to make life easier for the tlb miss microcode. Yep, exactly this is what happens, it avoids that trouble. But as far as fragmentation guarantees goes, it's really about keeping the unmovable out of our way (instead of spreading the unmovable all over the buddy randomly, or with ugly boot-time-fixed-numbers-memory-reservations) than to map largepages in userland. Infact as I said we could map kmalloced 4k entries in userland to save memory if we would really want to hurt the fast paths to make a generic kernel to use on smaller systems, but that would be very complex. Since those 4k entries would be 100% movable (not like the rest of the slab, like dentries and inodes etc..) that wouldn't make the design less reliable, it'd still be 100% reliable and performance would be ok because that memory is userland memory, we've to set the pte anyway, regardless if it's a 4k page or a largepage. Sure! 2M is sure way excessive for a 1G system, 64k most certainly too, of course unless you're running a db or a multimedia streaming service, in which case it should be ideal. -
| david | Re: Dual-Licensing Linux Kernel with GPL V2 and GPL V3 |
| David Woodhouse | [GIT *] Allow request_firmware() to be satisfied from in-kernel, use it in more dr... |
| Philipp Marek | Re: sys_chroot+sys_fchdir Fix |
| Greg Kroah-Hartman | [PATCH 008/196] Chinese: add translation of volatile-considered-harmful.txt |
git: | |
| Krishna Kumar | [PATCH 9/10 REV5] [IPoIB] Implement batching |
| Gerrit Renker | [PATCH 15/37] dccp: Set per-connection CCIDs via socket options |
| David Miller | Re: [PATCH] pkt_sched: Destroy gen estimators under rtnl_lock(). |
| David Miller | [GIT]: Networking |
