The recently-released 2.5.36 kernel now includes (among other things, like XFS) Rohit Seth's Huge TLB Pages patch (IA-32 only). This enables support for page sizes larger than 4k, but user-space apps must use special system calls to take advantage of the large pages.
Smaller pages are space-efficient, reduce memory fragmentation, and are easy to swap. They are suited for most tasks; but large pages are better-suited with shared memory, and require less page-table entries.
This patch has already been in Andrew Morton's [interview] -mm tree for some time, so it has (presumably) received some good testing.
Related Links:
Changeset details for the patch
Changelog for kernel 2.5.36:
TLB
TLB means "translation lookaside buffer", but that's still not very clear. I think a more "correct" name would be "Virtual-to-Physical address cache", since that's kinda what it is.
The TLB keeps a list of the last-accessed physical addresses (and the virtual addresses the map to), and it resides in the CPU. So it provides a performance boost, because without a TLB, the physical addresses would have to be looked up in the page-tables (which are stored in relatively-slow main memory).
So, who are teh best users of Huge Page?
Call me ignorant or perhaps naive in the land of VM, but what sorts of apps are huge pages aimed at?
I personally can think of three scenarios, but I don't know if I'm talking out my hindside:
So am I off in left field, or is these sorts of things the intended audience for Huge Pages?
FWIW, the reason I mention libc is that a simple "grep libc- /proc/*/maps | grep -v rw" turns up a huge list on even my mostly-quiescent box. A typical entry looks like so:
42000000-4212c000 r-xp 00000000 03:05 72290 /lib/i686/libc-2.2.5.soIf you notice, about 1.2MB is covered by libc. This same range is mapped by ALL the processes. This range could be covered by one 2MB or 4MB page, right? (That is, assuming the "rw" section that abuts it is moved outside the range covered by the single 2MB or 4MB huge-page.) Unless I'm smoking something, wouldn't that trade about 300 pages for 1? What would be the drawbacks, aside from the fact you now wouldn't be able to swap libc? (Arguably, if one library is kept in memory, shouldn't it be libc?)
--Joe
Oracle and 2MB pages == 8% speedup
Heh... Same day I posted my question, LWN reports on this email from HP, which notes that 2MB PTEs for Oracle's shared memory gave an 8% speedup on their TPC-C numbers.
I wonder if the speedup is from less bookkeeping work in the kernel, or from less TLB thrash, or both?
I'm still curious if huge-page w/ libc is something that's "Good" or "Bad".
--Joe