MAP_SHARED mappings on hugetlbfs reserve huge pages at mmap() time. This guarantees all future faults against the mapping will succeed. This allows local allocations at first use improving NUMA locality whilst retaining reliability. MAP_PRIVATE mappings do not reserve pages. This can result in an application being SIGKILLed later if a huge page is not available at fault time. This makes huge pages usage very ill-advised in some cases as the unexpected application failure cannot be detected and handled as it is immediately fatal. Although an application may force instantiation of the pages using mlock(), this may lead to poor memory placement and the process may still be killed when performing COW. This patchset introduces a reliability guarantee for the process which creates a private mapping, i.e. the process that calls mmap() on a hugetlbfs file successfully. The first patch of the set is purely mechanical code move to make later diffs easier to read. The second patch will guarantee faults up until the process calls fork(). After patch two, as long as the child keeps the mappings, the parent is no longer guaranteed to be reliable. Patch 3 guarantees that the parent will always successfully COW by unmapping the pages from the child in the event there are insufficient pages in the hugepage pool in allocate a new page, be it via a static or dynamic pool. Existing hugepage-aware applications are unlikely to be affected by this change. For much of hugetlbfs's history, pages were pre-faulted at mmap() time or mmap() failed which acts in a reserve-like manner. If the pool is sized correctly already so that parent and child can fault reliably, the application will not even notice the reserves. It's only when the pool is too small for the application to function perfectly reliably that the reserves come into play. Credit goes to Andy Whitcroft for cleaning up a number of mistakes during review before the patches were released. -- Mel Gorman Part-time Phd Student Linux Technology Center University of Limerick IBM Dublin Software Lab --
| Andrea Arcangeli | [PATCH 06 of 11] rwsem contended |
| Mikulas Patocka | LFENCE instruction (was: [rfc][patch 3/3] x86: optimise barriers) |
| Rafael J. Wysocki | Re: [Bug 10030] Suspend doesn't work when SD card is inserted |
| Manu Abraham | PCIE |
git: | |
| Sverre Rabbelier | Git vs Monotone |
| Junio C Hamano | [ANNOUNCE] GIT 1.5.4 |
| Bill Lear | Meaning of "fatal: protocol error: bad line length character"? |
| Junio C Hamano | Re: [PATCH] Teach remote machinery about remotes.default config variable |
| Richard Stallman | Real men don't attack straw men |
| Stefan Beke | mail dovecot: pipe() failed: Too many open files |
| Wijnand Wiersma | Almost success: OpenBSD on Xen |
| Didier Wiroth | how can I "find xyz | xargs tar" ... like gtar |
| Greg A. Woods | Re: Fork bomb protection patch |
| Tyler Retzlaff | Re: more summer of code fun |
| Elad Efrat | Re: sysctl knob to let sugid processes dump core (pr 15994) |
| Thor Lancelot Simon | Re: FFS journal |
