login
Header Space

 
 

[patch 18/21] mlock vma pages under mmap_sem held for read

Previous thread: [patch 20/21] account mlocked pages by Rik van Riel on Thursday, February 28, 2008 - 3:29 pm. (1 message)

Next thread: [PATCH 0/4] firewire: trivial fw-sbp2 updates by Stefan Richter on Thursday, February 28, 2008 - 3:50 pm. (5 messages)
To: <linux-kernel@...>
Cc: KOSAKI Motohiro <kosaki.motohiro@...>, Lee Schermerhorn <Lee.Schermerhorn@...>, <linux-mm@...>
Date: Thursday, February 28, 2008 - 3:29 pm

V2 -&gt; V3:
+ rebase to 23-mm1 atop RvR's split lru series [no change]
+ fix function return types [void -&gt; int] to fix build when
  not configured.

New in V2.

We need to hold the mmap_sem for write to initiatate mlock()/munlock()
because we may need to merge/split vmas.  However, this can lead to
very long lock hold times attempting to fault in a large memory region
to mlock it into memory.   This can hold off other faults against the
mm [multithreaded tasks] and other scans of the mm, such as via /proc.
To alleviate this, downgrade the mmap_sem to read mode during the 
population of the region for locking.  This is especially the case 
if we need to reclaim memory to lock down the region.  We [probably?]
don't need to do this for unlocking as all of the pages should be
resident--they're already mlocked.

Now, the caller's of the mlock functions [mlock_fixup() and 
mlock_vma_pages_range()] expect the mmap_sem to be returned in write
mode.  Changing all callers appears to be way too much effort at this
point.  So, restore write mode before returning.  Note that this opens
a window where the mmap list could change in a multithreaded process.
So, at least for mlock_fixup(), where we could be called in a loop over
multiple vmas, we check that a vma still exists at the start address
and that vma still covers the page range [start,end).  If not, we return
an error, -EAGAIN, and let the caller deal with it.

Return -EAGAIN from mlock_vma_pages_range() function and mlock_fixup()
if the vma at 'start' disappears or changes so that the page range
[start,end) is no longer contained in the vma.  Again, let the caller
deal with it.  Looks like only sys_remap_file_pages() [via mmap_region()]
should actually care.

With this patch, I no longer see processes like ps(1) blocked for seconds
or minutes at a time waiting for a large [multiple gigabyte] region to be
locked down.  

Signed-off-by:  Lee Schermerhorn &lt;lee.schermerhorn@hp.com&gt;
Signed-off-by:  Rik van Riel &lt;riel@redha...
Previous thread: [patch 20/21] account mlocked pages by Rik van Riel on Thursday, February 28, 2008 - 3:29 pm. (1 message)

Next thread: [PATCH 0/4] firewire: trivial fw-sbp2 updates by Stefan Richter on Thursday, February 28, 2008 - 3:50 pm. (5 messages)
speck-geostationary