login
Header Space

 
 

Re: [PATCH 08 of 11] anon-vma-rwsem

Score:
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]
To: Andrew Morton <akpm@...>
Cc: <clameter@...>, <steiner@...>, <holt@...>, <npiggin@...>, <a.p.zijlstra@...>, <kvm-devel@...>, <kanojsarcar@...>, <rdreier@...>, <swise@...>, <linux-kernel@...>, <avi@...>, <linux-mm@...>, <general@...>, <hugh@...>, <rusty@...>, <aliguori@...>, <chrisw@...>, <marcelo@...>, <dada1@...>, <paulmck@...>
Date: Wednesday, May 7, 2008 - 7:39 pm

Hi Andrew,

On Wed, May 07, 2008 at 03:59:14PM -0700, Andrew Morton wrote:
				================== mmu_notifier_register()

But the problem is that we've to stop the critical section in the
place I marked with "========" while mmu_notifier_register
runs. Otherwise the driver calling mmu_notifier_register won't know if
it's safe to start establishing secondary sptes/tlbs. If the driver
will establish sptes/tlbs with get_user_pages/follow_page the page
could be freed immediately later when zap_page_range starts.

So if CPU1 doesn't take the global_lock before proceeding in
zap_page_range (inside vmtruncate i_mmap_lock that is represented as
b->lock above) we're in trouble.

What we can do is to replace the mm_lock with a
spin_lock(&global_lock) only if all places that takes i_mmap_lock
takes the global lock first and that hurts scalability of the fast
paths that are performance critical like vmtruncate and
anon_vma->lock. Perhaps they're not so performance critical, but
surely much more performant critical than mmu_notifier_register ;).

The idea of polluting various scalable paths like truncate() syscall
in the VM with a global spinlock frightens me, I'd rather return to
invalidate_page() inside the PT lock removing both
invalidate_range_start/end. Then all serialization against the mmu
notifiers will be provided by the PT lock that the secondary mmu page
fault also has to take in get_user_pages (or follow_page). In any case
that is a better solution that won't slowdown the VM when
MMU_NOTIFIER=y even if it's a bit slower for GRU, for KVM performance
is about the same with or without invalidate_range_start/end. I didn't
think anybody could care about how long mmu_notifier_register takes
until it returns compared to all heavyweight operations that happens
to start a VM (not only in the kernel but in the guest too).

Infact if it's security that we worry about here, can put a cap of
_time_ that mmu_notifier_register can take before it fails, and we
fail to start a VM if it takes more than 5sec, that's still fine as
the failure could happen for other reasons too like vmalloc shortage
and we already handle it just fine. This 5sec delay can't possibly happen in
practice anyway in the only interesting scenario, just like the
vmalloc shortage. This is obviously a superior solution than polluting
the VM with an useless global spinlock that will destroy truncate/AIM
on numa.

Anyway Christoph, I uploaded my last version here:

       	http://www.kernel.org/pub/linux/kernel/people/andrea/patches/v2.6/2.6.25/mmu-notifier-...

(applies and runs fine on 26-rc1)

You're more than welcome to takeover from it, I kind of feel my time
now may be better spent to emulate the mmu-notifier-core with kprobes.
--
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

Messages in current thread:
[PATCH 00 of 11] mmu notifier #v16, Andrea Arcangeli, (Wed May 7, 10:35 am)
[PATCH 10 of 11] export zap_page_range for XPMEM, Andrea Arcangeli, (Wed May 7, 10:36 am)
[PATCH 11 of 11] mmap sems, Andrea Arcangeli, (Wed May 7, 10:36 am)
[PATCH 09 of 11] mm_lock-rwsem, Andrea Arcangeli, (Wed May 7, 10:35 am)
[PATCH 08 of 11] anon-vma-rwsem, Andrea Arcangeli, (Wed May 7, 10:35 am)
Re: [PATCH 08 of 11] anon-vma-rwsem, Linus Torvalds, (Wed May 7, 4:56 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Andrea Arcangeli, (Wed May 7, 5:26 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Jack Steiner, (Wed May 7, 6:42 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Linus Torvalds, (Wed May 7, 5:36 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Robin Holt, (Wed May 7, 8:38 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Nick Piggin, (Tue May 13, 8:06 am)
Re: [PATCH 08 of 11] anon-vma-rwsem, Robin Holt, (Tue May 13, 11:32 am)
Re: [PATCH 08 of 11] anon-vma-rwsem, Nick Piggin, (Wed May 14, 12:11 am)
Re: [PATCH 08 of 11] anon-vma-rwsem, Robin Holt, (Wed May 14, 7:26 am)
Re: [PATCH 08 of 11] anon-vma-rwsem, Nick Piggin, (Thu May 15, 3:57 am)
Re: [PATCH 08 of 11] anon-vma-rwsem, Christoph Lameter, (Thu May 15, 1:33 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Nick Piggin, (Thu May 15, 7:52 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Robin Holt, (Fri May 16, 7:23 am)
Re: [PATCH 08 of 11] anon-vma-rwsem, Robin Holt, (Fri May 16, 7:50 am)
Re: [PATCH 08 of 11] anon-vma-rwsem, Nick Piggin, (Tue May 20, 1:31 am)
Re: [PATCH 08 of 11] anon-vma-rwsem, Robin Holt, (Tue May 20, 6:01 am)
Re: [PATCH 08 of 11] anon-vma-rwsem, Nick Piggin, (Tue May 20, 6:50 am)
Re: [PATCH 08 of 11] anon-vma-rwsem, Robin Holt, (Tue May 20, 7:05 am)
Re: [PATCH 08 of 11] anon-vma-rwsem, Nick Piggin, (Tue May 20, 7:14 am)
Re: [PATCH 08 of 11] anon-vma-rwsem, Robin Holt, (Tue May 20, 7:26 am)
Re: [PATCH 08 of 11] anon-vma-rwsem, Robin Holt, (Thu May 15, 7:01 am)
Re: [PATCH 08 of 11] anon-vma-rwsem, Avi Kivity, (Thu May 15, 7:12 am)
Re: [PATCH 08 of 11] anon-vma-rwsem, Linus Torvalds, (Wed May 14, 11:18 am)
Re: [PATCH 08 of 11] anon-vma-rwsem, Christoph Lameter, (Wed May 14, 1:57 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Linus Torvalds, (Wed May 14, 2:27 pm)
mm notifier: Notifications when pages are unmapped., Christoph Lameter, (Fri May 16, 9:38 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Robin Holt, (Wed May 14, 12:22 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Linus Torvalds, (Wed May 14, 12:56 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Linus Torvalds, (Wed May 7, 8:55 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Andrea Arcangeli, (Wed May 7, 6:22 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Linus Torvalds, (Wed May 7, 6:44 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Andrea Arcangeli, (Wed May 7, 6:58 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Linus Torvalds, (Wed May 7, 7:09 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Andrea Arcangeli, (Wed May 7, 7:02 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Andrew Morton, (Wed May 7, 6:31 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Andrea Arcangeli, (Wed May 7, 6:44 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Benjamin Herrenschmidt, (Wed May 7, 7:28 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Andrea Arcangeli, (Wed May 7, 7:45 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Andrea Arcangeli, (Wed May 7, 9:34 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Nick Piggin, (Tue May 13, 8:14 am)
Re: [PATCH 08 of 11] anon-vma-rwsem, Benjamin Herrenschmidt, (Wed May 14, 1:43 am)
Re: [PATCH 08 of 11] anon-vma-rwsem, Jack Steiner, (Wed May 14, 9:15 am)
Re: [PATCH 08 of 11] anon-vma-rwsem, Nick Piggin, (Wed May 14, 2:06 am)
Re: [PATCH 08 of 11] anon-vma-rwsem, Andrew Morton, (Wed May 7, 6:59 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Andrea Arcangeli, (Wed May 7, 7:39 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Linus Torvalds, (Wed May 7, 9:02 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Andrea Arcangeli, (Wed May 7, 9:26 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Christoph Lameter, (Wed May 7, 9:12 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Andrea Arcangeli, (Wed May 7, 10:56 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Christoph Lameter, (Wed May 7, 11:10 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Andrea Arcangeli, (Wed May 7, 11:41 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Linus Torvalds, (Thu May 8, 12:14 am)
Re: [PATCH 08 of 11] anon-vma-rwsem, Andrea Arcangeli, (Thu May 8, 1:20 am)
Re: [PATCH 08 of 11] anon-vma-rwsem, Linus Torvalds, (Thu May 8, 11:03 am)
Re: [PATCH 08 of 11] anon-vma-rwsem, Linus Torvalds, (Thu May 8, 12:11 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Peter Zijlstra, (Fri May 9, 2:37 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Andrea Arcangeli, (Fri May 9, 2:55 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Peter Zijlstra, (Fri May 9, 3:04 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Andrea Arcangeli, (Thu May 8, 6:01 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Pekka Enberg, (Thu May 8, 1:27 am)
Re: [PATCH 08 of 11] anon-vma-rwsem, Pekka Enberg, (Thu May 8, 1:30 am)
Re: [PATCH 08 of 11] anon-vma-rwsem, Andrea Arcangeli, (Thu May 8, 1:49 am)
Re: [PATCH 08 of 11] anon-vma-rwsem, Linus Torvalds, (Wed May 7, 9:32 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Linus Torvalds, (Wed May 7, 7:19 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Christoph Lameter, (Wed May 7, 7:39 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Linus Torvalds, (Wed May 7, 8:03 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Christoph Lameter, (Wed May 7, 8:56 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Linus Torvalds, (Wed May 7, 9:39 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Andrea Arcangeli, (Wed May 7, 9:52 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Linus Torvalds, (Wed May 7, 9:57 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Andrea Arcangeli, (Wed May 7, 10:24 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Linus Torvalds, (Wed May 7, 10:32 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Linus Torvalds, (Wed May 7, 9:07 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Robin Holt, (Wed May 7, 8:52 pm)
[PATCH 07 of 11] i_mmap_rwsem, Andrea Arcangeli, (Wed May 7, 10:35 am)
[PATCH 06 of 11] rwsem contended, Andrea Arcangeli, (Wed May 7, 10:35 am)
[PATCH 05 of 11] unmap vmas tlb flushing, Andrea Arcangeli, (Wed May 7, 10:35 am)
Re: [PATCH 05 of 11] unmap vmas tlb flushing, Rik van Riel, (Wed May 7, 1:46 pm)
[PATCH 04 of 11] free-pgtables, Andrea Arcangeli, (Wed May 7, 10:35 am)
Re: [PATCH 04 of 11] free-pgtables, Rik van Riel, (Wed May 7, 1:41 pm)
[PATCH 03 of 11] invalidate_page outside PT lock, Andrea Arcangeli, (Wed May 7, 10:35 am)
Re: [PATCH 03 of 11] invalidate_page outside PT lock, Rik van Riel, (Wed May 7, 1:39 pm)
Re: [PATCH 03 of 11] invalidate_page outside PT lock, Andrea Arcangeli, (Wed May 7, 1:57 pm)
[PATCH 01 of 11] mmu-notifier-core, Andrea Arcangeli, (Wed May 7, 10:35 am)
Re: [PATCH 01 of 11] mmu-notifier-core, Andrew Morton, (Wed May 7, 4:05 pm)
Re: [PATCH 01 of 11] mmu-notifier-core, Linus Torvalds, (Wed May 7, 4:30 pm)
Re: [PATCH 01 of 11] mmu-notifier-core, Andrea Arcangeli, (Wed May 7, 5:58 pm)
Re: [PATCH 01 of 11] mmu-notifier-core, Linus Torvalds, (Wed May 7, 6:11 pm)
Re: [PATCH 01 of 11] mmu-notifier-core, Andrea Arcangeli, (Wed May 7, 6:27 pm)
Re: [PATCH 01 of 11] mmu-notifier-core, Linus Torvalds, (Wed May 7, 7:00 pm)
Re: [PATCH 01 of 11] mmu-notifier-core, Andrea Arcangeli, (Wed May 7, 6:37 pm)
Re: [PATCH 01 of 11] mmu-notifier-core, Linus Torvalds, (Wed May 7, 7:38 pm)
Re: [ofa-general] Re: [PATCH 01 of 11] mmu-notifier-core, Andrea Arcangeli, (Wed May 7, 6:39 pm)
Re: [ofa-general] Re: [PATCH 01 of 11] mmu-notifier-core, Linus Torvalds, (Wed May 7, 7:03 pm)
Re: [PATCH 01 of 11] mmu-notifier-core, Andrew Morton, (Wed May 7, 4:02 pm)
Re: [PATCH 01 of 11] mmu-notifier-core, Rik van Riel, (Wed May 7, 1:35 pm)
[PATCH 02 of 11] get_task_mm, Andrea Arcangeli, (Wed May 7, 10:35 am)
Re: [PATCH 02 of 11] get_task_mm, Robin Holt, (Wed May 7, 11:59 am)
Re: [PATCH 02 of 11] get_task_mm, Andrea Arcangeli, (Wed May 7, 12:20 pm)
speck-geostationary