Re: [PATCH 01 of 11] mmu-notifier-core

Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]
From: Andrea Arcangeli
Date: Wednesday, May 7, 2008 - 2:58 pm

On Wed, May 07, 2008 at 01:30:39PM -0700, Linus Torvalds wrote:

I can't follow this, please be more specific.

About the patches I merged from Christoph, I didn't touch them at all
(except for fixing a kernel crashing bug in them plus some reject
fix). Initially I didn't even add a signed-off-by: andrea, and I only
had the signed-off-by: christoph. But then he said I had to add my
signed-off-by too, while I thought at most an acked-by was
required. So if I got any attribution on Christoph work it's only
because he explicitly requested it as it was passing through my
maintenance line. In any case, all patches except mmu-notifier-core
are irrelevant in this context and I'm entirely fine to give Christoph
the whole attribution of the whole patchset including the whole
mmu-notifier-core where most of the code is mine.

We had many discussions with Christoph, Robin and Jack, but I can
assure you nobody had a single problem with regard to attribution.

About all patches except mmu-notifier-core: Christoph, Robin and
everyone (especially myself) agrees those patches can't yet be merged
in 2.6.26.

With regard to the post-2.6.26 material, I think adding a config
option to make the change at compile time, is ok. And there's no other
way to deal with it in a clean way, as vmtrunate has to teardown
pagetables, and if the i_mmap_lock is a spinlock there's no way to
notify secondary mmus about it, if the ->invalidate_range_start method
has to allocate an skb, send it through the network and wait for I/O
completion with schedule().


No upside to all people setting CONFIG_KVM=n true, but no downside
for them either, that's the important fact!

And for all the people setting CONFIG_KVM!=n, I should provide some
background here. KVM MM development is halted without this, that
includes: paging, ballooning, tlb flushing at large, pci-passthrough
removing page pin as a whole, etc...

Everyone on kvm-devel talks about mmu-notifiers, check the last VT-d
patch form Intel where Antony (IBM/qemu/kvm) wonders how to handle
things without mmu notifiers (mlock whatever).

Rusty agreed we had to get mmu notifiers in 2.6.26 so much that he has
gone as far as writing his own ultrasimple mmu notifier
implementation, unfortunately too simple as invalidate_range_start was
missing and we can't remove the page pinning and avoid doing
spte=invalid;tlbflush;unpin for every group of sptes released without
it. And without mm_lock invalidate_range_start can't be implemented in
a generic way (to work for GRU/XPMEM too).


That's only invoked in mmu_notifier_register, mm_lock is explicitly
documented as heavyweight function. In the KVM case it's only called
when a VM is created, that's irrelevant cpu cost compared to the time
it takes to the OS to boot in the VM... (especially without real mode
emulation with direct NPT-like secondary-mmu paging).

mm_lock solved the fundamental race in the range_start/end
invalidation model (that will allow GRU to do a single tlb flush for
the whole range that is going to be freed by
zap_page_range/unmap_vmas/whatever). Christoph merged mm_lock in his
EMM versions of mmu notifiers, moments after I released it, I think he
wouldn't have done it if there was a better way.


Even if you're totally right, with Nick's mmu notifiers, Rusty's mmu
notifiers, my original mmu notifiers, Christoph's first version of my
mmu notifiers, with my new mmu notifiers, with christoph EMM version
of my new mmu notifiers, with my latest mmu notifiers, and all people
making suggestions and testing the code and needing the code badly,
and further patches waiting inclusion during 2.6.27 in this area, it
must be obvious for everyone, that there's zero chance this code won't
evolve over time to perfection, but we can't wait it to be perfect
before start using it or we're screwed. Even if it's entirely broken
this will allow kvm development to continue and then we'll fix it (but
don't worry it works great at runtime and there are no race
conditions, Jack and Robin are also using it with zero problems with
GRU and XPMEM just in case the KVM testing going great isn't enough).

Furthermore the API is freezed for almost months, everyone agrees with
all fundamental blocks in mmu-notifier-core patch (to be complete
Christoph would like to replace invalidate_page with an
invalidate_range_start/end but that's a minor detail).

And most important we need something in now, regardless of which
API. We can handle a change of API totally fine later.

mm_lock() is not even part of the mmu notifier API, it's just an
internal implementation detail, so whatever problem it has, or
whatever better name we can find, isn't an high priority right now.

If you suggest a better name now I'll fix it up immediately. I hope
the mm_lock name and whatever signed-off-by error in patches after
mmu-notifier-core won't be really why this doesn't go in.

Thanks a lot for your time to review even if it wasn't as positive as
I hoped,
Andrea
--
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

Messages in current thread:
[PATCH 00 of 11] mmu notifier #v16, Andrea Arcangeli, (Wed May 7, 7:35 am)
[PATCH 01 of 11] mmu-notifier-core, Andrea Arcangeli, (Wed May 7, 7:35 am)
[PATCH 02 of 11] get_task_mm, Andrea Arcangeli, (Wed May 7, 7:35 am)
[PATCH 03 of 11] invalidate_page outside PT lock, Andrea Arcangeli, (Wed May 7, 7:35 am)
[PATCH 04 of 11] free-pgtables, Andrea Arcangeli, (Wed May 7, 7:35 am)
[PATCH 05 of 11] unmap vmas tlb flushing, Andrea Arcangeli, (Wed May 7, 7:35 am)
[PATCH 06 of 11] rwsem contended, Andrea Arcangeli, (Wed May 7, 7:35 am)
[PATCH 07 of 11] i_mmap_rwsem, Andrea Arcangeli, (Wed May 7, 7:35 am)
[PATCH 08 of 11] anon-vma-rwsem, Andrea Arcangeli, (Wed May 7, 7:35 am)
[PATCH 09 of 11] mm_lock-rwsem, Andrea Arcangeli, (Wed May 7, 7:35 am)
[PATCH 10 of 11] export zap_page_range for XPMEM, Andrea Arcangeli, (Wed May 7, 7:36 am)
[PATCH 11 of 11] mmap sems, Andrea Arcangeli, (Wed May 7, 7:36 am)
Re: [PATCH 02 of 11] get_task_mm, Robin Holt, (Wed May 7, 8:59 am)
Re: [PATCH 02 of 11] get_task_mm, Andrea Arcangeli, (Wed May 7, 9:20 am)
Re: [PATCH 01 of 11] mmu-notifier-core, Rik van Riel, (Wed May 7, 10:35 am)
Re: [PATCH 03 of 11] invalidate_page outside PT lock, Rik van Riel, (Wed May 7, 10:39 am)
Re: [PATCH 04 of 11] free-pgtables, Rik van Riel, (Wed May 7, 10:41 am)
Re: [PATCH 05 of 11] unmap vmas tlb flushing, Rik van Riel, (Wed May 7, 10:46 am)
Re: [PATCH 03 of 11] invalidate_page outside PT lock, Andrea Arcangeli, (Wed May 7, 10:57 am)
Re: [PATCH 01 of 11] mmu-notifier-core, Andrew Morton, (Wed May 7, 1:02 pm)
Re: [PATCH 01 of 11] mmu-notifier-core, Andrew Morton, (Wed May 7, 1:05 pm)
Re: [PATCH 01 of 11] mmu-notifier-core, Linus Torvalds, (Wed May 7, 1:30 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Linus Torvalds, (Wed May 7, 1:56 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Andrea Arcangeli, (Wed May 7, 2:26 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Linus Torvalds, (Wed May 7, 2:36 pm)
Re: [PATCH 01 of 11] mmu-notifier-core, Andrea Arcangeli, (Wed May 7, 2:58 pm)
Re: [PATCH 01 of 11] mmu-notifier-core, Linus Torvalds, (Wed May 7, 3:11 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Andrea Arcangeli, (Wed May 7, 3:22 pm)
Re: [PATCH 01 of 11] mmu-notifier-core, Andrea Arcangeli, (Wed May 7, 3:27 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Andrew Morton, (Wed May 7, 3:31 pm)
Re: [PATCH 01 of 11] mmu-notifier-core, Andrea Arcangeli, (Wed May 7, 3:37 pm)
Re: [ofa-general] Re: [PATCH 01 of 11] mmu-notifier-core, Andrea Arcangeli, (Wed May 7, 3:39 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Jack Steiner, (Wed May 7, 3:42 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Andrea Arcangeli, (Wed May 7, 3:44 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Linus Torvalds, (Wed May 7, 3:44 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Andrea Arcangeli, (Wed May 7, 3:58 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Andrew Morton, (Wed May 7, 3:59 pm)
Re: [PATCH 01 of 11] mmu-notifier-core, Linus Torvalds, (Wed May 7, 4:00 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Andrea Arcangeli, (Wed May 7, 4:02 pm)
Re: [ofa-general] Re: [PATCH 01 of 11] mmu-notifier-core, Linus Torvalds, (Wed May 7, 4:03 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Linus Torvalds, (Wed May 7, 4:09 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Linus Torvalds, (Wed May 7, 4:19 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Benjamin Herrenschmidt, (Wed May 7, 4:28 pm)
Re: [PATCH 01 of 11] mmu-notifier-core, Linus Torvalds, (Wed May 7, 4:38 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Christoph Lameter, (Wed May 7, 4:39 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Andrea Arcangeli, (Wed May 7, 4:39 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Andrea Arcangeli, (Wed May 7, 4:45 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Linus Torvalds, (Wed May 7, 5:03 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Robin Holt, (Wed May 7, 5:38 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Robin Holt, (Wed May 7, 5:52 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Linus Torvalds, (Wed May 7, 5:55 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Christoph Lameter, (Wed May 7, 5:56 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Linus Torvalds, (Wed May 7, 6:02 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Linus Torvalds, (Wed May 7, 6:07 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Christoph Lameter, (Wed May 7, 6:12 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Andrea Arcangeli, (Wed May 7, 6:26 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Linus Torvalds, (Wed May 7, 6:32 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Andrea Arcangeli, (Wed May 7, 6:34 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Linus Torvalds, (Wed May 7, 6:39 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Andrea Arcangeli, (Wed May 7, 6:52 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Linus Torvalds, (Wed May 7, 6:57 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Andrea Arcangeli, (Wed May 7, 7:24 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Linus Torvalds, (Wed May 7, 7:32 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Andrea Arcangeli, (Wed May 7, 7:56 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Christoph Lameter, (Wed May 7, 8:10 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Andrea Arcangeli, (Wed May 7, 8:41 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Linus Torvalds, (Wed May 7, 9:14 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Andrea Arcangeli, (Wed May 7, 10:20 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Pekka Enberg, (Wed May 7, 10:27 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Pekka Enberg, (Wed May 7, 10:30 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Andrea Arcangeli, (Wed May 7, 10:49 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Linus Torvalds, (Thu May 8, 8:03 am)
Re: [PATCH 08 of 11] anon-vma-rwsem, Linus Torvalds, (Thu May 8, 9:11 am)
Re: [PATCH 08 of 11] anon-vma-rwsem, Andrea Arcangeli, (Thu May 8, 3:01 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Peter Zijlstra, (Fri May 9, 11:37 am)
Re: [PATCH 08 of 11] anon-vma-rwsem, Andrea Arcangeli, (Fri May 9, 11:55 am)
Re: [PATCH 08 of 11] anon-vma-rwsem, Peter Zijlstra, (Fri May 9, 12:04 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Nick Piggin, (Tue May 13, 5:06 am)
Re: [PATCH 08 of 11] anon-vma-rwsem, Nick Piggin, (Tue May 13, 5:14 am)
Re: [PATCH 08 of 11] anon-vma-rwsem, Robin Holt, (Tue May 13, 8:32 am)
Re: [PATCH 08 of 11] anon-vma-rwsem, Nick Piggin, (Tue May 13, 9:11 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Benjamin Herrenschmidt, (Tue May 13, 10:43 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Nick Piggin, (Tue May 13, 11:06 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Robin Holt, (Wed May 14, 4:26 am)
Re: [PATCH 08 of 11] anon-vma-rwsem, Jack Steiner, (Wed May 14, 6:15 am)
Re: [PATCH 08 of 11] anon-vma-rwsem, Linus Torvalds, (Wed May 14, 8:18 am)
Re: [PATCH 08 of 11] anon-vma-rwsem, Robin Holt, (Wed May 14, 9:22 am)
Re: [PATCH 08 of 11] anon-vma-rwsem, Linus Torvalds, (Wed May 14, 9:56 am)
Re: [PATCH 08 of 11] anon-vma-rwsem, Christoph Lameter, (Wed May 14, 10:57 am)
Re: [PATCH 08 of 11] anon-vma-rwsem, Linus Torvalds, (Wed May 14, 11:27 am)
Re: [PATCH 08 of 11] anon-vma-rwsem, Nick Piggin, (Thu May 15, 12:57 am)
Re: [PATCH 08 of 11] anon-vma-rwsem, Robin Holt, (Thu May 15, 4:01 am)
Re: [PATCH 08 of 11] anon-vma-rwsem, Avi Kivity, (Thu May 15, 4:12 am)
Re: [PATCH 08 of 11] anon-vma-rwsem, Christoph Lameter, (Thu May 15, 10:33 am)
Re: [PATCH 08 of 11] anon-vma-rwsem, Nick Piggin, (Thu May 15, 4:52 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Robin Holt, (Fri May 16, 4:23 am)
Re: [PATCH 08 of 11] anon-vma-rwsem, Robin Holt, (Fri May 16, 4:50 am)
mm notifier: Notifications when pages are unmapped., Christoph Lameter, (Fri May 16, 6:38 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Nick Piggin, (Mon May 19, 10:31 pm)
Re: [PATCH 08 of 11] anon-vma-rwsem, Robin Holt, (Tue May 20, 3:01 am)
Re: [PATCH 08 of 11] anon-vma-rwsem, Nick Piggin, (Tue May 20, 3:50 am)
Re: [PATCH 08 of 11] anon-vma-rwsem, Robin Holt, (Tue May 20, 4:05 am)
Re: [PATCH 08 of 11] anon-vma-rwsem, Nick Piggin, (Tue May 20, 4:14 am)
Re: [PATCH 08 of 11] anon-vma-rwsem, Robin Holt, (Tue May 20, 4:26 am)