Re: [kvm-devel] [PATCH] export notifier #1

!MAILaRCHIVE_VOTE_RePLACE
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]
To: Robin Holt <holt@...>
Cc: Christoph Lameter <clameter@...>, Avi Kivity <avi@...>, Izik Eidus <izike@...>, Andrew Morton <akpm@...>, Nick Piggin <npiggin@...>, <kvm-devel@...>, Benjamin Herrenschmidt <benh@...>, <steiner@...>, <linux-kernel@...>, <linux-mm@...>, <daniel.blueman@...>, Hugh Dickins <hugh@...>
Date: Wednesday, January 23, 2008 - 1:33 pm

On Wed, Jan 23, 2008 at 06:32:30AM -0600, Robin Holt wrote:

It wasn't immediately obvious that this was an addition but really I
don't mind either ways as long as a feature workable for KVM is
included ;).


You want to be able to tell the mmu_notifier that you want the flush
repeated without locks later? Sorry but then if you're always going to
set the bitflag unconditionally, why don't you simply implement a
second notifier in addition of my current ->invalidate_page (like
->after_invalidate_page).

We can then implement a method in rmap.c for you to call to do the
final freeing of the page (pagecache/swapcache won't be collected
unless it's a truncate, as long as you keep it pinned and you
certainly don't want to wait a second round of lru scan before freeing
the page after you release the external reference, so you may need to
call this method before returning from the
->after_invalidate_page). Infact I can call that method for you in the
notifier implementation itself after all ->after_invalidate_pages have
been called. (of course only if at least one of them was implemented
and not-null)


mmu notifiers should already cover all pte clearing cases (export
notifiers definitely don't instead!). It's not as important as for
quadrics, we're mostly interested in rmap.c swapping and do_wp_page
for KVM page sharing. But in the longer term protection changes and
other things happening in the main MMU can also be tracked through mmu
notifiers (something quadrics will likely need). Tracking the
asm-generic/pgtable.h is all about trivially tracking all places
without cluttering mm/*.c, so the mm/*.c remains more hackable and
more readable.


Well as long as you send these messages somewhat serially and you
don't pretend to allocate all packets at once it should be ok. Perhaps
you should preallocate all packets statically and serialize the access
to the pool with a lock.

What I'd like to stress to be sure it's crystal clear, is that in the
mm/rmap.c path GFP_KERNEL==GFP_ATOMIC, infact both are = PF_MEMALLOC =
TIF_MEMDIE = if mempool is empty it will crash. The argument that you
need to sleep to allocate memory with GFP_KERNEL is totally bogus. If
that's the only reason, you don't need to sleep at all. alloc_pages
will not invoke the VM when called inside the VM, it will grab ram
from PF_MEMALLOC instead. At most it will schedule so the only benefit
would be lower -rt latency in the end.


As long as you keep a reference on the page too, you don't risk
any corruption by flushing after.


The window that you must close with that bitflag is the request coming
from the remote node to map the page after the linux pte has been
cleared. If you map the page in a remote node after the linux pte has
been cleared ->invalidate_page won't be called again because the page
will look unmapped in the linux VM. Now invalidate_page will clear the
bitflag, so the map requests will block. But where exactly you know
that the linux pte has been cleared so you can "unblock" the map
requests? If a page is not mapped by some linux pte, mm/rmap.c will
never be called and this is why any notification in mm/rmap.c should
track the "address space" and not the "physical page".

In effect you don't care less about the address space of the task in
the master node, so IMHO you're hooking your ->invalidate_page(page)
(instead of my ->invalidate_page(mm, address)) in the very wrong
place. You should hook it in mm/vmscan.c shrink-list so it will be
invoked regardless if the pte is mapped or not. Then your model that
passes the "page" instead of an "mm+address" will start to work
without any need clearing/setting PG_exported bifflags, and you will
automatically close the above race window without the need of ever
clearing the PG_exported bitflag etc.... So again the current design
of Christoph's patch really looks flawed to me.

If you work the "pages" you should stick to pages and to stay away
from mm/rmap.c and ignore whatever is mapped in the master address
space of the task. mm/rmap.c only deals with ptes/sptes and other
_virtual-tracked_ mappings.

If you work with get_user_pages for page allocation (instead of
alloc_pages) and the userland "virtual" addresses are your RAM backing
store, like with KVM, then you should build an rmap structure based on
_virtual_ addresses because the virtual addresses of the task will be
all you care about, and then you will register the notifier in the
"mm" and you will need ->invalidate_page to take an "address" as
parameter and not a "page". All reference counting will be automatic,
the userland task will run with all virtual memory visible
automatically. You should make sure this model tied to the ram mapped
in the userland task can't fit your needs and you really have to care
about building stuff on top of physical pages instead of letting the
VM decide which physical page or swap entry is going back your memory needs.


Ok. This wasn't my impression but again I'm fine either ways! ;)
--
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

Messages in current thread:
[PATCH] mmu notifiers #v2, Andrea Arcangeli, (Sun Jan 13, 12:24 pm)
Re: [PATCH] mmu notifiers #v2, Rik van Riel, (Wed Jan 16, 1:42 pm)
Re: [PATCH] mmu notifiers #v2, Izik Eidus, (Wed Jan 16, 1:48 pm)
Re: [PATCH] mmu notifiers #v2, Andrea Arcangeli, (Thu Jan 17, 12:23 pm)
Re: [PATCH] mmu notifiers #v2, Izik Eidus, (Thu Jan 17, 2:21 pm)
Re: [PATCH] mmu notifiers #v2, Andrea Arcangeli, (Thu Jan 17, 3:32 pm)
[PATCH] mmu notifiers #v3, Andrea Arcangeli, (Mon Jan 21, 8:52 am)
Re: [PATCH] mmu notifiers #v3, Peter Zijlstra, (Tue Jan 22, 3:28 pm)
Re: [PATCH] mmu notifiers #v3, Andrea Arcangeli, (Tue Jan 22, 4:31 pm)
Re: [PATCH] mmu notifiers #v3, Hugh Dickins, (Tue Jan 22, 6:10 pm)
Re: [PATCH] mmu notifiers #v3, Christoph Lameter, (Tue Jan 22, 4:31 pm)
Re: [kvm-devel] [PATCH] mmu notifiers #v3, Avi Kivity, (Tue Jan 22, 10:12 am)
Re: [kvm-devel] [PATCH] mmu notifiers #v3, Andrea Arcangeli, (Tue Jan 22, 10:43 am)
Re: [kvm-devel] [PATCH] mmu notifiers #v4, Andrea Arcangeli, (Tue Jan 22, 4:08 pm)
Re: [kvm-devel] [PATCH] export notifier #1, Christoph Lameter, (Tue Jan 22, 4:34 pm)
Re: [kvm-devel] [PATCH] export notifier #1, Gerd Hoffmann, (Wed Jan 23, 8:51 am)
Re: [kvm-devel] [PATCH] export notifier #1, Andrea Arcangeli, (Wed Jan 23, 11:41 am)
Re: [kvm-devel] [PATCH] export notifier #1, Christoph Lameter, (Wed Jan 23, 4:40 pm)
Re: [kvm-devel] [PATCH] export notifier #1, Gerd Hoffmann, (Wed Jan 23, 1:47 pm)
Re: [kvm-devel] [PATCH] export notifier #1, Jeremy Fitzhardinge, (Thu Jan 24, 2:45 am)
Re: [kvm-devel] [PATCH] export notifier #1, Avi Kivity, (Thu Jan 24, 2:01 am)
Re: [kvm-devel] [PATCH] export notifier #1, Robin Holt, (Wed Jan 23, 9:19 am)
Re: [kvm-devel] [PATCH] export notifier #1, Avi Kivity, (Wed Jan 23, 10:17 am)
Re: [kvm-devel] [PATCH] export notifier #1, Benjamin Herrenschmidt, (Thu Jan 24, 12:03 am)
Re: [kvm-devel] [PATCH] export notifier #1, Gerd Hoffmann, (Wed Jan 23, 10:12 am)
Re: [kvm-devel] [PATCH] export notifier #1, Robin Holt, (Wed Jan 23, 10:18 am)
Re: [kvm-devel] [PATCH] export notifier #1, Gerd Hoffmann, (Wed Jan 23, 10:35 am)
Re: [kvm-devel] [PATCH] export notifier #1, Robin Holt, (Wed Jan 23, 11:48 am)
Re: [kvm-devel] [PATCH] export notifier #1, Benjamin Herrenschmidt, (Tue Jan 22, 7:36 pm)
Re: [kvm-devel] [PATCH] export notifier #1, Christoph Lameter, (Tue Jan 22, 8:40 pm)
Re: [kvm-devel] [PATCH] export notifier #1, Robin Holt, (Tue Jan 22, 9:21 pm)
Re: [kvm-devel] [PATCH] export notifier #1, Andrea Arcangeli, (Tue Jan 22, 6:31 pm)
Re: [kvm-devel] [PATCH] export notifier #1, Christoph Lameter, (Tue Jan 22, 6:53 pm)
Re: [kvm-devel] [PATCH] export notifier #1, Andrea Arcangeli, (Wed Jan 23, 7:41 am)
Re: [kvm-devel] [PATCH] export notifier #1, Christoph Lameter, (Wed Jan 23, 4:18 pm)
Re: [kvm-devel] [PATCH] export notifier #1, Andrea Arcangeli, (Thu Jan 24, 10:34 am)
Re: [kvm-devel] [PATCH] export notifier #1, Christoph Lameter, (Thu Jan 24, 4:01 pm)
Re: [kvm-devel] [PATCH] export notifier #1, Avi Kivity, (Thu Jan 24, 11:15 am)
Re: [kvm-devel] [PATCH] export notifier #1, Avi Kivity, (Thu Jan 24, 11:18 am)
Re: [kvm-devel] [PATCH] export notifier #1, Andrea Arcangeli, (Thu Jan 24, 10:41 am)
Re: [kvm-devel] [PATCH] export notifier #1, Robin Holt, (Wed Jan 23, 8:32 am)
Re: [kvm-devel] [PATCH] export notifier #1, Andrea Arcangeli, (Wed Jan 23, 1:33 pm)
Re: [kvm-devel] [PATCH] export notifier #1, Christoph Lameter, (Wed Jan 23, 4:27 pm)
Re: [kvm-devel] [PATCH] export notifier #1, Andrea Arcangeli, (Thu Jan 24, 11:42 am)
Re: [kvm-devel] [PATCH] export notifier #1, Christoph Lameter, (Thu Jan 24, 4:07 pm)
Re: [kvm-devel] [PATCH] export notifier #1, Avi Kivity, (Fri Jan 25, 2:35 am)
Re: [kvm-devel] [PATCH] export notifier #1, Avi Kivity, (Wed Jan 23, 6:27 am)
Re: [kvm-devel] [PATCH] export notifier #1, Robin Holt, (Wed Jan 23, 6:52 am)
Re: [kvm-devel] [PATCH] export notifier #1, Christoph Lameter, (Wed Jan 23, 3:47 pm)
Re: [kvm-devel] [PATCH] export notifier #1, Avi Kivity, (Thu Jan 24, 1:56 am)
Re: [kvm-devel] [PATCH] export notifier #1, Andrea Arcangeli, (Thu Jan 24, 8:26 am)
Re: [kvm-devel] [PATCH] export notifier #1, Avi Kivity, (Thu Jan 24, 8:34 am)
Re: [kvm-devel] [PATCH] export notifier #1, Andrea Arcangeli, (Wed Jan 23, 8:04 am)
Re: [kvm-devel] [PATCH] export notifier #1, Christoph Lameter, (Wed Jan 23, 3:48 pm)
Re: [kvm-devel] [PATCH] export notifier #1, Robin Holt, (Wed Jan 23, 3:58 pm)
Re: [kvm-devel] [PATCH] export notifier #1, Robin Holt, (Wed Jan 23, 8:34 am)
Re: [PATCH] mmu notifiers #v3, Rik van Riel, (Mon Jan 21, 10:21 pm)
Re: [PATCH] mmu notifiers #v2, Brice Goglin, (Wed Jan 16, 5:01 am)
Re: [PATCH] mmu notifiers #v2, Andrea Arcangeli, (Wed Jan 16, 6:19 am)
Re: [PATCH] mmu notifiers #v2, Christoph Lameter, (Mon Jan 14, 4:02 pm)
Re: [PATCH] mmu notifiers #v2, Andrea Arcangeli, (Tue Jan 15, 8:44 am)
Re: [PATCH] mmu notifiers #v2, Benjamin Herrenschmidt, (Tue Jan 15, 4:18 pm)
Re: [PATCH] mmu notifiers #v2, Andrea Arcangeli, (Tue Jan 15, 9:06 pm)
Re: [PATCH] mmu notifiers #v2, Benjamin Herrenschmidt, (Tue Jan 15, 12:28 am)
Re: [PATCH] mmu notifiers #v2, Benjamin Herrenschmidt, (Sun Jan 13, 5:11 pm)