Cc: Andrew Morton <akpm@...>, Christoph Lameter <clameter@...>, Robin Holt <holt@...>, Nick Piggin <npiggin@...>, Peter Zijlstra <a.p.zijlstra@...>, <kvm-devel@...>, Kanoj Sarcar <kanojsarcar@...>, Roland Dreier <rdreier@...>, Steve Wise <swise@...>, <linux-kernel@...>, Avi Kivity <avi@...>, <linux-mm@...>, <general@...>, Hugh Dickins <hugh@...>, Rusty Russell <rusty@...>, Anthony Liguori <aliguori@...>, Chris Wright <chrisw@...>, Marcelo Tosatti <marcelo@...>, Eric Dumazet <dada1@...>, Paul E. McKenney <paulmck@...>
On Mon, May 05, 2008 at 02:46:25PM -0500, Jack Steiner wrote:
Yes, this will also happen in case the well behaved task receives
SIGKILL, so you can test it that way too.
GRU TLB flushes aren't skipped because your flag is set but because
__mmu_notifier_release already executed
list_del_init_rcu(&grunotifier->hlist) before proceeding with
unmap_vmas.
As long as nobody can write through the already established gru tlbs
and nobody can establish new tlbs after exit_mmap run you don't
strictly need ->release.
You can remove the flag and ->release and ->clear_flush_young (if you
keep clear_flush_young implemented it should return 0). The
synchronize_rcu after mmu_notifier_register can also be dropped thanks
to mm_lock(). gru_drop_mmu_notifier should be careful with current->mm
if you're using an fd and if the fd can be passed to a different task
through unix sockets (you should probably fail any operation if
current->mm != gru->mm).
The way I use ->release in KVM is to set the root hpa to -1UL
(invalid) as a debug trap. That's only for debugging because even if
tlb entries and sptes are still established on the secondary mmu they
are only relevant when the cpu jumps to guest mode and that can never
happen again after exit_mmap is started.
Well that function needs fixing w.r.t. srcu. Are you sure you want to
search for mn->ops == gru_mmuops and not for mn == gmn? And if you
search for mn why can't you keep track of the mn being registered or
unregistered outside of the mmu_notifier layer? Set a bitflag in the
container after mmu_notifier_register returns and a clear it after
_unregister returns. I doubt saving one bitflag is worth searching the
list and your approach make it obvious that you've to protect the
bitflag and the register/unregister under write-mmap_sem
yourself. Otherwise the find function will return an object that can
be freed at any time if somebody calls unregister and
kfree. (synchronize_srcu in mmu_notifier_unregister won't wait for
anything but some outstanding srcu_read_lock)
--