Actually, so if it's that race, then we might get rid of the oops with
this total hack.
NOTE! If this is the race, then the hack really is just a hack, because it
doesn't really solve anything. We still take the spinlock, and if bad
things has happened, _that_ can still very much fail, and you get the
watchdog lockup message instead. So this doesn't really fix anything.
But if this patch changes behavior, and you no longer see the oops, that
tells us _something_. I'm not sure how useful that "something" is, but it
at least means that there are no _mapped_ pages that have that stale
anon_vma pointer in page->mapping.
Conversely, if you still see the oops (rather than the watchdog), that
means that we actually have pages that are still marked mapped, and that
despite that mapped state have a stale page->mapping pointer. I actually
find that the more likely case, because otherwise the window is _so_ small
that I don't see how you can hit the oops so reliably.
Anyway - probably worth testing, along with the verify_vma() patch. If
nothing else, if there is no new behavior, even that tells us something.
Even if that "something" is not a huge piece of information.
Linus
---
diff --git a/mm/rmap.c b/mm/rmap.c
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -302,7 +302,11 @@ struct anon_vma *page_lock_anon_vma(struct page *page)
anon_vma = (struct anon_vma *) (anon_mapping - PAGE_MAPPING_ANON);
spin_lock(&anon_vma->lock);
- return anon_vma;
+
+ if (page_mapped(page))
+ return anon_vma;
+
+ spin_unlock(&anon_vma->lock);
out:
rcu_read_unlock();
return NULL;
--