Re: [PATCH RFC] kvm: write protect memory after slot swap

Previous thread: linux-next: build failure after merge of the mmc tree by Stephen Rothwell on Sunday, October 24, 2010 - 6:12 pm. (2 messages)

Next thread: [RELEASE] LTTng 0.233 for kernel 2.6.36, lttng-modules 0.19.1 by Mathieu Desnoyers on Sunday, October 24, 2010 - 6:39 pm. (1 message)
From: Michael S. Tsirkin
Date: Sunday, October 24, 2010 - 6:21 pm

I have observed the following bug trigger:

1. userspace calls GET_DIRTY_LOG
2. kvm_mmu_slot_remove_write_access is called and makes a page ro
3. page fault happens and makes the page writeable
   fault is logged in the bitmap appropriately
4. kvm_vm_ioctl_get_dirty_log swaps slot pointers

a lot of time passes

5. guest writes into the page
6. userspace calls GET_DIRTY_LOG

At point (5), bitmap is clean and page is writeable,
thus, guest modification of memory is not logged
and GET_DIRTY_LOG returns an empty bitmap.

The rule is that all pages are either dirty in the current bitmap,
or write-protected, which is violated here.

It seems that just moving kvm_mmu_slot_remove_write_access down
to after the slot pointer swap should fix this bug.

Warning: completely untested.
Please comment.
Note: fix will be needed for -stable etc.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 arch/x86/kvm/x86.c |    8 ++++----
 1 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 3a09c62..4ca1d7f 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -2912,10 +2912,6 @@ int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm,
 		struct kvm_memslots *slots, *old_slots;
 		unsigned long *dirty_bitmap;
 
-		spin_lock(&kvm->mmu_lock);
-		kvm_mmu_slot_remove_write_access(kvm, log->slot);
-		spin_unlock(&kvm->mmu_lock);
-
 		r = -ENOMEM;
 		dirty_bitmap = vmalloc(n);
 		if (!dirty_bitmap)
@@ -2937,6 +2933,10 @@ int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm,
 		dirty_bitmap = old_slots->memslots[log->slot].dirty_bitmap;
 		kfree(old_slots);
 
+		spin_lock(&kvm->mmu_lock);
+		kvm_mmu_slot_remove_write_access(kvm, log->slot);
+		spin_unlock(&kvm->mmu_lock);
+
 		r = -EFAULT;
 		if (copy_to_user(log->dirty_bitmap, dirty_bitmap, n)) {
 			vfree(dirty_bitmap);
-- 
1.7.3-rc1
--

From: Takuya Yoshikawa
Date: Monday, October 25, 2010 - 12:27 am

On Mon, 25 Oct 2010 03:21:24 +0200

This may be the reason why my commit is a corruption magnifier.
My patch moved the vmalloc() right after
kvm_mmu_slot_remove_write_access() and made this chance bigger:
because vmalloc() takes some time.

  Thanks,
--

From: Jan Kiszka
Date: Monday, October 25, 2010 - 2:07 am

Cool, seems to be the key to the corruptions I've seen. Applying your

Assuming that a page cannot be write-enabled without having a dirty
entry in the old bitmap and due to the fact that user space won't get
hold of that old bitmap to read out the page before we reset write
access again, your patch should actually be safe.

If no one else sees some remaining race, let's get this applied upstream
ASAP and pushed down to the stable trees.

Thanks,

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux
--

From: Michael S. Tsirkin
Date: Monday, October 25, 2010 - 5:05 am

Yes, works for me as well.

-- 
MST
--

From: Takuya Yoshikawa
Date: Monday, October 25, 2010 - 11:38 pm

I did some tests on my laptop:
   - kvm.git + mst's patch
   - qemu.git (upstream qemu)
and still got graphics curruption.


Corruption:
   On usual Desktop environment, I opened two terminals on different
   workspaces. Then as Jan did, I did "find /" loop on both of them.

   During these heavy updates, I tried to switch between these
   workspaces some times. Then, at some turn, some part of old
   workspace's images like terminals and mouse cursor remained
   in the new workspace.

   I could refresh these by moving mouse over the problematic parts.
   But without doing so, the images remained still.


Refresh is not working on virtual machines as I expect?


Thanks,
   Takuya
--

From: Avi Kivity
Date: Monday, October 25, 2010 - 2:32 am

Excellent catch, I stared at this code for a while and didn't see the 
bug.  Patch applied.

-- 
error compiling committee.c: too many arguments to function

--

From: Jan Kiszka
Date: Monday, October 25, 2010 - 4:40 am

BTW, while this was an annoying one for graphic emulation, wasn't it
potentially lethal for live migration?

The issue looks like is was introduced with the switch to SRCU, so every
kernel since 2.6.34 should be affected, correct?

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux
--

From: Avi Kivity
Date: Monday, October 25, 2010 - 4:50 am

Deadly.  Yes autofs passed it happily.


Yes.

-- 
error compiling committee.c: too many arguments to function

--

From: Avi Kivity
Date: Monday, October 25, 2010 - 4:51 am

Er, autotest.  I'm just having problems my 2.6.36+ autofs setup.

-- 
error compiling committee.c: too many arguments to function

--

From: Jan Kiszka
Date: Wednesday, November 24, 2010 - 12:16 pm

This patch was marked KVM-stable on commit, but it did not make into any
stable branch thus also none of the recent releases. Please fix (for
2.6.36 now).

Thanks,
Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux
--

From: Avi Kivity
Date: Thursday, November 25, 2010 - 2:18 am

We need some bot that watches out for commits with the tag going into 
upstream and pokes the maintainers.

-- 
error compiling committee.c: too many arguments to function

--

Previous thread: linux-next: build failure after merge of the mmc tree by Stephen Rothwell on Sunday, October 24, 2010 - 6:12 pm. (2 messages)

Next thread: [RELEASE] LTTng 0.233 for kernel 2.6.36, lttng-modules 0.19.1 by Mathieu Desnoyers on Sunday, October 24, 2010 - 6:39 pm. (1 message)