Re: [PATCH 2/2] KVM, Fix QEMU-KVM is killed by guest SRAO MCE

Previous thread: [PATCH 1/2] mm, Make __get_user_pages return -EHWPOISON for HWPOISON page by Huang Ying on Tuesday, April 27, 2010 - 12:04 am. (4 messages)

Next thread: [PATCH 1/6] kconfig: print symbol type in help text by Li Zefan on Tuesday, April 27, 2010 - 12:48 am. (12 messages)
From: Huang Ying
Date: Tuesday, April 27, 2010 - 12:04 am

In common cases, guest SRAO MCE will cause corresponding poisoned page
be un-mapped and SIGBUS be sent to QEMU-KVM, then QEMU-KVM will relay
the MCE to guest OS.

But it is reported that if the poisoned page is accessed in guest
after un-mapped and before MCE is relayed to guest OS, QEMU-KVM will
be killed.

The reason is as follow. Because poisoned page has been un-mapped,
guest access will cause guest exit and kvm_mmu_page_fault will be
called. kvm_mmu_page_fault can not get the poisoned page for fault
address, so kernel and user space MMIO processing is tried in turn. In
user MMIO processing, poisoned page is accessed again, then QEMU-KVM
is killed by force_sig_info.

To fix the bug, kvm_mmu_page_fault send HWPOISON signal to QEMU-KVM
and do not try kernel and user space MMIO processing for poisoned
page.

Reported-by: Max Asbock <masbock@linux.vnet.ibm.com>
Signed-off-by: Huang Ying <ying.huang@intel.com>
---
 arch/x86/kvm/mmu.c         |   24 ++++++++++++++++++++++--
 arch/x86/kvm/paging_tmpl.h |   10 ++++++++--
 include/linux/kvm_host.h   |    1 +
 virt/kvm/kvm_main.c        |   30 ++++++++++++++++++++++++++++--
 4 files changed, 59 insertions(+), 6 deletions(-)

--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -32,6 +32,7 @@
 #include <linux/compiler.h>
 #include <linux/srcu.h>
 #include <linux/slab.h>
+#include <linux/uaccess.h>
 
 #include <asm/page.h>
 #include <asm/cmpxchg.h>
@@ -1972,6 +1973,17 @@ static int __direct_map(struct kvm_vcpu
 	return pt_write;
 }
 
+static void kvm_send_hwpoison_signal(struct kvm *kvm, gfn_t gfn)
+{
+	char buf[1];
+	void __user *hva;
+	int r;
+
+	/* Touch the page, so send SIGBUS */
+	hva = (void __user *)gfn_to_hva(kvm, gfn);
+	r = copy_from_user(buf, hva, 1);
+}
+
 static int nonpaging_map(struct kvm_vcpu *vcpu, gva_t v, int write, gfn_t gfn)
 {
 	int r;
@@ -1997,7 +2009,11 @@ static int nonpaging_map(struct kvm_vcpu
 	/* mmio */
 	if (is_error_pfn(pfn)) {
 		kvm_release_pfn_clean(pfn);
-		return ...
From: Avi Kivity
Date: Tuesday, April 27, 2010 - 12:47 am

(please copy kvm@vger.kernel.org on kvm patches)



No error check?  What will a copy_from_user() of poisoned page expected 
to return?


This is duplicated several times.  Please introduce a kvm_handle_bad_page():

     if (is_error_pfn(pfn))
         return kvm_handle_bad_page(vcpu->kvm, gfn, pfn);

-- 
Do not meddle in the internals of kernels, for they are subtle and quick to panic.

--

From: Huang Ying
Date: Tuesday, April 27, 2010 - 2:25 am

Just want to use the side effect of copy_from_user, SIGBUS will be sent
to current process because the page touched is marked as poisoned. That

OK. Will do that.

Best Regards,
Huang Ying


--

From: Avi Kivity
Date: Tuesday, April 27, 2010 - 2:30 am

What if the failure doesn't happen?  Say, someone mmap()ed over the page.

btw, better to use (void)copy_from_user(...) instead to avoid the 
initialized but not used warning the compiler may generate.


-- 
error compiling committee.c: too many arguments to function

--

From: Huang Ying
Date: Tuesday, April 27, 2010 - 7:56 pm

Sorry, not get your idea clearly. hva is re-mmap()ed? We just read the

OK. Will do that.

Best Regards,
Huang Ying


--

From: Avi Kivity
Date: Wednesday, April 28, 2010 - 2:47 am

We don't generate a signal in this case.  Does the code continue to work 
correctly (not sure what correctly is in this case... should probably 
just continue).

There's also the possibility of -EFAULT.

-- 
error compiling committee.c: too many arguments to function

--

From: Huang Ying
Date: Wednesday, April 28, 2010 - 6:31 pm

I think signal should be generated for copy_from_user, because the hva
is poisoned now. The signal will not generated only if the hva is
re-mmap()ped to some other physical page, but this should be impossible
unless we have memory hotadd/hotremove in KVM.

If the signal is not generated, lost or overwritten, guest will
continue, and if the hva is still poisoned, the page fault will be
triggered again; if the hva is not poisoned, there will be no further
page fault.

Best Regards,
Huang Ying



--

Previous thread: [PATCH 1/2] mm, Make __get_user_pages return -EHWPOISON for HWPOISON page by Huang Ying on Tuesday, April 27, 2010 - 12:04 am. (4 messages)

Next thread: [PATCH 1/6] kconfig: print symbol type in help text by Li Zefan on Tuesday, April 27, 2010 - 12:48 am. (12 messages)