This patch introduces the kvm_read_guest_page_x86 function
which reads from the physical memory of the guest. If the
guest is running in guest-mode itself with nested paging
enabled it will read from the guest's guest physical memory
instead.
The patch also changes changes the code to use this function
where it is necessary.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
---
arch/x86/include/asm/kvm_host.h | 7 +++++++
arch/x86/kvm/x86.c | 29 +++++++++++++++++++++++++----
2 files changed, 32 insertions(+), 4 deletions(-)
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 7851bbc..d9dfc8c 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -254,6 +254,13 @@ struct kvm_mmu {
union kvm_mmu_page_role base_role;
bool direct_map;
+ /*
+ * If true the mmu runs in two-level mode.
+ * vcpu->arch.nested_mmu needs to contain meaningful values in
+ * this case.
+ */
+ bool nested;
+
u64 *pae_root;
u64 rsvd_bits_mask[2][4];
};
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 558d995..317ad26 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -379,6 +379,20 @@ int kvm_read_guest_page_tdp(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu,
}
EXPORT_SYMBOL_GPL(kvm_read_guest_page_tdp);
+int kvm_read_guest_page_x86(struct kvm_vcpu *vcpu, gfn_t gfn,
+ void *data, int offset, int len, u32 *error)
+{
+ struct kvm_mmu *mmu;
+
+ if (vcpu->arch.mmu.nested)
+ mmu = &vcpu->arch.nested_mmu;
+ else
+ mmu = &vcpu->arch.mmu;
+
+ return kvm_read_guest_page_tdp(vcpu, mmu, gfn, data, offset, len,
+ error);
+}
+
/*
* Load the pae pdptrs. Return true is they are all valid.
*/
@@ -386,12 +400,13 @@ int load_pdptrs(struct kvm_vcpu *vcpu, unsigned long cr3)
{
gfn_t pdpt_gfn = cr3 >> PAGE_SHIFT;
unsigned offset = ((cr3 & (PAGE_SIZE-1)) >> 5) << 2;
- int i;
+ int i, error;
int ret;
u64 pdpte[ARRAY_SIZE(vcpu->arch.pdptrs)];
...This is really not x86 specific (though the implementation certainly is). s390 will have exactly the same need when it gets nested virt. I think this can be folded into kvm_read_guest_page_tdp()/kvm_read_nested_guest_page(). -- error compiling committee.c: too many arguments to function --
Hmm, difficult since both mmu's are active in the npt-npt case. The arch.mmu struct contains mostly the l1 paging state initialized for shadow paging and different set_cr3/get_cr3/inject_page_fault functions. This keeps the changes to the mmu small and optimize for the common case (a nested npt fault). The arch.nested_mmu contains the l2 paging mode and is only used for nested gva_to_gpa translations (thats the reason it is only partially For the generic walk_addr I need a version of that function that takes an mmu_context parameter. Thats the reason I made two functions. The function (or at least its semantic) is useful for !x86 too, thats right. But it currently can't be made generic because the MMU implementation is architecture specific. Do you suggest to give it a more generic name so we can move it later? Joerg --
Okay, I finally read most of mmu.txt :) I agree that the implemented scheme for using arch.mmu and arch.nested_mmu is non-obvious. The most complicated thing is that .gva_to_gpa functions are exchanged between mmu and nested_mmu. This means that nested_mmu.gva_to_gpa basically walks a page table in a mode described by arch.mmu while mmu.gva_to_gpa walks the full two-dimensional page table. But I also don't yet see how a *mmu pointer would help here. From my current understanding of the idea the only place to use it would be the init_kvm_mmu path. But maybe we could also do this to make things more clear: * Rename nested_mmu to l2_mmu and use it to save the current paging mode of the l2 guest * Do not switch the the .gva_to_gpa function pointers between mmu contexts and provide a wrapper function for all callers of arch.mmu.gva_to_gpa which uses the right one for the current context. The arch.nested flag is the indicator for which one to use. Btw. the purpose of the nested-flag is to prevent to reset arch.mmu when the l2 guest changes his paging mode and we could remove this flag with a pointer-solution. Currently its a bit unclear when to use mmu or nested_mmu. With a pointer it would be unclear to the code reader when to use the pointer and when to select the mmu_contexts directly. Joerg --
I think in most cases you'd want full translation, thus the pointer. This should be the default. In specific cases you'd want just the non-nested guest translation. -- error compiling committee.c: too many arguments to function --
Hmm, for most cases == all gva_to_gpa cases. The page fault path can't use the pointer. I'll try out how this works. It shouldn't be too complicated. Joerg --
Okay, the pointer is only used in the gva_to_gpa path which is the minority of mmu context usages. I will introduce a struct kvm_mmu *walk_mmu; and use it in the page walking code. Joerg --
