Hi, sorry for sending from my personal account.
The following series are all from me:
From: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
The 3rd version of "moving dirty bitmaps to user space".
From this version, we add x86 and ppc and asm-generic people to CC lists.
[To KVM people]
Sorry for being late to reply your comments.
Avi,
- I've wrote an answer to your question in patch 5/12: drivers/vhost/vhost.c .
- I've considered to change the set_bit_user_non_atomic to an inline function,
but did not change because the other helpers in the uaccess.h are written as
macros. Anyway, I hope that x86 people will give us appropriate suggestions
about this.
- I thought that documenting about making bitmaps 64-bit aligned will be
written when we add an API to register user-allocated bitmaps. So probably
in the next series.
Avi, Alex,
- Could you check the ia64 and ppc parts, please? I tried to keep the logical
changes as small as possible.
I personally tried to build these with cross compilers. For ia64, I could check
build success with my patch series. But book3s, even without my patch series,
it failed with the following errors:
arch/powerpc/kvm/book3s_paired_singles.c: In function 'kvmppc_emulate_paired_single':
arch/powerpc/kvm/book3s_paired_singles.c:1289: error: the frame size of 2288 bytes is larger than 2048 bytes
make[1]: *** [arch/powerpc/kvm/book3s_paired_singles.o] Error 1
make: *** [arch/powerpc/kvm] Error 2
About changelog: there are two main changes from the 2nd version:
1. I changed the treatment of clean slots (see patch 1/12).
This was already applied today, thanks!
2. I changed the switch API. (see patch 11/12).
To show this API's advantage, I also did a test (see the end of this mail).
[To x86 people]
Hi, Thomas, Ingo, Peter,
Please review the patches 4,5/12. Because this is the first experience for
me to send patches to x86, please tell me if this lacks anything.
[To ...Although we always allocate a new dirty bitmap in x86's get_dirty_log(),
it is only used as a zero-source of copy_to_user() and freed right after
that when memslot is clean. This patch uses clear_user() instead of doing
this unnecessary zero-source allocation.
Performance improvement: as we can expect easily, the time needed to
allocate a bitmap is completely reduced. Furthermore, we can avoid the
tlb flush triggered by vmalloc() and get some good effects. In my test,
the improved ioctl was about 4 to 10 times faster than the original one
for clean slots.
Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
---
arch/x86/kvm/x86.c | 37 +++++++++++++++++++++++--------------
1 files changed, 23 insertions(+), 14 deletions(-)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 6b2ce1d..b95a211 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -2744,7 +2744,6 @@ int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm,
struct kvm_memory_slot *memslot;
unsigned long n;
unsigned long is_dirty = 0;
- unsigned long *dirty_bitmap = NULL;
mutex_lock(&kvm->slots_lock);
@@ -2759,27 +2758,30 @@ int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm,
n = kvm_dirty_bitmap_bytes(memslot);
- r = -ENOMEM;
- dirty_bitmap = vmalloc(n);
- if (!dirty_bitmap)
- goto out;
- memset(dirty_bitmap, 0, n);
-
for (i = 0; !is_dirty && i < n/sizeof(long); i++)
is_dirty = memslot->dirty_bitmap[i];
/* If nothing is dirty, don't bother messing with page tables. */
if (is_dirty) {
struct kvm_memslots *slots, *old_slots;
+ unsigned long *dirty_bitmap;
spin_lock(&kvm->mmu_lock);
kvm_mmu_slot_remove_write_access(kvm, log->slot);
spin_unlock(&kvm->mmu_lock);
- slots = kzalloc(sizeof(struct kvm_memslots), GFP_KERNEL);
- if (!slots)
- goto out_free;
+ r = -ENOMEM;
+ dirty_bitmap = vmalloc(n);
+ if (!dirty_bitmap)
+ goto out;
+ memset(dirty_bitmap, 0, n);
+ r = -ENOMEM;
+ slots = kzalloc(sizeof(struct kvm_memslots), ...We will change the vmalloc() and vfree() to do_mmap() and do_munmap() later.
This patch makes it easy and cleanup the code.
Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp>
---
virt/kvm/kvm_main.c | 27 ++++++++++++++++++++-------
1 files changed, 20 insertions(+), 7 deletions(-)
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 7ab6312..3e3acad 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -435,6 +435,12 @@ out_err_nodisable:
return ERR_PTR(r);
}
+static void kvm_destroy_dirty_bitmap(struct kvm_memory_slot *memslot)
+{
+ vfree(memslot->dirty_bitmap);
+ memslot->dirty_bitmap = NULL;
+}
+
/*
* Free any memory in @free but not in @dont.
*/
@@ -447,7 +453,7 @@ static void kvm_free_physmem_slot(struct kvm_memory_slot *free,
vfree(free->rmap);
if (!dont || free->dirty_bitmap != dont->dirty_bitmap)
- vfree(free->dirty_bitmap);
+ kvm_destroy_dirty_bitmap(free);
for (i = 0; i < KVM_NR_PAGE_SIZES - 1; ++i) {
@@ -458,7 +464,6 @@ static void kvm_free_physmem_slot(struct kvm_memory_slot *free,
}
free->npages = 0;
- free->dirty_bitmap = NULL;
free->rmap = NULL;
}
@@ -520,6 +525,18 @@ static int kvm_vm_release(struct inode *inode, struct file *filp)
return 0;
}
+static int kvm_create_dirty_bitmap(struct kvm_memory_slot *memslot)
+{
+ unsigned long dirty_bytes = kvm_dirty_bitmap_bytes(memslot);
+
+ memslot->dirty_bitmap = vmalloc(dirty_bytes);
+ if (!memslot->dirty_bitmap)
+ return -ENOMEM;
+
+ memset(memslot->dirty_bitmap, 0, dirty_bytes);
+ return 0;
+}
+
/*
* Allocate some memory and give it an address in the guest physical address
* space.
@@ -653,12 +670,8 @@ skip_lpage:
/* Allocate page dirty bitmap if needed */
if ((new.flags & KVM_MEM_LOG_DIRTY_PAGES) && !new.dirty_bitmap) {
- unsigned long dirty_bytes = kvm_dirty_bitmap_bytes(&new);
-
- new.dirty_bitmap = ...During the work of KVM's dirty page logging optimization, we encountered the need of copy_in_user() for 32-bit x86 and ppc: these will be used for manipulating dirty bitmaps in user space. So we implement copy_in_user() for 32-bit with existing generic copy user helpers. Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp> Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp> CC: Avi Kivity <avi@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> CC: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> --- arch/x86/include/asm/uaccess_32.h | 2 ++ arch/x86/lib/usercopy_32.c | 26 ++++++++++++++++++++++++++ 2 files changed, 28 insertions(+), 0 deletions(-) diff --git a/arch/x86/include/asm/uaccess_32.h b/arch/x86/include/asm/uaccess_32.h index 088d09f..85d396d 100644 --- a/arch/x86/include/asm/uaccess_32.h +++ b/arch/x86/include/asm/uaccess_32.h @@ -21,6 +21,8 @@ unsigned long __must_check __copy_from_user_ll_nocache (void *to, const void __user *from, unsigned long n); unsigned long __must_check __copy_from_user_ll_nocache_nozero (void *to, const void __user *from, unsigned long n); +unsigned long __must_check copy_in_user + (void __user *to, const void __user *from, unsigned n); /** * __copy_to_user_inatomic: - Copy a block of data into user space, with less checking. diff --git a/arch/x86/lib/usercopy_32.c b/arch/x86/lib/usercopy_32.c index e218d5d..e90ffc3 100644 --- a/arch/x86/lib/usercopy_32.c +++ b/arch/x86/lib/usercopy_32.c @@ -889,3 +889,29 @@ void copy_from_user_overflow(void) WARN(1, "Buffer overflow detected!\n"); } EXPORT_SYMBOL(copy_from_user_overflow); + +/** + * copy_in_user: - Copy a block of data from user space to user space. + * @to: Destination address, in user space. + * @from: Source address, in user space. + * @n: Number of bytes to copy. + * + * Context: User context only. This function may sleep. + * + * Copy data from user space to user space. + * + * ...
During the work of KVM's dirty page logging optimization, we encountered
the need of manipulating bitmaps in user space efficiantly. To achive this,
we introduce a uaccess function for setting a bit in user space following
Avi's suggestion.
KVM is now using dirty bitmaps for live-migration and VGA. Although we need
to update them from kernel side, copying them every time for updating the
dirty log is a big bottleneck. Especially, we tested that zero-copy bitmap
manipulation improves responses of GUI manipulations a lot.
We also found one similar need in drivers/vhost/vhost.c in which the author
implemented set_bit_to_user() locally using inefficient functions: see TODO
at the top of that.
Probably, this kind of need would be common for virtualization area.
So we introduce a macro set_bit_user_non_atomic() following the implementation
style of x86's uaccess functions.
Note: there is a one restriction to this macro: bitmaps must be 64-bit
aligned (see the comment in this patch).
Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp>
CC: Avi Kivity <avi@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
---
arch/x86/include/asm/uaccess.h | 39 +++++++++++++++++++++++++++++++++++++++
1 files changed, 39 insertions(+), 0 deletions(-)
diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h
index abd3e0e..3138e65 100644
--- a/arch/x86/include/asm/uaccess.h
+++ b/arch/x86/include/asm/uaccess.h
@@ -98,6 +98,45 @@ struct exception_table_entry {
extern int fixup_exception(struct pt_regs *regs);
+/**
+ * set_bit_user_non_atomic: - set a bit of a bitmap in user space.
+ * @nr: Bit offset.
+ * @addr: Base address of a bitmap in user space.
+ *
+ * Context: User context only. This function may sleep.
+ *
+ * This macro set a bit of a bitmap in user space.
+ *
+ * Restriction: the ...During the work of KVM's dirty page logging optimization, we encountered
the need of copy_in_user() for 32-bit ppc and x86: these will be used for
manipulating dirty bitmaps in user space.
So we implement copy_in_user() for 32-bit with __copy_tofrom_user().
Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp>
CC: Alexander Graf <agraf@suse.de>
CC: Benjamin Herrenschmidt <benh@kernel.crashing.org>
CC: Paul Mackerras <paulus@samba.org>
---
arch/powerpc/include/asm/uaccess.h | 17 +++++++++++++++++
1 files changed, 17 insertions(+), 0 deletions(-)
diff --git a/arch/powerpc/include/asm/uaccess.h b/arch/powerpc/include/asm/uaccess.h
index bd0fb84..3a01ce8 100644
--- a/arch/powerpc/include/asm/uaccess.h
+++ b/arch/powerpc/include/asm/uaccess.h
@@ -359,6 +359,23 @@ static inline unsigned long copy_to_user(void __user *to,
return n;
}
+static inline unsigned long copy_in_user(void __user *to,
+ const void __user *from, unsigned long n)
+{
+ unsigned long over;
+
+ if (likely(access_ok(VERIFY_READ, from, n) &&
+ access_ok(VERIFY_WRITE, to, n)))
+ return __copy_tofrom_user(to, from, n);
+ if (((unsigned long)from < TASK_SIZE) ||
+ ((unsigned long)to < TASK_SIZE)) {
+ over = max((unsigned long)from, (unsigned long)to)
+ + n - TASK_SIZE;
+ return __copy_tofrom_user(to, from, n - over) + over;
+ }
+ return n;
+}
+
#else /* __powerpc64__ */
#define __copy_in_user(to, from, size) \
--
1.7.0.4
--
During the work of KVM's dirty page logging optimization, we encountered
the need of manipulating bitmaps in user space efficiantly. To achive this,
we introduce a uaccess function for setting a bit in user space following
Avi's suggestion.
KVM is now using dirty bitmaps for live-migration and VGA. Although we need
to update them from kernel side, copying them every time for updating the
dirty log is a big bottleneck. Especially, we tested that zero-copy bitmap
manipulation improves responses of GUI manipulations a lot.
We also found one similar need in drivers/vhost/vhost.c in which the author
implemented set_bit_to_user() locally using inefficient functions: see TODO
at the top of that.
Probably, this kind of need would be common for virtualization area.
So we introduce a function set_bit_user_non_atomic().
Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp>
CC: Alexander Graf <agraf@suse.de>
CC: Benjamin Herrenschmidt <benh@kernel.crashing.org>
CC: Paul Mackerras <paulus@samba.org>
---
arch/powerpc/include/asm/uaccess.h | 19 +++++++++++++++++++
1 files changed, 19 insertions(+), 0 deletions(-)
diff --git a/arch/powerpc/include/asm/uaccess.h b/arch/powerpc/include/asm/uaccess.h
index 3a01ce8..f878326 100644
--- a/arch/powerpc/include/asm/uaccess.h
+++ b/arch/powerpc/include/asm/uaccess.h
@@ -321,6 +321,25 @@ do { \
__gu_err; \
})
+static inline int set_bit_user_non_atomic(int nr, void __user *addr)
+{
+ u8 __user *p;
+ u8 val;
+
+ p = (u8 __user *)((unsigned long)addr + nr / BITS_PER_BYTE);
+ if (!access_ok(VERIFY_WRITE, p, 1))
+ return -EFAULT;
+
+ if (__get_user(val, p))
+ return -EFAULT;
+
+ val |= 1U << (nr % BITS_PER_BYTE);
+ if (__put_user(val, p))
+ return -EFAULT;
+
+ return 0;
+}
+
/* more complex routines */
--
1.7.0.4
--
We move dirty bitmaps to user space.
- Allocation and destruction: we use do_mmap() and do_munmap().
The new bitmap space is twice longer than the original one and we
use the additional space for double buffering: this makes it
possible to update the active bitmap while letting the user space
read the other one safely. For x86, we can also remove the vmalloc()
in kvm_vm_ioctl_get_dirty_log().
- Bitmap manipulations: we replace all functions which access dirty
bitmaps with *_user() functions.
- For ia64: moving the dirty bitmaps of memory slots does not affect
ia64 much because it's using a different place to store dirty logs
rather than the dirty bitmaps of memory slots: all we have to change
are sync and get of dirty log, so we don't need set_bit_user like
functions for ia64.
Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp>
CC: Avi Kivity <avi@redhat.com>
CC: Alexander Graf <agraf@suse.de>
---
arch/ia64/kvm/kvm-ia64.c | 15 +++++++++-
arch/powerpc/kvm/book3s.c | 5 +++-
arch/x86/kvm/x86.c | 25 ++++++++----------
include/linux/kvm_host.h | 3 +-
virt/kvm/kvm_main.c | 62 +++++++++++++++++++++++++++++++++++++-------
5 files changed, 82 insertions(+), 28 deletions(-)
diff --git a/arch/ia64/kvm/kvm-ia64.c b/arch/ia64/kvm/kvm-ia64.c
index 17fd65c..03503e6 100644
--- a/arch/ia64/kvm/kvm-ia64.c
+++ b/arch/ia64/kvm/kvm-ia64.c
@@ -1823,11 +1823,19 @@ static int kvm_ia64_sync_dirty_log(struct kvm *kvm,
n = kvm_dirty_bitmap_bytes(memslot);
base = memslot->base_gfn / BITS_PER_LONG;
+ r = -EFAULT;
+ if (!access_ok(VERIFY_WRITE, memslot->dirty_bitmap, n))
+ goto out;
+
for (i = 0; i < n/sizeof(long); ++i) {
if (dirty_bitmap[base + i])
memslot->is_dirty = true;
- memslot->dirty_bitmap[i] = dirty_bitmap[base + i];
+ if (__put_user(dirty_bitmap[base + i],
+ &memslot->dirty_bitmap[i])) {
+ r = ...Oh, it's a serious problem. I have to consider it. Thanks, Takuya --
Now that dirty bitmaps are accessible from user space, we export the addresses of these to achieve zero-copy dirty log check: KVM_GET_USER_DIRTY_LOG_ADDR We also need an API for triggering dirty bitmap switch to take the full advantage of double buffered bitmaps. KVM_SWITCH_DIRTY_LOG See the documentation in this patch for precise explanations. About performance improvement: the most important feature of switch API is the lightness. In our test, this appeared in the form of improved responses for GUI manipulations. Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp> Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp> CC: Avi Kivity <avi@redhat.com> CC: Alexander Graf <agraf@suse.de> --- Documentation/kvm/api.txt | 87 +++++++++++++++++++++++++++++++++++++++++++++ arch/ia64/kvm/kvm-ia64.c | 27 +++++++++----- arch/powerpc/kvm/book3s.c | 18 +++++++-- arch/x86/kvm/x86.c | 44 ++++++++++++++++------- include/linux/kvm.h | 11 ++++++ include/linux/kvm_host.h | 4 ++- virt/kvm/kvm_main.c | 63 +++++++++++++++++++++++++++++---- 7 files changed, 220 insertions(+), 34 deletions(-) diff --git a/Documentation/kvm/api.txt b/Documentation/kvm/api.txt index a237518..c106c83 100644 --- a/Documentation/kvm/api.txt +++ b/Documentation/kvm/api.txt @@ -892,6 +892,93 @@ arguments. This ioctl is only useful after KVM_CREATE_IRQCHIP. Without an in-kernel irqchip, the multiprocessing state must be maintained by userspace. +4.39 KVM_GET_USER_DIRTY_LOG_ADDR + +Capability: KVM_CAP_USER_DIRTY_LOG (>=1 see below) +Architectures: all +Type: vm ioctl +Parameters: struct kvm_user_dirty_log (in/out) +Returns: 0 on success, -1 on error + +This ioctl makes it possible to use KVM_SWITCH_DIRTY_LOG (see 4.40) instead +of the old dirty log manipulation by KVM_GET_DIRTY_LOG. + +A bit about KVM_CAP_USER_DIRTY_LOG + +Before this ioctl was introduced, dirty bitmaps for dirty page logging were +allocated in the kernel's ...
If you introduce a switch ioctl that frees the bitmap vmalloc'ed by the current set_memory_region (if its not freed already), after pointing the memslot to the user supplied one, it should be fine? --
We use new API for light dirty log access if KVM supports it.
This conflicts with Marcelo's patches. So please take this as a sample patch.
Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
---
kvm/include/linux/kvm.h | 11 ++++++
qemu-kvm.c | 81 ++++++++++++++++++++++++++++++++++++++++++-----
qemu-kvm.h | 1 +
3 files changed, 85 insertions(+), 8 deletions(-)
diff --git a/kvm/include/linux/kvm.h b/kvm/include/linux/kvm.h
index 6485981..efd9538 100644
--- a/kvm/include/linux/kvm.h
+++ b/kvm/include/linux/kvm.h
@@ -317,6 +317,14 @@ struct kvm_dirty_log {
};
};
+/* for KVM_GET_USER_DIRTY_LOG_ADDR */
+struct kvm_user_dirty_log {
+ __u32 slot;
+ __u32 flags;
+ __u64 dirty_bitmap;
+ __u64 dirty_bitmap_old;
+};
+
/* for KVM_SET_SIGNAL_MASK */
struct kvm_signal_mask {
__u32 len;
@@ -499,6 +507,7 @@ struct kvm_ioeventfd {
#define KVM_CAP_PPC_SEGSTATE 43
#define KVM_CAP_PCI_SEGMENT 47
+#define KVM_CAP_USER_DIRTY_LOG 55
#ifdef KVM_CAP_IRQ_ROUTING
@@ -595,6 +604,8 @@ struct kvm_clock_data {
struct kvm_userspace_memory_region)
#define KVM_SET_TSS_ADDR _IO(KVMIO, 0x47)
#define KVM_SET_IDENTITY_MAP_ADDR _IOW(KVMIO, 0x48, __u64)
+#define KVM_GET_USER_DIRTY_LOG_ADDR _IOW(KVMIO, 0x49, struct kvm_user_dirty_log)
+#define KVM_SWITCH_DIRTY_LOG _IO(KVMIO, 0x4a)
/* Device model IOC */
#define KVM_CREATE_IRQCHIP _IO(KVMIO, 0x60)
#define KVM_IRQ_LINE _IOW(KVMIO, 0x61, struct kvm_irq_level)
diff --git a/qemu-kvm.c b/qemu-kvm.c
index 91f0222..98777f0 100644
--- a/qemu-kvm.c
+++ b/qemu-kvm.c
@@ -143,6 +143,8 @@ struct slot_info {
unsigned long userspace_addr;
unsigned flags;
int logging_count;
+ unsigned long *dirty_bitmap;
+ unsigned long *dirty_bitmap_old;
};
struct slot_info slots[KVM_MAX_NUM_MEM_REGIONS];
@@ -232,6 +234,29 @@ int kvm_is_containing_region(kvm_context_t kvm, unsigned long phys_addr,
return 1;
}
...Does this work correctly if your user space is 32 bits (i.e. unsigned long is different size in user space and kernel) in both big- and little-endian systems? I'm not sure about all the details, but I think you cannot in general share bitmaps between user space and kernel because of this. Arnd --
That's why the bitmaps are defined as little endian u64 aligned, even on big endian 32-bit systems. Little endian bitmaps are wordsize agnostic, and u64 alignment ensures we can use long-sized bitops on mixed size systems. -- error compiling committee.c: too many arguments to function --
I'm not sure I understand how this macro is going to be used though. If you are just using this in kernel space, that's fine, please go for it. However, if the intention is to use the same macro in user space, putting it into asm-generic/bitops/* is not going to help, because those headers are not available in user space, and I wouldn't want to change that. The definition of the macro is not part of the ABI, so just duplicate it in KVM if you need it there. Arnd --
Yes, I'm just using in kernel space: qemu has its own endian related helpers. So if you allow us to place this macro in asm-generic/bitops/* it will help us. --
I really prefer anything that is generic to be outside kvm, even if kvm is the only user. -- error compiling committee.c: too many arguments to function --
No problem at all then. Thanks for the explanation. Acked-by: Arnd Bergmann <arnd@arndb.de> --
Thanks you both. I will add your Acked-by from now on! Takuya --
100 ns... this is a bit on the low side (and if you can measure it No doubt get.org -> get.opt is measurable, but get.opt->switch.opt is problematic. Have you tried profiling to see where the time is spent Can you post such a test, for an idle large guest? -- error compiling committee.c: too many arguments to function --
Sorry but no, and I agree with your guess. Anyway, I want to do some profiling to confirm this guess. BTW, If we only think about performance improvement of time, optimized get(get.opt) may be enough at this stage. But if we consider the future expansion like using user allocated bitmaps, new API's introduced for switch.opt won't become waste, I think, because we --
Result of "low workload test" (running top during migration) first,
4GB guest
picked up slots[1](len=3757047808) only
*****************************************
get.org get.opt switch.opt
1060875 310292 190335
1076754 301295 188600
655504 318284 196029
529769 301471 325
694796 70216 221172
651868 353073 196184
543339 312865 213236
1061938 72785 203090
689527 323901 249519
621364 323881 473
1063671 70703 192958
915903 336318 174008
1046462 332384 782
1037942 72783 190655
680122 318305 243544
688156 314935 193526
558658 265934 190550
652454 372135 196270
660140 68613 352
1101947 378642 186575
... ... ...
*****************************************
As expected we've got the difference more clearly.
In this case, switch.opt reduced 1/3 (.1 msec) compared to get.opt
for each iteration.
--
User allocated bitmaps have the advantage of reducing pinned memory. However we have plenty more pinned memory allocated in memory slots, so by itself, user allocated bitmaps don't justify this change. Perhaps if we optimize memory slot write protection (I have some ideas about this) we can make the performance improvement more pronounced. -- error compiling committee.c: too many arguments to function --
In that sense, what do you think about the question I sent last week?
=== REPOST 1 ===
>>
>> mark_page_dirty is called with the mmu_lock spinlock held in set_spte.
>> Must find a way to move it outside of the spinlock section.
>>
>
> Oh, it's a serious problem. I have to consider it.
Avi, Marcelo,
Sorry but I have to say that mmu_lock spin_lock problem was completely out of
my mind. Although I looked through the code, it seems not easy to move the
set_bit_user to outside of spinlock section without breaking the semantics of
its protection.
So this may take some time to solve.
But personally, I want to do something for x86's "vmallc() every time" problem
even though moving dirty bitmaps to user space cannot be achieved soon.
In that sense, do you mind if we do double buffering without moving dirty bitmaps to
user space?
I know that the resource for vmalloc() is precious for x86 but even now, at the timing
of get_dirty_log, we use the same amount of memory as double buffering.
It's really nice!
Even now we can measure the performance improvement by introducing switch ioctl
when guest is relatively idle, so the combination will be really effective!
=== REPOST 2 ===
>>
>> Can you post such a test, for an idle large guest?
>
> OK, I'll do!
Result of "low workload test" (running top during migration) first,
4GB guest
picked up slots[1](len=3757047808) only
*****************************************
get.org get.opt switch.opt
1060875 310292 190335
1076754 301295 188600
655504 318284 196029
529769 301471 325
694796 70216 221172
651868 353073 196184
543339 312865 213236
1061938 72785 203090
689527 323901 249519
621364 323881 473
1063671 70703 192958
915903 336318 174008
1046462 332384 782
1037942 72783 190655
680122 ...This is bad. I haven't encountered that one at all so far, but I guess Could you please point me to a git tree where everything's readily applied? That would make testing a lot easier. Alex --
