Hi, After upgrading a Xen virtual machine to Debian's 2.6.26-4 kernel, I noticed that the idle counter doubled its pace on one of the machines: service2:~# yes >/dev/null & [1] 32578 service2:~# grep cpu0 /proc/stat; sleep 1; grep cpu0 /proc/stat cpu0 141208 9113 57273 13379659 61012 0 792 2350 0 cpu0 141310 9113 57274 13379659 61012 0 792 2350 0 service2:~# fg yes > /dev/null ^C service2:~# grep cpu0 /proc/stat; sleep 1; grep cpu0 /proc/stat cpu0 141952 9113 57277 13383481 61012 0 792 2350 0 cpu0 141953 9113 57278 13383681 61012 0 792 2350 0 One out of three machines show this effect, with the exact same kernel and Xen versions (3.2.0, dom0 is Debian's stock Etch 2.6.18 kernel). They aren't hosted by the same machine, though: the misbehaving one is on a different installation with very similar hardware (3 vs 2 GHz). All the guest are paravirtual. service2:~# cat /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 15 model : 2 model name : Intel(R) Xeon(TM) CPU 3.06GHz stepping : 5 cpu MHz : 3056.482 cache size : 1024 KB physical id : 0 siblings : 1 core id : 0 cpu cores : 1 apicid : 1 initial apicid : 1 fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 2 wp : yes flags : fpu de tsc msr pae cx8 cmov pat pse36 clflush dts mmx fxsr sse sse2 ss ht pbe up pebs bts cid xtpr bogomips : 6128.11 clflush size : 64 power management: I don't know what else would be useful, please ask. -- Thanks, Feri. --
So you're saying that they are identical Xen and guest kernel binaries, but one of three is showing doubled idle time? That seems unlikely. The source of that time is from Xen itself, and I think it should be hardware independent, though I guess its possible there's something going on at in the Xen-level timekeeping. That said, I think there's some chance that stolen time may get counted as idle time. Does the one machine with a different outcome have something else running in another virtual machine (including dom0)? --
Above, the difference of the first numbers is 102, which, given that USER_HZ=100, means that the CPU spent all its cycles during the sleep And here, the difference of the fourth numbers is 200, meaning that the processor spent 200% of its time in idle state during this second! (If I read the procfs documentation correctly, of course.) This seems wrong by a factor of two, as there are only 100 "ticks" in a second (actually, this kernel is tickless, but USER_HZ=100, as I'm I was very much surpised myself, too... The version numbers surely are the same, but the binaries came from different downloads. I'll Yes, both Xen instances run other domUs, and at abount one on both consumes significant CPU. The other domUs are mostly idle, and the dom0s too. -- Thanks for taking time, Feri. --
Now I upgraded another domU, and that also shows this doubling effect, so I've got two domUs (running on xen2-ha) misbehaving, and other two (running on xen2) behaving correctly. On the first: wferi@xen2-ha:~$ sudo xm info host : xen2-ha release : 2.6.18-6-xen-686 version : #1 SMP Mon Aug 18 12:56:50 UTC 2008 machine : i686 nr_cpus : 4 nr_nodes : 1 cores_per_socket : 1 threads_per_core : 2 cpu_mhz : 3056 hw_caps : bfebfbff:00000000:00000000:00000080:00004400 total_memory : 3071 free_memory : 991 node_to_cpu : node0:0-3 xen_major : 3 xen_minor : 2 xen_extra : -1 xen_caps : xen-3.0-x86_32p xen_scheduler : credit xen_pagesize : 4096 platform_params : virt_start=0xf5800000 xen_changeset : unavailable cc_compiler : gcc version 4.1.2 20061115 (prerelease) (Debian 4.1.1-21) cc_compile_by : fs cc_compile_domain : debian.org cc_compile_date : Mon Mar 10 15:50:27 UTC 2008 xend_config_format : 4 while on the other the differing lines are: host : xen2 cpu_mhz : 1993 total_memory : 4991 free_memory : 2464 Hope this adds some useful info. -- Regards, Feri. --
I've affirmed this in detail on Sept 19. Now I'm trying to resurrect this thread, maybe some visual stimulation will help...
