Upstream PV guests fail to boot because of a NULL pointer. It is possible that xen guests have irq_desc->chip_data = NULL. Test for NULL chip_data pointer before attempting to complete an irq move. Signed-off-by: Prarit Bhargava <prarit@redhat.com> Acked-by: Suresh Siddha <suresh.b.siddha@intel.com> diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c index 127b871..eb2789c 100644 --- a/arch/x86/kernel/apic/io_apic.c +++ b/arch/x86/kernel/apic/io_apic.c @@ -2545,6 +2545,9 @@ void irq_force_complete_move(int irq) struct irq_desc *desc = irq_to_desc(irq); struct irq_cfg *cfg = desc->chip_data; + if (!cfg) + return; + __irq_complete_move(&desc, cfg->vector); } #else --
Can you provide a short example of test scenario? As in what I should do --
Take the latest upstream (well ... to be honest, a bit older than that because of some other bugs) -- take 2.6.33 and try to boot it as a PV guest. I'm using a RHEL5 Xen HV fwiw ... --
Another ingredient is to boot the guest with a configuration where its maxvcpus is greater than its vcpus. If you have RHEL 5.5 userspace then you can create a config with lines like this maxvcpus = 4 vcpus = 2 with that you'll crash on boot. Then you can check that irq_force_complete_move is on the stack if you have "preserve" for on_crash and use xenctx to look at the state of the vcpus. If the Xen you're using doesn't support the maxvcpus var, then I believe you can do the same principle, but in a different way, using the vcpus_avail var. Or, you can boot with > 1 vcpus and then attempt to remove one with 'xm vcpu-set'. --
2.6.34-rc5 PV boots under Xen for me (and pretty much since 2.6.33 + Suresh fix for the CONFIG_RODATA_MARK). Perhaps I am missing some of the .config options you have set that make it not work? The irqbalance daemon looks to be running - but I think you are hitting this during bootup? How long do you have to wait for this to trigger? How many CPUs did you assign to your guest? OK, so your control domain is RHEL5. Mine is the Jeremy's xen/next one (2.6.32). Let me try to compile RHEL5 under FC11 - any tricks necessary to do that? --
It happens during bootup. I don't have a 2.6.33 vanilla panic handy but I do have one from an earlier 2.6.32... rip: ffffffff81256f45 delay_tsc+0x45 rsp: ffff8800fac95a98 rax: fffffffff6ef46d0 rbx: 00000002 rcx: f6ef46d0 rdx: 0010850c rsi: 002b3bb6 rdi: 002b3bcc rbp: ffff8800fac95ab8 r8: ffffffff r9: 00000002 r10: 00000002 r11: 00000000 r12: fffffffff6dec1c4 r13: 00000002 r14: 002b3bcc r15: 00000001 cs: 0000e033 ds: 00000000 fs: 00000000 gs: 00000000 Stack: 000000000002ef45 ffff8800fac95c88 0000000000000009 ffff8800fac93540 ffff8800fac95ac8 ffffffff81256ef6 ffff8800fac95b48 ffffffff814c6341 0000000000000010 ffff8800fac95b38 ffff880000000008 ffff8800fac95b58 ffff8800fac95b08 a22d306b065d4a66 0000000000000000 0000000000000000 Code: f3 90 65 8b 1c 25 d8 e3 00 00 44 39 eb 75 23 66 66 90 0f ae e8<e8> 46 3d dc ff 66 90 48 98 48 89 Call Trace: [<ffffffff81256f45>] delay_tsc+0x45<-- [<ffffffff81256ef6>] __const_udelay+0x46 [<ffffffff814c6341>] panic+0x135 [<ffffffff814ca23c>] oops_end+0xdc [<ffffffff81042272>] no_context+0xf2 [<ffffffff8125946c>] __bitmap_weight+0x8c [<ffffffff81042505>] __bad_area_nosemaphore+0x125 [<ffffffff8105fad4>] find_busiest_group+0x254 [<ffffffff810425d3>] bad_area_nosemaphore+0x13 [<ffffffff814cbccf>] do_page_fault+0x2ef [<ffffffff814c9595>] page_fault+0x25 [<ffffffff810302f2>] irq_force_complete_move+0x12 [<ffffffff81015214>] fixup_irqs+0xa4 [<ffffffff8102ce59>] cpu_disable_common+0x1a9 [<ffffffff8100f9c2>] check_events+0x12 [<ffffffff810c2550>] __stop_machine+0x120 [<ffffffff8100ff75>] xen_cpu_disable+0x25 [<ffffffff814b0427>] take_cpu_down+0x17 [<ffffffff810c25f9>] stop_cpu+0xa9 [<ffffffff8108869d>] worker_thread+0x16d [<ffffffff8100f19d>] xen_force_evtchn_callback+0xd [<ffffffff8108dd00>] wake_up_bit+0x40 [<ffffffff814c90f6>] ...
Yes. No luck reproducing the crash/panic. I am just not seeing the failure you guys are seeing. Let me build once more 2.6.33 vanilla + CONFIG_DEBUG_MARK_RODATA=n) and check this. And also install a vanilla RHEL5 dom0 as it looks impossible to compile a 2.6.18-era kernel under FC11. The Xen I am using is xen-unstable - so 4.0.1. I know that the IRQ balance code in the Xen hypervisor was fixed in 4.0 (it used to run out of context - now it runs in the IRQ context). Maybe this bug you are seeing (and have the fix for) is just a red-heering? --
Let me try reproducing this on FC11 + 2.6.33. P. --
Rebuilding everything from scratch did it. I am seeing a similar failure where xenctx reports: Call Trace: [<ffffffff8107f780>] stop_cpu+0xc6 <-- [<ffffffff8105520e>] worker_thread+0x15d [<ffffffff8107f6ba>] __stop_machine+0x106 [<ffffffff81058afb>] wake_up_bit+0x25 [<ffffffff81038720>] spin_unlock_irqrestore+0x9 [<ffffffff810550b1>] spin_lock_irq+0xb [<ffffffff810586cb>] kthread+0x7a [<ffffffff8100a964>] kernel_thread_helper+0x4 [<ffffffff81009d61>] int_ret_from_sys_call+0x7 [<ffffffff814033dd>] retint_restore_args+0x5 [<ffffffff8100a960>] gs_change+0x13 With this guest file: kernel = "/mnt/lab/vs11/vmlinuz" ramdisk = "/mnt/lab/vs11/initramfs.cpio.gz" memory = 2048 maxvcpus = 4 vcpus = 2 vif = [ 'mac=00:0F:4B:00:00:71, bridge=switch' ] vfb = [ 'vnc=1, vnclisten=0.0.0.0,vncunused=1'] root = "debug loglevel=10 plymouth:splash=solar plymouth:debug norm console=hvc0 initcall_debug" This is with the latest linux kernel: d93ac51c7a129db7a1431d859a3ef45a0b1f3fc5 (Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client) With your patch the PV guests keeps on going. So: Interestingly enough, I couldn't reproduce this on my Intel box, but on a AMD box with a very wacked TSC (cpu MHz : 2795681.405) I can --
Huh ... that's odd. I'll grab a dinar based system and see if I can reproduce it there. It would be interesting to know what the differences are. P. --
On Tue, 27 Apr 2010 11:24:42 -0400 I assume this is needed for 2.6.34? What about 2.6.33.x and earlier? --
Hey Andrew, I actually pinged Chris Wright to see about including this in the -stable branches. I haven't heard anything back so I'll reping him. P. --
It will be applicable for 2.6.33 and beyond. thanks, suresh --
On Wed, 28 Apr 2010 14:29:06 -0400 Well. Pinging people offlist isn't very reliable. Put Cc: <stable@kernel.org> at the end of the changelog and cc stable@kernel.org on the original patch and then the patch will reliably receive consideration for backporting. I have added Cc:<stable@kernel.org> to my copy of the patch, so the -stable guys will at least see it when I drop it after it is merged. But if the x86 maintainers were to merge your patch as you sent it, it would have no Cc: <stable@kernel.org> when it goes into Linus's tree. I worry that if the -stable maintainer see me drop a patch, but the patch in Linus's tree doesn't have the stable tag, they might not merge the fix into -stable. I bugged them about this scenario recently and the reply was a bit waffly ;) By far the safest thing to do is to include the stable tag in your changelog right at the outset. --
It was? I try my best, that if I see you drop a patch, to go dig through Linus's tree to find if it landed there. If not, I leave it in my queue, and do that for a few releases. If after a long time (like 6 months) I either ping someone, or just drop it from my queue as I guessed that someone dropped it for some reason. Yes, that's the _easiest_ and will not get lost. thanks, greg k-h --
This looks like it should be tagged stable for 2.6.33. Is that correct? -hpa --
Nevermind... see it has already been discussed. -hpa --
Commit-ID: bbd391a15d82e14efe9d69ba64cadb855b061dba Gitweb: http://git.kernel.org/tip/bbd391a15d82e14efe9d69ba64cadb855b061dba Author: Prarit Bhargava <prarit@redhat.com> AuthorDate: Tue, 27 Apr 2010 11:24:42 -0400 Committer: H. Peter Anvin <hpa@zytor.com> CommitDate: Fri, 30 Apr 2010 14:31:38 -0700 x86: Fix NULL pointer access in irq_force_complete_move() for Xen guests Upstream PV guests fail to boot because of a NULL pointer in irq_force_complete_move(). It is possible that xen guests have irq_desc->chip_data = NULL. Test for NULL chip_data pointer before attempting to complete an irq move. Signed-off-by: Prarit Bhargava <prarit@redhat.com> LKML-Reference: <20100427152434.16193.49104.sendpatchset@prarit.bos.redhat.com> Acked-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Cc: <stable@kernel.org> [2.6.33] --- arch/x86/kernel/apic/io_apic.c | 3 +++ 1 files changed, 3 insertions(+), 0 deletions(-) diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c index 127b871..eb2789c 100644 --- a/arch/x86/kernel/apic/io_apic.c +++ b/arch/x86/kernel/apic/io_apic.c @@ -2545,6 +2545,9 @@ void irq_force_complete_move(int irq) struct irq_desc *desc = irq_to_desc(irq); struct irq_cfg *cfg = desc->chip_data; + if (!cfg) + return; + __irq_complete_move(&desc, cfg->vector); } #else --
--
