On Tue, Nov 27, 2007 at 10:41:15AM -0800, Ben Woodard wrote:Ben I tend to agree. I think re-enabling the APIC early in the boot process provides a greater degree of reliability in that it more quickly restores the system to a state where booting on a cpu other than cpu0 will be more likely to work, but I have to say that overall it seems like booting a secondary kernel on cpu0, when possible offers the highest degree of reliability. Perhaps what we need is a 'both solution'. Re-enabling the apic to full smp functionality early in the boot process is a good solution for the problems which we are hypothesizing here, and would be a good thing to do in general, but it doesn't preclude also attmpting to switch back to cpu0 during a crash. Perhaps it would be worthwhile to: 1) Investigate the early enablement of the ioapic for x86[_64] 2) implement my prevoiusly proposed patch with the addition of a handshake element, such that: a) when the boot cpu gets the ipi from machine_crash_shutdown it flags the fact that it is going to boot the kexec kernel with a global variable b) the crashing cpu loops waiting for either: I) a timeout of 1 second II) a reduction of the halt count to zero III) the setting of the flag mentioned in (a) c) the crashing cpu, if it sees that it is not the boot cpu AND that the flag in (III) is set, will halt itself, otherwise it will set the flag and boot the kexec image itself. With this modification, we can try to relocate to cpu0, and if we fail, we fall back to booting on the crashing processor. I'll work up a patch that implements (2), unless there are strong objections. I see no reason why we can't implment this 'both' solution. Regards Neil -- /*************************************************** *Neil Horman *Software Engineer *Red Hat, Inc. *nhorman@redhat.com *gpg keyid: 1024D / 0x92A74FA1 *http://pgp.mit.edu ***************************************************/ -
| Adrian Bunk | Re: Linux 2.6.21 |
| Linus Torvalds | Linux 2.6.21-rc2 |
| WANG Cong | [-mm Patch] UML: fix a building error |
| Roland McGrath | Re: [PATCH 0/5] ftrace: to kill a daemon |
git: | |
| Natalie Protasevich | [BUG] New Kernel Bugs |
| David Miller | Re: [PATCH] pkt_sched: Destroy gen estimators under rtnl_lock(). |
| Patrick McHardy | Re: [PATCH] netfilter: use per-cpu spinlock rather than RCU (v3) |
| Gerrit Renker | [PATCH 27/37] dccp: Integration of dynamic feature activation - part 2 (server side) |
| Theodore Ts'o | Re: cc1 fails silently |
| Michael Nolan | Power routines on notebook cause kernel panic |
| Marc Peters | v 0.11 boot disk problem |
| Dave `geek' Gymer | WARNING (was Re: New afio release) |
