On Tue, Nov 27, 2007 at 11:55:03AM +0100, Andi Kleen wrote:I understand that this is how its supposed to work, but my analysis nevertheless in my mind points to an inability to route irqs to any processor other than the boot cpu. I conducted a test whereby I forced a crash on cpu0, and then again on cpu3. I've found that on all the systems I have available, I'm able to boot to a kexec kernel without issue. However, on this system: http://www.supermicro.com/Aplus/motherboard/Opteron8000/MCP55/H8QM8-2.cfm Crashing on any cpu other than the boot cpu leads to a hang in calibrate_delay on reboot. I certainly make room for the notion that this could be an ioapic programming error, but I don't see that this system has a different ioapic that my other test systems. I've also simply tried booting the system that is failing with noapic, to force the system to not use the ioapic, to no avail. If you could suggest another test to point to another root cause, I would happily run it, but the evidence that I have at the moment suggests to me that the ioapic, while normally getting succefully programmed to deliver interrupts to the appropriate cpu, is unable to on this system, and removing the traversal of the system bus on the affected system restores functionality. That suggests a system bus error to me. I think thats part of the issue. Somehow (and in fairness I don't know how this occurs), the crash of the kernel affects the functionality of the hypertransport bus in such a way that it can no longer deliver interrupts. I'm not sur how, beyond the testing that I describe above, that I can further prove or disprove this. If you could suggest a test or observation to make so that I could further diagnose this, I would appreciate it. Currently the code seems to configure the ioapic properly on all systems available to me, except the supermicro board above. Any thoughts welcome here. Thanks & Regards Neil -- /*************************************************** *Neil Horman *Software Engineer *Red Hat, Inc. *nhorman@redhat.com *gpg keyid: 1024D / 0x92A74FA1 *http://pgp.mit.edu ***************************************************/ -
| Alan | Re: [RFC] Heads up on sys_fallocate() |
| Tarkan Erimer | Re: Dual-Licensing Linux Kernel with GPL V2 and GPL V3 |
| Greg Kroah-Hartman | [PATCH 001/196] Chinese: Add the known_regression URI to the HOWTO |
| Paul Mundt | Re: 2.6.22-rc4-mm2 |
git: | |
| Gerrit Renker | [PATCH 15/37] dccp: Set per-connection CCIDs via socket options |
| Jarek Poplawski | [PATCH] pkt_sched: Destroy gen estimators under rtnl_lock(). |
| David Miller | Re: [GIT]: Networking |
| Frans Pop | svc: failed to register lockdv1 RPC service (errno 97). |
