PATCH/RFC: [kdump] fix APIC shutdown sequence

Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]
From: Martin Wilck
Date: Monday, August 6, 2007 - 8:08 am

PATCH/RFC: [kdump] fix APIC shutdown sequence

This patch fixes a problem that we have encountered
with kdump under high I/O load on some machines.
The machines showing the errors have an Intel ICH7
chip set with a 6702PXH  PCI Express-to-PCI Bridge
(8086:032c) containing an IO-APIC.

The bug symptom is that certain controllers connected
to the 6702PXH bridge wouldn't receive any IRQs in the
kdump kernel. In the error case (which is about 20% of
all cases) the IRR bit of the IO-APIC pin for that
controller is always set after the start of the kdump
kernel, indicating an IRQ in progress. We haven't found
a way to recover from this situation when it has once
occured, except for a system reset.

The error is caused by IRQs arriving while the APIC
subsystem is deactivated in machine_crash_shutdown().

Apparently, the IO-APIC gets stuck if it sends an IRQ
message to a Local APIC and never receives an EOI for that
message. This can have several possible reasons:

1. If, under SMP, the IO-APIC logical destination field is
   set by the IRQ balancing code to one of the "other"
   CPUs (i.e. not the crashing_cpu), and an IRQ arrives
   on the respective pin after that CPU has shut down
   its local APIC (but before the IO-APIC pin is masked)
   the IRQ message can't be delivered.

2. The crashing CPU itself disables its local APIC
   before the IO-APIC, leaving a short time window
   where the IOAPIC can receive IRQs, but not
   deliver them.

3. An IRQ is received and delivered to a local APIC, but
   no CPU ever executes the IRQ handler and therefore no
   EOI is sent.

After a lot of failed attempts, i have come up with the
following patch, which fixes the problem.

The patch first masks all IO-Apic pins to avoid a sitation
where the IO-Apic can receive, but not deliver, the IRQs.
Moreover, it enables interrupts for a short period before
eventually starting the kdump kernel, so that EOIs can be
sent to the APICs as necessary.

Notes:
a) Simply calling disable_IO_APIC() early doesn't
work, probably because that also clears the IRQ vector
information, so that arriving EOI messages can't be
associated with pins by the IO-APIC.
b) We have tried patches that avoid re-enabling interrupts,
but so far without success. Re-enabling IRQs is of course
dangerous while dumping, and I'd rather find a way to avoid it.
c) There are indications that besides the EOI, it's also
necessary that the PCI IRQ pin is deasserted at least for
a short time. That usually requires that the driver IRQ
handler is called and tells the FW that the IRQ was received.
Whether or not this is a requirement hasn't been finally
clarified yet.
d) The problem is only seen with the IO-APIC in the 6702PXH
PCI bridge, which is the system's secondary IO-APIC. On the
system's main IO-APIC, we see other IRQs (timer etc) arrive
and never get an EOI, but we see no errors.

The patch below is against 2.6.23-rc1. The problem was
originally analyzed and the patch developed against the
Red Hat EL5 kernel (2.6.18-8.el5). I verified that the
problem still occurs with 2.6.23-rc1, and that the patch
below fixes the problem.

Regards
Martin

PS: patch attached ain MIME format because it'd be mangled
quoted-printable by my Mail relay.

-- 
Martin Wilck
PRIMERGY System Software Engineer
FSC IP ESP DE6

Fujitsu Siemens Computers GmbH
Heinz-Nixdorf-Ring 1
33106 Paderborn
Germany

Tel:			++49 5251 8 15113
Fax:			++49 5251 8 20409
Email:			mailto:martin.wilck@fujitsu-siemens.com
Internet:		http://www.fujitsu-siemens.com
Company Details:	http://www.fujitsu-siemens.com/imprint.html
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

Messages in current thread:
PATCH/RFC: [kdump] fix APIC shutdown sequence, Martin Wilck, (Mon Aug 6, 8:08 am)
Re: PATCH/RFC: [kdump] fix APIC shutdown sequence, Vivek Goyal, (Tue Aug 7, 7:29 am)
Re: PATCH/RFC: [kdump] fix APIC shutdown sequence, Martin Wilck, (Tue Aug 7, 10:41 am)
Re: PATCH/RFC: [kdump] fix APIC shutdown sequence, Chip Coldwell, (Tue Aug 7, 12:44 pm)
Re: PATCH/RFC: [kdump] fix APIC shutdown sequence, Andrew Morton, (Tue Aug 7, 5:29 pm)
Re: PATCH/RFC: [kdump] fix APIC shutdown sequence, Eric W. Biederman, (Tue Aug 7, 6:04 pm)
Re: PATCH/RFC: [kdump] fix APIC shutdown sequence, Martin Wilck, (Wed Aug 8, 1:32 am)
Re: PATCH/RFC: [kdump] fix APIC shutdown sequence, Martin Wilck, (Wed Aug 8, 2:03 am)
Re: PATCH/RFC: [kdump] fix APIC shutdown sequence, Vivek Goyal, (Wed Aug 8, 2:33 am)
Re: PATCH/RFC: [kdump] fix APIC shutdown sequence, Vivek Goyal, (Wed Aug 8, 3:36 am)
Re: PATCH/RFC: [kdump] fix APIC shutdown sequence, Vivek Goyal, (Wed Aug 8, 4:38 am)
Re: PATCH/RFC: [kdump] fix APIC shutdown sequence, Martin Wilck, (Wed Aug 8, 5:04 am)
Re: PATCH/RFC: [kdump] fix APIC shutdown sequence, Chip Coldwell, (Wed Aug 8, 7:06 am)
Re: PATCH/RFC: [kdump] fix APIC shutdown sequence, Vivek Goyal, (Wed Aug 8, 7:42 am)
Re: PATCH/RFC: [kdump] fix APIC shutdown sequence, Eric W. Biederman, (Wed Aug 8, 8:21 am)
Re: PATCH/RFC: [kdump] fix APIC shutdown sequence, Martin Wilck, (Wed Aug 8, 10:35 am)
Re: PATCH/RFC: [kdump] fix APIC shutdown sequence, Eric W. Biederman, (Wed Aug 8, 10:56 am)
Re: PATCH/RFC: [kdump] fix APIC shutdown sequence, Martin Wilck, (Wed Aug 8, 11:07 am)
Re: PATCH/RFC: [kdump] fix APIC shutdown sequence, Martin Wilck, (Wed Aug 8, 11:15 am)
Re: PATCH/RFC: [kdump] fix APIC shutdown sequence, Martin Wilck, (Wed Aug 8, 11:22 am)
Re: PATCH/RFC: [kdump] fix APIC shutdown sequence, Martin Wilck, (Wed Aug 8, 11:38 am)
Re: PATCH/RFC: [kdump] fix APIC shutdown sequence, Eric W. Biederman, (Wed Aug 8, 2:25 pm)
Re: PATCH/RFC: [kdump] fix APIC shutdown sequence, Vivek Goyal, (Thu Aug 9, 3:11 am)
Re: PATCH/RFC: [kdump] fix APIC shutdown sequence, Martin Wilck, (Thu Aug 9, 10:35 am)