From: Matthew Garrett <mjg59@srcf.ucam.org>
Some HP laptops have a problem with their DSDT reporting as
HP/SB400/10000, which includes some code which overrides all temperature
trip points to 16C if the INTIN2 input of the I/O APIC is enabled. This
input is incorrectly designated the ISA IRQ 0 via an interrupt source
override even though it is wired to the output of the master 8259A and
INTIN0 is not connected at all. So far two models have been identified,
namely nx6125 and nx6325.Use a knob provided by the I/O APIC interrupt registration code to
abandon any attempts to route IRQ 0 through the I/O APIC for these
systems.Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org>
---
Hello,Matthew, you have not added your sign-off with the original patch, please
do so now.Rafael, please try this change together with
patch-2.6.26-rc1-20080505-mpparse-acpi-noirq0-2 and see if this fixes your
system.I have tried the patches with a wildcard entry added to acpi_dmi_table[]
so that it trips for my system. The result is as follows:ENABLING IO-APIC IRQs
..TIMER: vector=0x31 apic1=-1 pin1=-1 apic2=-1 pin2=-1
...trying to set up timer as Virtual Wire IRQ... works.(note the absence of any timer pin), which means the changes work as
expected. Here the timer has been set up as a native local APIC interrupt
routed through the 8259A set up as a "virtual wire".Ingo, please apply the two patches where appropriate.
Maciej
patch-2.6.26-rc1-20080505-mjg59-acpi-noirq0-0
diff -up --recursive --new-file linux-2.6.26-rc1-20080505.macro/arch/x86/kernel/acpi/boot.c linux-2.6.26-rc1-20080505/arch/x86/kernel/acpi/boot.c
--- linux-2.6.26-rc1-20080505.macro/arch/x86/kernel/acpi/boot.c 2008-05-05 02:55:24.000000000 +0000
+++ linux-2.6.26-rc1-20080505/arch/x86/kernel/acpi/boot.c 2008-06-30 23:31:48.000000000 +0000
@@ -1049,6 +1049,17 @@ static int __init force_acpi_ht(const st
}/*
+ * Don't register any I/O APIC entries for the 8254 timer IRQ.
+ */
...
Out of sheer curiosity. What makes you think that IRQ0 is not
connected to INT0 of IO APIC? The IRQ0 pin2 override?Why would we trust that BIOS information if the broken BIOS does
weird things if INT2 of IO APIC is _not_ masked?IMHO on those HP systems we should skip the (bogus?) timer override,
leave IRQ0 -> IOAPIC/INT0 and mask IOAPIC/INT2. Or at least we should
test that configuration.I admit that I have lost track of who has provided which patches,
which patch does exactly what ((IO|L)A)PIC configuration, and who has
tested what combinations, and what the results were.So I've just done the following (based on x86/master as of yesterday):
Booting an HP nx6325
(1) with "acpi_skip_timer_override" to avoid configuration
of IOAPIC/INT2 and to avoid masking of IOAPIC/INT0.plus
(2) adding (hardcoded in check_timer()) some code to explicitly mask
IOAPIC/INT2 to quiesce the fan (work around the DSDT issue).With that setup my test box boots w/o problems (no fan noise, no
throttling, no stablity problems so far). Here the relevant dmesg
parts:Command line: root=/dev/sda2 video=radeonfb:noaccel debug apic=debug acpi_skip_timer_override
...
IOAPIC[0]: apic_id 2, version 0, address 0xfec00000, GSI
...
ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
ACPI: BIOS IRQ0 pin2 override ignored.
...
using C1E aware idle routine
...
init IO_APIC IRQs
IOAPIC[0]: Set routing entry (2-0 -> 0x30 -> IRQ 0 Mode:0 Active:0)
IOAPIC[0]: Set routing entry (2-1 -> 0x31 -> IRQ 1 Mode:0 Active:0)
IOAPIC[0]: Set routing entry (2-2 -> 0x32 -> IRQ 2 Mode:0 Active:0)
...
IO-APIC (apicid-pin) 2-16, 2-17, 2-18, 2-19, 2-20 not connected.
IOAPIC[0]: Set routing entry (2-21 -> 0x49 -> IRQ 21 Mode:1 Active:1)
IO-APIC (apicid-pin) 2-22, 2-23 not connected.
..TIMER: vector=0x30 apic1=0 pin1=0 apic2=-1 pin2=-1
CPU0: AMD Turion(tm) 64 X2 Mobile Technology TL-60 stepping 02
Using local APIC timer interrupts.
...
IO APIC #2......
...
FWIW, it's my preferred one too. So far, I haven't seen any problems with this
approach (ie. "acpi_skip_timer_override" plus masking IOAPIC/INT2 in
check_timer()), even with the C1E aware idle routine.Thanks,
Rafael
--
There is some confusion here apparently -- you certainly have not tested
the timer as an ExtINTA interrupt during the course of this discussion,
because the "8259A Virtual Wire" through the local APIC has always worked
for you. ExtINTA would be the next, fourth and final attempt before a
panic().For a reference, here is the list of configurations of the 8254 timer
interrupt in the order they are tried:1. Native I/O APIC interrupt.
2. "8259A Virtual Wire" through the I/O APIC.
3. "8259A Virtual Wire" through the local APIC of the bootstrap processor.
4. ExtINTA ("Virtual Wire") through the local APIC of the BSP.
The configurations #1 and #2 are only tried if the firmware has supplied
information about how the IRQ0 or 8259A have been wired. The #4 is only
reached in very rare cases where there is some special glue logic between
the 8259A and the APIC (seen before as a workaround for broken hardware;
unlock_ExtINT_logic() is needed for that) or a given implementation of the
8259A is not a full one (hypothetical).Maciej
--
Yes, you're right, sorry.
So in fact I didn't test B.
Thanks,
Rafael
--
The test I prepared a couple of weeks ago which Rafael ran for me. The
test forcibly rerouted IRQ0 to INTIN0, but the timer did not work withThis is why I prepared the workaround to ignore BIOS IRQ0 configuration
Unfortunately it did not work for Rafael's box and diagnostic messages
have been quiesced in the 64-bit variation of code so bootstrap logsI can be blamed for the vast majority. If you are unsure what does what,
Now that you mentioned (2) a sudden insight made me realise in the
absence of an overlapping override we register ISA IRQ2 as an
edge-triggered I/O APIC interrupt, which means its input will be
implicitly unmasked. But despite the ACPI spec does not make it
explicitly clear, there is no such thing as ISA IRQ2 -- what used to be
IRQ2 up to PC/XT got remapped to IRQ9 when the IR2 line of the 8259A was
reused as a cascade to the slave and no device since has ever been able to
make use of IRQ2. Therefore I think we should actually not register thisI am assuming by "Virtual Wire IRQ" you mean a setup we are referring to
as "8259A Virtual Wire" (without this additional qualifier the term is
synonymous to ExtINTA).Given (C) did not work a workaround has been prepared to implement (A).
As it does DMI ID matching and "knows" for this system (C) would not work
I decided there is no point in trying such a configuration.Now you say your machine actually supports (C), which makes me wonder why
Well, this is a piece of information I could not know. Documentation for
the SB400 is not publicly available. I asked whether anybody could answer
how this piece of circuitry has been designed in the chip and you are theOne or more of the following:
1. Because the 8254 is not in fact routed to INTIN0 (as seen with the
test).2. Because almost everything else requires such an override and it was
left in place in the code base by mistake/misunderstanding/etc.OK, to recap: given what you have written, I think we should refrai...
Since we add a new entry to acpi_dmi_table[], I think it's necessary
for this new entry to preserve the semantics of this table, that is
"acpi=force overrules DMI blacklist", am I right?Thanks,
Forrest
--
Can you please elaborate?
What exactly are we supposed to do apart from adding the new entries to the
table?Thanks,
Rafael
--
Assuming it works for Rafael (still haven't had time to pull my nx6125
out of storage):Signed-off-by: mjg@redhat.com
--
Matthew Garrett | mjg59@srcf.ucam.org
--
Rafael, could you please try the latest tip/master that i've just pushed
out:http://people.redhat.com/mingo/tip.git/README
it has the final form of these changes integrated - it should in theory
work out of box on your system, with no boot parameters or other
explicit quirks needed anywhere.Ingo
--
I tested patches [1/2] (your version) and [2/2] on top of today's linux-next
and they work just fine.Thanks,
Rafael
--
thanks Rafael!
Ingo
--
Well, I'm afraid that my information wasn't correct. The patches actually
don't work, but I had my own additional patch that fixed the problem applied,
which I've only just discovered. Sorry for the confusion and the issue is
still unfixed.Thanks,
Rafael
--
Appended is a patch that actually works for me (it's missing the i386 part,
but I can't test that anyway).Please note two things:
(1) The whole acpi_dmi_table[] thing originally depended on __i386__, so it
wouldn't work on x86-64 no matter what. I removed that dependecy, but
I have no idea why it was there and so I'm not sure if that's correct.
(2) The clear_IO_APIC_pin(apic1, pin1) done if disable_irq0_through_ioapic is
true is absolutely essential. The symptoms are 100% reproducible without it.Thanks,
Rafael---
arch/x86/kernel/acpi/boot.c | 43 +++++++++++++++++++++++++++++++++++++------
arch/x86/kernel/io_apic_64.c | 19 ++++++++++++++-----
2 files changed, 51 insertions(+), 11 deletions(-)Index: linux-next/arch/x86/kernel/acpi/boot.c
===================================================================
--- linux-next.orig/arch/x86/kernel/acpi/boot.c
+++ linux-next/arch/x86/kernel/acpi/boot.c
@@ -1363,8 +1363,6 @@ static void __init acpi_process_madt(voi
return;
}-#ifdef __i386__
-
static int __init disable_acpi_irq(const struct dmi_system_id *d)
{
if (!acpi_force) {
@@ -1415,6 +1413,17 @@ static int __init force_acpi_ht(const st
}/*
+ * Don't register any I/O APIC entries for the 8254 timer IRQ.
+ */
+static int __init
+dmi_disable_irq0_through_ioapic(const struct dmi_system_id *d)
+{
+ pr_notice("%s detected: disabling IRQ 0 through I/O APIC\n", d->ident);
+ force_disable_irq0_through_ioapic();
+ return 0;
+}
+
+/*
* If your system is blacklisted here, but you find that acpi=force
* works for you, please contact acpi-devel@sourceforge.net
*/
@@ -1581,11 +1590,35 @@ static struct dmi_system_id __initdata a
DMI_MATCH(DMI_PRODUCT_NAME, "TravelMate 360"),
},
},
+ /*
+ * HP laptops which use a DSDT reporting as HP/SB400/10000,
+ * which includes some code which overrides all temperature
+ * trip points to 16C if the INTIN2 input of the I/O APIC
+ * is enabled. This input is incor...
This is completely broken -- you cannot blindly assume IRQ0 is wired to
the pin #0 of the I/O APIC #0. You have to respect routing information
provided by the system.Ingo, from the sequence above, I gather this code is currently in the
tree:- replace_pin_at_irq(0, 0, 0, apic1, pin1);
- apic1 = 0;
- pin1 = 0;Please revert the change which introduced it. While I recall posting a
patch which added code like this, I clearly stated it was solely for
diagnostics of Rafael's system and not to apply to any tree.Maciej
--
I have had a look at the tree and there are two commits containing
diagnostic code that should have never been applied to the tree. Their
IDs in the linux-next tree are as follows:90221a61a71b7ad659d8741cf1e404506b174982
a74a1cc3df0be89658bc735c8aed80c8392e2c15Please revert them both. Thanks.
Maciej
--
Shouldn't the setup_timer_IRQ0_pin(apic1, pin1, cfg->vector) be removed as
well?Rafael
--
Yes. The whole commit should be reverted as mentioned in the other
e-mail (it didn't even bear my sign-off mark). This would include the
line you are referring to as well. I have submitted a patch to add a
64-bit variation of replace_pin_at_irq() separately.Maciej
--
thanks, applied to tip/x86, to give this some more testing.
the clear_IO_APIC_pin() is the most worrisome aspect of the change - but
since we are already in limited quirk mode, does it hurt? Maciej, any
preferences?Ingo
--
It makes absolutely no sense and should be harmful to call
clear_IO_APIC_pin(apic1, pin1) here, because both apic1 and pin1 should be
equal to -1 here. If it has to be called, then I suppose the DMI matching
did not work and the workaround has not been enabled.Rafael, can you please provide a full bootstrap log obtained with all the
changes, but *without* this part? Also, can you please verify DMI IDs of
your system (dmidecode or /sys/class/dmi, I am told)?I do think we should record DMI vendor and name information in the
bootstrap log. It won't take a lot of space and will be more useful than
some other IDs we never use for anything else, which we obtain from some
other tables.Maciej
--
BTW, did you even to look at the code _as_ _is_ in linux-next?
In fact, it is _impossible_ that either apic1 or pin1 are equal to -1 at this
point, because of this part:/*
* Some BIOS writers are clueless and report the ExtINTA
* I/O APIC input from the cascaded 8259A as the timer
* interrupt input. So just in case, if only one pin
* was found above, try it both directly and through the
* 8259A.
*/
if (pin1 == -1) {
pin1 = pin2;
apic1 = apic2;
no_pin1 = 1;
} else if (pin2 == -1) {
pin2 = pin1;
apic2 = apic1;
}that originates from your patch.
End even without this part apic1 and pin1 are _not_ equal to -1 on this box
(apic2 and pin2 are, but that's a different matter).Thanks,
Rafael
--
Well, I hope it has not changed too much since my recent work in this
And the conclusion is? If we leave MP-table setups aside as irrelevant
for this configuration, we can be almost certain apic2 and pin2 are both
equal to -1 at this point. This is because (unlike the MP table) ACPI has
no provisions in its tables for ExtINTA APIC interrupt inputs. Therefore
the only case apic2 and pin2 may not be equal -1 here is when the firmware
had set up one of the inputs as such in the hardware before initiating
bootstrap which has been subsequently noted by a piece of code in
enable_IO_APIC() which examines the I/O APIC for such a condition.I have taken these circumstances very much into account when preparing
the workaround, based on the assumption that if the firmware has set up an
I/O APIC line as an ExtINTA interrupt, then it means it considers it
suitable to use in such a manner. This furthermore implies the line
should be safe to be used in any valid 8259A mode of operation, such as
one we use to forward IRQ0 transparently through the 8259A (we
double-check it just in case though, as workarounds for hardware bugs in
the past made it not always true). The workaround therefore applies to
genuine IRQ0 routing as reported by ACPI only and not any possible legacy
ExtINTA fallback that we may attempt to use.Of course, as determined previously, the ExtINTA line is not safe to be
used on your box, but it has not been set up by the firmware as an ExtINTA
interrupt either, so the assumption mentioned above remains valid and has
no negative impact on your system. At this point all of apic1, apic2,
pin1 and pin2 should be equal -1, which means the reassignments you quotedWhich means the workaround has not triggered and the rest of cosideration
is therefore irrelevant. Please get us these DMI IDs, so that we can see
what's wrong with the quirk.Maciej
--
Which workaround?
There are no any workarounds related to this problem in the 20080704 linux-next
I used those DMI IDs and your quirk didn't work, even after removing the
dependency of acpi_dmi_table[] from __i386__. For this reason, I prepared an
alternative patch that did work and posted it for your information.Now, you're saying that my patch couldn't work, while in fact it did.
Thanks,
Rafael
--
I have the nx6125 version of the laptop, though it's not long for this
world. Regardless, dmi data below. Earlier, I looked at your DMI match
strings up-thread and they seemed good, so you may be barking up the
wrong tree here.ray@phoenix:~$ sudo grep . /sys/class/dmi/id/*
/sys/class/dmi/id/bios_date:03/10/2006
/sys/class/dmi/id/bios_vendor:Hewlett-Packard
/sys/class/dmi/id/bios_version:68DTT Ver. F.0E
/sys/class/dmi/id/board_name:308B
/sys/class/dmi/id/board_vendor:Hewlett-Packard
/sys/class/dmi/id/board_version:KBC Version 45.26
/sys/class/dmi/id/chassis_asset_tag:
/sys/class/dmi/id/chassis_serial:CND5500R4K
/sys/class/dmi/id/chassis_type:10
/sys/class/dmi/id/chassis_vendor:Hewlett-Packard
/sys/class/dmi/id/modalias:dmi:bvnHewlett-Packard:bvr68DTTVer.F.0E:bd03/10/2006:svnHewlett-Packard:pnHPCompaqnx6125(PZ880UA#ABA):pvrF.0E:rvnHewlett-Packard:rn308B:rvrKBCVersion45.26:cvnHewlett-Packard:ct10:cvr:
/sys/class/dmi/id/product_name:HP Compaq nx6125 (PZ880UA#ABA)
/sys/class/dmi/id/product_serial:CND5500R4K
/sys/class/dmi/id/product_uuid:4D0AF52B-ED6E-DA11-1880-66990A524D29
/sys/class/dmi/id/product_version:F.0E
/sys/class/dmi/id/sys_vendor:Hewlett-Packard
/sys/class/dmi/id/uevent:MODALIAS=dmi:bvnHewlett-Packard:bvr68DTTVer.F.0E:bd03/10/2006:svnHewlett-Packard:pnHPCompaqnx6125(PZ880UA#ABA):pvrF.0E:rvnHewlett-Packard:rn308B:rvrKBCVersion45.26:cvnHewlett-Packard:ct10:cvr:
--
Do you realize that the clear_IO_APIC_pin(apic1, pin1) thing is _only_ called
I assume you want the boot log with the ACPI debugging messages enabled?
See above.
Thanks,
Rafael
--
Well, it is the very intent of the DMI quirk to set apic1 and pin1 both
to -1, as a result of IRQ0 being absent from our I/O APIC interrupt
routing table. Therefore if the quirk did indeed work, a call to
clear_IO_APIC_pin() is useless and likely harmful as its callees don't do
range checking (my understanding of code is it results in random poking at
the local APIC through the FIX_APIC_BASE fixmap). There should be nothing
to clear too, as interrupt redirection entries for all the I/O APIC inputs
are cleared (the mask is set to 1 and the remaining fields zeroed) when
clear_IO_APIC() is called from enable_IO_APIC() upon initialization and
all the unused ones (not referred to from anywhere in the interrupt
routing table) are never touched afterwards.Maciej
--
Sorry, the patch I posted was _instead_ of your previous patch with the quirk,
because that patch didn't work. I don't know why it didn't work, however, I
can only say it didn't work after removing the __i386__ dependency of
acpi_dmi_table[].My patch is on top of the linux-next tree that didn't contain your patch.
So, my patch adds a quirk that sets disable_irq0_through_ioapic to 1 (this
variable is defined differently in my patch) and uses it to skip the part of
check_timer() that breaks my box.I hope that makes things clear.
Thanks,
Rafael
--
I don't recall seeing a note that my patch was reverted. I had a careful
look at yours now and it applies on top of some diagnostic code which
should never have been applied to the tree in the first place as it was
not meant to and would also break the majority of systems out there. I am
fairly sure this piece of code is the reason my implementation of theIt does, thanks. However I insist on getting this issue dealt with the
way I proposed -- check_timer() is too complicated and too fragile to mess
with as our history has already shown. If IRQ0 is not to be routed
through the I/O APIC, then it should never be registered as an I/O APIC
interrupt in the first place.I am fairly with the offending change removed my fix will work for you as
expected as I went through the effort of checking at the run time that
once activated it makes the kernel correctly avoid the sequence in
check_timer() that hits your system, so it is only the DMI ID matching
that could have gone wrong, but your change indicates this is actually not
the case.Maciej
--
So I'm going to apply the appended combined patch on top of linux-next and
retest.Thanks,
Rafael---
arch/x86/kernel/acpi/boot.c | 53 ++++++++++++++++++++++++++++++++++++++-----
arch/x86/kernel/io_apic_64.c | 5 ----
2 files changed, 47 insertions(+), 11 deletions(-)Index: linux-next/arch/x86/kernel/acpi/boot.c
===================================================================
--- linux-next.orig/arch/x86/kernel/acpi/boot.c
+++ linux-next/arch/x86/kernel/acpi/boot.c
@@ -83,6 +83,8 @@ int acpi_lapic;
int acpi_ioapic;
int acpi_strict;+static int disable_irq0_through_ioapic __initdata;
+
u8 acpi_sci_flags __initdata;
int acpi_sci_override_gsi __initdata;
int acpi_skip_timer_override __initdata;
@@ -990,6 +992,10 @@ void __init mp_override_legacy_irq(u8 bu
int pin;
struct mp_config_intsrc mp_irq;+ /* Skip the 8254 timer interrupt (IRQ 0) if requested. */
+ if (bus_irq == 0 && disable_irq0_through_ioapic)
+ return;
+
/*
* Convert 'gsi' to 'ioapic.pin'.
*/
@@ -1056,6 +1062,10 @@ void __init mp_config_acpi_legacy_irqs(v
for (i = 0; i < 16; i++) {
int idx;+ /* Skip the 8254 timer interrupt (IRQ 0) if requested. */
+ if (i == 0 && disable_irq0_through_ioapic)
+ continue;
+
for (idx = 0; idx < mp_irq_entries; idx++) {
struct mp_config_intsrc *irq = mp_irqs + idx;@@ -1363,8 +1373,6 @@ static void __init acpi_process_madt(voi
return;
}-#ifdef __i386__
-
static int __init disable_acpi_irq(const struct dmi_system_id *d)
{
if (!acpi_force) {
@@ -1415,6 +1423,17 @@ static int __init force_acpi_ht(const st
}/*
+ * Don't register any I/O APIC entries for the 8254 timer IRQ.
+ */
+static int __init
+dmi_disable_irq0_through_ioapic(const struct dmi_system_id *d)
+{
+ pr_notice("%s detected: disabling IRQ 0 through I/O APIC\n", d->ident);
+ disable_irq0_through_ioapic = 1;
+ return 0;
+}
+
+/*
* If your system is blacklisted here, but you find that acpi=force
...
Wonderful! Thanks for your patience and cooperation.
Maciej
--
Unfortunately, it hangs solid very early on boot, before earlyprintk=vga
kicks in.Thanks,
Rafael
--
i know you havent submitted this for inclusion yet, but FYI it wont
build on 32-bit with a config like:http://redhat.com/~mingo/misc/config-Mon_Jul__7_09_22_45_CEST_2008.bad
Ingo
--
Well, I wrote it was missing the i386 part. ;-)
Thanks,
Rafael
--
Can you please push them to linux-next, btw?
They still apply cleanly to linux-next for me.
Thanks,
Rafael
--
| Greg Kroah-Hartman | [PATCH 012/196] nozomi driver |
| Ingo Molnar | Re: [patch 00/13] Syslets, "Threadlets", generic AIO support, v3 |
| Rafael J. Wysocki | [PATCH -mm 5/6] Freezer: Remove PF_NOFREEZE from bluetooth threads |
| Ingo Molnar | Re: [PATCH 00/23] per device dirty throttling -v8 |
git: | |
| David Miller | [GIT]: Networking |
| David Miller | Re: [PATCH] pkt_sched: Destroy gen estimators under rtnl_lock(). |
| Gerrit Renker | [PATCH 15/37] dccp: Set per-connection CCIDs via socket options |
| Natalie Protasevich | [BUG] New Kernel Bugs |
