Re: [PATCH] Replace nvidia timer override quirk with pci id list

Previous thread: [PATCH] input: driver for USB VoIP phones with CM109 chipset by Alfred E. Heggestad on Thursday, February 7, 2008 - 2:38 pm. (10 messages)

Next thread: [2.6.22.y] {00/14+1} - series for stable - on top of 2.6.22.17 by Oliver Pinter on Thursday, February 7, 2008 - 4:03 pm. (1 message)
To: <mingo@...>, <tglx@...>, <lenb@...>, <linux-kernel@...>
Date: Thursday, February 7, 2008 - 3:55 pm

[This patch was originally in the old ff tree and was intended for .24; but
somehow got lost in the arch merge. Has also shipped with OpenSUSE 10.3.
I think it should go into .25]

This replaces the old NF3/NF4 reference BIOS timer override quirk with a device
ID list. We need to ignore the timer override on these systems, but not
ignore it on NF5 based systems. Previously this was distingushed by checking
for HPET, but a lot of BIOS vendors didn't enable HPET in their pre Vista
BIOSes.

Replace the old "for all of nvidia" quirk with a quirk containing pci device
ID. I goobled this list together from pci.ids and googling and it may be
incomplete, but so far I haven't had complaints.

Cc: lenb@kernel.org

Signed-off-by: Andi Kleen <ak@suse.de>

---
arch/x86/kernel/early-quirks.c | 43 ++++++++++++++++-------------------------
1 file changed, 17 insertions(+), 26 deletions(-)

Index: linux/arch/x86/kernel/early-quirks.c
===================================================================
--- linux.orig/arch/x86/kernel/early-quirks.c
+++ linux/arch/x86/kernel/early-quirks.c
@@ -60,38 +60,21 @@ static void __init via_bugs(int num, in
#endif
}

-#ifdef CONFIG_ACPI
-#ifdef CONFIG_X86_IO_APIC
-
-static int __init nvidia_hpet_check(struct acpi_table_header *header)
-{
- return 0;
-}
-#endif /* CONFIG_X86_IO_APIC */
-#endif /* CONFIG_ACPI */
-
-static void __init nvidia_bugs(int num, int slot, int func)
+static void __init nvidia_timer(int num, int slot, int func)
{
#ifdef CONFIG_ACPI
#ifdef CONFIG_X86_IO_APIC
/*
- * All timer overrides on Nvidia are
- * wrong unless HPET is enabled.
- * Unfortunately that's not true on many Asus boards.
- * We don't know yet how to detect this automatically, but
- * at least allow a command line override.
+ * All timer overrides on Nvidia NF3/NF4 are
+ * wrong.
*/
if (acpi_use_timer_override)
return;

- if (acpi_table_parse(ACPI_SIG_HPET, nvidia_hpet_check)) {
- acpi_skip_timer_override = 1;
...

To: Andi Kleen <andi@...>
Cc: <mingo@...>, <tglx@...>, <lenb@...>, <linux-kernel@...>
Date: Thursday, February 7, 2008 - 5:21 pm

If you want to skip timer override on this board, this is a *NAK* from me. =
I=20
told you the last time, it only works reliably here on MCP51 with timer=20
override working. Even before Asus released a bios which had an option to=20
enable the hpet, I needed the override or I got irratic behaviour. Since I=
=20
got hpet enabled I gave up on arguing as the wrongly triggered quirk didn't=
=20
bug me anymore.

IIRC my nforce2 needed the override. I didn't see that in the list.

00:00.0 RAM memory: nVidia Corporation C51 Host Bridge (rev a2)
00: de 10 f0 02 06 00 b0 00 a2 00 00 05 00 00 80 00
10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
20: 00 00 00 00 00 00 00 00 00 00 00 00 43 10 c0 81
30: 00 00 00 00 44 00 00 00 00 00 00 00 ff 00 00 00

bye,
=2D-=20
(=B0=3D =3D=B0)
//\ Prakash Punnoor /\\
V_/ \_V

To: Prakash Punnoor <prakash@...>
Cc: Andi Kleen <andi@...>, <mingo@...>, <tglx@...>, <lenb@...>, <linux-kernel@...>
Date: Friday, February 8, 2008 - 7:36 am

The list only contains IDs where the override should be ignored; so
if it has a correct one and it's not there everything is fine.

I'm appending a revised patch. Does it work for you?

-Andi

---

Replace nvidia timer override quirk with pci id list v2

[This patch was originally in the old ff tree and was intended for .24; but
somehow got lost in the arch merge. Has also shipped with OpenSUSE 10.3.
I think it should go into .25]

This replaces the old NF3/NF4 reference BIOS timer override quirk with a device
ID list. We need to ignore the timer override on these systems, but not
ignore it on NF5 based systems. Previously this was distingushed by checking
for HPET, but a lot of BIOS vendors didn't enable HPET in their pre Vista
BIOSes.

Replace the old "for all of nvidia" quirk with a quirk containing pci device
ID. I goobled this list together from pci.ids and googling and it may be
incomplete, but so far I haven't had complaints.

I also straightened out the ifdef jungle a bit.

v1->v2: Readd the HPET check to handle a NF4 system of Prakash Punnoor.
This means with HPET we always assume timer overrides are ok.

Cc: lenb@kernel.org

Signed-off-by: Andi Kleen <ak@suse.de>

---
arch/x86/kernel/early-quirks.c | 43 ++++++++++++++++-------------------------
1 file changed, 17 insertions(+), 26 deletions(-)

Index: linux/arch/x86/kernel/early-quirks.c
===================================================================
--- linux.orig/arch/x86/kernel/early-quirks.c
+++ linux/arch/x86/kernel/early-quirks.c
@@ -67,37 +67,30 @@ static int __init nvidia_hpet_check(stru
{
return 0;
}
-#endif /* CONFIG_X86_IO_APIC */
-#endif /* CONFIG_ACPI */

-static void __init nvidia_bugs(int num, int slot, int func)
+static void __init nvidia_timer(int num, int slot, int func)
{
-#ifdef CONFIG_ACPI
-#ifdef CONFIG_X86_IO_APIC
- /*
- * All timer overrides on Nvidia are
- * wrong unless HPET is enabled.
- * Unfortunately that's not true on many Asus b...

To: Andi Kleen <andi@...>
Cc: <mingo@...>, <tglx@...>, <lenb@...>, <linux-kernel@...>
Date: Friday, February 8, 2008 - 11:13 am

I at least found this post: http://lkml.org/lkml/2006/8/13/2 though I remeb=
er=20

Sorry, I meant the opposite. I needed the acpi_skip_timer_override kernel=20
parameter for nforce2, thus no override. So this chipset is missing here. A=
t=20
least I remember that my nforce2 needed the skipping,=20

I haven't tested it, but it would "work" as it would bail out in my case=20
because of the hpet check. The problem I see with this approach - as with t=
he=20
old one - it simply wants to ignore the override for a whole bunch of=20
chipsets. (The old one is catastrophic as it even doesn't care for chipset=
=20
revision.) And checking for hpet is just heuristics (or what is the=20
rationale behind it?) not a real check whether the override should be ignor=
ed=20
or not. Are you actually sure that so many nforceX boards have broken biose=
s?=20
References? I would prefer a DMI check and only apply the quirk for *known*=
=20
broken bioses instead fo "blindly" doing it as in my case my mcp51 system i=
s=20
unstable with the quirk applied - it *never* needed the quirk.

I found this:
http://www.kernel.org/pub/linux/kernel/people/lenb/acpi/patches/test/2.6...
20040422223905-nforce2_timer.patch

So it seems the original (proposed?) version did a DMI check for known brok=
en=20
bioses. Why was this approach abandoned?

According to
http://lkml.org/lkml/2006/10/19/427
it seems only nforce2 and perhaps some nforce3 are relevant.

To sum it up, I think it is a step into the right direction, but still not=
=20
correct.
=2D-=20
(=B0=3D =3D=B0)
//\ Prakash Punnoor /\\
V_/ \_V

To: Prakash Punnoor <prakash@...>
Cc: Andi Kleen <andi@...>, <mingo@...>, <tglx@...>, <lenb@...>, <linux-kernel@...>
Date: Friday, February 8, 2008 - 12:09 pm

I hope you remember correctly and mean it this time. It would be better
if you could double check.

I'm a little sceptical because we had this patch in OpenSUSE 10.3
and I didn't think there were complaints from NF2 users.
With the changes you're requesting it turns from something
very well tested into something experimental.

But NF2 should not need a timer override anyways so probably
ignoring it there is ok.

Actually checking CK804 is already an Nforce2, but you might
have NF2S which has a different ID. Do you have full lspci/lspci -n
output?

Ok I'm appending another patch that adds the NF2S too, can

It was a heuristic originally to detect the NF5 which did need
the override. That is why I first removed it because it should

Yes, it was a problem in the Nvidia reference BIOS that they sent to OEMs
to base their own BIOS on, so pretty much everybody had this problem.

We went over this with Nvidia engineers with a fine comb at this
point. If you search the mailing list archives you might even
find the discussions.

-Andi7

---

Replace nvidia timer override quirk with pci id list v3

[This patch was originally in the old ff tree and was intended for .24; but
somehow got lost in the arch merge. Has also shipped with OpenSUSE 10.3.
I think it should go into .25]

This replaces the old NF3/NF4 reference BIOS timer override quirk with a device
ID list. We need to ignore the timer override on these systems, but not
ignore it on NF5 based systems. Previously this was distingushed by checking
for HPET, but a lot of BIOS vendors didn't enable HPET in their pre Vista
BIOSes.

Replace the old "for all of nvidia" quirk with a quirk containing pci device
ID. I goobled this list together from pci.ids and googling and it may be
incomplete, but so far I haven't had complaints.

I also straightened out the ifdef jungle a bit.

v1->v2: Readd the HPET check to handle a NF4 system of Prakash Punnoor.
This means with HPET we always assume timer overrides are ok.
...

To: Andi Kleen <andi@...>
Cc: <mingo@...>, <tglx@...>, <lenb@...>, <linux-kernel@...>
Date: Friday, February 8, 2008 - 1:39 pm

Yes, confirmed. timer w/o the skipping stays XT-PIC on nforce2.

w/o skipping:

0: 153413 XT-PIC-XT timer
1: 10 IO-APIC-edge i8042
8: 2 IO-APIC-edge rtc
9: 0 IO-APIC-fasteoi acpi
12: 112 IO-APIC-edge i8042
14: 37 IO-APIC-edge ide0
16: 165137 IO-APIC-fasteoi eth0
17: 0 IO-APIC-fasteoi Technisat/B2C2 FlexCop II/IIb/III Digit=
al=20
TV PCI Driver
18: 0 IO-APIC-fasteoi NVidia nForce2
19: 7922 IO-APIC-fasteoi nvidia
NMI: 0
LOC: 153209
ERR: 0
MIS: 0

w/ skipping:
CPU0
0: 47834 IO-APIC-edge timer
1: 10 IO-APIC-edge i8042
8: 2 IO-APIC-edge rtc
9: 0 IO-APIC-fasteoi acpi
12: 112 IO-APIC-edge i8042
14: 37 IO-APIC-edge ide0
16: 152413 IO-APIC-fasteoi eth0
17: 0 IO-APIC-fasteoi Technisat/B2C2 FlexCop II/IIb/III Digit=
al=20
TV PCI Driver
18: 0 IO-APIC-fasteoi NVidia nForce2
19: 1582 IO-APIC-fasteoi nvidia
NMI: 0
LOC: 47736
ERR: 0
MIS: 0

lspci -n:
00:00.0 0600: 10de:01e0 (rev c1)
00:00.1 0500: 10de:01eb (rev c1)
00:00.2 0500: 10de:01ee (rev c1)
00:00.3 0500: 10de:01ed (rev c1)
00:00.4 0500: 10de:01ec (rev c1)
00:00.5 0500: 10de:01ef (rev c1)
00:01.0 0601: 10de:0060 (rev a3)
00:01.1 0c05: 10de:0064 (rev a2)
00:02.0 0c03: 10de:0067 (rev a3)
00:02.1 0c03: 10de:0067 (rev a3)
00:02.2 0c03: 10de:0068 (rev a3)
00:04.0 0200: 10de:0066 (rev a1)
00:05.0 0401: 10de:006b (rev a2)
00:06.0 0401: 10de:006a (rev a1)
00:08.0 0604: 10de:006c (rev a3)
00:09.0 0101: 10de:0065 (rev a2)
00:0d.0 0c00: 10de:006e (rev a3)
00:1e.0 0604: 10de:01e8 (rev c1)
01:08.0 0280: 13d0:2103 (rev 01)

Well, even w/o the skipping my nforce2 system wasn't unstable, AFAIK. So I=
=20
don't think just because of the XT-PIC entry people would compla...

To: Prakash Punnoor <prakash@...>
Cc: Andi Kleen <andi@...>, <mingo@...>, <tglx@...>, <lenb@...>, <linux-kernel@...>
Date: Saturday, February 9, 2008 - 8:46 am

Sort of coming in in the middle of this because I just realized this may have
something to do with the "exception Emask" errors. I have an NF2, and the
above 0: type is what I'm getting with, or without, the apparently opposite
kernel argument, acpi_use_timer_override. Should I instead be seeing 0: as
shown below? If so, what do I change to effect this? My current .config:

[root@coyote linux-2.6.24]# grep HPET .config
CONFIG_HPET_TIMER=y
CONFIG_HPET_EMULATE_RTC=y
CONFIG_HPET=y
# CONFIG_HPET_RTC_IRQ is not set
# CONFIG_HPET_MMAP is not set

--
Cheers, Gene
"There are four boxes to be used in defense of liberty:
soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
You are in a maze of little twisting passages, all alike.
--

To: Gene Heskett <gene.heskett@...>
Cc: Prakash Punnoor <prakash@...>, Andi Kleen <andi@...>, <mingo@...>, <tglx@...>, <lenb@...>, <linux-kernel@...>
Date: Saturday, February 9, 2008 - 10:18 am

exception Emask is a SATA problem. Shouldn't be caused by timer troubles.
This means in theory it could be that your sata driver if compiled
in is the first thing to use timers, but normally if the timers are
broken you get a earlier hang somewhere else.

-Andi

--

To: Andi Kleen <andi@...>
Cc: Prakash Punnoor <prakash@...>, <mingo@...>, <tglx@...>, <lenb@...>, <linux-kernel@...>
Date: Saturday, February 9, 2008 - 2:02 pm

The error stanza nearly always included a Timeout as cause. Except of course
for the only dmesg I actually saved, which claimed a media error. Murphy at
his finest. :) It got well and smartctl reports it hasn't remapped a sector,

This has killed me both at boot time twice, once before NASH was running, and
several times when uptimes were a day plus, but has never reappeared since
the first time I used the acpi_user_timer_override argument, and this
includes several boots without it including 2 complete, 2 or 3 minute power
downs.

I HATE it when that happens. :( Bugs you cannot duplicate even when you are
fresh out of virgins are the worst.

If I can digress, that is what I've done all my life, 57 years of chasing
electrons for a living now. The most exasperating one was in a Chyron
Character generator whose output came and went with no visible cause and
their engineers said what I was seeing was technically impossible. Turned
out they had left the 2nd gate input to a 7400 they were using for an
inverter open. And they claimed they were digital engineers.... Spit. But
it took me nearly 5 years of intermittent searching through it with a scope
probe to find it. What I had didn't always match the schematics, which
didn't help. The eureka moment was when the 10 meg scope probe killed it
completely with its additional drain to ground.

This one has my attention...

--
Cheers, Gene
"There are four boxes to be used in defense of liberty:
soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)

"Whatever the missing mass of the universe is, I hope it's not cockroaches!"
-- Mom
--

To: Gene Heskett <gene.heskett@...>
Cc: Andi Kleen <andi@...>, <mingo@...>, <tglx@...>, <lenb@...>, <linux-kernel@...>
Date: Saturday, February 9, 2008 - 2:11 pm

Are you saying that on your nforce2 you need the override=20
(acpi_use_timer_override) to have a stable system? Because that would be in=
=20
contrast to all previous findings regarding nforce2. Could you provide=20

cat /proc/interrupts
lspci
lspci -n
=2D-=20
(=B0=3D =3D=B0)
//\ Prakash Punnoor /\\
V_/ \_V

To: Prakash Punnoor <prakash@...>
Cc: Andi Kleen <andi@...>, <mingo@...>, <tglx@...>, <lenb@...>, <linux-kernel@...>
Date: Saturday, February 9, 2008 - 4:05 pm

Currently booted with it, uptime 38 hours, only diff visible is in dmesg as
has been posted here in another thread.

[root@coyote ~]# cat /proc/interrupts
CPU0
0: 869 XT-PIC-XT timer
1: 18 IO-APIC-edge i8042
3: 1 IO-APIC-edge
4: 3 IO-APIC-edge
6: 6 IO-APIC-edge floppy
8: 1 IO-APIC-edge rtc
9: 0 IO-APIC-fasteoi acpi
10: 0 IO-APIC-edge MPU401 UART
14: 2331370 IO-APIC-edge libata
15: 1492330 IO-APIC-edge libata
16: 23369893 IO-APIC-fasteoi ehci_hcd:usb1, eth0
17: 62319 IO-APIC-fasteoi ohci_hcd:usb2, NVidia nForce2
18: 35 IO-APIC-fasteoi ohci_hcd:usb3
19: 443715 IO-APIC-fasteoi sata_sil
20: 8421481 IO-APIC-fasteoi firewire_ohci, nvidia
21: 0 IO-APIC-fasteoi cx88[0], cx88[0]
22: 79353 IO-APIC-fasteoi EMU10K1
NMI: 0 Non-maskable interrupts
LOC: 44806761 Local timer interrupts
RES: 0 Rescheduling interrupts
CAL: 0 function call interrupts
TLB: 0 TLB shootdowns
TRM: 0 Thermal event interrupts
SPU: 0 Spurious interrupts
ERR: 0
MIS: 0

[root@coyote ~]# lspci
00:00.0 Host bridge: nVidia Corporation nForce2 IGP2 (rev c1)
00:00.1 RAM memory: nVidia Corporation nForce2 Memory Controller 1 (rev c1)
00:00.2 RAM memory: nVidia Corporation nForce2 Memory Controller 4 (rev c1)
00:00.3 RAM memory: nVidia Corporation nForce2 Memory Controller 3 (rev c1)
00:00.4 RAM memory: nVidia Corporation nForce2 Memory Controller 2 (rev c1)
00:00.5 RAM memory: nVidia Corporation nForce2 Memory Controller 5 (rev c1)
00:01.0 ISA bridge: nVidia Corporation nForce2 ISA Bridge (rev a4)
00:01.1 SMBus: nVidia Corporation nForce2 SMBus (MCP) (rev a2)
00:02.0 USB Controller: nVidia Corporation nForce2 USB Controller (rev a4)
00:02.1 USB Controller: nVidia Corporation nForce2 USB Controller (rev ...

To: Gene Heskett <gene.heskett@...>
Cc: Andi Kleen <andi@...>, <mingo@...>, <tglx@...>, <lenb@...>, <linux-kernel@...>
Date: Saturday, February 9, 2008 - 5:13 pm

Thanks for providing the info. According to the IDs your and my board are=20
quite alike. If you don't pass acpi_use_timer_override to vanilla kernel,=20
does your timer get connected to IO-APIC? Ie:

CPU0
0: 47834 IO-APIC-edge timer

In this mode, is your board stable? I never run my hw longer than 10h, so I=
=20
cannot say anything about long-term stability, but lately my nforce2 didn't=
=20
make any troubles and (when it did, it usally was related to PSU). I am=20
skipping the override, ie. my timer is connected to IO-APIC.

If in the latter mode your hw is instable, we have a problem...
=2D-=20
(=C2=B0=3D =3D=C2=B0)
//\ Prakash Punnoor /\\
V_/ \_V

To: Andi Kleen <andi@...>
Cc: Gene Heskett <gene.heskett@...>, Prakash Punnoor <prakash@...>, Andi Kleen <andi@...>, <mingo@...>, <tglx@...>, <lenb@...>, <linux-kernel@...>
Date: Saturday, February 9, 2008 - 10:03 am

On Sat, 9 Feb 2008 15:18:52 +0100

Badly wrong timer speed would do that, as would the timer change breaking
other IRQ delivery somewhere

Alan
--

To: Prakash Punnoor <prakash@...>
Cc: Andi Kleen <andi@...>, <mingo@...>, <tglx@...>, <lenb@...>, <linux-kernel@...>
Date: Friday, February 8, 2008 - 3:01 pm

Confirmed what? Did you test my patch on both machines?

Please always send lspci without -n too; I hate looking up hex codes

Timer override only does something in APIC mode and when you see XT-PIC
in /proc/interrupts then you're not in APIC mode. All these patches

Can you please just test the patches instead of speculating what they
might do or not do?

-Andi
--

To: Andi Kleen <andi@...>
Cc: <mingo@...>, <tglx@...>, <lenb@...>, <linux-kernel@...>
Date: Friday, February 8, 2008 - 3:00 pm

I confirmed that it (nforce2) needed the acpi_skip_timer_override. If you r=
ead=20

Perhaps I wasn't percise, Len Brown had it in his earlier patch description=
s:

"
workaround for nForce2 BIOS bug: XT-PIC timer in IOAPIC mode=20
"acpi_skip_timer_override" boot parameter
"

or

"
Since the hardware is connected to APIC pin0, it is a BIOS bug=20
that an ACPI interrupt source override from pin2 to IRQ0 exists.=20
=20
With this simple 2.6.5 patch you can specify "acpi_skip_timer_override"=20
to ignore that bogus BIOS directive. The result is with your=20
ACPI-enabled APIC-enabled kernel, you'll get IRQ0 IO-APIC-edge timer.=20
"

This is exactly what I observed on the nforce2.

My kernels are compiled and configured for APIC. With broken BIOSes the tim=
er=20
ends up as XT-PIC anyway. That is what I wanted to say and which you could=
=20
see from my cat /proc/interrupts dumps.

Why can't the kernel check for this condition and only activate the quirk t=
hen=20

No, I do understand C code and I know the ID of my board. So I am not=20
speculating, just saving myself time and hassle.

You are not even taking the time to really read what I say. I am not your =
=20
guinea pig. Why should I simply waste my time? Esp. my nforce2 system is a=
=20
productive system and I usually don't mess with it. So come up with a patch=
=20
that makes sense (and triggers on my nforce2 and does not trigger on my=20
mcp51) in my eyes, or I won't test anything and keep the NAK.

I don't think you did your research correctly coming up with the first vers=
ion=20
of the patch, as it ignored the nforce2 altogether. And the original versio=
n=20
was made for nforce2 exclusively! So why should I trust that you know what=
=20
you are doing? I don't get the impression. You also didn't gave references=
=20
where you get your IDs in the patch. I at least tried to gave some referenc=
es=20
that putting in those IDs is *wrong*. If you can provide those references=20
(and not some "sea...

To: Prakash Punnoor <prakash@...>
Cc: Andi Kleen <andi@...>, <mingo@...>, <tglx@...>, <lenb@...>, <linux-kernel@...>
Date: Friday, February 8, 2008 - 5:02 pm

Ok can you please do so then? Or stop your obstructism?

I believe my patch will and according to the test results I had so far
from other people it also works fairly well. If it doesn't work
on your systems I can fix it of course, but I need something more

Again when you see "XT-PIC" in /proc/interrupts then you're not in IO-APIC
mode.

I think Len refers to the case of the PIC being routed through the IO-APIC,
but that is a different case and you won't see XT-PIC, but "IO-APIC-level"

Well it doesn't make sense. When you have XT-PIC you're not in IO-APIC
mode and the timer override is a nop because it only changes how the
IO-APICs are programmed.

Long ago there used to be a condition where ACPI would fall back
to XT-PIC mode if something went wrong -- perhaps you're thinking

Your objections don't make sense, so you can NAK all day. You're
talking about timer overrides in PIC mode which is just pure non sense.

Ok if you're unwilling to test I'm ignoring you in the future.
Please stop sending me email.

-Andi
--

To: Andi Kleen <andi@...>
Cc: <mingo@...>, <tglx@...>, <lenb@...>, <linux-kernel@...>
Date: Friday, February 8, 2008 - 7:08 pm

Grr, I don't know why I am discussing with stubborn and/or arrogant devs li=
ke=20
you seem to be. But I actually did what you wanted and as *expected* - as I=
=20
said I understand that trivial piece of code you posted - your patch fails=
=20
here for my nforce2:

cat /proc/interrupts
CPU0
0: 832 XT-PIC-XT timer <---------------- seeing this?
1: 10 IO-APIC-edge i8042
8: 2 IO-APIC-edge rtc
9: 0 IO-APIC-fasteoi acpi
12: 84 IO-APIC-edge i8042
14: 38 IO-APIC-edge ide0
16: 184026 IO-APIC-fasteoi eth0
17: 0 IO-APIC-fasteoi Technisat/B2C2 FlexCop II/IIb/III Digit=
al=20
TV PCI Driver
18: 0 IO-APIC-fasteoi NVidia nForce2
19: 12460 IO-APIC-fasteoi nvidia
NMI: 0 Non-maskable interrupts
LOC: 74695 Local timer interrupts
TRM: 0 Thermal event interrupts
SPU: 0 Spurious interrupts
ERR: 0
MIS: 0

And no, I won't test it on my MCP51 as I *know* what happens: As soon as I=
=20
disable hpet, the quirk gets triggered and will lock up my system.

Actually I don't care anymore. The last time you also didn't really cared f=
or=20
what I said about your way of quirking the nforce boards.

I know how to make the kernel behave in this matter, it is just a pity for=
=20
other users, who don't know...

Good luck!
=2D-=20
(=B0=3D =3D=B0)
//\ Prakash Punnoor /\\
V_/ \_V

To: Prakash Punnoor <prakash@...>
Cc: Andi Kleen <andi@...>, <mingo@...>, <tglx@...>, <lenb@...>, <linux-kernel@...>
Date: Saturday, February 9, 2008 - 8:17 am

Also just to make sure you tested v3 of the patch when
you saw the failure, right?

-Andi
--

To: Prakash Punnoor <prakash@...>
Cc: Andi Kleen <andi@...>, <mingo@...>, <tglx@...>, <lenb@...>, <linux-kernel@...>
Date: Saturday, February 9, 2008 - 7:59 am

Well it looks like it is ticking. What are the symptoms?

I readded the HPET check in v2 especially for you so if HPET is enabled
no quirk is triggered.

-Andi
--

To: Andi Kleen <andi@...>
Cc: <mingo@...>, <tglx@...>, <lenb@...>, <linux-kernel@...>
Date: Saturday, February 9, 2008 - 7:53 am

You are seeing it. If you read my other mails you would comprehend it as we=

As I said inform yourself, what the intention of the quirk is about. I am=20

Still you didn't give *any* proof that mcp51 needs quirk at all. I therefor=
e=20
want that line *removed*:

+ QBRIDGE(PCI_VENDOR_ID_NVIDIA, 0x02f0, nvidia_timer), /* mcp 51/nf4 =
?=20
*/

=46urthermore my original bios didn't have option to enable hpet. What then=
?=20
Kernel hangs unless I specify acpi_use_timer_override. Great.

Hint: The correct way of quirking would be *only having this one* line:

+ QBRIDGE(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NFORCE2,=20
nvidia_timer),

According to Len Brown's comment this would work for every nforce2. For eve=
ry=20
other nforceX use DMI Scan and only quirk known broken bioses instead of=20
messing up working boxes.

bye,
=2D-=20
(=B0=3D =3D=B0)
//\ Prakash Punnoor /\\
V_/ \_V

To: Prakash Punnoor <prakash@...>
Cc: Andi Kleen <andi@...>, <mingo@...>, <tglx@...>, <lenb@...>, <linux-kernel@...>
Date: Saturday, February 9, 2008 - 10:31 am

I would like to see a boot log with v3 of my patch at least so that I can
see what is going on. Can you please supply one?

If you booted the patch as you claimed earlier that should be quite

We discussed this back then with Nvidia engineers and they stated
that only NF5 would need timer overrides.

I suppose they didn't know about the case of your NF4 though. The only
reason for that I can think of is that your particular board differs
in non trivial ways from the reference design. Ok I suppose that's

Yes that's true, but that was the case anyways even without my
patch. Without my patch timer overrides are ignored on all Nvidia
boards without HPET.

No, we have NF3 and NF4 systems which have bogus timer overrides

DMI generally doesn't scale for any issues that are in reference
BIOS, because there are so many different OEMs.

It might work for your specific board though. But again that does not
change with or withouyt my patch. On the other hand since your board
seems to have a BIOS update available with hpet enabled it seems
to work now, so your box is covered anyways.

Admittedly the cases my patch fixes are not very wide -- it is basically
only NF5 boards that do have an old BIOS without HPET. For everything
else it should be ideally a noop (if not I did something wrong).
Maybe those NF5 boards with old pre Vista certified BIOS are dying
out, but at least when I wrote the patch originally there were several
users were which had NF5 boards without HPET and no BIOS update available.

-Andi
--

To: Andi Kleen <andi@...>
Cc: <mingo@...>, <tglx@...>, <lenb@...>, <linux-kernel@...>
Date: Saturday, February 9, 2008 - 11:51 am

Can I get a link which verifies your statement? I provided one which kind o=
f=20
contradicts yours.
=2D-=20
(=B0=3D =3D=B0)
//\ Prakash Punnoor /\\
V_/ \_V

To: Prakash Punnoor <prakash@...>
Cc: Andi Kleen <andi@...>, <mingo@...>, <tglx@...>, <lenb@...>, <linux-kernel@...>
Date: Saturday, February 9, 2008 - 12:56 pm

Sorry cannot supply links to them, they were private mail. For some
reason the Nvidia people seem to be shy to post to mailing lists.

iirc the thread which inspired this patch (together with several
bugs in both novell and kernel.org bugzilla) was

http://marc.info/?t=116175224500001&r=1&w=2
http://marc.info/?t=116230518000004&r=1&w=2

but you won't find the ultimate conclusion in there unfortunately.

So you have to trust me on that -- it's a bit similar to as to I have
to trust your not yet produced boot log and test results.

-Andi
--

To: Andi Kleen <andi@...>
Cc: <mingo@...>, <tglx@...>, <lenb@...>, <linux-kernel@...>
Date: Saturday, February 9, 2008 - 1:30 pm

And I won't, as I reverted to my stable kernel again and thus patching it=20
again (yes it was 2.6.24 with early-quirks.c from git and your patch on top=
)=20
doesn't give more info then I already provided. Furthermore I also told you=
=20
that because of missing nforce2 ID the practical test wasn't really=20
necessary.

Just add this line to your patch:

+ QBRIDGE(PCI_VENDOR_ID_NVIDIA, PCI_DEVICE_ID_NVIDIA_NFORCE2,=20
nvidia_timer),

So that the quirk gets applied for nForce2, then your patch is - while stil=
l=20
wrong in my eyes - not a regression anymore (for me) and thus I would take=
=20
back my NAK(, but still not ACK it).

bye,
=2D-=20
(=B0=3D =3D=B0)
//\ Prakash Punnoor /\\
V_/ \_V

To: Prakash Punnoor <prakash@...>
Cc: Andi Kleen <andi@...>, <mingo@...>, <lenb@...>, <linux-kernel@...>
Date: Saturday, February 9, 2008 - 7:06 am

No worry, this patch wont go anywhere near mainline as long as it
breaks stuff and obviously you are under no obligation to re-test
patches that have not been changed just re-submitted.

x86 changes, which are considered for mainline are staged in the mm
branch of the x86 git tree:

http://git.kernel.org/?p=linux/kernel/git/x86/linux-2.6-x86.git;a=shortl...

Instructions for checking it out are here:
http://people.redhat.com/mingo/x86.git/README

Please let us know, if there is anything which breaks your box(en).

Thanks for your feedback and patience.

tglx
--

To: Thomas Gleixner <tglx@...>
Cc: Andi Kleen <andi@...>, <mingo@...>, <lenb@...>, <linux-kernel@...>
Date: Saturday, February 9, 2008 - 8:18 am

The problem is current behaviour is already broken as it applies the quirk=
=20
*unconditionally* for all Nvidia hardwarde where no hpet is detected. The=20
latter is just heuristics. *If* correct Nforce2 ID gets added to the propos=
ed=20
patch, behaviour would be equivalent to current situation for me (nforce2,=
=20
mcp51). Still I am saying mcp51 doesn't belong per-se to the list of chipse=
ts=20
which need to be quirked, as for me it shows adverse effect. Taking mcp51 o=
ut=20
would be an advancement.

If there are situations where quirking is correct and other situation where=
it=20
is incorrect for the same type of chipsets, I think then the quirk should n=
ot=20
be applied automatically.

So I suggest something like this as a start. The quirk only gets applied fo=
r=20
nforce2 unconditionally, as it was intended originally. For chipsets betwee=
n=20
nforce3 and before nforce5 the user gets a message and no quirk gets applie=
d=20
automatically. If there are known bug reports (Andi Kleen didn't supply any=
=20
references) some more infos could be asked from the reportes. Then for know=
n=20
broken bios versions the quirk could be applied by DMI scan *selectively*.=
=20
(This part of code is missing here.) I also don't know whether the list of=
=20
IDs is complete.

Warning I hand edited the original proposed patch, so it won't apply and=20
probably won't compile. But I hope one gets the idea.

Index: linux/arch/x86/kernel/early-quirks.c
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
=2D-- linux.orig/arch/x86/kernel/early-quirks.c
+++ linux/arch/x86/kernel/early-quirks.c
@@ -60,38 +60,21 @@ static void __init via_bugs(int num, in
#endif
}
=20
=2D#ifdef CONFIG_ACPI
=2D#ifdef CONFIG_X86_IO_APIC
=2D
=2Dstatic int __init nvidia_hpet_check(struct acpi_table_header *header)
=2D{
=2D ret...

To: Andi Kleen <andi@...>
Cc: <mingo@...>, <tglx@...>, <lenb@...>, <linux-kernel@...>
Date: Friday, February 8, 2008 - 11:18 am

http://lkml.org/lkml/2006/8/13/25

I killed one digit by mistake...
=2D-=20
(=B0=3D =3D=B0)
//\ Prakash Punnoor /\\
V_/ \_V

Previous thread: [PATCH] input: driver for USB VoIP phones with CM109 chipset by Alfred E. Heggestad on Thursday, February 7, 2008 - 2:38 pm. (10 messages)

Next thread: [2.6.22.y] {00/14+1} - series for stable - on top of 2.6.22.17 by Oliver Pinter on Thursday, February 7, 2008 - 4:03 pm. (1 message)