Re: [patch] PCI: modify SB700 SATA MSI quirk

Previous thread: Re: [bug] ata subsystem related crash with latest -git by Jeff Garzik on Thursday, October 18, 2007 - 5:14 am. (1 message)

Next thread: s390x: getting ipv6 bugs on mainline since 2.6.23-git3 by Andy Whitcroft on Thursday, October 18, 2007 - 5:35 am. (3 messages)
To: <gregkh@...>, <htejun@...>
Cc: <linux-kernel@...>, <linux-pci@...>, Su, Henry <Henry.Su@...>, Yang, Libin <Libin.Yang@...>, Shane Huang <Shane.Huang@...>
Date: Thursday, October 18, 2007 - 5:14 am

More ATI North Bridges like RS780 can't do MSI like its predecessors
in linux. Disable MSIs on them.

Signed-off-by: Shane Huang <shane.huang@amd.com>

Since there is some word wrapping problem with my mail client MS outlook
if I copy the patch into the text, so I'll have to attach the patch as
an attachment. Please check it.

Thanks
Best Regards
Shane

To: <gregkh@...>
Cc: <linux-kernel@...>, <linux-pci@...>, <htejun@...>, Shane Huang <Shane.Huang@...>
Date: Thursday, January 24, 2008 - 6:59 am

This patch recover Tejun's commit
4be8f906435a6af241821ab5b94b2b12cb7d57d8
because there is one MSI bug on RS690+SB600 board which will lead to
boot failure. This bug is NOT same as the one in SB700 SATA controller,
quirk_msi_intx_disable_bug does not work to SB600. Disablement the MSI
of RS690 is the workaround.

Signed-off-by: Shane Huang <shane.huang@amd.com>

Since there is some word wrapping problem with my mail client MS outlook
if I copy the patch into the text, so I'll also attach the patch as an
attachment. Please check it.

diff -ruN old/drivers/pci/quirks.c new/drivers/pci/quirks.c
--- old/drivers/pci/quirks.c 2008-01-07 05:45:38.000000000 +0800
+++ new/drivers/pci/quirks.c 2008-01-22 11:02:00.000000000 +0800
@@ -1623,6 +1623,7 @@
DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_SERVERWORKS,
PCI_DEVICE_ID_SERVERWORKS_GCNB_LE, quirk_disable_all_msi);
DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ATI, PCI_DEVICE_ID_ATI_RS400_200,
quirk_disable_all_msi);
DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ATI, PCI_DEVICE_ID_ATI_RS480,
quirk_disable_all_msi);
+DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ATI, PCI_DEVICE_ID_ATI_RS690,
quirk_disable_all_msi);
DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_VIA, PCI_DEVICE_ID_VIA_VT3351,
quirk_disable_all_msi);
=20
/* Disable MSI on chipsets that are known to not support it */
diff -ruN old/include/linux/pci_ids.h new/include/linux/pci_ids.h
--- old/include/linux/pci_ids.h 2008-01-07 05:45:38.000000000 +0800
+++ new/include/linux/pci_ids.h 2008-01-22 11:01:55.000000000 +0800
@@ -360,6 +360,7 @@
#define PCI_DEVICE_ID_ATI_RS400_166 0x5a32
#define PCI_DEVICE_ID_ATI_RS400_200 0x5a33
#define PCI_DEVICE_ID_ATI_RS480 0x5950
+#define PCI_DEVICE_ID_ATI_RS690 0x7910
/* ATI IXP Chipset */
#define PCI_DEVICE_ID_ATI_IXP200_IDE 0x4349
#define PCI_DEVICE_ID_ATI_IXP200_SMBUS 0x4353

Thanks
Best Regards
Shane

To: Shane Huang <Shane.Huang@...>
Cc: <gregkh@...>, <linux-kernel@...>, <linux-pci@...>, <htejun@...>
Date: Thursday, January 24, 2008 - 7:15 am

This patch disable MSI for the _whole_ system, not only behind the
RS690. Is this on purpose? Is MSI really going to be broken on any
bus that's _not_ behind RS690. If not, you might want to use
quirk_disable_msi() instead (as we do for AMD8131).

Brice
--

To: Brice Goglin <Brice.Goglin@...>, <gregkh@...>
Cc: <linux-kernel@...>, <linux-pci@...>, <htejun@...>, Shane Huang <Shane.Huang@...>
Date: Friday, January 25, 2008 - 6:39 am

quirk_disable_msi() can not fix the issue in my debug,
quirk_msi_intx_disable_bug() which can fix SB700 SATA MSI bug does not
work either.
quirk_disable_all_msi is the only workaround I found.

If there is any other guy who also has one SB600+RS690 board, and can
help
to verify this RS690 MSI disablement patch with a new kernel version
such as
2.6.24-rc7, that's great.

BTW:
RS690 MSI disablement should NOT affect SB700 MSI, because as I know,
there will not be the combination of RS690+SB700 on the market.

Thanks
Shane

--

To: <gregkh@...>
Cc: <linux-kernel@...>, <linux-pci@...>, <htejun@...>, Shane Huang <Shane.Huang@...>
Date: Thursday, January 24, 2008 - 7:12 am

SB700 SATA MSI bug will be fixed in SB700 revision A21 at hardware
level,
but the SB700 revision older than A21 will also be found in the market.
This patch modify the original quirk commit
bc38b411fe696fad32b261f492cb4afbf1835256 instead of withdrawing it.

Signed-off-by: Shane Huang <shane.huang@amd.com>

Since there is some word wrapping problem with my mail client MS
outlook, I also attach the patch as an attachment. Please check it.

diff -ruN old/drivers/pci/quirks.c new/drivers/pci/quirks.c
--- old/drivers/pci/quirks.c 2008-01-07 05:45:38.000000000 +0800
+++ new/drivers/pci/quirks.c 2008-01-22 11:31:09.000000000 +0800
@@ -1709,6 +1709,22 @@
{
dev->dev_flags |=3D PCI_DEV_FLAGS_MSI_INTX_DISABLE_BUG;
}
+static void __devinit quirk_msi_intx_disable_ati_bug(struct pci_dev
*dev)
+{
+ struct pci_dev *p;
+ u8 rev =3D 0;
+
+ /* SB700 MSI issue will be fixed at HW level from revision A21,
+ * we need check PCI REVISION ID of SMBus controller to get
SB700 revision.
+ */
+ p =3D pci_get_device(PCI_VENDOR_ID_ATI,
PCI_DEVICE_ID_ATI_SBX00_SMBUS, NULL);
+ if (p!=3DNULL) {
+ pci_read_config_byte(p, PCI_CLASS_REVISION, &rev);
+ }
+ if ((rev < 0x3B) && (rev >=3D 0x30)) {
+ dev->dev_flags |=3D PCI_DEV_FLAGS_MSI_INTX_DISABLE_BUG;
+ }
+}
DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_BROADCOM,
PCI_DEVICE_ID_TIGON3_5780,
quirk_msi_intx_disable_bug);
@@ -1729,17 +1745,17 @@
quirk_msi_intx_disable_bug);
=20
DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ATI, 0x4390,
- quirk_msi_intx_disable_bug);
+ quirk_msi_intx_disable_ati_bug);
DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ATI, 0x4391,
- quirk_msi_intx_disable_bug);
+ quirk_msi_intx_disable_ati_bug);
DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ATI, 0x4392,
- quirk_msi_intx_disable_bug);
+ quirk_msi_intx_disable_ati_bug);
DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ATI, 0x4393,
- quirk_msi_intx_disable_bug);
+ quirk_msi_intx_disable_ati_bug);
DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ATI...

To: <gregkh@...>, <htejun@...>
Cc: <linux-kernel@...>, <linux-pci@...>, Shane Huang <Shane.Huang@...>
Date: Thursday, January 24, 2008 - 11:26 pm

I did some modification to this patch and send it again, Please check
it.
The quirk to 0x4395 has been removed because 0x4395 only belongs to
SB800.

Thanks

diff -ruN linux-2.6.24-rc7_org/drivers/pci/quirks.c
linux-2.6.24-rc7_new/drivers/pci/quirks.c
--- linux-2.6.24-rc7_org/drivers/pci/quirks.c 2008-01-23
14:44:53.000000000 +0800
+++ linux-2.6.24-rc7_new/drivers/pci/quirks.c 2008-01-25
10:55:21.000000000 +0800
@@ -1709,6 +1709,24 @@
{
dev->dev_flags |=3D PCI_DEV_FLAGS_MSI_INTX_DISABLE_BUG;
}
+static void __devinit quirk_msi_intx_disable_ati_bug(struct pci_dev
*dev)
+{
+ struct pci_dev *p;
+ u8 rev =3D 0;
+
+ /* SB700 MSI issue will be fixed at HW level from revision A21,
+ * we need check PCI REVISION ID of SMBus controller to get
SB700
+ * revision.
+ */
+ p =3D pci_get_device(PCI_VENDOR_ID_ATI,
PCI_DEVICE_ID_ATI_SBX00_SMBUS,
+ NULL);
+ if (p) {
+ pci_read_config_byte(p, PCI_CLASS_REVISION, &rev);
+ }
+ if ((rev < 0x3B) && (rev >=3D 0x30)) {
+ dev->dev_flags |=3D PCI_DEV_FLAGS_MSI_INTX_DISABLE_BUG;
+ }
+}
DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_BROADCOM,
PCI_DEVICE_ID_TIGON3_5780,
quirk_msi_intx_disable_bug);
@@ -1729,17 +1747,15 @@
quirk_msi_intx_disable_bug);
=20
DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ATI, 0x4390,
- quirk_msi_intx_disable_bug);
+ quirk_msi_intx_disable_ati_bug);
DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ATI, 0x4391,
- quirk_msi_intx_disable_bug);
+ quirk_msi_intx_disable_ati_bug);
DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ATI, 0x4392,
- quirk_msi_intx_disable_bug);
+ quirk_msi_intx_disable_ati_bug);
DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ATI, 0x4393,
- quirk_msi_intx_disable_bug);
+ quirk_msi_intx_disable_ati_bug);
DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ATI, 0x4394,
- quirk_msi_intx_disable_bug);
-DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ATI, 0x4395,
- quirk_msi_intx_disable_bug);
+ quirk_msi_intx_disable_ati_bug);
=20
DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ATI, 0x4...

To: Shane Huang <Shane.Huang@...>
Cc: <gregkh@...>, <linux-kernel@...>, <linux-pci@...>
Date: Thursday, January 24, 2008 - 11:35 pm

After S-O-B, you can put --- and between it and the patch body, you can
say things which you wanna mention but don't think should be included in

Hmm... So, if there's no SMBUS device the quirk applies. Is this
intended? If that can't happen, just do if (!p) return;

You tested this, right?

--
tejun
--

To: Tejun Heo <htejun@...>
Cc: <gregkh@...>, <linux-kernel@...>, <linux-pci@...>, Shane Huang <Shane.Huang@...>
Date: Thursday, January 24, 2008 - 11:48 pm

OK, I'll have to submit another update patch later.

Thanks
Shane

--

To: Shane Huang <Shane.Huang@...>
Cc: Tejun Heo <htejun@...>, <linux-kernel@...>, <linux-pci@...>
Date: Friday, January 25, 2008 - 12:33 am

I recommend running the scripts/checkpatch.pl script on any proposed
patches like this before you send them. It will find a lot of these
problems for you :)

thanks,

greg k-h
--

To: Shane Huang <Shane.Huang@...>
Cc: <gregkh@...>, <linux-kernel@...>, <linux-pci@...>
Date: Thursday, January 24, 2008 - 8:19 pm

Can you please get a decent email client? Thunderbird + toggle word

It would be nice if things stay under 80col limit although we don't

--
tejun
--

To: <Shane.Huang@...>
Cc: <gregkh@...>, <htejun@...>, <linux-kernel@...>, <linux-pci@...>, <Henry.Su@...>, <Libin.Yang@...>
Date: Thursday, October 18, 2007 - 6:19 am

From: "Shane Huang" <Shane.Huang@amd.com>

Can we get some detail as to why these north bridges have to have MSI
disabled completely? I can't believe that all 6 of these ATI
controllers which will now be listed in the quirk table cannot use MSI
at all.

I've discovered several cases where it was a buggy device or a driver
bug that caused someone to erroneously submit patches that disable MSI
completely for the bridge they were behind.

I don't want that to happen any more. One way to prevent that is to
have a full detailed justification for the MSI disabling in the
changelog or in the comments for the MSI quirk.

Thank you.
-

To: David Miller <davem@...>
Cc: <gregkh@...>, <htejun@...>, <linux-kernel@...>, <linux-pci@...>, Su, Henry <Henry.Su@...>, Yang, Libin <Libin.Yang@...>, Shane Huang <Shane.Huang@...>
Date: Thursday, October 18, 2007 - 6:37 am

Hi Miller:

Thank you for your response.

The reason why MSIs of these northbridges do not work is still under
further debug, we are NOT able to tell its hardware issue or software
issue at this time. But enablement of them will lead to the OS
installation failure in many distributions like openSUSE, Ubuntu etc:
https://bugzilla.novell.com/show_bug.cgi?id=302016

So we have to disable them firstly before we find out the root cause,
maybe they are just workarounds.

If you guys know much more about this MSI problem, don't hesitate to
tell us, we can debug on some of the hardware platforms.

BTW, There already some disablements to some ATI NB MSIs in the kernel,
the reasons are similar.

Thanks

-

To: <Shane.Huang@...>
Cc: <gregkh@...>, <htejun@...>, <linux-kernel@...>, <linux-pci@...>, <Henry.Su@...>, <Libin.Yang@...>
Date: Thursday, October 18, 2007 - 7:46 am

From: "Shane Huang" <Shane.Huang@amd.com>

This logic seems backwards, to me. "shoot first, ask questions later"
To me this it not how to approach this problem.

Once you turn MSI off, there is next to no incentive to fix the
problem because users aren't running into it any longer.

The only two devices in that bug report which should be using MSI
would be the SATA controller and the broadcom ethernet NIC. And by
the failed bootup logs provided by the user the problem is clearly
with the SATA controller.

One common problem we're finding is that some devices have a hardware
bug where setting INTX_DISABLE in the PCI COMMAND register masks MSI
interrupts too.

I mention this because the user in that report mentions that the
kernel upgrade causes the failure, and one thing we started doing not
too long ago was to set the INTX_DISABLE bit when MSI is enabled for a
device.

So maybe this SATA controller has this problem too. It is easy to
test, simply comment out all of the pci_intx() function calls in
drivers/pci/msi.c and perform a test boot with MSI enabled.

I would rather you approach analysis of these kinds of MSI bugs in
this manner, instead of disabling MSI wholesale. Because with the
latter approach it is nearly guarenteed that the real reason will only
be discovered with an extremely low priority.

Thank you.
-

To: David Miller <davem@...>
Cc: <Shane.Huang@...>, <gregkh@...>, <htejun@...>, <linux-kernel@...>, <linux-pci@...>, <Henry.Su@...>, <Libin.Yang@...>
Date: Friday, October 19, 2007 - 1:42 pm

And the same SATA controller could show up behind a different northbridge.
It would be unfortunate to hit the same device bug independantly on each
system and work around it by doing something that won't help the next

Have we gotten around to having a device quirk for this? I bet it won't be
too long before we see a system where the SATA controller doesn't work
with INTX disabled and the ethernet controller doesn't work with it
enabled, since we've seen devices with each of these bugs.

-Daniel
*This .sig left intentionally blank*
-

To: David Miller <davem@...>
Cc: <Shane.Huang@...>, <htejun@...>, <linux-kernel@...>, <linux-pci@...>, <Henry.Su@...>, <Libin.Yang@...>
Date: Thursday, October 18, 2007 - 11:24 am

I agree with David here. Please work to find the root cause of this
problem. If it turns out that all of these chipsets have broken MSI and
can't handle it at all (oops, that's a major bug...) then we will accept
this patch.

But for now, let's not take a band-aid that prevents others from working
to solve the real issues here.

thanks,

greg k-h
-

Previous thread: Re: [bug] ata subsystem related crash with latest -git by Jeff Garzik on Thursday, October 18, 2007 - 5:14 am. (1 message)

Next thread: s390x: getting ipv6 bugs on mainline since 2.6.23-git3 by Andy Whitcroft on Thursday, October 18, 2007 - 5:35 am. (3 messages)