We're getting a lot of storage drivers blamed for interrupt misrouting
issues. This patch provides a standard way of reporting the problem
... and, if possible, correcting it.
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
---
drivers/pci/Makefile | 3 +-
drivers/pci/irq.c | 60 ++++++++++++++++++++++++++++++++++++++++++++++++++
include/linux/pci.h | 7 +++++
3 files changed, 69 insertions(+), 1 deletions(-)
create mode 100644 drivers/pci/irq.c
diff --git a/drivers/pci/Makefile b/drivers/pci/Makefile
index 7d63f8c..19dacb8 100644
--- a/drivers/pci/Makefile
+++ b/drivers/pci/Makefile
@@ -3,7 +3,8 @@
#
obj-y += access.o bus.o probe.o remove.o pci.o quirks.o slot.o \
- pci-driver.o search.o pci-sysfs.o rom.o setup-res.o
+ pci-driver.o search.o pci-sysfs.o rom.o setup-res.o \
+ irq.o
obj-$(CONFIG_PROC_FS) += proc.o
# Build PCI Express stuff if needed
diff --git a/drivers/pci/irq.c b/drivers/pci/irq.c
new file mode 100644
index 0000000..6441dfa
--- /dev/null
+++ b/drivers/pci/irq.c
@@ -0,0 +1,60 @@
+/*
+ * PCI IRQ failure handing code
+ *
+ * Copyright (c) 2008 James Bottomley <James.Bottomley@HansenPartnership.com>
+ */
+
+#include <linux/acpi.h>
+#include <linux/device.h>
+#include <linux/kernel.h>
+#include <linux/pci.h>
+
+static void pci_note_irq_problem(struct pci_dev *pdev, const char *reason)
+{
+ struct pci_dev *parent = to_pci_dev(pdev->dev.parent);
+
+ dev_printk(KERN_ERR, &pdev->dev,
+ "Potentially misrouted IRQ (Bridge %s %04x:%04x)\n",
+ parent->dev.bus_id, parent->vendor, parent->device);
+ dev_printk(KERN_ERR, &pdev->dev, "%s\n", reason);
+ dev_printk(KERN_ERR, &pdev->dev, "Please report to linux-kernel@vger.kernel.org\n");
+ WARN_ON(1);
+}
+
+/**
+ * pci_lost_interrupt - reports a lost PCI interrupt
+ * @pdev: device whose interrupt is lost
+ *
+ * The primary function of this routine is to report a lost interrupt
+ * in a standard way which users can recognise (instead of ...Will the dev_printk() strings be included in the kerneloops report? And
what if there is no parent of the device? Consider device 00:02.0 on my
laptop:
00:02.0 VGA compatible controller: Intel Corporation Mobile GM965/GL960 Integrated Graphics Controller (rev 03)
Subsystem: Fujitsu Limited. Device 13fe
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0
Couldn't this be written more concisely as:
if (pdev->msix_enabled) {
pci_note_irq_problem(pdev, "MSIX routing failure");
return PCI_LOST_IRQ_DISABLE_MSIX;
}
if (pdev->msi_enabled) {
pci_note_irq_problem(pdev, "MSI routing failure");
return PCI_LOST_IRQ_DISABLE_MSI;
--
No, but some of them take the prior strings (depending on the There is always a parent device ... it might be the PCI root device, in The idea was to separate the cases in case something extra needs be done. I think it's pretty much identical as far as the compiler optimises, and therefore probably not worth worrying about much. James --
James, Can you add a paragraph to Documentation/pci.txt about the usage of the new API? thanks, --
When there actually is a new API, sure. James --
Do you prefer "dev_printk(KERN_ERR, ...)" over "dev_err(...)"? Easier to grep for the former, maybe? If so, should we deprecate "dev_err()" and friends? When I converted most of the PCI core to use dev_printk(), (80ccba1186d48f ...) I used dev_err(), but I don't really care one way or the other. Maybe use pci_name(parent)? I tried to standardize the PCI core on "[%04x/%04x]" for vendor/device ID. Bjorn --
That's unfortunately different from lspci -nn: 00:00.0 Host bridge [0600]: Intel Corporation Mobile PM965/GM965/GL960 Memory Controller Hub [8086:2a00] (rev 03) Sorry, I should have reviewed your patches ;-( -- Intel are signing my paycheques ... these opinions are still mine "Bill, look, we understand that you're interested in selling us this operating system, but compare it to ours. We can't possibly take such a retrograde step." --
I can change those. I just copied what seemed to be the most prevalent. But lspci is harder to change, so we should probably follow that. --
To be honest I'm not really interested too much in the various API preferences ... they can be fixed up later by the people who care. What I am interested in is whether we can get this interface to give sufficient information to blacklist the offending motherboard if we can identify it as the source of the problem. Since the usual culprit is the bridge, that's why I'm doing parent vendor/device pairs. However, better suggestions for the information that should be displayed would be gratefully received. James --
I'm happy to fix it up later if you prefer. I only mentioned it because doing it later adds churn and risk of breakage. Bjorn --
This seems to be a function that just returns what type of IRQ you're using or how it's routed and it isn't necessarily "lost interrupt" specific. This information is clearly useful to drivers both for informational purposes and for debugging problems, so on that front it looks good. However, I think it should probably be called pci_interrupt_type(struct pci_dev *dev) or something instead, and just return an enum of either MSIX, MSI, or LINE (though I'm open to better names). From that, the driver can infer what might be going wrong, though in the case of a LINE interrupt, all you can really do is report that there's probably a platform IRQ routing problem. Jesse --
So perhaps this routine should only note but not advise? The drivers The only thing that this can't do is ACPI ... but on the other hand once the IRQ routing is set by ACPI I'm not sure the driver can do anything to fix it. So ... depending on how the feedback goes, I'll plan to reroll this as a note but not advise function. James --
If it's just a pci_irq_type function then it probably shouldn't print anything, and leave that to the caller, since it might be used for other purposes too (e.g. a driver load printk or something). In the lost interrupt case you already have to disable MSI or MSIX in the Fusion driver, so you may as well put the printk there, right? I guess I'm saying it should neither Yeah, ACPI may or may not be the problem, all you'll really know is that you've got a line interrupt that you failed to get... The driver won't be able to do much in that case aside from complain loudly. Thanks, Jesse --
Well, no; the object was to have the layer that knew (PCI) print information which could be used to identify the problem. Likely what the driver will say is something like "MSI isn't working and it's not my fault". What I want is for PCI to print something that may be helpful to people trying to diagnose the problem. Driver writers aren't going Yes ... it's just that when line interrupts fail (especially if they worked previously) it's usually ACPI to blame. James --
Yeah, I understand what you're trying to do, and given that there's only one
user of this function (your fusion patch), I don't have a strong preference.
However, the function is really just telling the driver what type of IRQ a
given PCI device currently has; it's not really a "lost interrupt" function
at all. So to me, it makes sense to just generalize it into a pci_irq_type
function and let drivers do what they will with it. It looks like this is
the real lost interrupt detection:
+ /* May fail becuase of IRQ misrouting */
+ rc = mpt_get_manufacturing_pg_0(ioc);
+ if (rc) {
+ ...
+ printk(MYIOC_s_ERR_FMT "Cannot recover IRQ routing\n",
+ ioc->name);
+ return -1;
+ }
not the pci_lost_interrupt() function. So it could just as easily be written
as such:
+ /* May fail becuase of IRQ misrouting */
+ rc = mpt_get_manufacturing_pg_0(ioc);
+ if (rc) {
+ /* Lost an IRQ, see what type... */
+ int irq_type = pci_irq_type(dev);
+
+ if (type == PCI_IRQ_MSI) {
+ /* Lost an MSI interrupt, try re-config w/o MSI */
+ free_irq(ioc->pci_irq, ioc);
+ ioc->msi_enable = 0;
+ pci_disable_msi(ioc->pcidev);
+ goto retry;
+ }
+ /* This platform is just broken... */
+ printk(MYIOC_s_ERR_FMT "Cannot recover from %s IRQ
+ routing error\n", pci_irq_type_name(type), ioc->name);
+ return -1;
+ }
or somesuch. That seems just as simple for driver writers as your initial
patch, and the function is named in accordance with what it actually does,
rather than what it's used for...
Jesse
--
It could, but if the bridge is the culprit (as it usually is for MSI problems), this print won't help identify it. Therefore, rather than give driver writers a recipe for "print this and this and go to the bridge and print this", I'd rather have a single PCI callback that prints all the (hopefully) relevant information that will allow either fixing or blacklisting. James --
So in addition to the IRQ type check we need to dump some device topology information... yeah that makes sense. I wonder if the driver core should provide something like this. Greg? In the meantime we can definitely add the IRQ type function. Thanks, Jesse --
What kind of topology do you need that is not already provided? thanks, greg k-h --
We really need to print how the device interrupt is routed. But, unfortunately, we need to identify the devices in the layout (bridges and the like by say device/vendor numbers) so we know what they are ... this is the bit that the generic device model can't do because it's embedded in the bus specific part. James --
Ok, per our discussion at the PCI BoF I applied this, so now you can apply the fusion and other bits to help debug issues. Thanks, Jesse --
