http://bugzilla.kernel.org/show_bug.cgi?id=8833
We write 0xffffffff to BARs to detect BAR size, this will change BAR
base to 0xfxxxxxx depends on BAR size. In the bug, PCI MCFG base address
is 0xf4000000. One PCI device (gfx) has a 256M BAR, the detection code
will temprarily change it to 0xf0000000, so conflict with MCFG decode
range. Later memory based config space read/write address is decoded by
both MCH and gfx and cause a hang. This patch disables resource decode
in BAR size detection to avoid resource conflict.Signed-off-by: Shaohua Li <shaohua.li@intel.com>
---
drivers/pci/probe.c | 6 ++++++
1 file changed, 6 insertions(+)Index: linux/drivers/pci/probe.c
===================================================================
--- linux.orig/drivers/pci/probe.c 2007-09-12 10:44:19.000000000 +0800
+++ linux/drivers/pci/probe.c 2007-09-13 12:58:18.000000000 +0800
@@ -185,6 +185,11 @@ static void pci_read_bases(struct pci_de
unsigned int pos, reg, next;
u32 l, sz;
struct resource *res;
+ u16 command;
+
+ pci_read_config_word(dev, PCI_COMMAND, &command);
+ pci_write_config_word(dev, PCI_COMMAND,
+ command & (~(PCI_COMMAND_MEMORY|PCI_COMMAND_IO)));for(pos=0; pos<howmany; pos = next) {
u64 l64;
@@ -283,6 +288,7 @@ static void pci_read_bases(struct pci_de
}
}
}
+ pci_write_config_word(dev, PCI_COMMAND, command);
}void pci_read_bridge_bases(struct pci_bus *child)
-
You missed the part where we have to avoid doing this for host bridges ...
http://marc.info/?l=linux-kernel&m=118809338631160&w=2
(I believe it's now queued in Greg's tree, and possibly Andrew's tree
too)--
Intel are signing my paycheques ... these opinions are still mine
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours. We can't possibly take such
a retrograde step."
-
Hmm, I don't know there is already a fix for this, so waste a whole
morning :(. yes, your patch looks better.Thanks,
Shaohua
-
Yes, this bug is causing a lot of people to waste time. I fielded an
internal request for this patch this afternoon. I appreciate we're
post-rc6 at this point, but it does rather suck to be releasing a kernel
which freezes on boot on this class of machines.Unfortunately if this patch does cause any machine to break, these will
be machines that worked fine up until this point, so that would be a
regression, which is worse. Life sucks.--
Intel are signing my paycheques ... these opinions are still mine
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours. We can't possibly take such
a retrograde step."
-
But it's not like the kernel every worked on this class of machines,
If, after a while, you think the change should go into the -stable tree,
I have no objection.thanks,
greg k-h
-
I think it shouldn't - this change will almost certainly cause a regression.
There is a lot of system devices besides the host bridges that shouldn't be
disabled during BAR probe, like interrupt controllers, power management
controllers and so on.We need a more sophisticated fix - I'm thinking of introducing "probe" field
in struct pci_dev which can be set by "early" quirk routines.Ivan.
-
Agreed. I have a similar problem on ppc where it's common to have things
like the main PIC on a PCI device. Note that another problem is (or at
least was, i haven't checked recently) the P2P bridge scanning code
that, in a similar way, can block the path to all devices below it. I
-do- have a case for example with Apple Xserve G4's where the main Apple
IO ASIC, which is a PCI device containing the PIC, the power management
controller, and various low level system control IOs is behind a pair of
P2P bridges.One solution for us (PPC) is to enforce those devices and bridges to be
described in the OF tree, and generalize a bit the code we have for some
64 bits machines, that synthetizes the pci_dev's from the OF nodes
rather than probing. But that's not going to help other archs.In fact, that's a problem we also have with
pci_assign_unassigned_resources() which will happily move things around
that must not be moved, especially when sitting behind P2P bridges.So the root of the issue is much deeper than just a quirk here I
believe.Cheers,
Ben.-
I think the P2P probing code is pretty safe now - there are read-only
accesses to the bridge config, unless you request to reassign the busIf you can get reliable PCI info from firmware, it should be relatively easy
to avoid at least a bar sizing. You can install an "early" fixup for
PCI_ANY_ID and fill the resource fields of the pci_dev with values obtained
from firmware. Then all we need in probe.c is just to check that the resourceIt's not supposed to do that. Certainly, there were problems of that sort,
but hopefully they are in the past.Ivan.
-
In which case I will need to NAK the patch... Note that those Xserve
G4's still have the subtle issue that they -also- reassign bus
numbers :-) But that's going away the day I finally enable domainsRight now, we have code to completely build a pci_dev from the firmware
At this stage (but we are getting a bit OT), ppc has something like 3
different PCI code implementations :-) I do have some plans to fix that
by switching everybody to use pci_assign_unassigned_resources() and
friends but last time I tried, everything blew up :-) I suspect I'll
need a quirk or two in the generic code, but I'll let you know when I
get to it.Cheers,
Ben.-
Ok, I'll be happy to look into that :-)
Ivan.
-
Well, if it's going to cause a regression with machines that currently
work properly, I'll just drop it entirely. I would much rather not have
a machine work at all with Linux, than break other people's workingThat might work out well. Matthew, want to look into this for a
possible way to get your fix into the tree in a way that will not affect
others?thanks,
greg k-h
-
| H. Peter Anvin | Re: [rft] s2ram wakeup moves to .c, could fix few machines |
| Greg Kroah-Hartman | [PATCH 002/196] Chinese: rephrase English introduction in HOWTO |
| Ingo Molnar | [patch] PID namespace design bug, workaround |
| Tarkan Erimer | Re: Dual-Licensing Linux Kernel with GPL V2 and GPL V3 |
git: | |
| Eric Dumazet | Re: Multicast packet loss |
| Gerrit Renker | [PATCH 27/37] dccp: Integration of dynamic feature activation - part 2 (server side) |
| David Miller | [GIT]: Networking |
| Jarek Poplawski | Re: [PATCH] pkt_sched: Destroy gen estimators under rtnl_lock(). |
