This message contains a list of some regressions from 2.6.24 reported since
2.6.25-rc1 was released, for which there are no fixes in the mainline I know
of. If any of them have been fixed already, please let me know.If you know of any other unresolved regressions from 2.6.24, please let me know
either and I'll add them to the list. Also, please let me know if any of the
entries below are invalid.Listed regressions statistics:
Date Total Pending Unresolved
----------------------------------------
2008-03-22 159 35 31
2008-03-17 148 38 30
2008-03-16 146 42 35
2008-03-14 145 45 39
2008-03-12 143 51 41
2008-03-11 141 58 43
2008-03-10 138 66 47
2008-03-03 115 65 49
2008-02-25 90 51 39
2008-02-17 61 45 37Unresolved regressions
----------------------Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=9412
Subject : Since commit 'x86: enable iommu_merge by default' (948062683004d13ca21c8c05ac052d387978a449) 2.6 is no go on SB600 AHCI
Submitter : Srihari Vijayaraghavan <sriharivijayaraghavan@yahoo.com.au>
Date : 2007-11-19 14:43 (124 days old)Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=9962
Subject : mount: could not find filesystem
Submitter : Kamalesh Babulal <kamalesh@linux.vnet.ibm.com>
Date : 2008-02-12 14:34 (39 days old)
References : http://lkml.org/lkml/2008/2/12/91
Handled-By : Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Yinghai Lu <yhlu.kernel@gmail.com>Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=9976
Subject : BUG: 2.6.25-rc1: iptables postrouting setup causes oops
Submitter : Ben Nizette <bn@niasdigital.com>
Date : 2008-02-12 12:46 (39 days old)
References : http://lkml.org/lkml/2008/2/12/148
Handled-By : Haavard Skinnemoen <hskinnemoen@atm...
Hmm. I just tested mine, and it still works fine (current -git,
obviously). But as usual, I tend to run different kernel configs than most
people (no modules, only drivers that are actually useful). So I think
this needs bisecting or something.Linus
--
..
I wonder if this is related to the USB suspend/resume bug I see here?
One way to tell, is to rebuild the kernel with CONFIG_USB_SUSPEND=n.
If suspend/resume then works, we might well be looking at the same bug.-ml
--
Hi,
Okay, USB now works fine on resume after fixing the iaa_watchdog issue.
But twice now, once before that fix, and again just a minute ago,
my machine has failed completely to resume after suspend-to-RAM.Again, resume here is 100% reliable in 2.6.24 and earlier.
Symptoms: the usual:
-- 2.6.25-rc8 (and -rc7 earlier), plus USB fix.
-- not reproduceable on demand.
-- black screen of death -- no backlight, no kernel messages.
-- no hard drive activity.
-- no alt-sysrq-anything.
-- 5 second hold of the power button to poweroff and then recover.Not much info to go on, but it's worth knowing there's an issue here somewhere.
Cheers
--
..
Mmmm... Now that 2.6.25-rc* is usable here, there's another symptom
that's been happening regularly enough that it's got to be a regression
of some sort.This machine has an ATI X1400 video card, which doesn't work with any
open source X server that I know of. Maybe latest RadeonHD would work
but it didn't when I tried it in January.So I'm using the ATI binary fglrx X server, but without their kernel module,
so no 3D acceleration (fine.. only affects Google Earth, really).Now.. on 2.6.25, after a suspend/resume cycle (or three), the framebuffer
frequently starts going wonky. "snowy" pixels appear, and stay.
Just a moment ago here, the entire background changed to zebra stripes.And so on.. Peculiar stuff.
I'm wondering if something to do with video mode switching,
or video register save/restore, has changed since 2.6.24.
Because it's broken in 2.6.25, yet works fine in all earlier kernels.???
--
Yes, that's probably related.
However, I'm unable to reproduce the breakage. If you can do that, it would
probably help a lot if you identified the commit causing that to happen.I also wonder if this report might be related to it:
http://marc.info/?l=linux-acpi&m=120669137831005&w=4Thanks,
Rafael
--
..
Well, it would help then if you identified the commits which hacked
at the video mode switching. Then I can revert those and see what happens.
There's probably not many commits there, and since it doesn't do it consistently
(and sometimes the machine just doesn't resume at all with 2.6.25),
it is tricky to bisect.Works perfectly in 2.6.24, though, which is what I'm now running again.
Thanks
--
I guess this is related to http://bugzilla.kernel.org/show_bug.cgi?id=10319 .
I have no idea what can cause that to happen. It apparently is not related to
CONFIG_DRM (that actually may help if you have an Intel graphics adapter),
so it looks like an ACPI issue.Thanks,
Rafael
--
..
Something's weird there. I poked at my suspend/resume script again this morning,
and discovered that it did have some vbetool hacks to deal with textmode on earlier
kernels.So I wrapped those vbetool hacks with some logic to prevent them on 2.6.25,
and for now suspend/resume seem to be working with good video on -rc8.
It'll take a bit longer to know for sure, but by the end of today we'll see.cheers
--
I already tried this, without success:
$ grep CONFIG_USB_SUSPEND /boot/config-2.6.25-rc7
# CONFIG_USB_SUSPEND is not setRegards,
Tino
--
I just tested current -git (a9edadbf790d72adf6ebed476cb5caf7743e7e4a),
and it's still broken. I attached my kernel config.Regards,
Tino
This bug is alas an orphan. It is /not/ handled by me, because it is
not an IEEE 1394 subsystem bug and I have no idea what to do about it.Note the following:
- Several or all of ohci1394's MMIO reads return ~0 (all bits set
to one) --- or 0 --- where different values are expected.- In http://lkml.org/lkml/2008/2/23/244 we get to see a
WARNING: at arch/x86/mm/ioremap.c:137 __ioremap+0xa7/0x16a()
which is "WARN_ON_ONCE(page_is_ram(pfn));".
After that, the failures start.
But before that, "Unknown symbol" messages pop up when ohci1394
is loaded. These symbols are implemented by ieee1394 on which
ohci1394 depends....
MMIO read expected 0x00008400, got 0x00000000.
MMIO read got random bits in the interrupt event register.
The values for Max Packet and IR/IT contexts came from bogus MMIO reads.
Thomas, you wrote in http://lkml.org/lkml/2008/3/17/316 that the problem
resurfaced.
- Are the "Unknown symbol"s still there? These are not supposed to
happen.
- Is the "WARNING: at arch/x86/mm/ioremap.c" still there?
- Can you reproduce it without the atheros driver?Rafael, no matter how it looks, this is not "handled-by: me". :-)
I don't know nothing about kbuild nor about arch/x86/mm nor about WLAN.
The IEEE 1394 bits in the bug of which I know something about are purely
accidental.
--
Stefan Richter
-=====-==--- --== =-==-
http://arcgraph.de/sr/
--
I removed the ath driver, see bug
I moved ohci1394.ko + ieee1394.ko file manually to avoid automatic
loading. then i used insmod on those files and forgot to insmod ieee1394
Yes.Seems so.
--
Thanks. Summary from today's bugzilla comments:
No ioremap warning, but MMIO reads still give bogus values and let
ohci1394 fail. ohci1394 still got the MMIO region 0x1'0000'0000 -
0x1'0000'07ff, FWIW. The length of the region is correct, but its
contents bogus.
--
Stefan Richter
-=====-==--- --== =-==-
http://arcgraph.de/sr/
--
Can an MMIO region reside above 0x1'0000'0000 on x86-32? ... Apparently
yes, if CONFIG_RESOURCES_64BIT=y.From Thomas' dmesg:
http://bugzilla.kernel.org/attachment.cgi?id=15397BIOS-provided physical RAM map:
BIOS-e820: 0000000000000000 - 000000000009fc00 (usable)
BIOS-e820: 000000000009fc00 - 00000000000a0000 (reserved)
BIOS-e820: 00000000000ede00 - 0000000000100000 (reserved)
BIOS-e820: 0000000000100000 - 000000007f0c8000 (usable)
BIOS-e820: 000000007f0c8000 - 000000007f2c9000 (ACPI NVS)
BIOS-e820: 000000007f2c9000 - 000000007feb9000 (ACPI data)
BIOS-e820: 000000007feb9000 - 000000007feef000 (ACPI NVS)
BIOS-e820: 000000007feef000 - 000000007ff00000 (ACPI data)
BIOS-e820: 000000007ff00000 - 0000000080000000 (reserved)
BIOS-e820: 00000000e0000000 - 00000000f0000000 (reserved)
BIOS-e820: 00000000fec00000 - 00000000fec01000 (reserved)
BIOS-e820: 00000000fed14000 - 00000000fed1a000 (reserved)
BIOS-e820: 00000000fed1c000 - 00000000fed20000 (reserved)
BIOS-e820: 00000000fee00000 - 00000000fee01000 (reserved)
BIOS-e820: 00000000ffe00000 - 0000000100000000 (reserved)
1136MB HIGHMEM available.
896MB LOWMEM available.
...
ACPI: RSDP 000FE020, 0024 (r2 APPLE )
ACPI: XSDT 7FEFD120, 0074 (r1 APPLE Apple00 55 1000013)
ACPI: FACP 7FEFB000, 00F4 (r3 APPLE Apple00 55 Loki 5F)
ACPI: DSDT 7FEF0000, 48C0 (r1 APPLE MacBookP 10001 INTL 20050309)From Thomas' .config:
http://bugzilla.kernel.org/attachment.cgi?id=15396
CONFIG_RESOURCES_64BIT=yI have a Mac mini but run x86-64 on it. However, I've got another
i945GM based PC with x86-32 and three OHCI-1394 controllers in it.
I always had CONFIG_RESOURCES_64BIT switched off. I shall try
CONFIG_RESOURCES_64BIT=y on that PC.Thomas, did you use CONFIG_RESOURCES_64BIT=y already under Linux 2.6.24?
All, is there anything special that drivers need to take care of for
CONFIG_RESOURCES_64BIT=y?We use resource_size_t ohci_base to request the MMIO region in
drivers/ieee139...
Hmm. It would only work if PAE (HIGHMEM64G) is enabled too.
And obviously the hardware has to have working 64-bit BAR's.
AND no, I don't think our x86-32 ioremap() actually works for this case,
because while the resource data may have the full 64 bits, when the
ioremap() happens it gets truncated to 32 bits.Ingo/Thomas - should ioremap*() perhaps take "resource_size_t" or a "u64"
for the address (and then "__ioremap()" should probably take a PFN, not a
physical address, and that one can remain just a "unsigned long"?)Has anybody ever had a working 64-bit BAR on x86? Ivan? Maybe I'm missing
something..Linus
--
Hmm. No idea. I look into that on monday (tomorrow is family
Same here.
Thanks,
tglx
--
Does this patch make any difference?
(ENTIRELY untested, I checked that it compiles on x86-64, but didn't even
test a 32-bit build, I'm hoping whoever sees this issue can also fix up
the inevitable small missed pieces)Linus
---
arch/x86/mm/ioremap.c | 6 +++---
include/asm-x86/io_32.h | 6 +++---
include/asm-x86/io_64.h | 6 +++---
lib/iomap.c | 2 +-
4 files changed, 10 insertions(+), 10 deletions(-)diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
index 8fe576b..4afaba0 100644
--- a/arch/x86/mm/ioremap.c
+++ b/arch/x86/mm/ioremap.c
@@ -106,7 +106,7 @@ static int ioremap_change_attr(unsigned long vaddr, unsigned long size,
* have to convert them into an offset in a page-aligned mapping, but the
* caller shouldn't need to know that small detail.
*/
-static void __iomem *__ioremap(unsigned long phys_addr, unsigned long size,
+static void __iomem *__ioremap(resource_size_t phys_addr, unsigned long size,
enum ioremap_mode mode)
{
unsigned long pfn, offset, last_addr, vaddr;
@@ -193,13 +193,13 @@ static void __iomem *__ioremap(unsigned long phys_addr, unsigned long size,
*
* Must be freed with iounmap.
*/
-void __iomem *ioremap_nocache(unsigned long phys_addr, unsigned long size)
+void __iomem *ioremap_nocache(resource_size_t phys_addr, unsigned long size)
{
return __ioremap(phys_addr, size, IOR_MODE_UNCACHED);
}
EXPORT_SYMBOL(ioremap_nocache);-void __iomem *ioremap_cache(unsigned long phys_addr, unsigned long size)
+void __iomem *ioremap_cache(resource_size_t phys_addr, unsigned long size)
{
return __ioremap(phys_addr, size, IOR_MODE_CACHED);
}
diff --git a/include/asm-x86/io_32.h b/include/asm-x86/io_32.h
index 58d2c45..d4d8fbd 100644
--- a/include/asm-x86/io_32.h
+++ b/include/asm-x86/io_32.h
@@ -114,13 +114,13 @@ static inline void * phys_to_virt(unsigned long address)
* If the area you are trying to map is a PCI BAR you should have a
* look at pci_iomap().
*/
-extern...
Yes. With this patch applied the error messages doesn't appear anymore.
$ git describe
v2.6.25-rc6-268-gd2532dd$ git status
# On branch master
# Changed but not updated:
# (use "git add <file>..." to update what will be committed)
#
# modified: arch/x86/mm/ioremap.c
# modified: include/asm-x86/io_32.h
# modified: include/asm-x86/io_64.h
# modified: lib/iomap.c
#
# Untracked files:
# (use "git add <file>..." to include in what will be committed)
#
# arch/x86/boot/offsets.h
no changes added to commit (use "git add" and/or "git commit -a")See also attached dmesg.
[ Ahh, I was reading email in the wrong order, so I saw your other email
first. ]Does the firewire port work too?
Linus
--
Thomas, if you don't have anything to plug in, something like the
following log messages after module loading should be enough
confirmation that the issue is resolved:(new driver)
ACPI: PCI Interrupt 0000:03:03.0[A] -> GSI 19 (level, low) -> IRQ 19
firewire_ohci: Added fw-ohci device 0000:03:03.0, OHCI version 1.0
firewire_core: created device fw0: GUID 0017f2fffe66fb80, S400(or old driver)
ACPI: PCI Interrupt 0000:03:03.0[A] -> GSI 19 (level, low) -> IRQ 19
ohci1394: fw-host0: OHCI-1394 1.0 (PCI): IRQ=[19]
MMIO=[90000000-900007ff] Max Packet=[2048] IR/IT contexts=[8/8]
ieee1394: Host added: ID:BUS[0-00:1023] GUID[0017f2fffe66fb80]
--
Stefan Richter
-=====-==--- --== ==---
http://arcgraph.de/sr/
--
ioremap() fails now.
--
Stefan Richter
-=====-==--- --== ==---
http://arcgraph.de/sr/
--
64-bit ioremaps never worked on 32-bit, so we are in totally unchartered
waters now - but due to the unification we have a realistic chance to
make them work. At minimum we need the fix below in addition to Linus'
patch - does it make any difference?Ingo
---
arch/x86/mm/ioremap.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)Index: v/arch/x86/mm/ioremap.c
===================================================================
--- v.orig/arch/x86/mm/ioremap.c
+++ v/arch/x86/mm/ioremap.c
@@ -39,7 +39,7 @@ EXPORT_SYMBOL(__phys_addr);int page_is_ram(unsigned long pagenr)
{
- unsigned long addr, end;
+ resource_size_t addr, end;
int i;/*
@@ -109,7 +109,8 @@ static int ioremap_change_attr(unsigned
static void __iomem *__ioremap(resource_size_t phys_addr, unsigned long size,
enum ioremap_mode mode)
{
- unsigned long pfn, offset, last_addr, vaddr;
+ unsigned long pfn, offset, vaddr;
+ resource_size_t last_addr;
struct vm_struct *area;
pgprot_t prot;--
could you please try x86.git/latest:
http://people.redhat.com/mingo/x86.git/README
which has all the fixes and debug patches integrated (no extra patching
should be needed). If it still doesnt work then please send the new
dmesg and /sys/kernel/debug/kernel_page_tables dump.head 0fef904c33841be92f or later.
Ingo
--
The kernel boots, but hangs at "populating dev tree with devices usings
uevents". After removing both firewire drivers (ohci1394 and
firewire_ohci) the system comes up correctly.See attached file.
Modprobing either ohci1394 or firewire_ohci seems to lock up the system.
mfg
thomas
that's weird. If you do the modprobe from a VGA console and do a 'dmesg
-n 8', do you get any ioremap printk shortly before the hard lockup?Ingo
--
All i get is:
[ 1175.600768] ACPI: PCI Interrupt 0000:0c:03.0[A] -> GSI 19 (level,
low) -> IRQ 19no ioremap debug offset.
--
basically, old ioremap did this:
[ 162.485605] ACPI: PCI Interrupt 0000:0c:03.0[A] -> GSI 19 (level, low) -> IRQ 19
[ 162.485695] ioremap: 00000000(00000800) => f8978000the theory (fact?) was that the zero physical address there (the
'00000000') was some 4GB+ address truncated down to 32-bits.OTOH, before this system worked for you before, i start to suspect that
ioremap is a red herring here and that it's the code that gets to that
physical address (which is ioremap-ed) is at fault here.the hard hang might be your southbridge totally dumbfounded by the host
OS attempting to do an MMIO access to an above-4GB address?so the question is - what physical address did that ioremap do in 2.6.24
(which presumly had a working ohci1394, right?), and why did it change
to something else in -git?Ingo
--
See file attachments of this bug:
http://bugzilla.kernel.org/show_bug.cgi?id=10080
and compare lspci -vv from 2.6.25 and 2.6.24:
2.6.25:
0c:03.0 FireWire (IEEE 1394): Agere Systems FW323 (rev 61) (prog-if 10
[OHCI])
Subsystem: Agere Systems FW323
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- ParErr-
Stepping- SERR- FastB2B+ DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort-
<TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 248 (3000ns min, 6000ns max), Cache Line Size: 64 bytes
Interrupt: pin A routed to IRQ 19
Region 0: [virtual] Memory at 100000000 (32-bit, non-prefetchable)
[size=4K]
Capabilities: [44] Power Management version 2
Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA
PME(D0+,D1+,D2+,D3hot+,D3cold-)
Status: D0 PME-Enable- DSel=0 DScale=0 PME-
Kernel driver in use: ohci1394
Kernel modules: firewire-ohci, ohci13942.6.24:
0c:03.0 FireWire (IEEE 1394): Agere Systems FW323 (rev 61) (prog-if 10
[OHCI])
Subsystem: Agere Systems FW323
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- ParErr-
Stepping- SERR- FastB2B+ DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort-
<TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 248 (3000ns min, 6000ns max), Cache Line Size: 64 bytes
Interrupt: pin A routed to IRQ 19
Region 0: Memory at 8c000000 (32-bit, non-prefetchable) [size=4K]
Capabilities: [44] Power Management version 2
Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA
PME(D0+,D1+,D2+,D3hot+,D3cold-)
Status: D0 PME-Enable- DSel=0 DScale=0 PME+
Kernel driver in use: ohci1394
Maybe. Is it important that i have an core duo? (32 bit only - not a the
Is the lspci output sufficient for you?--
Ok, so it didn't use to be at the 4GB mark.
This seems to be a PCI and resource alloc issue. It would be really
interesting to see where the 4GB allocation started. Ie ignore anything
else (warnings, driver loadings etc), and _just_ look at lspci -vv output
for where the memory got allocated.Did you already bisect that and I just missed it?
Linus
--
I didn't bisect this.But I recompiled with CONFIG_RESOURCES_64BIT
*unset* and everything is fine now, i.e. no error occurs anymore.lspci -vv output:
0c:03.0 FireWire (IEEE 1394): Agere Systems FW323 (rev 61) (prog-if 10
[OHCI])
Subsystem: Agere Systems FW323
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop-
ParErr- Stepping- SERR- FastB2B+ DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort-
<TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 248 (3000ns min, 6000ns max), Cache Line Size: 64 bytes
Interrupt: pin A routed to IRQ 19
Region 0: Memory at 94406000 (32-bit, non-prefetchable) [size=4K]
Capabilities: [44] Power Management version 2
Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA
PME(D0+,D1+,D2+,D3hot+,D3cold-)
Status: D0 PME-Enable- DSel=0 DScale=0 PME+
Kernel driver in use: ohci1394
Kernel modules: firewire-ohci, ohci1394This is on ingo's git tree (branch: x86-latest) (commit
0fef904c33841be92f5ebdbdb9339dd11a133c92 - x86: ioremap of 64-bit
resource on 32-bit kernel fix)mfg
thomas
--
Note that 'Memory at 100000000' is totally bogus as this is 32-bit BAR
and respective PCI bridge is 32-bit either.Most interesting thing is that under 2.6.24 Tomas had
PCI: Bridge: 0000:00:1e.0
IO window: disabled.
MEM window: 8c000000-8c0fffff
PREFETCH window: disabled.but with 2.6.25
PCI: Bridge: 0000:00:1e.0
IO window: disabled.
MEM window: 0x00000000-0x000fffff
PREFETCH window: disabled.I'm sure that MEM window was actually 0x100000000-0x1000fffff as that
Yeah, it's resource issue for sure. The 00:1e.0 is a transparent bridge,
so I blame commit 8fa5913d54f3b1e09948e6a0db34da887e05ff1f
(PCI: remove transparent bridge sizing). It's wrong for two reasons:
- we cannot ignore standard windows of a transparent bridge as they
always positive decode, so they are potential source of address conflicts;
- that patch just broke whole bridge setup logic in unpredictable way.What confused me a lot initially is that the patch was already there
in 2.6.24. But I think that issue was somehow masked by 'unsigned
longs' used in setup-bus.c all over the place instead of resource_size_t,
which has been fixed by Ben in 2.6.25...
I think that's why Thomas has everything working again without
CONFIG_RESOURCES_64BIT.Thomas, can you put CONFIG_RESOURCES_64BIT=y back and either
revert commit 8fa5913d54f3b1e09948e6a0db34da887e05ff1f, or
just comment out these two lines in drivers/pci/setup-bus.c:if (bus->self->transparent)
break;and check if it helps?
Ivan.
--
alternatively, try x86.git/latest which has the revert below included.
Ingo
------------------>
Subject: pci: revert "PCI: remove transparent bridge sizing"
From: Ingo Molnar <mingo@elte.hu>
Date: Wed Mar 26 14:38:07 CET 2008revert commit 8fa5913d54f3b1e09948e6a0db34da887e05ff1f as it is wrong.
Requested by Ivan Kokshaysky.
Signed-off-by-if-Thomas-reports-success: Ingo Molnar <mingo@elte.hu>
---
drivers/pci/setup-bus.c | 5 -----
1 file changed, 5 deletions(-)Index: linux-x86.q/drivers/pci/setup-bus.c
===================================================================
--- linux-x86.q.orig/drivers/pci/setup-bus.c
+++ linux-x86.q/drivers/pci/setup-bus.c
@@ -486,12 +486,7 @@ void __ref pci_bus_size_bridges(struct p
break;case PCI_CLASS_BRIDGE_PCI:
- /* don't size subtractive decoding (transparent)
- * PCI-to-PCI bridges */
- if (bus->self->transparent)
- break;
pci_bridge_check_ranges(bus);
- /* fall through */
default:
pbus_size_io(bus);
/* If the bridge supports prefetchable range, size it
--
Good news everyone.
Everything works again.
lspci -vv:
0c:03.0 FireWire (IEEE 1394): Agere Systems FW323 (rev 61) (prog-if 10
[OHCI])
Subsystem: Agere Systems FW323
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop-
ParErr- Stepping- SERR- FastB2B+ DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort-
<TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 248 (3000ns min, 6000ns max), Cache Line Size: 64 bytes
Interrupt: pin A routed to IRQ 19
Region 0: Memory at 94400000 (32-bit, non-prefetchable) [size=4K]
Capabilities: [44] Power Management version 2
Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA
PME(D0+,D1+,D2+,D3hot+,D3cold-)
Status: D0 PME-Enable- DSel=0 DScale=0 PME+
Kernel driver in use: ohci1394
Kernel modules: firewire-ohci, ohci1394See also attached cfs-debug-info for config and dmesg.
I tried ingos's x86/latest:
$ git describe
v2.6.25-rc7-684-g74d77fethanks to everybody for the help.
thomas
As the author of the "PCI: remove transparent bridge sizing" change,
I apologize for the trouble it seems to have caused. The same change
had also exposed an issue reported by Paul Martin that has been causing
an Oops while hotplugging ThinkPads to a ThinkPad Dock II.
re: http://lkml.org/lkml/2008/2/19/405
http://bugzilla.kernel.org/show_bug.cgi?id=9961
I have a fix for the ThinkPad docking Oops but if the issue being
discussed here is caused by the transparent bridge sizing removal
change I totally agree that it should be reverted.The transparent bridge sizing removal change was motivated by
insufficient PCI memory resource for a transparent bridge window
that was being created as a result of expansion ROM(s) being included
in the transparent bridge sizing calculations. A later
"PCI: Remove default PCI expansion ROM memory allocation" change
( re: http://lkml.org/lkml/2007/12/11/361 ) removes the expansion
ROM(s) from the transparent bridge sizing calculations which actually
resolves the original issue in a different manner. So, even if the
"PCI: remove transparent bridge sizing" is not problematic it is
no longer needed anyway.Gary
--
Gary Hade
System x Enablement
IBM Linux Technology Center
503-578-4503 IBM T/L: 775-4503
garyhade@us.ibm.com
http://www.ibm.com/linux/ltc--
Ok, I reverted it, but the docking Oops is still interesting, in that it
implies that some piece of code wasn't exactly robust. I haven't seen the
patch, but it sounds like that should be fixed independently..Linus
--
Could be but without the sizing removal I doubt that the fix I
was going to propose (see below) would be the same. What I had
found is that the removal of the transparent bridge sizing was
leaving the resource record for at least region 7 (IO) of a
hot-added transparent bridge on the docking station in a state
that was not palatable with later executed code in pdev_sort_resource().
Even though the restoration of the transparent bridge sizing
corrects the problem, pdev_sort_resource() could probably use
some bulletproofing. I will take a look at this.Thanks,
Gary--
Gary Hade
System x Enablement
IBM Linux Technology Center
503-578-4503 IBM T/L: 775-4503
garyhade@us.ibm.com
http://www.ibm.com/linux/ltc--- linux-2.6.25-rc6/drivers/pci/probe.c.orig 2008-03-20 12:09:14.000000000 -0700
+++ linux-2.6.25-rc6/drivers/pci/probe.c 2008-03-20 12:11:29.000000000 -0700
@@ -328,6 +328,8 @@ void __devinit pci_read_bridge_bases(str
if (!res->end)
res->end = limit + 0xfff;
}
+ if (!res->start && dev->transparent)
+ res->flags = 0;res = child->resource[1];
pci_read_config_word(dev, PCI_MEMORY_BASE, &mem_base_lo);
@@ -339,6 +341,8 @@ void __devinit pci_read_bridge_bases(str
res->start = base;
res->end = limit + 0xfffff;
}
+ if (!res->start && dev->transparent)
+ res->flags = 0;res = child->resource[2];
pci_read_config_word(dev, PCI_PREF_MEMORY_BASE, &mem_base_lo);
@@ -373,6 +377,8 @@ void __devinit pci_read_bridge_bases(str
res->start = base;
res->end = limit + 0xfffff;
}
+ if (!res->start && dev->transparent)
+ res->flags = 0;
}static struct pci_bus * pci_alloc_bus(void)
--
Ahh, ok, so it was directly caused by simply not sizing and setting up the
Thanks,
Linus
--
Indeed.
This should prevent an oops in all cases.
Ivan.
---
PCI: improved sanity check for pdev_sort_resources()Prevent potential oops with unsized PCI bridge resources.
Signed-off-by: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
---
drivers/pci/setup-res.c | 7 +++----
1 files changed, 3 insertions(+), 4 deletions(-)diff --git a/drivers/pci/setup-res.c b/drivers/pci/setup-res.c
index 4be7ccf..fb57c8b 100644
--- a/drivers/pci/setup-res.c
+++ b/drivers/pci/setup-res.c
@@ -226,18 +226,17 @@ void pdev_sort_resources(struct pci_dev *dev, struct resource_list *head)
if (r->flags & IORESOURCE_PCI_FIXED)
continue;- r_align = r->end - r->start;
-
+ r_align = (i < PCI_BRIDGE_RESOURCES) ? r->end - r->start + 1 :
+ r->start;
if (!(r->flags) || r->parent)
continue;
- if (!r_align) {
+ if (r_align <= 1) {
printk(KERN_WARNING "PCI: Ignore bogus resource %d "
"[%llx:%llx] of %s\n",
i, (unsigned long long)r->start,
(unsigned long long)r->end, pci_name(dev));
continue;
}
- r_align = (i < PCI_BRIDGE_RESOURCES) ? r_align + 1 : r->start;
for (list = head; ; list = list->next) {
resource_size_t align = 0;
struct resource_list *ln = list->next;
--
Hmm. This makes the code look totally nonsensical.
It seems to come from the fact that we had a very odd way to check bogus
resources (namely "start == end"), but your code makes it _really_ hard to
see what is going on.In fact, I think the old code was buggy too, because we actually *do* have
single-byte resources where start == end, as showb by google:PCI: Ignore bogus resource 1 [3f6:3f6] of Symphony Labs SL82c105
PCI: Ignore bogus resource 3 [376:376] of Symphony Labs SL82c105where that resource actually looks valid, and should have a single byte
alignment! Admittedly I think it was created with a quirk (can you get
that kind of resource from actually _probing_ a PCI device?) but I do
think that a single-byte resource is valid.So I wonder if we shouldn't just make this a bit more readable and also a
bit more explicit with something like the appended..NOTE! This will also consider a bridge resource at 0 to be an invalid
resource (since now the alignment will be zero), which is a bit odd and
makes me worry a bit. I wouldn't be surprised if some non-PC architectures
have PCI bridges at zero. But maybe they should be (or already are?)
marked IORESOURCE_PCI_FIXED?Ben - the obvious "odd PCI bus resources" architecture would be POWER. Any
commentary?Linus
---
drivers/pci/setup-res.c | 21 +++++++++++++++++----
1 files changed, 17 insertions(+), 4 deletions(-)diff --git a/drivers/pci/setup-res.c b/drivers/pci/setup-res.c
index 4be7ccf..048ed77 100644
--- a/drivers/pci/setup-res.c
+++ b/drivers/pci/setup-res.c
@@ -211,6 +211,20 @@ int pci_assign_resource_fixed(struct pci_dev *dev, int resno)
EXPORT_SYMBOL_GPL(pci_assign_resource_fixed);
#endif+static inline resource_size_t get_resource_alignment(int resno, struct resource *r)
+{
+ resource_size_t start = r->start, end = r->end;
+ resource_size_t alignment = 0;
+
+ /* End == start == 0 - invalid resource */
+ if (end && start <= end) {
+ alignment = end -...
Good point - even though 1-byte size/alignment is invalid for a regular BAR
(minimum is 8 bytes for IO and 16 bytes for MEM, IIRC), nothing prevents
us from using this code for non-standard stuff, including single-byteWell, at this point (pdev_sort_resources call) bridge resource->start
has nothing to do with a bus address, it just a temporary storage for
required alignment, filled by sizing routines (and 0 is definitely invalid
here). I know, this is quite confusing, but I didn't want to add extra
fields to existing structures or create temporary per-bus trees...
But after pci_assign_resource() that resource can certainly be at 0,Extra 4 or 8 bytes per resource? Well, if you think that people won't start
complain too much about that, I'll be absolutely happy with that.If the new "align" field (and then, maybe, "size" instead of "end"?)
is OK, then I'm definitely willing to give it a try.Ivan.
--
In fact, this is the resource assignment code, and PCIBIOS_MIN_* is set
to non-0 for powerpc too so we don't try to assign things down at 0
anyway, so we don't have a problem. We only care about things at 0 that
have already been put there by the firmware and that we decide not to
reassign.So I don't have any objection anymore.
Cheers,
Ben.--
Adding an alignment field should be a non-issue: the size of this
structure is not likely to be a big deal (yeah, we have something like 12
of them in each PCI device etc, so smaller is better, but it's still not
going to be something anybody really notices).And yeah, it might be nice to have "size" instead of "end", but the real
problem with that one is actually that on 32-bit (without the 64-bit
resource configuration) we want it to be "size-1" in order to be able to
fit a whole 0-0xffffffff resource into a resource.And *that* would be really ugly with "size": at least right now it makes a
certain amount of sense with "end" pointing to the last entry.So while I can understand your wish for "start+len" rather than
"start+end", I don't think it's really practical.Linus
--
Actually, before we go any further, there might be a less intrusive
alternative: add just a couple of flags to the resource flags field (we
still have something like 8 unused bits on 32-bit), and use those to
implement a generic "resource_alignment()" routine.Two flags would do it:
- IORESOURCE_SIZEALIGN: size indicates alignment (regular PCI device
resources)- IORESOURCE_STARTALIGN: start field is alignment (PCI bus resources
during probing)and then the case of both flags zero (or both bits set) would actually be
"invalid", and we would also clear the IORESOURCE_STARTALIGN flag when we
actually allocate the resource (so that we don't use the "start" field as
alignment incorrectly when it no longer indicates alignment).That wouldn't be totally generic, but it would have the nice property of
automatically at least add sanity checking for that whole "res->start has
the odd meaning of 'alignment' during probing" and remove the need for a
new field, and it would allow us to have a generic "resource_alignment()"
routine that just gets a resource pointer.Linus
--
Sounds good to me. So here we go (completely untested, just for review).
Ivan.
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 2db2e4b..152ffa2 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -225,7 +225,7 @@ static void pci_read_bases(struct pci_dev *dev, unsigned int howmany, int rom)
res->flags |= l & ~PCI_BASE_ADDRESS_IO_MASK;
}
res->end = res->start + (unsigned long) sz;
- res->flags |= pci_calc_resource_flags(l);
+ res->flags |= pci_calc_resource_flags(l) | IORESOURCE_SIZEALIGN;
if (is_64bit_memory(l)) {
u32 szhi, lhi;@@ -278,7 +278,8 @@ static void pci_read_bases(struct pci_dev *dev, unsigned int howmany, int rom)
if (sz) {
res->flags = (l & IORESOURCE_ROM_ENABLE) |
IORESOURCE_MEM | IORESOURCE_PREFETCH |
- IORESOURCE_READONLY | IORESOURCE_CACHEABLE;
+ IORESOURCE_READONLY | IORESOURCE_CACHEABLE |
+ IORESOURCE_SIZEALIGN;
res->start = l & PCI_ROM_ADDRESS_MASK;
res->end = res->start + (unsigned long) sz;
}
diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
index f7cb8e0..5cf8456 100644
--- a/drivers/pci/setup-bus.c
+++ b/drivers/pci/setup-bus.c
@@ -65,6 +65,7 @@ static void pbus_assign_resources_sorted(struct pci_bus *bus)
res = list->res;
idx = res - &list->dev->resource[0];
if (pci_assign_resource(list->dev, idx)) {
+ /* FIXME: get rid of this */
res->start = 0;
res->end = 0;
res->flags = 0;
@@ -327,6 +328,7 @@ static void pbus_size_io(struct pci_bus *bus)
/* Alignment of the IO window is always 4K */
b_res->start = 4096;
b_res->end = b_res->start + size - 1;
+ b_res->flags |= IORESOURCE_STARTALIGN;
}/* Calculate the size of the bus and minimal alignment which
@@ -401,6 +403,7 @@ static int pbus_size_mem(struct pci_bus *bus, unsigned long mask, unsigned long
}
b_res->start = min_align;
b_res->end = size + min_align - 1;
+ b_res-&g...
Ivan, After adding a resource_alignment() prototype to
include/linux/ioport.h to get rid of
drivers/pci/setup-res.c: In function `pci_assign_resource':
drivers/pci/setup-res.c:141: error: implicit declaration of function `resource_alignment'
make[2]: *** [drivers/pci/setup-res.o] Error 1
make[1]: *** [drivers/pci] Error 2
I verified that pdev_sort_resources() longer encounters an Oops when
the ThinkPad is hot-added to the Dock II with the change that
exposed the problem (transparent bridge sizing removal) included.
So, although it is no suprise, I can at least say that the
pdev_sort_resources() bulletproofing has been test verified. :)However, while doing the above I did see some "bogus alignment
of resource" messages both during boot with ThinkPad attached to
the Dock II and during hot-add of the ThinkPad to the Dock II.
The messages are associated with the cardbus bridges on the
docking station. dmesg and lspci output collected after the
hot-add included below.Note that the
PCI: Cannot allocate resource region 7 of bridge 0000:02:03.0
PCI: Cannot allocate resource region 8 of bridge 0000:02:03.0
PCI: Cannot allocate resource region 9 of bridge 0000:02:03.0
messages associated with the transparent p2p bridge on the
docking station are _not_ new and come from
if (!r->start || !pr ||
request_resource(pr, r) < 0) {
printk(KERN_ERR "PCI: Cannot allocate "
"resource region %d "
"of bridge %s\n",
idx, pci_name(dev));
/*
* Something is wrong with the region.
* Invalidate the resource to prevent
* child resource allocations in this
* range.
*/
r->flags = 0;
}
in pcibios_allocate_bus_resources() [arch/x86/pci/i386.c] which
is visited during boot but not during the hot-add. This is where
the resource flags were cleared preventing the Oops from happening
during boot when the ThinkPad was already attached to the docking
station. This code should probably be modified to reduce dmesg
pollution.Will now try ...
Just tried it on an IBM x3850. No obvious problems or unexpected
messages spotted during/following boot or during/following hotplug
of PCI-X and PCIe cards.Gary
--
Gary Hade
System x Enablement
IBM Linux Technology Center
503-578-4503 IBM T/L: 775-4503
garyhade@us.ibm.com
http://www.ibm.com/linux/ltc
--
Thanks for testing. I've also tested an updated patch (with BUG_ON()
replaced by printk and a resource_alignment() prototype) on alpha
and x86 - works as expected.Ivan.
--
Don't do the BUG_ON(). That would just cause a broken machine, and makes
it much harder to report this issue. BUG_ON() should be used only for
totally unfixable things.In this case, the easy thing to do is to just return an error, possibly
with a printk() about bogus resources (ignoring it as a resource, the way
we do it in pdev_sort_resource()).But other than that, the thing doesn't look horrible.
Linus
--
PCI bridges at zero is perfectly valid indeed and I'm sure we have that
around at least for IO space. In fact, I'm surprised you don't have that
on x86. Typically, things like an HT segment with a P2P bridge and
behind that bridge an ISA bridge could well have the P2P bridge with a
resource forwarding 0...0x1000 IO downstream for example even on x86
no ? (I'm not -that- familiar with the crazyness of legacy ISA on x86
but I've definitely seen such setup on other archs).For MMIO, it mostly depends whether the code gets to work on raw bus
values, in which case 0 will be around, or already fixed up values (ie,
translated in CPU bus space) in which case 0 is unlikely.In the case of pdev_sort_resources(), it will manipulate already fixed
up resources, so MMIO should work, but I'm a bit worried by PIO.Ben.
--
x86 has memory there, always has had, probably always will.
Also, the reason I *think* this issue is ok is that I think the only PCI
bus resources we can see in the whole pdev_sort_resources() mess is the
ones that are behind the bus that we're not sizing for, and they've been
set up by pbus_size_mem().And pbus_size_mem() has this special magic setup where it calculates the
size and the alignment of the bus resource, and then makesr->start = alignment;
r->end = r->start + size - 1;so using "r->start" *should* be ok in this case because it really means
"alignment" in this one case.That said, I'm not going to be willing to bet my life on it.
I also wonder if we maybe should just add a separate "alignment" field to
the resources. Rather than playing games like these (and having to compare
the resource number to decide whether it is a bus resource or a normal PCI
device resource), just adding the dang field would be a whole lot saner.I dunno. I'm not going to do anything in this area before 2.6.25 is out
because this *does* make me a bit nervous, but if somebody wants to think
about this and perhaps write patches for testing, that would be good.And once more: Ivan, can you again double-check my blatherings above?
Linus
--
One way to kill off some of the assumptions and gunge would be to add
pci_resource_assigned(resource) [or indeed just resource_assigned()].
Iomap has similar problems - we have no portable defined "not mapped" at
the moment, although we use NULL technically mmio maps of 0 end up at 0
in the implementation today.Alan
--
There is IORESOURCE_UNSET... We could use that. I use it to some extent
on powerpc but x86 doesn't. Though I remember spotting a code path in
setup-res.c will not clear it when actually assigning the resource to a
bus. I can't remember if that hits in practice tho. I have a patch
anyway :-)----
[PATCH] pci: Make pci_assign_resource always clear IORESOURCE_UNSETFor bus resources pci_assign_resrouce() needs to also clear
IORESOURCE_UNSET. (For device resources, it's handled by
pci_update_resource).Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
---Index: linux-merge/drivers/pci/setup-res.c
===================================================================
--- linux-merge.orig/drivers/pci/setup-res.c 2007-12-13 13:06:27.000000000 +1100
+++ linux-merge/drivers/pci/setup-res.c 2007-12-13 13:06:50.000000000 +1100
@@ -167,7 +167,8 @@ int pci_assign_resource(struct pci_dev *
(unsigned long long)res->start, pci_name(dev));
} else if (resno < PCI_BRIDGE_RESOURCES) {
pci_update_resource(dev, res, resno);
- }
+ } else
+ res->flags &= ~IORESOURCE_UNSET;return ret;
}--
0..0x1000 physical memory (== bus memory on x86) is reserved to the BIOS
as RAM in essence and that legacy will be with us for at least 100 or
maybe 200 years ;-)Ingo
--
I was talking about IO not memory mostly here. MMIO wouldn't be a
problem on powerpc as I said because we offset MMIO resources early
after probe so that they contain effectively a CPU bus address, and in
that case, 0 is definitely not going to happen for PCI devices or busses
(even if it may on the bus, but the code we are talking about won't see
it).Cheers,
Ben.--
Yeah, low IO is also reserved on PC's (the low 256 IO ports are
motherboard resources and contain stuff like legacy DMA channel setup
etc).You could imagine having it behind a PCI bridge, but in practice it's
always on the NB/SB (and if you want to support some of the odder things
like the NMI reason and the i387 error ports, they pretty much have to
be - it would be insane to make a special PCI chips on a separate bus
that does things like that).Linus
--
Sure but can't that be in a kind of southbridge ? Like HT or PCIe
segment out of the CPU gets through a virtual P2P wich then hits the
"legacy" combo blob masquerading as a PCI device ? I remember seeing
that sort of thing in the past and I -think- it was some kind of x86It's often all virtual inside a single chip.
Anyway, doesn't matter much at this stage I suppose, but it would be
nice to not use 0 as meaning invalid when sizing bridge windows and I'm
not sure at all about using "start" as an alignment indicator neither...
It will be much over-aligned in some cases, adding constraints to the
allocator where we didn't have any before no ?Ben.
--
Hrm... we use "start" today already and it hasn't choked on me so far so
I suppose that's fine. I overlooked that ...Ben.
--
ah i was fixated on ioremap-ed memory resources - the PIO space is quite
colorful indeed.Ingo
--
On a PC system 0x00-0xFF are motherboard resources (sometimes chipset,
some even swallowed by the CPU in certain cases) so 0 as disabled is sort
of safe but as shown by the pci_enable_device_bars replacement code - not
a good idea neccessarily.A lot of driver code does assume 0 == unavailable/off/disabled including
large chunks of serial, ata, ide and probably other subsystems.Alan
--
True, and we do try to avoid allocating resources at 0 most of the time,
but it happens, and in the case of bridges, it can make somewhat sense
(more in fact than for devices I'd say).Cheers,
Ben.--
no need to apologize, bugs happen. Could we perhaps add a WARN_ON()
somewhere to detect the nonsensical (for that particular hardware)
resource setup?Ingo
--
The root of the problem is again HW vendors getting wild I think...
Some bridges are fully transparent, some have explicit decode windows
but are also transparent for whatever else and we definitely don't deal
with it.Ben.
--
The Intel 94x chipsets must be OK wrt >4Gb MMIO accesses, as far as I know.
You should just get all 1s on read of non-existent MMIO address, as usual.
Not a hard hang.So I suspect something is still broken in 64-bit ioremap... It would be nice
to sort that out, even though it seems to be unrelated to this particular
problem.Ivan.
--
Just to clarify, in case the thread of discussion caused any confusion:
I only have an unaffected system which never maps MMIO space above 4G.
So far only Thomas M.'s system did this.
--
Stefan Richter
-=====-==--- --== ==--=
http://arcgraph.de/sr/
--
[...]
On my x86-32 test system, none of the devices gets an MMIO region above
the 4G map. (Reconfigured with CONFIG_RESOURCES_64BIT=y while having
CONFIG_HIGHMEM4G=y, like in Thomas M's .config.)So the only thing I can confirm is that the patch builds and runs fine
on x86-64.
--
Stefan Richter
-=====-==--- --== =-===
http://arcgraph.de/sr/
--
64-bit ioremap on x86 never worked AFAIK, the whole API is using 32-bit
physical addresses which is limited to 4GB. But ... this bug does have
the feeling of the wrong IO side physical memory being mapped. (0xff
data means there's no device there - typical of non-existent RAM/IO)could you please pull the x86/latest tree to do a kernel pagetable dump
for us? You can pull it via the instructions here:http://people.redhat.com/mingo/x86.git/README
then enable CONFIG_DEBUG_KERNEL=y and CONFIG_X86_PTDUMP=y boot the
kernel so that ohci1394 doesnt work. Then dump the pagetables and other
hw info via:http://people.redhat.com/mingo/cfs-scheduler/tools/cfs-debug-info.sh
send us the cfs-debug-info file that gets generated. We'll be able to
tell from that whether your ioremap's are OK. (x86/latest will also
printk all ioremap()s and results that are done)thanks,
Ingo
--
See attached file.
could you send the full bootlog too? (the dmesg in the cfs-debug-info
log is truncated - probably CONFIG_LOG_BUF_SHIFT is too low?)the bootlog should have such entries:
ioremap: 0x12340000(0x00001000) => 0x12340000
near the ohci1394 initialization messages.
Ingo
--
Second try. See attached file.
Hmm. This isn't with my suggested patch, is it?
You are still losing the high 32 bits, and we see:
[ 162.485695] ioremap: 00000000(00000800) => f8978000
<3>ohci1394: fw-host0: Get PHY Reg timeout [0x00008400/0x00000000/100]because it's trying to ioremap the resource that is at 0x100000000, and it
gets truncated to 0x00000000.I already committed my patch as "obviously correct", so if you're a git
user, you can just update to current git, but here it is again if you
missed it and aren't a git person.Linus
---
commit b9e76a00749521f2b080fa8a4fb15f66538ab756
Author: Linus Torvalds <torvalds@linux-foundation.org>
Date: Mon Mar 24 11:22:39 2008 -0700x86-32: Pass the full resource data to ioremap()
It appears that 64-bit PCI resources cannot possibly ever have worked on
x86-32 even when the RESOURCES_64BIT config option was set, because any
driver that tried to [pci_]ioremap() the resource would have been unable
to do so because the high 32 bits would have been silently dropped on
the floor by the ioremap() routines that only used "unsigned long".Change them to use "resource_size_t" instead, which properly encodes the
whole 64-bit resource data if RESOURCES_64BIT is enabled.Acked-by: H. Peter Anvin <hpa@kernel.org>
Acked-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
---
arch/x86/mm/ioremap.c | 6 +++---
include/asm-x86/io_32.h | 6 +++---
include/asm-x86/io_64.h | 6 +++---
lib/iomap.c | 2 +-
4 files changed, 10 insertions(+), 10 deletions(-)diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
index 8fe576b..4afaba0 100644
--- a/arch/x86/mm/ioremap.c
+++ b/arch/x86/mm/ioremap.c
@@ -106,7 +106,7 @@ static int ioremap_change_attr(unsigned long vaddr, unsigned long size,
* have to convert them into an...
Misses your patch (email problems yesterday).
I suggest you merge the one bulk-fixing iomap and then apply your x86
ioremap fixup on top of it, if not too late..Cheers,
--
While at it, here's an updated version of my patch that applies on top
of yesterday git. I also turned "len" into a resource_size_t as it's
technically the right type to use in the generic code, even if it makes
little sense to try to map more than 32 bits on a 32 bits arch :-) Maybe
we should actually add some kind of warning here or error checking
though rather than just cropping the value.I "fixed" all archs even those that don't technically need it (like real
64 bits one) for consistency (and who knows if they ever want to make
resource_size_t even bigger ...). The patch actually fixes at least one
bug in the case of arch/ppc which needs that change.
---iomap: Fix 64 bits resources on 32 bits
Almost all implementations of pci_iomap() in the kernel, including
the generic lib/iomap.c one, copies the content of a struct resource
into unsigned long's which will break on 32 bits platforms with 64
bits resources.This fixes all definitions of pci_iomap() to use resource_size_t. I
also "fixed" the 64bits arch for consistency.Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
---Index: linux-work/lib/iomap.c
===================================================================
--- linux-work.orig/lib/iomap.c 2008-03-26 10:39:25.000000000 +1100
+++ linux-work/lib/iomap.c 2008-03-27 10:20:14.000000000 +1100
@@ -257,7 +257,7 @@ EXPORT_SYMBOL(ioport_unmap);
void __iomem *pci_iomap(struct pci_dev *dev, int bar, unsigned long maxlen)
{
resource_size_t start = pci_resource_start(dev, bar);
- unsigned long len = pci_resource_len(dev, bar);
+ resource_size_t len = pci_resource_len(dev, bar);
unsigned long flags = pci_resource_flags(dev, bar);if (!len || !start)
Index: linux-work/arch/alpha/kernel/pci.c
===================================================================
--- linux-work.orig/arch/alpha/kernel/pci.c 2007-09-28 11:42:04.000000000 +1000
+++ linux-work/arch/alpha/kernel/pci.c 2008-03-27 10:19:59.000000000 +1100
@@ -547,8 +547,8 @@ sys_...
resource_size_t seems to make sense here. In the case when we have
64-bit resources but no PAE support we should error our when the64-bit BAR, certainly. 64-bit BAR with a value beyond 4 GB I would find
questionable at best... I suspect the answer is "no".-hpa
--
Opteron system with coprocessor on socket or HTX slot. will use 64bit
BAR above 4g and size is more than 4g too.YH
--
And it's working, with a 32-bit kernel?
(64-bit, no problem, obviously.)
-hpa
--
don't think so.
YH
--
Would you mind trying it out, with Linus' patch?
-hpa
--
myles may test that with 32 bit kernel on his setup.
YH
--
H. Peter Anvin wrote at
Hmm, right -- I didn't see this until now. Today's dmesg:
http://bugzilla.kernel.org/attachment.cgi?id=15397&action=view
[ 1.236587] firewire_ohci: Failed to remap registers
[ 243.640549] ohci1394: fw-host0: Get PHY Reg timeout
(etc.)However, the two drivers for the same device don't seem to be the
problem. Looks like firewire-ohci was attempted to be bound to the
controller much earlier than ohci1394. The error message means that
firewire-ohci's pci_request_region() succeeded but pci_iomap() failed,
hence the pci_driver.probe failed, hence firewire-ohci wasn't bound to
the device, hence subsequent loading of ohci1394 (manually, I presume)
was a valid action.IOW firewire-ohci was indeed already loaded, but not bound to the device
because of the .probe failure; and ohci1394 was loaded much later.Same thing in the report in February:
http://lkml.org/lkml/2008/2/23/244
[ 1.326958] firewire_ohci: Failed to remap registers
[ 856.943807] ohci1394: fw-host0: Get PHY Reg timeout
(here: ohci1394 manually loaded by insmod)(Let's see if bugme-daemon captures this...)
--
Stefan Richter
-=====-==--- --== =-==-
http://arcgraph.de/sr/
--
On other platforms however, you can have mmio above 32 bits without
support for 64 bits BARs: the entire PCI bus mmio region can be mapped
up there.That happens for example with 4xx embedded PowerPC. We deal with it just
fine, provided that nothing tries to stick a resource value into an
unsigned long but uses resource_size_t instead. Unfortunately, it's a
common bug, I've fixing drivers regulary.It also appears that the iomap code on various archs is buggy too,
including the generic lib/iomap.c, thus pci_iomap is broken for archs
that can have 64 bits resources and use the version in lib/.The good news is that I have done a patch to fix it, which has been
in -mm for about a month: iomap-fix-64-bits-resources-on-32-bits.patchCheers,
Ben.--
does it support if size is 4G above, like 256G?
YH
--
The patch just fixes pci_iomap to use resource_size_t, which fixes bugs
with MMIO addresses/size being cropped. It doesn't fix some limitations
we still, I think, have in the PCI code with 64 bits BARs on 32 bits
platforms (that we could, in fact, probably remove when resource_size_t
is 64 bits but we haven't done it yet, at least as far as I remember).Ben.
--
Yes, I tend to forget about it. I'll do my best to remember.
Thanks,
Rafael
--
This one seems to imply that the workaround approved by AMD/ATI didn't
work...(AMD folks CC'd...)
Jeff
--
note, the 'Subject' is wrong, we resolved the original regression on
2007-11-26, 7 days after it was reported. This is about a bug with
similar symptoms, reopened 10 days ago. Jeff, could you please change
the bugzilla entry so that it reflects that it's an SB600 problem?Ingo
--
This appears to be a race in user space that people have been hitting
for some time, but has gotten more likely with the latest kernel. It
matches the behavior of a gnome dbus bug [
http://bugzilla.gnome.org/show_bug.cgi?id=395488 ]. The way to avoid
that bug is to install the dbus-x11 package which forces dbus to start
up earlier, avoiding the race.Though the original reporter hasn't replied, Christoph Hellwig hit the
same issue and was able to confirm that installing the dbus-x11
package avoids the issue, papering over bug, wherever it may lie.
Regardless, there are reports of this issue that go back before
2.6.25, so I don't think this is a regression, just a timing issue
that's a lot easier to hit with the latest kernel.This is one of those things that'd be nice to have in a Known Issues
document in the kernel release. "Debian and derivative distributions
may need to install dbus-x11 package to avoid a known userspace issue
in the dbus package. [gnome bug 395488]"
--
Yes, I have answered in the bugzilla that installing dbus-x11
fixes/hides the problem for me:
http://bugzilla.kernel.org/show_bug.cgi?id=10041#c15Should I have also posted it here? If so, sorry (but in this case, why
not set up auto-forwarding to l-k of the bugzilla entries related to the
Rafael regression list?)Romano
--
Sorry for the disclaimer --- ¡I cannot stop it!--
La presente comunicación tiene carácter confidencial y es para el exclusivo uso del destinatario indicado en la misma. Si Ud. no es el destinatario indicado, le informamos que cualquier forma de distribución, reproducción o uso de esta comunicación y/o de la información contenida en la misma están estrictamente prohibidos por la ley. Si Ud. ha recibido esta comunicación por error, por favor, notifíquelo inmediatamente al remitente contestando a este mensaje y proceda a continuación a destruirlo. Gracias por su colaboración.This communication contains confidential information. It is for the exclusive use of the intended addressee. If you are not the intended addressee, please note that any form of distribution, copying or use of this communication or the information in it is strictly prohibited by law. If you have received this communication in error, please immediately notify the sender by reply e-mail and destroy this message. Thank you for your cooperation.
--
No, you did fine, it was my mistake; I forgot to check bugzilla before
posting. Thanks for testing.
--
Okay, I closed the bug (resolution WILL_NOT_FIX).
Thanks,
Rafael
--
http://kernelnewbies.org/Linux_2_6_25 seems to be the right place for
such stuff.Diego is for quite some time doing a great job with these changes pages,
and the big penguin might perhaps be convinced to mention it in his
announcement of the final 2.6.25 kernel?cu
Adrian--
"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed--
Probably fixed - hardware limit on the SB600 shown up by merging I/Os
--
For this one a patch is available in -mm. Qouted patch can be seen
via the lkml.org link above ;)
--
Well, I tried to find a link to this patch, but I gave up after some time.
Please attach it to bug #10271 or put the link to it in there.
Thanks,
Rafael
--
--
btw., this one appears to be an accounting artifact: two regressions
with similar (same?) symptoms in the same bugzilla. The first regression
was resolved after 7 days, the second was reopened 10 days ago. (It
could easily be about the same underlying (hw?) problem, but the
regression window for users was very small.)Ingo
--
Hm, perhaps we should create another bugzilla entry for the problem
described in http://bugzilla.kernel.org/show_bug.cgi?id=9412#c5.If that turns out to be a duplicate of bug #9412, it'll be easy to close.
Otherwise, it's just _very_ confusing.Thanks,
Rafael
--
| Davide Libenzi | Re: [patch 7/8] fdmap v2 - implement sys_socket2 |
| Bart Van Assche | Integration of SCST in the mainstream Linux kernel |
| Greg Kroah-Hartman | [PATCH 005/196] Chinese: add translation of SubmittingDrivers |
| Mariusz Kozlowski | [KJ PATCHES] mostly kmalloc + memset conversion to k[cz]alloc |
git: | |
| KOSAKI Motohiro | [bug?] tg3: Failed to load firmware "tigon/tg3_tso.bin" |
| Stefan Richter | Re: [GIT]: Networking |
| David Miller | Re: [PATCH] pkt_sched: Destroy gen estimators under rtnl_lock(). |
| Gerrit Renker | [PATCH 0/37] dccp: Feature negotiation - last call for comments |
