Re: [Sparc64 BUG] hangup under booting 2.6.22.X whith Sparc64

Previous thread: [PATCH 0/13] Reduce external fragmentation by grouping pages by mobility v30 by Mel Gorman on Monday, September 10, 2007 - 7:20 am. (21 messages)

Next thread: [RESEND][PATCH 0/4] Virtual Machine Time Accounting by Laurent Vivier on Monday, September 10, 2007 - 8:02 am. (7 messages)
To: <linux-kernel@...>, <stable-bounces@...>, <davem@...>
Date: Monday, September 10, 2007 - 7:42 am

Hi!

(please CC)

We have SUN fire v100 servers and the server hung up with the stable
2.6.22 kernels. The server booting fine with 2.6.21 serie, but wiht
2.6.20 serie same error with 2.6.22.

This is the serial terminal dump:

Sun Fire V100 (UltraSPARC-IIe 648MHz), No Keyboard
OpenBoot 4.0, 1024 MB memory installed, Serial #53829668.
Ethernet address 0:3:ba:35:60:24, Host ID: 83356024.

Executing last command: boot
Boot device: /pci@1f,0/ide@d/disk@0,0:a File and args:
SILO Version 1.4.13
boot: teszt
Allocated 8 Megs of memory at 0x40000000 for kernel
Loaded kernel version 2.6.22

Remapping the kernel... done.
Booting Linux...

-----------

lspci:
00:00.0 Host bridge: Sun Microsystems Computer Corp. Ultra IIe
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
ParErr+ Stepping- SERR+ FastB2B-
Status: Cap- 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort-
<TAbort- <MAbort+ >SERR- <PERR-
Latency: 40
Interrupt: pin ? routed to IRQ 00000001

00:03.0 Non-VGA unclassified device: ALi Corporation M7101 Power
Management Controller [PMU]
Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop-
ParErr- Stepping- SERR- FastB2B-
Status: Cap- 66MHz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort-
<TAbort- <MAbort- >SERR- <PERR-

00:05.0 Ethernet controller: Davicom Semiconductor, Inc. 21x4x
DEC-Tulip compatible 10/100 Ethernet (rev 31)
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
ParErr+ Stepping- SERR+ FastB2B-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort-
<TAbort- <MAbort- >SERR- <PERR-
Latency: 160 (5000ns min, 10000ns max)
Interrupt: pin A routed to IRQ 0000000a
Region 0: I/O ports at 1fe02010100 [size=256]
Region 1: Memory at 1ff00002000 (32-bit, non-prefetchable) [size=256]
Expansion ROM at 1ff00100000 [disabled] [size=256K]
Capabilities: [50] Power Management version 2
Flags: PMEClk- DSI+ D1- D2- AuxCurrent=22...

To: <kovedi@...>
Cc: <linux-kernel@...>, <stable-bounces@...>, <davem@...>
Date: Monday, September 10, 2007 - 8:36 am

From: "Kövedi_Krisztián" <kovedi@gmail.com>

Please add "-p" to the boot command line so we can see
if any other useful messages are printed by the kernel.

There is a period between this last message and the
console initialization, which can be quite long and the
"-p" option forces all messages to be printed via the
firmware.
-

To: David Miller <davem@...>
Cc: <linux-kernel@...>, <stable-bounces@...>, <davem@...>
Date: Tuesday, September 11, 2007 - 1:33 am

2007/9/10, David Miller <davem@davemloft.net>:

To: <kovedi@...>
Cc: <linux-kernel@...>, <stable-bounces@...>, <davem@...>
Date: Tuesday, September 11, 2007 - 11:38 am

I think this patch should fix the problem.

Please give it a try, thanks.

diff --git a/arch/sparc64/kernel/pci.c b/arch/sparc64/kernel/pci.c
index 139b4cf..e8dac81 100644
--- a/arch/sparc64/kernel/pci.c
+++ b/arch/sparc64/kernel/pci.c
@@ -744,7 +744,7 @@ static void __devinit pci_of_scan_bus(struct pci_pbm_info *pbm,
{
struct device_node *child;
const u32 *reg;
- int reglen, devfn;
+ int reglen, devfn, prev_devfn;
struct pci_dev *dev;

if (ofpci_verbose)
@@ -752,14 +752,25 @@ static void __devinit pci_of_scan_bus(struct pci_pbm_info *pbm,
node->full_name, bus->number);

child = NULL;
+ prev_devfn = -1;
while ((child = of_get_next_child(node, child)) != NULL) {
if (ofpci_verbose)
printk(" * %s\n", child->full_name);
reg = of_get_property(child, "reg", &reglen);
if (reg == NULL || reglen < 20)
continue;
+
devfn = (reg[0] >> 8) & 0xff;

+ /* This is a workaround for some device trees
+ * which list PCI devices twice. On the V100
+ * for example, device number 3 is listed twice.
+ * Once as "pm" and once again as "lomp".
+ */
+ if (devfn == prev_devfn)
+ continue;
+ prev_devfn = devfn;
+
/* create a new pci_dev for this device */
dev = of_create_pci_dev(pbm, child, bus, devfn, 0);
if (!dev)
-

To: David Miller <davem@...>
Cc: <linux-kernel@...>, <stable-bounces@...>, <davem@...>
Date: Wednesday, September 12, 2007 - 3:47 am

Thanks
The patch work fine the kernel booting up without error messages.

when i patch the kernel become this error message:

patch -p1 < ~/2.6.22.6.patch
patching file arch/sparc64/kernel/pci.c
Hunk #1 FAILED at 744.
Hunk #2 FAILED at 752.
2 out of 2 hunks FAILED -- saving rejects to file arch/sparc64/kernel/pci.c.rej

but i add manual this patch the kernel and it works.
-

To: <kovedi@...>
Cc: <linux-kernel@...>
Date: Wednesday, September 12, 2007 - 4:11 am

From: "K

To: David Miller <davem@...>
Cc: <kovedi@...>, <linux-kernel@...>, Greg KH <gregkh@...>, Linus Torvalds <torvalds@...>, Andrew Morton <akpm@...>
Date: Thursday, September 13, 2007 - 8:59 am

Hi David!

this patch releated for 2.6.22.y too.
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commi...
we (Krisz and me) tested with 2.6.22.6, the server works and no errors
become for now.

--
Thanks,
Oliver
-

To: Oliver Pinter <oliver.pntr@...>
Cc: David Miller <davem@...>, <kovedi@...>, <linux-kernel@...>, Linus Torvalds <torvalds@...>, Andrew Morton <akpm@...>
Date: Thursday, September 13, 2007 - 10:42 am

Are you asking for this patch to go into the 2.6.22-y tree? If so,
please send it to the stable@kernel.org address so we know to add it to
the queue.

thanks,

greg k-h
-

To: <gregkh@...>
Cc: <oliver.pntr@...>, <kovedi@...>, <linux-kernel@...>, <torvalds@...>, <akpm@...>
Date: Thursday, September 13, 2007 - 5:54 pm

From: Greg KH <gregkh@suse.de>

I'll do that after I land in Seattle in about 7 hours and
do some testing of the patch.
-

To: <kovedi@...>
Cc: <linux-kernel@...>, <stable-bounces@...>, <davem@...>
Date: Tuesday, September 11, 2007 - 7:13 am

From: "K

Previous thread: [PATCH 0/13] Reduce external fragmentation by grouping pages by mobility v30 by Mel Gorman on Monday, September 10, 2007 - 7:20 am. (21 messages)

Next thread: [RESEND][PATCH 0/4] Virtual Machine Time Accounting by Laurent Vivier on Monday, September 10, 2007 - 8:02 am. (7 messages)