(ccing linux-ide) Tejun, another one of these reset issues? --
Yeah, looks like it. I just sent the patch for #upstream-fixes and will forward it to -stable once it gets into #upstream-fixes. http://article.gmane.org/gmane.linux.ide/34077 Thanks. -- tejun --
I have this patch actually applied and had switched off the computer afterwards completely for more than one hour two times. Each time it booted then (this was the patch you suggested initially to Many Maxwell this month to this list). Everything seems to work fine, but my dmesg and /var/log/messages is flooded with this now: Aug 29 23:20:39 zappa ata1: EH complete Aug 29 23:20:41 zappa ata1: EH complete Aug 29 23:20:47 zappa ata1: EH complete Aug 29 23:20:49 zappa ata1: EH complete Aug 29 23:20:51 zappa ata1: EH complete Aug 29 23:20:53 zappa ata1: EH complete Aug 29 23:20:55 zappa ata1: EH complete Aug 29 23:20:56 zappa ata1: EH complete Aug 29 23:20:59 zappa ata1: EH complete Aug 29 23:21:01 zappa ata1: EH complete I am curious if it boots tomorrow after sleeping for one night :-P Regards, Konsti -- GPG KeyID EF62FCEF Fingerprint: 13C9 B16B 9844 EC15 CC2E A080 1E69 3FDA EF62 FCEF --
Hmm... Can you post full dmesg output? We used to see things like above when ATAPI CHECK SENSE handling somehow failed to tell EH that it was an exception not worth whining about. Maybe EH action mask is not being I somehow feel pretty optimistic about that part. :-) -- tejun --
Of course :-) dmesg is attached with patch applied. At the end of the patch it (of course) continues, but only with: ata1: EH complete ata1: EH complete ata1: EH complete Hmn, to take my pants down entirely: What is this "EH"? And how does the change of the .reset function affect this? May be, I You are right, it started immediately without a hitch this morning after sleeping entirely for a couple of hours. Kind Regards, Konsti -- GPG KeyID EF62FCEF Fingerprint: 13C9 B16B 9844 EC15 CC2E A080 1E69 3FDA EF62 FCEF
Your controller is raising exception continuously for some reason. Have no idea yet. Can you please apply the attached patch and post the resulting dmesg? -- tejun
Of course. Patch applied, here is dmesg output, looks interesting now - but anything works fine so far. Regards, Konsti -- GPG KeyID EF62FCEF Fingerprint: 13C9 B16B 9844 EC15 CC2E A080 1E69 3FDA EF62 FCEF
(cc'ing linux-ide) I think it's more circa 2.6.22 but in my memory terms, which sucks, Hmm... someone is scheduling EH incessantly without any error or action set. Can you please try the attached patch? -- tejun
I did and put here all dmesg gave me. Do you want to get the whole stuff from the beginning? /var/log/messages has ist, but it is 416k, I put it on a ftp server somewhere out there then. Or is something else recommended? Konsti -- GPG KeyID EF62FCEF Fingerprint: 13C9 B16B 9844 EC15 CC2E A080 1E69 3FDA EF62 FCEF
The excerpt is fine but please turn on CONFIG_KALLSYMS. The stack dump is pretty much meaningless without it. Thanks. -- tejun --
Yea, of course... my bad. First I wondered to turn this on or deliver ;-) -- GPG KeyID EF62FCEF Fingerprint: 13C9 B16B 9844 EC15 CC2E A080 1E69 3FDA EF62 FCEF
I'm seeing a similar problem after upgrading from 2.6.25.14-108.fc9.x86_64 to 2.6.27-rc6. From what I can tell the messages Sep 23 10:27:20 pangw kernel: ata6: EH pending after 5 tries, giving up Sep 23 10:27:20 pangw kernel: ata6: EH complete are printed for a disconnected ATA port that's a neighbor of one that's occupied. these are the ports that are in use: Sep 23 10:19:40 pangw kernel: ata1.00: ATA-6: ST3160023A, 3.01, max UDMA/100 Sep 23 10:19:40 pangw kernel: ata2.00: ATAPI: _NEC DVD_RW ND-3550A, 1.05, max UDMA/33 [PATA] Sep 23 10:19:40 pangw kernel: ata5: SATA link up 3.0 Gbps (SStatus 123 SControl 300) The WDC drive was previously on ata3 and the message were then printed for ata4. This makes me thunk it might be related to managing of dual-ported sata_nv chips? before: Sep 21 19:14:10 pangw kernel: sata_nv 0000:00:08.0: PCI INT A -> Link[APSJ] -> GSI 20 (level, low) -> IRQ 20 Sep 21 19:14:10 pangw kernel: scsi2 : sata_nv Sep 21 19:14:10 pangw kernel: scsi3 : sata_nv Sep 21 19:14:10 pangw kernel: ata3: SATA max UDMA/133 cmd 0x9f0 ctl 0xbf0 bmdma 0xcc00 irq 21 Sep 21 19:14:10 pangw kernel: ata4: SATA max UDMA/133 cmd 0x970 ctl 0xb70 bmdma 0xcc08 irq 21 Sep 21 19:14:10 pangw kernel: ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300) Sep 21 19:14:10 pangw kernel: ata3.00: ATA-7: WDC WD1600JS-60MHB1, 10.02E02, max UDMA/100 Sep 21 19:14:10 pangw kernel: ata3.00: 312581808 sectors, multi 16: LBA48 Sep 21 19:14:10 pangw kernel: ata3.00: configured for UDMA/100 Sep 21 19:14:10 pangw kernel: scsi 2:0:0:0: Direct-Access ATA WDC WD1600JS-60M 10.0 PQ: 0 ANSI: 5 Sep 21 19:14:10 pangw kernel: sd 2:0:0:0: [sdb] 312581808 512-byte hardware sectors (160042 MB) Sep 21 19:14:10 pangw kernel: sd 2:0:0:0: [sdb] Write Protect is off Sep 21 19:14:10 pangw kernel: sd 2:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA Sep 21 19:14:10 pangw kernel: sd 2:0:0:0: [sdb] 312581808 512-byte hardware sectors (160042 MB) Sep 21 19:14:10 pangw ...
Hey, I did not realize this yet! My only SATA device seems to be connected to Port2: Sep 24 07:12:47 zappa ata2: SATA max UDMA/133 cmd 0xe80 ctl 0xe00 bmdma 0xd808 irq 21 Sep 24 07:12:47 zappa ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300) Sep 24 07:12:47 zappa ata2.00: ATA-7: SAMSUNG HD753LJ, 1AA01106, max UDMA7 Sep 24 07:12:47 zappa ata2.00: 1465149168 sectors, multi 16: LBA48 NCQ (depth 0/32) Sep 24 07:12:47 zappa ata2.00: configured for UDMA/133 Whereas my message log is still flooded by ata1 stuff: Sep 21 11:34:57 zappa ata1: EH complete Sep 21 11:34:57 zappa ata1: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen Sep 21 11:34:58 zappa ata1: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen t4 Sep 21 11:34:58 zappa ata1: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen t3 Sep 21 11:34:58 zappa ata1: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen t2 Tijo, is there any stuff I should try out? I mean, if I got this right, the boot problems itself were fixed by removing nv_hardreset, but is there a way around getting the log flooded by "EH complete" now? Kind regards, Konsti -- GPG KeyID EF62FCEF Fingerprint: 13C9 B16B 9844 EC15 CC2E A080 1E69 3FDA EF62 FCEF --
Please apply the attached patch and post the resulting log. Please don't forget to turn on KALLSYMS. Thanks. -- tejun
Mainly this is the patch you posted in http://marc.info/?l=linux-kernel&m=122033025206886&w=2 which I replied with a useless reply containing no debug. After that I turned KALLSYMS on and posted, did you miss this? here: http://marc.info/?l=linux-kernel&m=122033745615187&w=2 Kind Regards, Konsti -- GPG KeyID EF62FCEF Fingerprint: 13C9 B16B 9844 EC15 CC2E A080 1E69 3FDA EF62 FCEF --
Tejun, sorry for misspelling your name as Tijo :-/ Do I hit your spamfilter with my KSYMOOPS enabled debug outputs or some sort of that? If not, sorry for the inconvenience and take your time. Regards, Konsti -- GPG KeyID EF62FCEF Fingerprint: 13C9 B16B 9844 EC15 CC2E A080 1E69 3FDA EF62 FCEF --
Sorry about lack of response. I was on vacation after a series of conferences. There's a bug entry for this problem and I just posted a patch. http://bugzilla.kernel.org/show_bug.cgi?id=11615 Can you please try the patch attached there and post the result there? Thanks. -- tejun --
Well, this Patch simply installs the old weird behaviour with the result in sometimes no cold boot possible yielding into ataX: link too slow to response ... ataX: COMRESET failed (errno=-16) After that powercycle required. I wrote it down from memory. If there is a detailed log required I can create them tomorrow when I have my Hands back onto the machine. Kind Regards, Konsti -- GPG KeyID EF62FCEF Fingerprint: 13C9 B16B 9844 EC15 CC2E A080 1E69 3FDA EF62 FCEF --
Ah.. okay, so generic is not the only one affected by the original hardreset problem. Can you please post the result of "lspci -nn"? -- tejun --
00:00.0 Host bridge [0600]: nVidia Corporation nForce3 250Gb Host Bridge [10de:00e1] (rev a1) 00:01.0 ISA bridge [0601]: nVidia Corporation nForce3 250Gb LPC Bridge [10de:00e0] (rev a2) 00:01.1 SMBus [0c05]: nVidia Corporation nForce 250Gb PCI System Management [10de:00e4] (rev a1) 00:02.0 USB Controller [0c03]: nVidia Corporation CK8S USB Controller [10de:00e7] (rev a1) 00:02.1 USB Controller [0c03]: nVidia Corporation CK8S USB Controller [10de:00e7] (rev a1) 00:02.2 USB Controller [0c03]: nVidia Corporation nForce3 EHCI USB 2.0 Controller [10de:00e8] (rev a2) 00:05.0 Bridge [0680]: nVidia Corporation CK8S Ethernet Controller [10de:00df] (rev a2) 00:06.0 Multimedia audio controller [0401]: nVidia Corporation nForce3 250Gb AC'97 Audio Controller [10de:00ea] (rev a1) 00:08.0 IDE interface [0101]: nVidia Corporation CK8S Parallel ATA Controller (v2.5) [10de:00e5] (rev a2) 00:0a.0 IDE interface [0101]: nVidia Corporation CK8S Serial ATA Controller (v2.5) [10de:00e3] (rev a2) 00:0b.0 PCI bridge [0604]: nVidia Corporation nForce3 250Gb AGP Host to PCI Bridge [10de:00e2] (rev a2) 00:0e.0 PCI bridge [0604]: nVidia Corporation nForce3 250Gb PCI-to-PCI Bridge [10de:00ed] (rev a2) 00:18.0 Host bridge [0600]: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] HyperTransport Technology Configuration [1022:1100] 00:18.1 Host bridge [0600]: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Address Map [1022:1101] 00:18.2 Host bridge [0600]: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] DRAM Controller [1022:1102] 00:18.3 Host bridge [0600]: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Miscellaneous Control [1022:1103] 01:00.0 VGA compatible controller [0300]: nVidia Corporation NV34 [GeForce FX 5200] [10de:0322] (rev a1) 02:06.0 Ethernet controller [0200]: Realtek Semiconductor Co., Ltd. RTL-8139/8139C/8139C+ [10ec:8139] (rev 10) -- GPG KeyID EF62FCEF Fingerprint: 13C9 B16B 9844 EC15 CC2E A080 1E69 3FDA EF62 FCEF --
Please apply the attached patch and see whether the problem goes away. Also, can you test whether hotplug works with the patch applied? CK804 had problems with hotplug w/ HRST removed. I wanna make sure nf2/3 doesn't have the same problem. Thanks. -- tejun
Erm... I never did Hotplug on SATA, should I plug out the Disk out of the Mainboard Connector to see what happens? I suspect I need another Thats no problem, but one question: go onto vanilla 2.6.27_rc7 WITH or withOUT sata_nv-reinstate-nv_hardreset.patch? Regards, Konsti -- GPG KeyID EF62FCEF Fingerprint: 13C9 B16B 9844 EC15 CC2E A080 1E69 3FDA EF62 FCEF --
Or you can boot into single mode, ro mount / with kernel messages redirected to console and hot unplug/plug the root disk and see what With. Thanks. -- tejun --
Well, this way the Situation is the following: TCP: Hash tables configured (established 131072 bind 65536) TCP reno registered NET: Registered protocol family 1 SGI XFS with ACLs, security attributes, realtime, large block/inode numbers, no debug enabled SGI XFS Quota Management subsystem msgmni has been set to 2008 io scheduler noop registered io scheduler cfq registered (default) pci 0000:01:00.0: Boot video device Linux agpgart interface v0.103 forcedeth: Reverse Engineered nForce ethernet driver. Version 0.61. ACPI: PCI Interrupt Link [LKLN] enabled at IRQ 22 forcedeth 0000:00:05.0: PCI INT A -> Link[LKLN] -> GSI 22 (level, low) -> IRQ 22 forcedeth 0000:00:05.0: setting latency timer to 64 nv_probe: set workaround bit for reversed mac addr Switched to high resolution mode on CPU 0 forcedeth 0000:00:05.0: ifname eth0, PHY OUI 0x732 @ 1, addr 00:13:8f:fd:f9:26 forcedeth 0000:00:05.0: csum timirq lnktim desc-v2 netconsole: local port 6665 netconsole: local IP 10.10.0.1 netconsole: interface eth0 netconsole: remote port 6666 netconsole: remote IP 10.10.0.18 netconsole: remote ethernet address 00:22:15:68:2c:eb netconsole: device eth0 not up yet, forcing it eth0: no link during initialization. eth0: link up. console [netcon0] enabled netconsole: network logging started Driver 'sd' needs updating - please use bus_type methods sata_nv 0000:00:0a.0: version 3.5 ACPI: PCI Interrupt Link [LTID] enabled at IRQ 21 sata_nv 0000:00:0a.0: PCI INT A -> Link[LTID] -> GSI 21 (level, low) -> IRQ 21 sata_nv 0000:00:0a.0: setting latency timer to 64 scsi0 : sata_nv scsi1 : sata_nv ata1: SATA max UDMA/133 cmd 0xf80 ctl 0xf00 bmdma 0xd800 irq 21 ata2: SATA max UDMA/133 cmd 0xe80 ctl 0xe00 bmdma 0xd808 irq 21 ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300) ata1.00: ATA-7: SAMSUNG HD753LJ, 1AA01106, max UDMA7 ata1.00: 1465149168 sectors, multi 16: LBA48 NCQ (depth 0/32) ata1.00: configured for UDMA/133 isa bounce pool size: 16 pages scsi 0:0:0:0: Direct-Access ATA ...
With commit 4c1eb90a0908c0c60db2169dce08fb672e7582f1 (v2.6.27-rc8), I see no spurious EH complete events as I saw with 2.6.27-rc <= 7. Thanks, Benny --
No, MCP55 actually: $ lspci | grep IDE 00:04.0 IDE interface: nVidia Corporation MCP55 IDE (rev a1) 00:05.0 IDE interface: nVidia Corporation MCP55 SATA Controller (rev a3) 00:05.1 IDE interface: nVidia Corporation MCP55 SATA Controller (rev a3) --
Right, the commit fixes generic and CK804 while break nf2/3. Can you also try the following patch? http://article.gmane.org/gmane.linux.ide/34942/raw -- tejun --
Log looks clean with this patch as well. Benny --
Hm, sadly doesn't look so well: Linux version 2.6.27-rc8 (root@zappa) (gcc version 4.3.1 (Gentoo 4.3.1-r1 p 2 CEST 2008 Command line: auto BOOT_IMAGE=linux ro root=801 netconsole=6665@10.10.0.1/e 5:68:2c:eb loglevel=8 debug KERNEL supported cpus: Intel GenuineIntel AMD AuthenticAMD Centaur CentaurHauls BIOS-provided physical RAM map: BIOS-e820: 0000000000000000 - 000000000009fc00 (usable) BIOS-e820: 000000000009fc00 - 00000000000a0000 (reserved) BIOS-e820: 00000000000e8000 - 0000000000100000 (reserved) BIOS-e820: 0000000000100000 - 000000003ffb0000 (usable) BIOS-e820: 000000003ffb0000 - 000000003ffc0000 (ACPI data) BIOS-e820: 000000003ffc0000 - 000000003fff0000 (ACPI NVS) BIOS-e820: 000000003fff0000 - 0000000040000000 (reserved) BIOS-e820: 00000000ff7c0000 - 0000000100000000 (reserved) last_pfn = 0x3ffb0 max_arch_pfn = 0x3ffffffff init_memory_mapping 0000000000 - 003fe00000 page 2M 003fe00000 - 003ffb0000 page 4k kernel direct mapping tables up to 3ffb0000 @ 8000-b000 last_map_addr: 3ffb0000 end: 3ffb0000 DMI 2.3 present. ACPI: RSDP 000F8710, 0014 (r0 ACPIAM) ACPI: RSDT 3FFB0000, 0030 (r1 A M I OEMRSDT 8000607 MSFT 97) ACPI: FACP 3FFB0200, 0084 (r2 A M I OEMFACP 8000607 MSFT 97) ACPI: DSDT 3FFB03F0, 3F26 (r1 K8UNF K8UNF201 201 INTL 2002026) ACPI: FACS 3FFC0000, 0040 ACPI: APIC 3FFB0390, 005C (r1 A M I OEMAPIC 8000607 MSFT 97) ACPI: OEMB 3FFC0040, 0056 (r1 A M I AMI_OEM 8000607 MSFT 97) (4 early reservations) ==> bootmem [0000000000 - 003ffb0000] #0 [0000000000 - 0000001000] BIOS data page ==> [0000000000 - 000000100 #1 [0000200000 - 000058a730] TEXT DATA BSS ==> [0000200000 - 000058a73 #2 [000009fc00 - 0000100000] BIOS reserved ==> [000009fc00 - 000010000 #3 [0000008000 - 0000009000] PGTABLE ==> [0000008000 - 000000900 [ffffe20000000000-ffffe20000dfffff] PMD -> [ffff880001200000-ffff880001fff Zone PFN ranges: DMA 0x00000000 -> 0x00001000 DMA32 0x00001000 -> 0x00100000 ...
What I forgot is, that the ata2: EH pending after 5 tries, giving up ata2: EH complete ata2: EH pending after 5 tries, giving up ata2: EH complete ata2: EH complete ata2: EH complete ata2: EH complete ata2: EH complete ata2: EH complete ata2: EH complete ata2: EH complete ata2: EH complete ata2: EH complete is away now and tomorrow morning I will take care if it manages to do a cold boot after it was switched of this night. -- GPG KeyID EF62FCEF Fingerprint: 13C9 B16B 9844 EC15 CC2E A080 1E69 3FDA EF62 FCEF --
Hmm... strange. Can you please try the attached patch? It's basically the same with a bit more debug information. Thanks. -- tejun
No Problem. I had difficulties to cold boot the machine today, I had to powercycle a lot. Then I applied the patch and it bootet immediately: sata_nv 0000:00:0a.0: version 3.5 ACPI: PCI Interrupt Link [LTID] enabled at IRQ 21 sata_nv 0000:00:0a.0: PCI INT A -> Link[LTID] -> GSI 21 (level, low) -> IRQ 21 sata_nv 0000:00:0a.0: setting latency timer to 64 scsi0 : sata_nv scsi1 : sata_nv ata1: SATA max UDMA/133 cmd 0xf80 ctl 0xf00 bmdma 0xd800 irq 21 ata2: SATA max UDMA/133 cmd 0xe80 ctl 0xe00 bmdma 0xd808 irq 21 ata1: hard resetting link XXX CLASSIFY 01:00:00 ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300) ata1.00: ATA-7: SAMSUNG HD753LJ, 1AA01106, max UDMA7 ata1.00: 1465149168 sectors, multi 16: LBA48 NCQ (depth 0/32) ata1.00: configured for UDMA/133 ata1: EH complete ata2: hard resetting link ata2: SATA link down (SStatus 0 SControl 300) ata2: EH complete isa bounce pool size: 16 pages scsi 0:0:0:0: Direct-Access ATA SAMSUNG HD753LJ 1AA0 PQ: 0 ANSI: 5 sd 0:0:0:0: [sda] 1465149168 512-byte hardware sectors (750156 MB) sd 0:0:0:0: [sda] Write Protect is off sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00 sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA sd 0:0:0:0: [sda] 1465149168 512-byte hardware sectors (750156 MB) sd 0:0:0:0: [sda] Write Protect is off sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00 sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA sda: sda1 sda2 sda3 sda4 < sda5 sda6 sda7 sda8 sda9 > sd 0:0:0:0: [sda] Attached SCSI disk PNP: No PS/2 controller found. Probing ports directly. serio: i8042 KBD port at 0x60,0x64 irq 1 serio: i8042 AUX port at 0x60,0x64 irq 12 TCP cubic registered NET: Registered protocol family 17 XFS mounting filesystem sda1 Ending clean XFS mount for filesystem: sda1 VFS: Mounted root (xfs filesystem) readonly. Freeing unused kernel memory: 220k freed Then I switched off and smoked a cigarette, booting then lasted a bit longer: ata1: link ...
See, cigarettes are bad for you(r computer) ;-) --
Yes... But where does he know from? I have no /dev/eyes still ;-) -- GPG KeyID EF62FCEF Fingerprint: 13C9 B16B 9844 EC15 CC2E A080 1E69 3FDA EF62 FCEF --
It has all the fans for a reason. :-) Eh... Joke aside. I still don't know what's going on here. Before 2.6.26, you always had clean boot, right? -- tejun --
Also, can you please repeat the test several times and see whether there are some patterns? And please also try pre-2.6.26 kernel a few times just to make sure it's not some bad coincidence. Thanks. -- tejun --
Still I have the last suggested patch running and the machine solves to boot cold any time (I am shure meanwhile, turned off the whole sunday it botted this morning and so on - any time). Consistent is this issue telling something about MISSCLASSIFIED: sata_nv 0000:00:0a.0: version 3.5 ACPI: PCI Interrupt Link [LTID] enabled at IRQ 21 sata_nv 0000:00:0a.0: PCI INT A -> Link[LTID] -> GSI 21 (level, low) -> IRQ 21 sata_nv 0000:00:0a.0: setting latency timer to 64 scsi0 : sata_nv scsi1 : sata_nv ata1: SATA max UDMA/133 cmd 0xf80 ctl 0xf00 bmdma 0xd800 irq 21 ata2: SATA max UDMA/133 cmd 0xe80 ctl 0xe00 bmdma 0xd808 irq 21 ata1: hard resetting link ata1: link is slow to respond, please be patient (ready=0) ata1: SRST failed (errno=-16) ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300) ata1: link online but device misclassified, retrying ata1: hard resetting link ata1: link is slow to respond, please be patient (ready=0) ata1: SRST failed (errno=-16) ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300) ata1: link online but device misclassified, retrying ata1: hard resetting link XXX CLASSIFY 01:00:00 ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300) ata1.00: ATA-7: SAMSUNG HD753LJ, 1AA01106, max UDMA7 ata1.00: 1465149168 sectors, multi 16: LBA48 NCQ (depth 0/32) ata1.00: configured for UDMA/133 ata1: EH complete ata2: hard resetting link ata2: SATA link down (SStatus 0 SControl 300) ata2: EH complete isa bounce pool size: 16 pages scsi 0:0:0:0: Direct-Access ATA SAMSUNG HD753LJ 1AA0 PQ: 0 ANSI: 5 sd 0:0:0:0: [sda] 1465149168 512-byte hardware sectors (750156 MB) sd 0:0:0:0: [sda] Write Protect is off sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00 sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA sd 0:0:0:0: [sda] 1465149168 512-byte hardware sectors (750156 MB) sd 0:0:0:0: [sda] Write Protect is off sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00 sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't ...
Hmm... this is proving to be much more difficult than I expected. :-( Can you please try the attached patch? Thanks. -- tejun
Hello! The patch before, I told it always boots but the last two days I had much difficulties to boot. It was hard resetting and waiting a couple of times before bailing out with no mountable root FS. One time it was switched off three hours and the next time overnight, I powercyced a couple of times. If I am the only one experiencing this difficulties (am I the only one with this chipset/revision reporting?), shouldn't we consider this machine... broken? I change SATA cables from time to time, but these seem all to be okay. I mean, if it is really _that_ strange... I fetched 2.6.27 now and tried this patch. A short powercycle, reboot wasn't a problem yesterday, this morning also not, so looks well so far, tihs is how /var/log/messages looks now: Oct 17 07:24:49 zappa sata_nv 0000:00:0a.0: version 3.5 Oct 17 07:24:49 zappa ACPI: PCI Interrupt Link [LTID] enabled at IRQ 21 Oct 17 07:24:49 zappa sata_nv 0000:00:0a.0: PCI INT A -> Link[LTID] -> GSI 21 (level, low) -> IRQ 21 Oct 17 07:24:49 zappa sata_nv 0000:00:0a.0: setting latency timer to 64 Oct 17 07:24:49 zappa scsi0 : sata_nv Oct 17 07:24:49 zappa scsi1 : sata_nv Oct 17 07:24:49 zappa ata1: SATA max UDMA/133 cmd 0xf80 ctl 0xf00 bmdma 0xd800 irq 21 Oct 17 07:24:49 zappa ata2: SATA max UDMA/133 cmd 0xe80 ctl 0xe00 bmdma 0xd808 irq 21 Oct 17 07:24:49 zappa ata1: hard resetting link Oct 17 07:24:49 zappa ata1: SATA link down (SStatus 0 SControl 300) Oct 17 07:24:49 zappa ata1: EH complete Oct 17 07:24:49 zappa ata2: hard resetting link Oct 17 07:24:49 zappa XXX CLASSIFY 01:00:00 Oct 17 07:24:49 zappa ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300) Oct 17 07:24:49 zappa ata2.00: ATA-7: SAMSUNG HD753LJ, 1AA01106, max UDMA7 Oct 17 07:24:49 zappa ata2.00: 1465149168 sectors, multi 16: LBA48 NCQ (depth 0/32) Oct 17 07:24:49 zappa ata2.00: configured for UDMA/133 Oct 17 07:24:49 zappa ata2: EH complete Oct 17 07:24:49 zappa isa bounce pool size: 16 pages Oct 17 07:24:49 zappa scsi 1:0:0:0: Direct-Access ATA ...
Eh... I just bought a used opteron system with nf2/3. I will receive the machine tomorrow. Hopefully, I'll be able find out what the heck is going on here. Thanks. -- tejun --
Hello Tejun! After my short reply I had a 2.6.27 running with [-- Attachment #2: sata_nv-nf2-hrst-debug-take2.patch --] fine so far. It bootet immediately at any time I powercycled the machine. Hot, cold and reboot seems to be no problem, no /var/log/messages flooding also. I just wanted to inform you, whatever the investigations result into, on _my_ machine this incarnation is just fine. Regards, Konsti -- GPG KeyID EF62FCEF Fingerprint: 13C9 B16B 9844 EC15 CC2E A080 1E69 3FDA EF62 FCEF --
Great. My test machine just confirmed the fix too (my first purchase was borked so I had to get another one so the delay). I'll forward the fix to upstream. Thanks a lot. -- tejun --
No Problem at all. If something gets borked - which absolutely is allowed to happen - I have fun to sort this out. Regards, Konsti -- GPG KeyID EF62FCEF Fingerprint: 13C9 B16B 9844 EC15 CC2E A080 1E69 3FDA EF62 FCEF --
I'm experiencing this issue with CD-RWs only. Power cycling allows me to eject
the CD-RW. I'm using an ASUS G50V laptop with kernel 2.6.27-gentoo-r4.
lspci | grep ATA
00:1f.2 SATA controller: Intel Corporation Mobile SATA AHCI Controller (rev 03)
dmesg output:
[ 189.184114] ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6
frozen
[ 189.184160] ata2.00: cmd a0/01:00:00:00:10/00:00:00:00:00/a0 tag 0 dma
4096 in
[ 189.184164] cdb 28 00 00 05 70 74 00 00 02 00 00 00 00 00 00 00
[ 189.184168] res 40/00:03:00:fe:00/00:00:00:00:00/a0 Emask
0x4 (timeout)
[ 189.184176] ata2.00: status: { DRDY }
[ 189.184191] ata2: hard resetting link
[ 194.538122] ata2: link is slow to respond, please be patient (ready=0)
[ 199.185071] ata2: COMRESET failed (errno=-16)
[ 199.185087] ata2: hard resetting link
[ 204.539127] ata2: link is slow to respond, please be patient (ready=0)
[ 209.231125] ata2: COMRESET failed (errno=-16)
[ 209.231153] ata2: hard resetting link
[ 214.585124] ata2: link is slow to respond, please be patient (ready=0)
[ 244.267114] ata2: COMRESET failed (errno=-16)
[ 244.267130] ata2: limiting SATA link speed to 1.5 Gbps
[ 244.267136] ata2: hard resetting link
[ 249.315112] ata2: COMRESET failed (errno=-16)
[ 249.315124] ata2: reset failed, giving up
[ 249.315130] ata2.00: disabled
[ 249.315153] ata2: EH complete
--
It's a different failure on a different controller. Can you please file a bug report on bugzilla.kernel.org and... 1. Reproduce the problem with kernel-2.6.28-rc8. 2. Attach boot and the failure kernel log. 3. Attach the output of "lspci -nn". Thanks. -- tejun --
Yes, with 2.6.25 it always had a clean booting system. From time to time I do an update and with some 2.6.26_rcX I had problems the system not solving a cold boot sometimes. Long I suspected a hardware issue but one time I went down to 2.6.25 and the problem was away. Then I updated to 2.6.27_rcX because I was hunting down some nfsv4 error, considering this as a bug or an issue in front of screen. Then I realised the cold boot problem again which I almost forgot meanwhile or considered closed from 2.6.26_rcX over 2.6.26 to 2.6.27_rcX. Konsti -- GPG KeyID EF62FCEF Fingerprint: 13C9 B16B 9844 EC15 CC2E A080 1E69 3FDA EF62 FCEF --
Replying to myself... Indeed my HArddisk was plugged into SATA2 and SATA1 was left empty, I moved it to SATA1 now. Now: Sep 25 08:35:29 zappa ata1: SATA max UDMA/133 cmd 0xf80 ctl 0xf00 bmdma 0xd800 irq 21 Sep 25 08:35:29 zappa ata2: SATA max UDMA/133 cmd 0xe80 ctl 0xe00 bmdma 0xd808 irq 21 Sep 25 08:35:29 zappa ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300) Sep 25 08:35:29 zappa ata1.00: ATA-7: SAMSUNG HD753LJ, 1AA01106, max UDMA7 Sep 25 08:35:29 zappa ata1.00: 1465149168 sectors, multi 16: LBA48 NCQ (depth 0/32) Sep 25 08:35:29 zappa ata1.00: configured for UDMA/133 Sep 25 08:35:29 zappa ata2: EH pending after 5 tries, giving up Sep 25 08:35:29 zappa ata2: EH complete Sep 25 08:35:29 zappa ata2: EH pending after 5 tries, giving up Sep 25 08:35:29 zappa ata2: EH complete Kind REgards, Konsti -- GPG KeyID EF62FCEF Fingerprint: 13C9 B16B 9844 EC15 CC2E A080 1E69 3FDA EF62 FCEF --
| Greg KH | Og dreams of kernels |
| Jens Axboe | [PATCH 31/33] Fusion: sg chaining support |
| Arnd Bergmann | Re: finding your own dead "CONFIG_" variables |
| Mark Brown | [PATCH 2/2] Subject: natsemi: Allow users to disable workaround for DspCfg reset |
| Tony Breeds | [LGUEST] Look in object dir for .config |
git: | |
| Brian Downing | Re: Git in a Nutshell guide |
| John Benes | Re: master has some toys |
| Matthias Lederhofer | [PATCH 4/7] introduce GIT_WORK_TREE to specify the work tree |
| Alexander Sulfrian | [RFC/PATCH] RE: git calls SSH_ASKPASS even if DISPLAY is not set |
| Junio C Hamano |
