Notable changes: 1) Link Power Management (LPM) rewrite 2) ATA transport class 3) Start hammering out support for >512-byte sectors Please pull from 'upstream-linus' branch of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/libata-dev.git upstream-linus to receive the following updates: Documentation/ABI/testing/sysfs-ata | 99 +++++ drivers/ata/Makefile | 2 +- drivers/ata/ahci.c | 3 - drivers/ata/ahci.h | 4 +- drivers/ata/ahci_platform.c | 3 - drivers/ata/ata_generic.c | 54 +++- drivers/ata/ata_piix.c | 34 ++- drivers/ata/libahci.c | 194 +++------ drivers/ata/libata-core.c | 389 ++++++++---------- drivers/ata/libata-eh.c | 254 ++++++++++-- drivers/ata/libata-pmp.c | 66 +++- drivers/ata/libata-scsi.c | 204 +++++----- drivers/ata/libata-sff.c | 14 +- drivers/ata/libata-transport.c | 774 +++++++++++++++++++++++++++++++++++ drivers/ata/libata-transport.h | 18 + drivers/ata/libata.h | 21 +- drivers/ata/pata_bf54x.c | 6 +- drivers/ata/pata_cmd640.c | 12 + drivers/ata/pata_pdc202xx_old.c | 23 + drivers/ata/pata_samsung_cf.c | 2 +- drivers/ata/pata_scc.c | 4 +- drivers/ata/pata_sil680.c | 16 +- drivers/ata/pata_sl82c105.c | 11 + drivers/ata/sata_fsl.c | 27 +- drivers/ata/sata_inic162x.c | 2 +- drivers/ata/sata_mv.c | 25 +- drivers/ata/sata_nv.c | 57 +-- drivers/ata/sata_sil24.c | 14 +- drivers/ata/sata_via.c | 2 +- include/linux/ata.h | 46 ++- include/linux/libata.h | 65 ++- 31 files changed, 1815 insertions(+), 630 deletions(-) create mode 100644 Documentation/ABI/testing/sysfs-ata create mode 100644 ...
Hrm, this is causing issues for me... When I resume from suspend on my thinkpad X201s I get: [ 2454.478673] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [ 2454.487736] ata1.00: ACPI cmd ef/02:00:00:00:00:a0 (SET FEATURES) succeeded [ 2454.487742] ata1.00: ACPI cmd f5/00:00:00:00:00:a0 (SECURITY FREEZE filtered out [repeated a few times] [ 2454.492688] ata1.00: configured for UDMA/100 [ 2454.506690] ata1.00: configured for UDMA/100 [ 2454.506696] ata1: EH complete [...] [ 2455.097874] ata1.00: configured for UDMA/100 [ 2455.097882] ata1: EH complete [...] [ 2455.386756] ata1.00: failed to enable DIPM, Emask 0x100 [ 2455.386768] ata1: hard resetting link [...] [ 2455.847925] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [ 2455.849372] ata1.00: ACPI cmd ef/02:00:00:00:00:a0 (SET FEATURES) succeeded [ 2455.849380] ata1.00: ACPI cmd f5/00:00:00:00:00:a0 (SECURITY FREEZE LOCK) filtered out [ditto] [ 2455.854171] ata1.00: configured for UDMA/100 [ 2455.854480] ata1.00: failed to enable DIPM, Emask 0x100 [ 2455.854484] ata1: disabling LPM on the link [ 2455.854489] ata1: limiting SATA link speed to 1.5 Gbps [ 2455.854494] ata1.00: limiting speed to UDMA/100:PIO3 Tejun, any ideas how I can debug this? regards, Kyle --
Hello, Hmm... DIPM commands are failing with AC_ERR_OTHER. Other than DIPM not being configured and speed capped at 1.5Gbps, the machine works fine afterwards, right? Are you up for applying debug patches? Thanks. -- tejun --
Yup, it's chugging along happily, though I notice kernel builds are taking a minute or two longer (but I don't bother to time them so it's purely subjective.) Be happy to apply any debug patches, but I plan on replacing the drive with a faster one at some point in the next few weeks. --Kyle --
Can you please apply the following patch and report the resulting
kernel log? You're on ahci, right?
diff --git a/drivers/ata/libahci.c b/drivers/ata/libahci.c
index ebc08d6..b1c39db 100644
--- a/drivers/ata/libahci.c
+++ b/drivers/ata/libahci.c
@@ -1560,6 +1560,10 @@ static void ahci_error_intr(struct ata_port *ap, u32 irq_stat)
}
/* okay, let's hand over to EH */
+ if (active_qc && ata_tag_internal(active_qc->tag))
+ ata_dev_printk(active_qc->dev, KERN_WARNING,
+ "ahci: internal command failure, irq_stat=0x%x\n",
+ irq_stat);
if (irq_stat & PORT_IRQ_FREEZE)
ata_port_freeze(ap);
diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
index 7f77c67..7cf236b 100644
--- a/drivers/ata/libata-core.c
+++ b/drivers/ata/libata-core.c
@@ -1668,6 +1668,10 @@ unsigned ata_exec_internal_sg(struct ata_device *dev,
/* perform minimal error analysis */
if (qc->flags & ATA_QCFLAG_FAILED) {
+ ata_dev_printk(dev, KERN_WARNING,
+ "internal command failure: stat=0x%x ehi_desc=\"%s\"\n",
+ qc->tf.command, dev->link->eh_info.desc);
+
if (qc->result_tf.command & (ATA_ERR | ATA_DF))
qc->err_mask |= AC_ERR_DEV;
--
Ok, running it, I haven't seen the issue across three suspends... Will update when it inevitably happens. --Kyle --
[10272.544858] ata1.00: ahci: internal command failure, irq_stat=0x400001 [10272.544943] ata1.00: internal command failure: stat=0xef ehi_desc="irq_stat 0x00400001, PHY RDY changed" [10272.544951] ata1.00: failed to enable DIPM, Emask 0x100 [10272.544960] ata1: hard resetting link [10273.004718] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [10273.006108] ata1.00: ACPI cmd ef/02:00:00:00:00:a0 (SET FEATURES) succeeded [10273.006116] ata1.00: ACPI cmd f5/00:00:00:00:00:a0 (SECURITY FREEZE LOCK) filtered out [10273.006410] ata1.00: ACPI cmd ef/5f:00:00:00:00:a0 (SET FEATURES) succeeded [10273.006417] ata1.00: ACPI cmd ef/10:03:00:00:00:a0 (SET FEATURES) filtered out [10273.009337] ata1.00: ACPI cmd ef/02:00:00:00:00:a0 (SET FEATURES) succeeded [10273.009344] ata1.00: ACPI cmd f5/00:00:00:00:00:a0 (SECURITY FREEZE LOCK) filtered out [10273.009611] ata1.00: ACPI cmd ef/5f:00:00:00:00:a0 (SET FEATURES) succeeded [10273.009631] ata1.00: ACPI cmd ef/10:03:00:00:00:a0 (SET FEATURES) filtered out [10273.010945] ata1.00: configured for UDMA/100 [10273.011206] ata1.00: ahci: internal command failure, irq_stat=0x400001 [10273.011241] ata1.00: internal command failure: stat=0xef ehi_desc="irq_stat 0x00400001, PHY RDY changed" [10273.011247] ata1.00: failed to enable DIPM, Emask 0x100 [10273.011250] ata1: disabling LPM on the link [10273.011255] ata1: limiting SATA link speed to 1.5 Gbps [10273.011259] ata1.00: limiting speed to UDMA/100:PIO3 regards, Kyle --
Hello, Hmmm, so it's a timing problem. For some reason, a PHY event is being reported after reset is complete which unfortunately gets reported while DIPM command is in progress. Hmmm... what's broken is that the PHY event happens after reset is complete. Can you please attach the output of "lspci -nn" and "hdparm -I /dev/sda"? Looks a lot like a timing issue. Maybe we'll have to add a delay somewhere. :-( -- tejun --
Sure, np: 00:1f.2 SATA controller [0106]: Intel Corporation 5 Series/3400 Series Chipset 6 port SATA AHCI Controller [8086:3b2f] (rev 06) (prog-if 01 [AHCI 1.0]) Subsystem: Lenovo Device [17aa:2168] Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+ Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 0 Interrupt: pin B routed to IRQ 42 Region 0: I/O ports at 1860 [size=8] Region 1: I/O ports at 1814 [size=4] Region 2: I/O ports at 1818 [size=8] Region 3: I/O ports at 1810 [size=4] Region 4: I/O ports at 1840 [size=32] Region 5: Memory at f2727000 (32-bit, non-prefetchable) [size=2K] Capabilities: [80] MSI: Enable+ Count=1/1 Maskable- 64bit- Address: fee0100c Data: 4161 Capabilities: [70] Power Management version 3 Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot+,D3cold-) Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME- Capabilities: [a8] SATA HBA v1.0 BAR4 Offset=00000004 Capabilities: [b0] PCI Advanced Features AFCap: TP+ FLR+ AFCtrl: FLR- AFStatus: TP- Kernel driver in use: ahci /dev/sda: ATA device, with non-removable media Model Number: HITACHI HTS725032A9A364 Serial Number: 100407PCKC04VPHVB55K Firmware Revision: PC3ZC70F Transport: Serial, ATA8-AST, SATA 1.0a, SATA II Extensions, SATA Rev 2.5, SATA Rev 2.6; Revision: ATA8-AST T13 Project D1697 Revision 0b Standards: Used: unknown (minor revision code 0x0028) Supported: 8 7 6 5 Likely used: 8 Configuration: Logical max current cylinders 16383 16383 heads 16 16 sectors/track 63 63 -- CHS current addressable sectors: 16514064 LBA user addressable sectors: 268435455 LBA48 user addressable sectors: 625142448 Logical Sector size: 512 bytes Physical Sector size: 512 bytes device size with M = 1024*1024: 305245 MBytes device size ...
Hello, again.
Nothing peculiar. Hmm... okay, I think I've found what went wrong.
Can you please try the following patch and see whether the problem
goes away?
diff --git a/drivers/ata/libata-eh.c b/drivers/ata/libata-eh.c
index 5e59050..17a6378 100644
--- a/drivers/ata/libata-eh.c
+++ b/drivers/ata/libata-eh.c
@@ -3275,6 +3275,7 @@ static int ata_eh_set_lpm(struct ata_link *link, enum ata_lpm_policy policy,
struct ata_port *ap = ata_is_host_link(link) ? link->ap : NULL;
struct ata_eh_context *ehc = &link->eh_context;
struct ata_device *dev, *link_dev = NULL, *lpm_dev = NULL;
+ enum ata_lpm_policy old_policy = link->lpm_policy;
unsigned int hints = ATA_LPM_EMPTY | ATA_LPM_HIPM;
unsigned int err_mask;
int rc;
@@ -3338,6 +3339,14 @@ static int ata_eh_set_lpm(struct ata_link *link, enum ata_lpm_policy policy,
goto fail;
}
+ /*
+ * Low level driver acked the transition. Issue DIPM command
+ * with the new policy set.
+ */
+ link->lpm_policy = policy;
+ if (ap && ap->slave_link)
+ ap->slave_link->lpm_policy = policy;
+
/* host config updated, enable DIPM if transitioning to MIN_POWER */
ata_for_each_dev(dev, link, ENABLED) {
if (policy == ATA_LPM_MIN_POWER && ata_id_has_dipm(dev->id)) {
@@ -3353,12 +3362,14 @@ static int ata_eh_set_lpm(struct ata_link *link, enum ata_lpm_policy policy,
}
}
- link->lpm_policy = policy;
- if (ap && ap->slave_link)
- ap->slave_link->lpm_policy = policy;
return 0;
fail:
+ /* restore the old policy */
+ link->lpm_policy = old_policy;
+ if (ap && ap->slave_link)
+ ap->slave_link->lpm_policy = old_policy;
+
/* if no device or only one more chance is left, disable LPM */
if (!dev || ehc->tries[dev->devno] <= 2) {
ata_link_printk(link, KERN_WARNING,
--
Thanks, I'll let you know when/if it drops to 1.5gbps. --Kyle --
Looks to stay at 3.0 Gbps now... but gives up on EH after 5 retries? [ 4342.048572] ata1: hard resetting link [ 4342.507282] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [ 4342.510118] ata1.00: ACPI cmd ef/02:00:00:00:00:a0 (SET FEATURES) succeeded [ 4342.510127] ata1.00: ACPI cmd f5/00:00:00:00:00:a0 (SECURITY FREEZE LOCK) filtered out [ 4342.510413] ata1.00: ACPI cmd ef/5f:00:00:00:00:a0 (SET FEATURES) succeeded [ 4342.510421] ata1.00: ACPI cmd ef/10:03:00:00:00:a0 (SET FEATURES) filtered out [ 4342.514766] ata1.00: ACPI cmd ef/02:00:00:00:00:a0 (SET FEATURES) succeeded [ 4342.514774] ata1.00: ACPI cmd f5/00:00:00:00:00:a0 (SECURITY FREEZE LOCK) filtered out [ 4342.515024] ata1.00: ACPI cmd ef/5f:00:00:00:00:a0 (SET FEATURES) succeeded [ 4342.515031] ata1.00: ACPI cmd ef/10:03:00:00:00:a0 (SET FEATURES) filtered out [ 4342.516697] ata1.00: configured for UDMA/100 [ 4342.516977] ata1: hard resetting link [ 4342.976018] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [ 4342.978892] ata1.00: ACPI cmd ef/02:00:00:00:00:a0 (SET FEATURES) succeeded [ 4342.978900] ata1.00: ACPI cmd f5/00:00:00:00:00:a0 (SECURITY FREEZE LOCK) filtered out [ 4342.979186] ata1.00: ACPI cmd ef/5f:00:00:00:00:a0 (SET FEATURES) succeeded [ 4342.979193] ata1.00: ACPI cmd ef/10:03:00:00:00:a0 (SET FEATURES) filtered out [ 4342.983577] ata1.00: ACPI cmd ef/02:00:00:00:00:a0 (SET FEATURES) succeeded [ 4342.983585] ata1.00: ACPI cmd f5/00:00:00:00:00:a0 (SECURITY FREEZE LOCK) filtered out [ 4342.983865] ata1.00: ACPI cmd ef/5f:00:00:00:00:a0 (SET FEATURES) succeeded [ 4342.983872] ata1.00: ACPI cmd ef/10:03:00:00:00:a0 (SET FEATURES) filtered out [ 4342.984989] ata1.00: configured for UDMA/100 [ 4342.985408] ata1: hard resetting link [ 4343.445045] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [ 4343.447845] ata1.00: ACPI cmd ef/02:00:00:00:00:a0 (SET FEATURES) succeeded [ 4343.447856] ata1.00: ACPI cmd f5/00:00:00:00:00:a0 (SECURITY FREEZE LOCK) filtered out [ ...
Hmmm... something is setting EH_PENDING while EH is in progress. Let's see what's doing it. Can you please apply the following patch on top of the previous one and report the kernel log? Thanks. diff --git a/drivers/ata/libata-eh.c b/drivers/ata/libata-eh.c index 5e59050..8f0d854 100644 --- a/drivers/ata/libata-eh.c +++ b/drivers/ata/libata-eh.c @@ -896,6 +896,10 @@ static void ata_eh_set_pending(struct ata_port *ap, int fastdrain) ap->pflags |= ATA_PFLAG_EH_PENDING; + ata_port_printk(ap, KERN_WARNING, "XXX SET EH PENDING, fastdrain=%d\n", + fastdrain); + dump_stack(); + if (!fastdrain) return; --
This gets triggered a bunch on boot-up: [ 36.981370] ata1: XXX SET EH PENDING, fastdrain=0 [ 36.981380] Pid: 0, comm: swapper Not tainted 2.6.37-0.rc4.git1.2.fc15.x86_64 #1 [ 36.981386] Call Trace: [ 36.981393] <IRQ> [<ffffffff8131d3df>] ata_eh_set_pending+0x3e/0x85 [ 36.981407] [<ffffffff8131db36>] ata_do_link_abort+0x44/0x94 [ 36.981412] [<ffffffff8131dbac>] ata_port_abort+0x10/0x12 [ 36.981421] [<ffffffff8132b0ab>] ahci_interrupt+0x467/0x5a1 [ 36.981428] [<ffffffff810afbc1>] ? check_for_new_grace_period.clone.21+0x8b/0x97 [ 36.981434] [<ffffffff8106fb68>] ? sched_clock_cpu+0x42/0xc6 [ 36.981440] [<ffffffff810abbfe>] handle_IRQ_event+0x58/0x122 [ 36.981447] [<ffffffff81024d64>] ? apic_write+0x16/0x18 [ 36.981452] [<ffffffff810adf27>] handle_edge_irq+0xfa/0x145 [ 36.981458] [<ffffffff8100d2dd>] handle_irq+0x88/0x90 [ 36.981463] [<ffffffff8147d508>] do_IRQ+0x58/0xb0 [ 36.981468] [<ffffffff81477513>] ret_from_intr+0x0/0x11 [ 36.981470] <EOI> [<ffffffff81274eab>] ? intel_idle+0x115/0x13d [ 36.981480] [<ffffffff81274e8a>] ? intel_idle+0xf4/0x13d [ 36.981485] [<ffffffff8139f894>] cpuidle_idle_call+0x9e/0x104 [ 36.981489] [<ffffffff8100932e>] cpu_idle+0xae/0x105 [ 36.981493] [<ffffffff8145f2ce>] rest_init+0x72/0x74 [ 36.981498] [<ffffffff81b53c31>] start_kernel+0x3e0/0x3eb [ 36.981503] [<ffffffff81b532cb>] x86_64_start_reservations+0xb6/0xba [ 36.981507] [<ffffffff81b533d5>] x86_64_start_kernel+0x106/0x115 [ 37.200683] ata1: XXX SET EH PENDING, fastdrain=1 [ 37.200692] Pid: 1465, comm: ck-history Not tainted 2.6.37-0.rc4.git1.2.fc15.x86_64 #1 [ 37.200696] Call Trace: [ 37.200699] <IRQ> [<ffffffff8131d3df>] ata_eh_set_pending+0x3e/0x85 [ 37.200715] [<ffffffff8131dae5>] ata_port_schedule_eh+0x43/0x50 [ 37.200720] [<ffffffff813150d7>] ata_qc_complete+0xd2/0x152 [ 37.200723] [<ffffffff8131521d>] ata_qc_complete_multiple+0xc6/0xcb [ 37.200727] [<ffffffff8132b141>] ahci_interrupt+0x4fd/0x5a1 [ ...
Hello, Kyle. Yeah, that's expected. Can you please do the usual testing routine and trigger the EH giving up message and then attach the full kernel log? Thanks. -- tejun --
Ahh, ok, sure. Sorry 'bout that. --Kyle --
OK, attached is a couple days worth of suspend/resume, hopefully that turns up what you're looking for. regards, Kyle
Hello,
Heh, yeah, actually it did. Can you please apply the following patch
on top and see whether the problem goes away?
diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
index 7f77c67..f23d6d4 100644
--- a/drivers/ata/libata-core.c
+++ b/drivers/ata/libata-core.c
@@ -4807,9 +4807,6 @@ static void ata_verify_xfer(struct ata_queued_cmd *qc)
{
struct ata_device *dev = qc->dev;
- if (ata_tag_internal(qc->tag))
- return;
-
if (ata_is_nodata(qc->tf.protocol))
return;
@@ -4858,14 +4855,23 @@ void ata_qc_complete(struct ata_queued_cmd *qc)
if (unlikely(qc->err_mask))
qc->flags |= ATA_QCFLAG_FAILED;
- if (unlikely(qc->flags & ATA_QCFLAG_FAILED)) {
- /* always fill result TF for failed qc */
+ /*
+ * Finish internal commands without any further processing
+ * and always with the result TF filled.
+ */
+ if (unlikely(ata_tag_internal(qc->tag))) {
fill_result_tf(qc);
+ __ata_qc_complete(qc);
+ return;
+ }
- if (!ata_tag_internal(qc->tag))
- ata_qc_schedule_eh(qc);
- else
- __ata_qc_complete(qc);
+ /*
+ * Non-internal qc has failed. Fill the result TF and
+ * summon EH.
+ */
+ if (unlikely(qc->flags & ATA_QCFLAG_FAILED)) {
+ fill_result_tf(qc);
+ ata_qc_schedule_eh(qc);
return;
}
--
Hi All, I have two dell laptops 1015. For studying Linux kernel, I want to let one run debugging Linux as client, the other as host. The two laptops don't have serial port, just have LAN and usb ports. How can I configure them to get early kernel print message? I ever asked a lot of people, but don't get answers. Please give me some practical advices, thanks in advance. stanley --
Take a look at <Documentation/networking/netconsole.txt> in the kernel sources. -- Regards/Gruss, Boris. --
Hi Borislav, I will check your recommend document. Thank you so much. Best wishes, Stanley -----邮件原件----- 发件人: Borislav Petkov [mailto:bp@alien8.de] 发送时间: 2010年12月2日 21:17 收件人: Stanley Gan 抄送: 'LKML' 主题: Re: debug kernel using two laptops without serial port Take a look at <Documentation/networking/netconsole.txt> in the kernel sources. -- Regards/Gruss, Boris. --
