Re: [PATCH] SATA / AHCI: Do not play with the link PM during suspend to RAM

Previous thread: Re: [PATCH] netfilter: add CHECKSUM target by Patrick McHardy on Friday, July 9, 2010 - 8:17 am. (5 messages)

Next thread: [PATCH 1/2] Add trace events to mmap and brk by Eric B Munson on Friday, July 9, 2010 - 8:53 am. (6 messages)
From: Stephan Diestelhorst
Date: Friday, July 9, 2010 - 8:50 am

Hi,
  I have n issue with suepnd to RAM and I/O load on a disk. Symptoms
are that the disk does not respond to requests when woken up, producing
only I/O errors on all tested kernels (newest 2.6.35-rc4 (Ubuntu
mainline PPA build)):

[ 1719.580169] sd 0:0:0:0: [sda] Unhandled error code
[ 1719.580174] sd 0:0:0:0: [sda] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[ 1719.580178] sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 0f 51 e7 88 00 00 b0 00
[ 1719.580186] end_request: I/O error, dev sda, sector 257025928
[ 1719.580798] Aborting journal on device dm-1-8.
[ 1719.580912] EXT4-fs error (device dm-1) in ext4_reserve_inode_write: Journal has aborted
[ 1719.580959] EXT4-fs (dm-1): Remounting filesystem read-only
[ 1719.581004] sd 0:0:0:0: [sda] Unhandled error code
[ 1719.581007] sd 0:0:0:0: [sda] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[ 1719.581010] sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 0f 51 a1 88 00 00 08 00
[ 1719.581016] end_request: I/O error, dev sda, sector 257008008
[ 1719.581026] Buffer I/O error on device dm-1, logical block 2129920
[ 1719.581027] lost page write due to I/O error on dm-1
[ 1719.581149] 
[ 1719.581214] sd 0:0:0:0: [sda] Unhandled error code
[ 1719.581217] sd 0:0:0:0: [sda] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[ 1719.581220] sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 0e 4d a1 88 00 00 08 00
[ 1719.581227] end_request: I/O error, dev sda, sector 239968648
[ 1719.581254] JBD2: I/O error detected when updating journal superblock for dm-1-8.
[ 1719.581268] journal commit I/O error

This can be triggered most reliably with multiple "direct" writes to
disk, I create the load with the attached script. If the issue is
triggered, suspend (through pm-suspend) takes very long.

IMHO the interesting log output during suspend is:
[ 1668.150125] Suspending console(s) (use no_console_suspend to debug)
[ 1668.150460] sd 0:0:0:0: [sda] Synchronizing SCSI cache
[ 1668.174958] sd 0:0:0:0: [sda] Stopping disk
[ 1668.198045] ACPI handle has ...
From: Stephan Diestelhorst
Date: Friday, July 9, 2010 - 2:47 pm

Almighty google suggested to try "pci=nomsi", which seems to have
cured the issue for me for now. Is that plausible? I'll keep this
under observation.

Thanks,
  Stephan
From: Rafael J. Wysocki
Date: Friday, July 9, 2010 - 2:53 pm

Hmm.  How does your /proc/interrupts look like?

Also, do you have a link to this "Google suggestion"?

Rafael
--

From: Stephan Diestelhorst
Date: Friday, July 9, 2010 - 4:04 pm

This has been yet another red herring. After trying out the kernel
option three times with two different kernels, it failed yet again
with the same symptoms. 

I have attached /proc/interrupts for 2.6.35-rc4, once with pci=nomsi

It was some german forum, a guy with completely different HW, but the
same symptom. I thought trying out the option wouldn't hurt.

Maybe it came for example from http://lkml.org/lkml/2008/12/20/3
originally.

Stephan
From: Rafael J. Wysocki
Date: Friday, July 9, 2010 - 5:06 pm

I have a box where this problem is kind of reproducible, but it happens _very_
rarely.  Also I can't reproduce it on demand running suspend-resume in a tight
loop.  Are you able to reproduce it more regurarly?

Also, what kind of disk do you use?

Rafael
--

From: Stephan Diestelhorst
Date: Friday, July 9, 2010 - 11:50 pm

For me it is much more reproducible. If I run multiple direct writing
dd-s to the disk in question I trigger it rather reliably (~75% or
higher). See the attached script from an earlier email.

It is a Samsung HM321HI in a Samsung Eikee R525 notebook, please also 
see my smartctl -a log, attached earlier.

Interesting, I have a similar symptom on one of my home servers,
which has a *Samsung* SpinPoint F1 and it went away with different
disks. So maybe these disks are either faulty themselves or they
trigger the issue more often?

I also have a LVM on top of LUKS on the disk. So the I/O will also
add some computational overhead for encryption.

Stephan
--

From: Tejun Heo
Date: Saturday, July 10, 2010 - 3:03 am

Can you please try the following git tree?

  git://git.kernel.org/pub/scm/linux/kernel/git/tj/misc.git libata-irq-expect

Thanks.

-- 
tejun
--

From: Rafael J. Wysocki
Date: Saturday, July 10, 2010 - 6:45 am

Well, for now I got this:

[   36.833075] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
[   36.833085] ata1.00: failed command: SMART
[   36.833099] ata1.00: cmd b0/d5:01:06:4f:c2/00:00:00:00:00/00 tag 0 pio 512 in
[   36.833101]          res 40/00:00:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
[   36.833107] ata1.00: status: { DRDY }
[   36.833118] ata1: hard resetting link
[   37.316053] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[   37.316840] ata1.00: configured for UDMA/133
[   37.316888] ata1: EH complete

during initialization.  Apart from this it seems to work fine.

But in fact I'll only be able to say it helps if it survives a week-or-so
without suspend failure.

Rafael
--


That didn't help, but the appended patch fixes the problem for me.

Thanks,
Rafael

---
From: Rafael J. Wysocki <rjw@sisk.pl>
Subject: SATA / AHCI: Do not play with the link PM during suspend to RAM

My Acer Ferrari One occasionally loses communication with the disk
(which in fact is an Intel SSD) during suspend to RAM.  The symptom
is that the IDENTIFY command times out during suspend and the device
is dropped by the kernel, so it is not available during resume and
the system is unuseable as a result.  The failure is not readily
reproducible, although it happens once every several suspends and
it always happens after the disk has been shut down by the SCSI
layer's suspend routine.

I was able to track this issue down to the link PM manipulations
carried out by ata_host_suspend(), which probably means that the
SSD's firmware is not implemented correctly.  However, the AHCI
driver, which is used on the affected box, doesn't really need to do
anything with the link PM during suspend to RAM, because the whole
controller is going to be put into D3 by ata_pci_device_do_suspend()
immediately and it will undergo full reset during the subsequent
resume anyway.  For this reason, make the AHCI driver avoid calling
ata_host_suspend() during suspend to RAM which fixes the problem and
makes sense as a general optimization.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
---
 drivers/ata/ahci.c |   11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

Index: linux-2.6/drivers/ata/ahci.c
===================================================================
--- linux-2.6.orig/drivers/ata/ahci.c
+++ linux-2.6/drivers/ata/ahci.c
@@ -595,6 +595,7 @@ static int ahci_pci_device_suspend(struc
 	struct ahci_host_priv *hpriv = host->private_data;
 	void __iomem *mmio = hpriv->mmio;
 	u32 ctl;
+	int rc = 0;
 
 	if (mesg.event & PM_EVENT_SUSPEND &&
 	    hpriv->flags & AHCI_HFLAG_NO_SUSPEND) {
@@ -614,7 +615,15 @@ static int ahci_pci_device_suspend(struc
 		readl(mmio + ...
From: Tejun Heo
Date: Friday, July 30, 2010 - 7:18 am

Hello, Rafael.

Sorry about the delay.  There was a tiny crisis here and the whole
link pm seems to need a lot more work than I originally expected.  I'm
working on it now.  I'll probably have something for you to test in a
few days.

Thanks.

-- 
tejun
--

From: Tejun Heo
Date: Thursday, August 5, 2010 - 9:08 am

Hello, Rafael.

Can you please try the following patch and see whether the problem
goes away?

Thanks.

 drivers/ata/ahci.c          |    3
 drivers/ata/ahci.h          |    1
 drivers/ata/ahci_platform.c |    3
 drivers/ata/ata_piix.c      |   24 +++
 drivers/ata/libahci.c       |  161 +++++++-------------------
 drivers/ata/libata-core.c   |  269 ++++++++++----------------------------------
 drivers/ata/libata-eh.c     |  176 +++++++++++++++++++++++++---
 drivers/ata/libata-pmp.c    |   49 +++++++-
 drivers/ata/libata-scsi.c   |   74 ++++--------
 drivers/ata/libata.h        |   12 +
 include/linux/libata.h      |   40 +++---
 11 files changed, 393 insertions(+), 419 deletions(-)

diff --git a/drivers/ata/ahci.c b/drivers/ata/ahci.c
index f252253..cfdc22b 100644
--- a/drivers/ata/ahci.c
+++ b/drivers/ata/ahci.c
@@ -1190,9 +1190,6 @@ static int ahci_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
 		ata_port_pbar_desc(ap, AHCI_PCI_BAR,
 				   0x100 + ap->port_no * 0x80, "port");

-		/* set initial link pm policy */
-		ap->pm_policy = NOT_AVAILABLE;
-
 		/* set enclosure management message type */
 		if (ap->flags & ATA_FLAG_EM)
 			ap->em_message_type = hpriv->em_msg_type;
diff --git a/drivers/ata/ahci.h b/drivers/ata/ahci.h
index 7113c57..6d07948 100644
--- a/drivers/ata/ahci.h
+++ b/drivers/ata/ahci.h
@@ -201,7 +201,6 @@ enum {
 	AHCI_HFLAG_MV_PATA		= (1 << 4), /* PATA port */
 	AHCI_HFLAG_NO_MSI		= (1 << 5), /* no PCI MSI */
 	AHCI_HFLAG_NO_PMP		= (1 << 6), /* no PMP */
-	AHCI_HFLAG_NO_HOTPLUG		= (1 << 7), /* ignore PxSERR.DIAG.N */
 	AHCI_HFLAG_SECT255		= (1 << 8), /* max 255 sectors */
 	AHCI_HFLAG_YES_NCQ		= (1 << 9), /* force NCQ cap on */
 	AHCI_HFLAG_NO_SUSPEND		= (1 << 10), /* don't suspend */
diff --git a/drivers/ata/ahci_platform.c b/drivers/ata/ahci_platform.c
index 5e11b16..0f69afe 100644
--- a/drivers/ata/ahci_platform.c
+++ b/drivers/ata/ahci_platform.c
@@ -120,9 +120,6 @@ static int __init ahci_probe(struct platform_device ...
From: Rafael J. Wysocki
Date: Thursday, August 5, 2010 - 12:58 pm

I'm going to LinuxCon shortly and I'm afraid I won't be able to test it until
I get back home.  However, it seems that Stephan could reproduce the issue
more easily, so parhaps he'll be able to test it earlier.

Thanks,

--

From: Stephan Diestelhorst
Date: Thursday, August 5, 2010 - 11:30 pm

Hi Tejun,

<snip>

to which revision does the patch apply? I didn't get it to apply
cleanly to Linus' kernel HEAD or the 2.6.34 stable tag. Maybe I am
missing something, since I am a git-n00b (use Mercurial all the time)?

Thanks,
  Stephan
-- 
Stephan Diestelhorst, AMD Operating System Research Center
stephan.diestelhorst@amd.com, Tel. +49 (0)351 448 356 719

Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach
General Managers: Alberto Bozzo, Andrew Bowd
Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632

--

From: Tejun Heo
Date: Friday, August 6, 2010 - 12:06 am

Hello,


It applies cleanly to v2.6.35.

Thanks.

-- 
tejun
--

From: Stephan Diestelhorst
Date: Friday, August 6, 2010 - 2:04 am

HI,


Arrrgh. My "great" company Exchange mail server thought it was a good
idea to mess with the white-space of the mail. That's why the patch did
not apply. Compiling and testing now, sorry.

Stephan

-- 
Stephan Diestelhorst, AMD Operating System Research Center
stephan.diestelhorst@amd.com, Tel. +49 (0)351 448 356 719

Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach
General Managers: Alberto Bozzo, Andrew Bowd
Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632

--

From: Stephan Diestelhorst
Date: Tuesday, August 17, 2010 - 12:51 am

Hi Tejun,


I've finally managed to get this to compile and test. (Hit a bug with
Debian's make-kpkg and other nuisances...)

The problem is still there. On some resumes I get the dreadful dead
disk again:

end_request: I/O error , dev sda sector ...
sd 0:0:0:0: [sda] Unhandled error code
sd 0:0:0:0: [sda] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
sd 0:0:0:0: [sda] CDB: Read(10): 28 00 0e a4 77 a8 0 00 08 00
(many of those)

Can't access /var/log/messages right now, due to broken I/O. Will try
to trigger it again and check for the qc timeout messages..

Stephan

-- 
Stephan Diestelhorst, AMD Operating System Research Center
stephan.diestelhorst@amd.com, Tel. +49 (0)351 448 356 719

Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach
General Managers: Alberto Bozzo, Andrew Bowd
Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632

--

From: Tejun Heo
Date: Tuesday, August 17, 2010 - 1:08 am

Hello,


Yeah, it would great to have the log.  So, it seems like the hardware
is actually buggy then.  :-(

Thanks.

-- 
tejun
--

From: Stephan Diestelhorst
Date: Tuesday, August 17, 2010 - 2:32 am

Indeed. Like I said, I have similar issues on a another Samsung HDD in
an AMD system. I have not yet got around to try the fix there, but I
suspect it is the same thing.

I have attached the full /var/log/messages and /var/log/kern.log with
multiple suspend-to-ram runs and the last one failing.

Would it make sense to add Rafael's workaround upstream, maybe enabling
it only for particular platforms / HDDs / with a parameter?

Thanks,
  Stephan
-- 
Stephan Diestelhorst, AMD Operating System Research Center
stephan.diestelhorst@amd.com, Tel. +49 (0)351 448 356 719

Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach
General Managers: Alberto Bozzo, Andrew Bowd
Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632
From: Tejun Heo
Date: Tuesday, August 17, 2010 - 3:15 am

Hello,


Hmm... are you sure the patch is applied?  There's no debug message

Yeah, maybe.  The problem is that I'm a bit reluctant to do that for
all cases as it may cause other obscure failures and we don't know
whether the problem is controller or device specific at this point,
so...

Thanks.

-- 
tejun
--

From: Stephan Diestelhorst
Date: Tuesday, August 17, 2010 - 3:29 am

Hi,


...

Let me get back to you. :-/

Stephan

-- 
Stephan Diestelhorst, AMD Operating System Research Center
stephan.diestelhorst@amd.com, Tel. +49 (0)351 448 356 719

Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach
General Managers: Alberto Bozzo, Andrew Bowd
Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632

--

From: Stephan Diestelhorst
Date: Tuesday, August 17, 2010 - 3:51 am

I *think* I have applied the patch correctly. Please find a copy of
"git show" in the build directory attached. This should be the right
thing, shouldn't it?

Maybe I forgot to speify a particular debug option / verbosity?

I've also confirmed that the "XXX ahci_set_ipm" is present in
libahci.ko. So either I've screwed up badly when compiling the initrd,
the code is not executed or the printout does not make it into any
logfile anymore.

Stephan
-- 
Stephan Diestelhorst, AMD Operating System Research Center
stephan.diestelhorst@amd.com, Tel. +49 (0)351 448 356 719

Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach
General Managers: Alberto Bozzo, Andrew Bowd
Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632
From: Tejun Heo
Date: Tuesday, August 17, 2010 - 8:04 am

Hello,


Yeah, looks right to me but if you have enabled IPM, there gotta be

Hmmm... all messages are at KERN_INFO level and they don't have any

Yeah, that's weird.  You're enabling IPM, right?

Thanks.

-- 
tejun
--

From: Stephan Diestelhorst
Date: Tuesday, August 17, 2010 - 2:28 pm

Hi,


Erm... Honestly, I have no clue. What is IPM? How do I enable it? This is a
Kubuntu Lucid 10.04 distribution, and I have not touched too much. In
particular, the kernels have been from upstream git, just with the Ubuntu config
copied over.

Maybe it is just not enabled? I am guessing that IPM might be IDE power
management? Or intelligent, integrated? Google turns up this email thread
as one of the first hits and nothing else conclusive.

Thanks,
  Stephan
--

From: Tejun Heo
Date: Tuesday, August 17, 2010 - 11:12 pm

Hello,


It's interface power management, also called link power management.
You can check whether it's enabled by

$ cat /sys/class/scsi_host/host0/link_power_management_policy

If it says max_performance, it's disabled.  If it says anything else,
it's enabled.

Thanks.

-- 
tejun
--

From: Stephan Diestelhorst
Date: Thursday, August 19, 2010 - 9:23 am

Hi Tejun,


It says "max_performance", I have not touched anyhting. So it has been
like that all the time. Would this explain why your patch did not show
the debug printout?

Thanks,
  Stephan
-- 
Stephan Diestelhorst, AMD Operating System Research Center
stephan.diestelhorst@amd.com, Tel. +49 (0)351 448 356 719

Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach
General Managers: Alberto Bozzo, Andrew Bowd
Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632

--

From: Tejun Heo
Date: Monday, August 23, 2010 - 5:03 am

Hello, sorry about the delay.


Hmm... okay.  Yeah, if you haven't been using IPM at all, there won't
be any debug messages but at the same time the posted patch should
have had the same effect as Rafael's patch as IPM path isn't traveled
at all.  Can you please check the followings?

* You're actually running the correct patched kernel and modules.  It
  probably is a good idea to add a printk message.  ie. Apply the
  patch and add a printk() in ata_host_request_pm() in libata-core.c
  and make sure the debug messages appears.

* Rafael's patch actually fixes the problem.  If you haven't been
  using IPM at all, Rafael's patch and mine should behave exactly the
  same (ie. no IPM operation at all during suspend/resume).  It could
  be that you're seeing a different issue.

Rafael, can you please test my patch and see how your case behaves?

Thanks.

-- 
tejun
--

From: Rafael J. Wysocki
Date: Monday, August 23, 2010 - 11:58 am

From: Tejun Heo
Date: Tuesday, August 24, 2010 - 12:37 am

Yeap, that one.  I can prep a test git branch if necessary.

Thanks.

-- 
tejun
--

From: Rafael J. Wysocki
Date: Tuesday, August 24, 2010 - 1:39 pm

No need to, but it's going to take a few days to verify on my box.

Thanks,
Rafael
--

From: Rafael J. Wysocki
Date: Thursday, August 26, 2010 - 4:09 pm

Well, no luck.  I was able to reproduce the issue on my box with this patch
applied on top of 2.6.32-rc2.

Which probably means that the link power management is not really involved
here and seems to turn up this statement:

rc = ata_host_request_pm(host, mesg, 0, ATA_EHI_QUIET, 1);

in ata_host_suspend() as the culprit.

Does it make sense?

Thanks,
Rafael
--

From: Rafael J. Wysocki
Date: Thursday, August 26, 2010 - 4:46 pm

Thanks,
Rafael
--

From: Tejun Heo
Date: Thursday, September 2, 2010 - 2:06 am

Hello, Rafael.


So, LPM doesn't have anything to do with the problem.  The fact that
this only happens on specific machines is strange.  Maybe the BIOS is
doing something fishy to the controller on disk spindown during
suspend.  Can the BIOS tell that the system is going for suspend by
the time sd suspend is called?  I'll prep another patch which will
make EH skip certain steps and ignore failures during suspend, which
basically mimics what your patch does but in safer way.

Thanks.

-- 
tejun
--

From: Tejun Heo
Date: Thursday, September 2, 2010 - 3:02 am

For some mysterious reason, certain hardware reacts badly to usual EH
actions while the system is going for suspend.  As the devices won't
be needed until the system is resumed, ask EH to skip usual autopsy
and recovery and proceed directly to suspend.

Signed-off-by: Tejun Heo <tj@kernel.org>
---
Can you please verify this one?

Thanks.

 drivers/ata/libata-core.c |   14 +++++++++++++-
 drivers/ata/libata-eh.c   |    4 ++++
 include/linux/libata.h    |    1 +
 3 files changed, 18 insertions(+), 1 deletion(-)

diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
index 9ceb493..92cd5f3 100644
--- a/drivers/ata/libata-core.c
+++ b/drivers/ata/libata-core.c
@@ -5427,6 +5427,7 @@ static int ata_host_request_pm(struct ata_host *host, pm_message_t mesg,
  */
 int ata_host_suspend(struct ata_host *host, pm_message_t mesg)
 {
+	unsigned int ehi_flags = ATA_EHI_QUIET;
 	int rc;

 	/*
@@ -5435,7 +5436,18 @@ int ata_host_suspend(struct ata_host *host, pm_message_t mesg)
 	 */
 	ata_lpm_enable(host);

-	rc = ata_host_request_pm(host, mesg, 0, ATA_EHI_QUIET, 1);
+	/*
+	 * On some hardware, device fails to respond after spun down
+	 * for suspend.  As the device won't be used before being
+	 * resumed, we don't need to touch the device.  Ask EH to skip
+	 * the usual stuff and proceed directly to suspend.
+	 *
+	 * http://thread.gmane.org/gmane.linux.ide/46764
+	 */
+	if (mesg.event == PM_EVENT_SUSPEND)
+		ehi_flags |= ATA_EHI_NO_AUTOPSY | ATA_EHI_NO_RECOVERY;
+
+	rc = ata_host_request_pm(host, mesg, 0, ehi_flags, 1);
 	if (rc == 0)
 		host->dev->power.power_state = mesg;
 	return rc;
diff --git a/drivers/ata/libata-eh.c b/drivers/ata/libata-eh.c
index 0108731..95838b3 100644
--- a/drivers/ata/libata-eh.c
+++ b/drivers/ata/libata-eh.c
@@ -3242,6 +3242,10 @@ static int ata_eh_skip_recovery(struct ata_link *link)
 	if (link->flags & ATA_LFLAG_DISABLED)
 		return 1;

+	/* skip if explicitly requested */
+	if (ehc->i.flags & ATA_EHI_NO_RECOVERY)
+		return ...
From: Stephan Diestelhorst
Date: Thursday, September 2, 2010 - 7:33 am

Thanks Tejun for this patch. Will test in a second!

Cheers,
  Stephan

-- 
Stephan Diestelhorst, AMD Operating System Research Center
stephan.diestelhorst@amd.com, Tel. +49 (0)351 448 356 719  (AMD: 29-719)

--

From: Rafael J. Wysocki
Date: Thursday, September 2, 2010 - 1:11 pm

I'll wait, then, because it takes a couple of days to reproduce it on my machine.

Thanks,
Rafael
--

From: Stephan Diestelhorst
Date: Thursday, September 2, 2010 - 1:52 pm

On Thu, Sep 2, 2010 at 4:33 PM, Stephan Diestelhorst

Applied, compiled and worked for the 5 iterations I managed to test it.
Much better than your previous patch, thanks a lot! I'd give it some more spins
tomorrow, though.


Thanks,
  Stephan
--

From: Stephan Diestelhorst
Date: Tuesday, September 7, 2010 - 4:54 am

Alright, I have tested for more than 20 iterations now. It works
great! Thanks for the fix!

Are there any plans for upstream inclusion?

Thanks,
  Stephan


-- 
Stephan Diestelhorst, AMD Operating System Research Center
stephan.diestelhorst@amd.com, Tel. +49 (0)351 448 356 719

Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach
General Managers: Alberto Bozzo, Andrew Bowd
Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632

--

From: Rafael J. Wysocki
Date: Thursday, September 2, 2010 - 1:16 pm

Putting the issue at hand aside, I'm not really sure if using SCSI EH for
suspending the controller is a good idea.  It seems overly complicated and
it doesn't match the new PCI suspend model with separate ->suspend(),
->freeze() and ->poweroff() callbacks.  Moreover, the passing of pm_message_t
back and forth doesn't make things clear either.

Would it be possible to rework this thing entirely at one point?

Rafael
--

From: Tejun Heo
Date: Thursday, September 2, 2010 - 1:25 pm

Hello,


Well, I think I would need more than that to rework the whole thing.
There are a lot of benefits in sharing the same path between probing /
error handling and suspend/resuming.  ATA has a lot of quirks which
have to be dealt with and it will be very fragile to scatter handling
logics over multiple separate paths.  We definitely can try to make
the plumbing from power management easier to follow.

Thanks.

-- 
tejun
--

From: Rafael J. Wysocki
Date: Thursday, September 2, 2010 - 1:28 pm

That would be very nice.  In particular, I'd like to get rid of the
pm_message_t thing if possible.  And I'd like to avoid putting the
controller into D3 before creating hibernation image. :-)

Thanks,
Rafael
--

From: Tejun Heo
Date: Thursday, September 2, 2010 - 1:33 pm

Hello,


Oh, yeah, things like that can definitely be changed, but I think it
would still need to be piped through EH.  That's how the queue gets
quiesced for those special operations and resume is basically probing,
so it doesn't make much sense to split them.  Please let me know how
it should work from power management POV and I'll be happy to convert
libata to fit the new behavior.

Thanks.

-- 
tejun
--

From: Tejun Heo
Date: Tuesday, September 7, 2010 - 5:05 am

For some mysterious reason, certain hardware reacts badly to usual EH
actions while the system is going for suspend.  As the devices won't
be needed until the system is resumed, ask EH to skip usual autopsy
and recovery and proceed directly to suspend.

Signed-off-by: Tejun Heo <tj@kernel.org>
Tested-by: Stephan Diestelhorst <stephan.diestelhorst@amd.com>
Cc: stable@kernel.org
---
 drivers/ata/libata-core.c |   14 +++++++++++++-
 drivers/ata/libata-eh.c   |    4 ++++
 include/linux/libata.h    |    1 +
 3 files changed, 18 insertions(+), 1 deletion(-)

diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
index 9ceb493..92cd5f3 100644
--- a/drivers/ata/libata-core.c
+++ b/drivers/ata/libata-core.c
@@ -5427,6 +5427,7 @@ static int ata_host_request_pm(struct ata_host *host, pm_message_t mesg,
  */
 int ata_host_suspend(struct ata_host *host, pm_message_t mesg)
 {
+	unsigned int ehi_flags = ATA_EHI_QUIET;
 	int rc;

 	/*
@@ -5435,7 +5436,18 @@ int ata_host_suspend(struct ata_host *host, pm_message_t mesg)
 	 */
 	ata_lpm_enable(host);

-	rc = ata_host_request_pm(host, mesg, 0, ATA_EHI_QUIET, 1);
+	/*
+	 * On some hardware, device fails to respond after spun down
+	 * for suspend.  As the device won't be used before being
+	 * resumed, we don't need to touch the device.  Ask EH to skip
+	 * the usual stuff and proceed directly to suspend.
+	 *
+	 * http://thread.gmane.org/gmane.linux.ide/46764
+	 */
+	if (mesg.event == PM_EVENT_SUSPEND)
+		ehi_flags |= ATA_EHI_NO_AUTOPSY | ATA_EHI_NO_RECOVERY;
+
+	rc = ata_host_request_pm(host, mesg, 0, ehi_flags, 1);
 	if (rc == 0)
 		host->dev->power.power_state = mesg;
 	return rc;
diff --git a/drivers/ata/libata-eh.c b/drivers/ata/libata-eh.c
index 0108731..95838b3 100644
--- a/drivers/ata/libata-eh.c
+++ b/drivers/ata/libata-eh.c
@@ -3242,6 +3242,10 @@ static int ata_eh_skip_recovery(struct ata_link *link)
 	if (link->flags & ATA_LFLAG_DISABLED)
 		return 1;

+	/* skip if explicitly requested */
+	if ...
From: Stephan Diestelhorst
Date: Tuesday, August 24, 2010 - 9:11 am

-- 
Stephan Diestelhorst, AMD Operating System Research Center
stephan.diestelhorst@amd.com, Tel. +49 (0)351 448 356 719

Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach
General Managers: Alberto Bozzo, Andrew Bowd
Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632

--

From: Stephan Diestelhorst
Date: Thursday, August 26, 2010 - 9:15 am

Just did the following: Rebased Rafaels patch to 2.6.35 and tried it
again (with added prints to make sure I am running the right one) and
did >10 suspend to ram / resume cycles under I/O write load. All of
them worked fine (for comparison: your patch resulted in RO HDD at
first attempt).

(I had some extra prints around the suspend functions changed in
 Rafael's patch, tried with and without, no change--works flawlessly.)

What do you make of this?

Thanks,
  Stephan
-- 
Stephan Diestelhorst, AMD Operating System Research Center
stephan.diestelhorst@amd.com, Tel. +49 (0)351 448 356 719

Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach
General Managers: Alberto Bozzo, Andrew Bowd
Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632

--

From: Rafael J. Wysocki
Date: Thursday, August 26, 2010 - 11:24 am

I think my patch actually does more than the Tejun's one.  I need to have a
deeper look at them both.

I'm still testing the Tejun's patch on my system where I was able to reproduce
the problem, but so far it's been working.

Thanks,
Rafael
--

From: Rafael J. Wysocki
Date: Friday, August 27, 2010 - 4:35 pm

I reproduced the problem with the Tejun's patch applied, so I'm now quite
sure the problem is related to the suspend of controller ports (which is done
by scheduling SCSI error handling on the controller).

Anyway, below is a new version of my patch that plays a bit nicer with
the resume code.  Can you please check if it still fixes the problem for you?

Thanks,
Rafael

---
From: Rafael J. Wysocki <rjw@sisk.pl>
Subject: SATA / AHCI: Do not play with the link PM during suspend to RAM (v2)

My Acer Ferrari One occasionally loses communication with the HDD
(which in fact is an Intel SSD) during suspend to RAM.  The symptom
is that the IDENTIFY command times out during suspend and the device
is dropped by the kernel, so it is not available during resume and
the system is unuseable as a result.  The failure is not readily
reproducible, although it happens once every several suspends and
it always happens after the disk has been shut down by the SCSI
layer's suspend routine.

I was able to trace this issue down to the scheduling of error
handling for all of the controller's ports carried out by
ata_host_suspend(), which indicates quirky hardware.  However, the
AHCI driver, which is used on the affected box, doesn't really need
to do anything with the controller's ports during suspend to RAM,
because the controller is going to be put into D3 immediately by
ata_pci_device_do_suspend() and it will undergo full reset during
the subsequent resume anyway.  For this reason, make the AHCI driver
avoid calling ata_host_suspend() during suspend to RAM which works
around the problem and makes sense as a general optimization.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
---
 drivers/ata/ahci.c        |   11 ++++++++++-
 drivers/ata/libata-core.c |   20 ++++++++++++++++++++
 include/linux/libata.h    |    1 +
 3 files changed, 31 insertions(+), 1 deletion(-)

Index: linux-2.6/drivers/ata/ahci.c
===================================================================
--- ...
From: Stephan Diestelhorst
Date: Thursday, September 2, 2010 - 7:31 am

Applied to 2.6.35.3 and tested. Works perfectly fine (> 10 s2ram
cycles under heavy I/O load).

Many thanks,
  Stephan

-- 
Stephan Diestelhorst, AMD Operating System Research Center
stephan.diestelhorst@amd.com, Tel. +49 (0)351 448 356 719

Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach
General Managers: Alberto Bozzo, Andrew Bowd
Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632

--

From: Stephan Diestelhorst
Date: Tuesday, August 24, 2010 - 9:07 am

Did that. Actually also added some printks to the XXX function, called
early during boot. Output confirms that your patch is loaded. And even

That next on my list...

Many thanks!

Stephan

-- 
Stephan Diestelhorst, AMD Operating System Research Center
stephan.diestelhorst@amd.com, Tel. +49 (0)351 448 356 719

Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach
General Managers: Alberto Bozzo, Andrew Bowd
Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632

--

From: Rafael J. Wysocki
Date: Tuesday, August 17, 2010 - 4:19 am

Well, I wonder what the real reason for doing the link power management
thing at this particular point in the suspend code path is.  It just seems to
disable the link power management, but then the controller is put into a
low-power state and is reset from scratch during resume, so I'm not quite
sure how skipping that code could possibly lead to any problems.

Perhaps we could move the link PM manipulation to the prepare stage of suspend?

Thanks,
Rafael
--

From: Tejun Heo
Date: Tuesday, August 17, 2010 - 4:29 am

Hello,


Stranger things have happened in the ATA la-la land. :-) Also, it
makes non-lpm and lpm cases leave the controller and device in
different states when it goes to sleep, which _really_ bothers me.
Combined with the timing dependent nature of DIPM, I worry this might
lead to very obscure issues and would much prefer to make sure
everything is in fixed, known, fully powered state before committing
to any major operations.  I might be paranoid tho.  I'll think more

Yeah, one possibility is that the devices misbehave if they receive
LPM commands while suspended.  Does commenting out sd_suspend resolve
the issue too?

Thanks.

-- 
tejun
--

From: Stephan Diestelhorst
Date: Tuesday, August 17, 2010 - 5:10 am

Hi,


If you want me to test anything... let me know. Since I do not know
much about the ATA code, I do not know what to change where. (A simple
grep for sd_suspend in drivers/ata didn't turn up anything.)

Thanks,
  Stephan
-- 
Stephan Diestelhorst, AMD Operating System Research Center
stephan.diestelhorst@amd.com, Tel. +49 (0)351 448 356 719

Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach
General Managers: Alberto Bozzo, Andrew Bowd
Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632

--

From: Tejun Heo
Date: Tuesday, August 17, 2010 - 5:09 am

Hello,


Oh, sure, the following should be enough.  Thanks.

diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index 8802e48..892ccc7 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -2456,6 +2456,8 @@ static int sd_suspend(struct device *dev, pm_message_t mesg)
 	struct scsi_disk *sdkp = scsi_disk_get_from_dev(dev);
 	int ret = 0;

+	return 0;
+
 	if (!sdkp)
 		return 0;	/* this can happen */

-- 
tejun
--


<snip>

Sorry for taking ages. Vacation and catching up after it are to blame,
as is me forgetting to build a proper initrd...

Thanks for the patch! It certainly changes behaviour, however, in a
very strange way for me. With your patch my machine does not suspend
to ram anymore (a simple echo mem > /proc/sys/state blocks), and
nothing happens in dmesg if there is a lot of write I/O while
suspending. (A number of parallel dd's with oflag=direct)

If I stop the I/O, the system eventually goes into suspend to RAM.
However, that takes a while, after the I/O has stopped, and also
from "Preparing system for suspend" log entry until it is actually
done.

Is this intentional? Let me know how I can debug this further!
Ideally I'd like to be able to suspend the machine under I/O load,
too. (E.g. during a compile job.)

Can you reproduce this at your end, too?

Many thanks,
  Stephan
--


Well, I didn't try suspending with a number of parallel dd's with oflag=direct
in the background, but otherwise I'm not reproducing the issue with
the patch applied.

Rafael
--


Mhmhm, I have tried to reproduce my issue again, and also added some
dev_printk's around your code to understand where the delay is
happening.

However, I have not been able to reproduce the issue (with and without
the debug output) anymore, and I am happy to report that for now your
patch helps.

I'd like to keep this under observation for a little while longer,
though.

Many thanks,
  Stephan

-- 
Stephan Diestelhorst, AMD Operating System Research Center
stephan.diestelhorst@amd.com, Tel. +49 (0)351 448 356 719

Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach
General Managers: Alberto Bozzo, Andrew Bowd
Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632

--


Good.

What you might be seeing is that the patch generally changes the timing of
suspend and since it is done asynchronously by default the change might trigger

You can try to remove the noise produced by asynchronous suspend from the
picture by dong "echo 0 > /sys/power/pm_async" (just once after bootup).

Thanks,
Rafael
--

From: Rafael J. Wysocki
Date: Saturday, July 10, 2010 - 6:08 am

They may be doing something that causes the issue to appear.

That said, on my test box this only happens during suspend and it's an Intel

There are only ext3/ext4 partitions on the disk in my case.

Rafael
--

From: Maciej Rutecki
Date: Monday, July 12, 2010 - 8:35 am

I created a Bugzilla entry at 
https://bugzilla.kernel.org/show_bug.cgi?id=16370
for your bug report, please add your address to the CC list in there, thanks!

-- 
Maciej Rutecki
http://www.maciek.unixy.pl
--

Previous thread: Re: [PATCH] netfilter: add CHECKSUM target by Patrick McHardy on Friday, July 9, 2010 - 8:17 am. (5 messages)

Next thread: [PATCH 1/2] Add trace events to mmap and brk by Eric B Munson on Friday, July 9, 2010 - 8:53 am. (6 messages)