Re: 2.6.25-rc6 hangs at resume after suspend to RAM on Mac mini Core Duo

Previous thread: Re: [mmotm] PM build error (PM_SLEEP undef) by Rafael J. Wysocki on Wednesday, March 19, 2008 - 3:16 pm. (3 messages)

Next thread: Come join me on ssbbw4u... by P Alb on Wednesday, March 19, 2008 - 6:28 pm. (1 message)
From: Tino Keitel
Date: Wednesday, March 19, 2008 - 11:05 pm

Hi folks,

whereas I had working suspend with 2.6.25-rc5, I had a hang at resume
with git HEAD from March 19th (something after 2.6.25-rc6). I tried it
again and got another hang, so it seems to be reproducible.

Any hints what commit might have broke suspend? I won't have the time
to start a git bisect or any resume debugging using CONFIG_PM_TRACE_RTC
until next Tuesday.

This is a Mac mini Core Duo. Here is the lspci -nn output:

00:00.0 Host bridge [0600]: Intel Corporation Mobile 945GM/PM/GMS,
943/940GML and 945GT Express Memory Controller Hub [8086:27a0] (rev 03)

00:02.0 VGA compatible controller [0300]: Intel Corporation Mobile
945GM/GMS, 943/940GML Express Integrated Graphics Controller
[8086:27a2] (rev 03)

00:07.0 Performance counters [1101]: Intel Corporation Unknown device
[8086:27a3] (rev 03)

00:1b.0 Audio device [0403]: Intel Corporation 82801G (ICH7 Family)
High Definition Audio Controller [8086:27d8] (rev 02)

00:1c.0 PCI bridge [0604]: Intel Corporation 82801G (ICH7 Family) PCI
Express Port 1 [8086:27d0] (rev 02)

00:1c.1 PCI bridge [0604]: Intel Corporation 82801G (ICH7 Family) PCI
Express Port 2 [8086:27d2] (rev 02)

00:1d.0 USB Controller [0c03]: Intel Corporation 82801G (ICH7 Family)
USB UHCI Controller #1 [8086:27c8] (rev 02)

00:1d.1 USB Controller [0c03]: Intel Corporation 82801G (ICH7 Family)
USB UHCI Controller #2 [8086:27c9] (rev 02)

00:1d.2 USB Controller [0c03]: Intel Corporation 82801G (ICH7 Family)
USB UHCI Controller #3 [8086:27ca] (rev 02)

00:1d.3 USB Controller [0c03]: Intel Corporation 82801G (ICH7 Family)
USB UHCI Controller #4 [8086:27cb] (rev 02)

00:1d.7 USB Controller [0c03]: Intel Corporation 82801G (ICH7 Family)
USB2 EHCI Controller [8086:27cc] (rev 02)

00:1e.0 PCI bridge [0604]: Intel Corporation 82801 Mobile PCI Bridge
[8086:2448] (rev e2)

00:1f.0 ISA bridge [0601]: Intel Corporation 82801GBM (ICH7-M) LPC
Interface Bridge [8086:27b9] (rev 02)

00:1f.1 IDE interface [0101]: Intel Corporation 82801G (ICH7 ...
From: Andrew Morton
Date: Thursday, March 20, 2008 - 3:25 am

Please try Linus's current tree - quite a few things got reverted today
which might have fixed this.
--

From: Tino Keitel
Date: Thursday, March 20, 2008 - 3:54 am

OK, I'll try it.

Regards,
Tino
--

From: Tino Keitel
Date: Tuesday, March 25, 2008 - 4:13 pm

I tried HEAD from yesterday, but resume still fails. So it looks like I
need to start a git bisect.

Regards,
Tino
--

From: Rafael J. Wysocki
Date: Friday, March 28, 2008 - 4:03 pm

Can you please try to boot with acpi_new_pts_ordering and retest?

Thanks,
Rafael
--

From: Tino Keitel
Date: Monday, March 31, 2008 - 2:28 pm

On Sat, Mar 29, 2008 at 00:03:15 +0100, Rafael J. Wysocki wrote:


I just tried current -git (a9edadbf790d72adf6ebed476cb5caf7743e7e4a),
without success. I still got the same hand at resume.

Regards,
Tino

--

From: Soeren Sonnenburg
Date: Monday, March 31, 2008 - 11:23 pm

I am having the hang on resume problem too (macbook pro), cf.
http://bugzilla.kernel.org/show_bug.cgi?id=10319, 2.6.24 was fine.

Soeren
-- 
For the one fact about the future of which we can be certain is that it
will be utterly fantastic. -- Arthur C. Clarke, 1962
--

From: Tino Keitel
Date: Monday, March 31, 2008 - 11:27 pm

What about 2.6.25-rc5? Up to this, resume works for me. -rc6 is broken.

Regards,
Tino
--

From: Soeren Sonnenburg
Date: Monday, March 31, 2008 - 11:38 pm

No never worked in 2.6.25 (at least from rc2 on) and still does not in
rc7-current-git. But note that on console I could suspend to ram and
resume but the display is not anymore coming back with s2ram - from
within X it freezes though.

Soeren

--

From: Rafael J. Wysocki
Date: Tuesday, April 1, 2008 - 2:05 pm

Please try -rc8.

Thanks,
Rafael
--

From: Soeren Sonnenburg
Date: Wednesday, April 2, 2008 - 10:33 am

OK I did. Now not only the console stays black but blindly typing
reboot/s2ram does not do anything after a resume... in this respect rc7
was better though not usable (as the display remained off).

Soeren
--

From: Rafael J. Wysocki
Date: Wednesday, April 2, 2008 - 10:56 am

The Felix's MacBook suspends sucessfully with current -git, AFAICS.  Perhaps
you can compare .configs?

Rafael
--

From: Soeren Sonnenburg
Date: Wednesday, April 2, 2008 - 11:42 am

Well it suspends, since rc2/3 or so but it simply does not resume as it
used to in 2.24. Anyway config is here
http://nn7.de/debugging/config.gz . 
Also note that this is a macbookpro1,1 and that I have to use
http://marc.info/?l=linux-kernel&m=120674502201007&w=4 to be able to
boot.

Soeren
--

From: Tino Keitel
Date: Tuesday, April 1, 2008 - 5:55 am

I'll try to find the time for a bisect his week.

Regards,
Tino
--

From: Tino Keitel
Date: Tuesday, April 1, 2008 - 11:37 am

git-bisect revealed this:

$ git bisect good
e82cc1288fa57857c6af8c57f3d07096d4bcd9d9 is first bad commit
commit e82cc1288fa57857c6af8c57f3d07096d4bcd9d9
Author: David Brownell <david-b@pacbell.net>
Date:   Fri Mar 7 13:49:42 2008 -0800

    USB: fix ehci unlink regressions
    
    The recent EHCI driver update to split the IAA watchdog timer out from
    the other timers made several things work better, but not everything;
    and it created a couple new issues in bugzilla.  Ergo this patch:
    
      - Handle a should-be-rare SMP race between the watchdog firing
        and (very late) IAA interrupts;
    
      - Remove a shouldn't-have-been-added WARN_ON() test;
    
      - Guard against one observed OOPS;
    
      - If this watchdog fires during clean HC shutdown, it should act
        as a NOP instead of interfering with the shutdown sequence;
    
      - Guard against silicon errata hypothesized by some vendors:
          * IAA status latch broken, but IAAD cleared OK;
          * IAAD wasn't cleared when IAA status got reported;
    
    The WARN_ON is in bugzilla as 10168; the OOPS as 10078; these are
    both regressions.
    
    Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
    Tested-by: Gordon Farquharson <gordonfarquharson@gmail.com>
251bc90076ea0b3a14d0b88dd250581fa815b5de M	drivers

For the linux-usb people: this commit breaks resume from suspend to RAM
on my Mac mini Core Duo. It worked with 2.6.24, 2.6.25-rc5, and was
broken since 2.6.25-rc6 (also with -rc7 and later). See
http://lkml.org/lkml/2008/3/20/23 for more information.

Regards,
Tino
--

From: Alan Stern
Date: Tuesday, April 1, 2008 - 12:06 pm

Can you get any information out of the computer after the hang?  It 
would help to see the dmesg log (with CONFIG_USB_DEBUG enabled) and an 
Alt-SysRq-T stack dump.

Alan Stern

--

From: Tino Keitel
Date: Tuesday, April 1, 2008 - 1:25 pm

On Tue, Apr 01, 2008 at 15:06:51 -0400, Alan Stern wrote:


Looks difficult. The text console stays black after resume, and there
is no serial interface.

Regards,
Tino
--

From: Alan Stern
Date: Tuesday, April 1, 2008 - 2:02 pm

Network console?  SSH?  Or even telnet?

Alan Stern

--

From: Tino Keitel
Date: Tuesday, April 1, 2008 - 2:06 pm

Network is dead, too.

Regards,
Tino
--

From: Alan Stern
Date: Wednesday, April 2, 2008 - 7:25 am

Okay, then try this.  Boot with "no_console_suspend" on the command
line.  Do "echo devices >/sys/power/pm_test" before starting the
suspend.  Then "echo mem >/sys/power/state", and while the system is
paused try unplugging/replugging a USB device.

Maybe that will provide usable information.

Alan Stern

--

From: Alan Stern
Date: Wednesday, April 2, 2008 - 12:26 pm

Tino, try this patch.  It fixed Mark Lord's problem, which looked the 
same as yours.

Alan Stern



--- rc8/drivers/usb/host/ehci-hub.c	2008-03-11 11:18:40.000000000 -0400
+++ linux/drivers/usb/host/ehci-hub.c	2008-04-02 13:28:50.000000000 -0400
@@ -135,8 +135,6 @@
 		hcd->state = HC_STATE_QUIESCING;
 	}
 	ehci->command = ehci_readl(ehci, &ehci->regs->command);
-	if (ehci->reclaim)
-		end_unlink_async(ehci);
 	ehci_work(ehci);
 
 	/* Unlike other USB host controller types, EHCI doesn't have
@@ -180,6 +178,9 @@
 	ehci_halt (ehci);
 	hcd->state = HC_STATE_SUSPENDED;
 
+	if (ehci->reclaim)
+		end_unlink_async(ehci);
+
 	/* allow remote wakeup */
 	mask = INTR_MASK;
 	if (!device_may_wakeup(&hcd->self.root_hub->dev))



--

From: Tino Keitel
Date: Wednesday, April 2, 2008 - 12:47 pm

Fine, it seems to solve the hang, I see no problems after the resume.
Thanks a lot.

Regards,
Tino
--

From: Soeren Sonnenburg
Date: Wednesday, April 2, 2008 - 1:37 pm

Using this patch fixes the hang for my macbookpro1,1 too. However the
console stays black after resume, though I can s2ram multiple times and
reboot...

Soeren
--

From: David Brownell
Date: Tuesday, April 1, 2008 - 1:06 pm

Most curious.  Based on what I read earlier, I was wondering about
the handling of the unlink watchdog during a known-dodgey path.


This "fires during a clean shutdown" path.  If there's any way this
patch would affect resume handling, that's where I'd expect it to
kick in.  Virtually nothing else *could* cause problems there.

If you'd like to experiment, modify the "if (...)" at the top of the
ehci_iaa_watchdog() function and make it just check to see if there's

FWIW the former has been confirmed as existing on some current AMD/ATI

--

From: Pavel Machek
Date: Monday, March 31, 2008 - 2:50 pm

You should put "acpi_new_pts_ordering" on kernel command line.
									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
--

From: Tino Keitel
Date: Monday, March 31, 2008 - 3:01 pm

I did, just forgot to mention it.

Regards,
Tino
--

From: Rafael J. Wysocki
Date: Monday, March 31, 2008 - 3:17 pm

OK, thanks.

Have you tried to do:

# echo core > /sys/power/pm_test
# echo mem > /sys/power/state

(it's good to boot the kernel with no_console_suspend and do
"echo 8 > /proc/sys/kernel/printk" before that to see the messages)?

Rafael
--

From: Tino Keitel
Date: Tuesday, April 1, 2008 - 1:04 am

Thanks, I'll try this evening. What should I look for in the kernel
messages?

Regards,
Tino
--

From: Rafael J. Wysocki
Date: Tuesday, April 1, 2008 - 2:03 pm

Any irregularities.  Please just post the dmesg output if the system survives.

If there are any oopes etc., you should see them. :-)

Thanks,
Rafael
--

From: Rafael J. Wysocki
Date: Thursday, March 20, 2008 - 3:10 am

No ideas for now.

I have created a Bugzilla entry for this problem at:
http://bugzilla.kernel.org/show_bug.cgi?id=10291


Thanks,
Rafael
--

Previous thread: Re: [mmotm] PM build error (PM_SLEEP undef) by Rafael J. Wysocki on Wednesday, March 19, 2008 - 3:16 pm. (3 messages)

Next thread: Come join me on ssbbw4u... by P Alb on Wednesday, March 19, 2008 - 6:28 pm. (1 message)