Hi folks, whereas I had working suspend with 2.6.25-rc5, I had a hang at resume with git HEAD from March 19th (something after 2.6.25-rc6). I tried it again and got another hang, so it seems to be reproducible. Any hints what commit might have broke suspend? I won't have the time to start a git bisect or any resume debugging using CONFIG_PM_TRACE_RTC until next Tuesday. This is a Mac mini Core Duo. Here is the lspci -nn output: 00:00.0 Host bridge [0600]: Intel Corporation Mobile 945GM/PM/GMS, 943/940GML and 945GT Express Memory Controller Hub [8086:27a0] (rev 03) 00:02.0 VGA compatible controller [0300]: Intel Corporation Mobile 945GM/GMS, 943/940GML Express Integrated Graphics Controller [8086:27a2] (rev 03) 00:07.0 Performance counters [1101]: Intel Corporation Unknown device [8086:27a3] (rev 03) 00:1b.0 Audio device [0403]: Intel Corporation 82801G (ICH7 Family) High Definition Audio Controller [8086:27d8] (rev 02) 00:1c.0 PCI bridge [0604]: Intel Corporation 82801G (ICH7 Family) PCI Express Port 1 [8086:27d0] (rev 02) 00:1c.1 PCI bridge [0604]: Intel Corporation 82801G (ICH7 Family) PCI Express Port 2 [8086:27d2] (rev 02) 00:1d.0 USB Controller [0c03]: Intel Corporation 82801G (ICH7 Family) USB UHCI Controller #1 [8086:27c8] (rev 02) 00:1d.1 USB Controller [0c03]: Intel Corporation 82801G (ICH7 Family) USB UHCI Controller #2 [8086:27c9] (rev 02) 00:1d.2 USB Controller [0c03]: Intel Corporation 82801G (ICH7 Family) USB UHCI Controller #3 [8086:27ca] (rev 02) 00:1d.3 USB Controller [0c03]: Intel Corporation 82801G (ICH7 Family) USB UHCI Controller #4 [8086:27cb] (rev 02) 00:1d.7 USB Controller [0c03]: Intel Corporation 82801G (ICH7 Family) USB2 EHCI Controller [8086:27cc] (rev 02) 00:1e.0 PCI bridge [0604]: Intel Corporation 82801 Mobile PCI Bridge [8086:2448] (rev e2) 00:1f.0 ISA bridge [0601]: Intel Corporation 82801GBM (ICH7-M) LPC Interface Bridge [8086:27b9] (rev 02) 00:1f.1 IDE interface [0101]: Intel Corporation 82801G (ICH7 ...
Please try Linus's current tree - quite a few things got reverted today which might have fixed this. --
I tried HEAD from yesterday, but resume still fails. So it looks like I need to start a git bisect. Regards, Tino --
Can you please try to boot with acpi_new_pts_ordering and retest? Thanks, Rafael --
On Sat, Mar 29, 2008 at 00:03:15 +0100, Rafael J. Wysocki wrote: I just tried current -git (a9edadbf790d72adf6ebed476cb5caf7743e7e4a), without success. I still got the same hand at resume. Regards, Tino --
I am having the hang on resume problem too (macbook pro), cf. http://bugzilla.kernel.org/show_bug.cgi?id=10319, 2.6.24 was fine. Soeren -- For the one fact about the future of which we can be certain is that it will be utterly fantastic. -- Arthur C. Clarke, 1962 --
What about 2.6.25-rc5? Up to this, resume works for me. -rc6 is broken. Regards, Tino --
No never worked in 2.6.25 (at least from rc2 on) and still does not in rc7-current-git. But note that on console I could suspend to ram and resume but the display is not anymore coming back with s2ram - from within X it freezes though. Soeren --
OK I did. Now not only the console stays black but blindly typing reboot/s2ram does not do anything after a resume... in this respect rc7 was better though not usable (as the display remained off). Soeren --
The Felix's MacBook suspends sucessfully with current -git, AFAICS. Perhaps you can compare .configs? Rafael --
Well it suspends, since rc2/3 or so but it simply does not resume as it used to in 2.24. Anyway config is here http://nn7.de/debugging/config.gz . Also note that this is a macbookpro1,1 and that I have to use http://marc.info/?l=linux-kernel&m=120674502201007&w=4 to be able to boot. Soeren --
Any chance to bisect? -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html --
I'll try to find the time for a bisect his week. Regards, Tino --
git-bisect revealed this:
$ git bisect good
e82cc1288fa57857c6af8c57f3d07096d4bcd9d9 is first bad commit
commit e82cc1288fa57857c6af8c57f3d07096d4bcd9d9
Author: David Brownell <david-b@pacbell.net>
Date: Fri Mar 7 13:49:42 2008 -0800
USB: fix ehci unlink regressions
The recent EHCI driver update to split the IAA watchdog timer out from
the other timers made several things work better, but not everything;
and it created a couple new issues in bugzilla. Ergo this patch:
- Handle a should-be-rare SMP race between the watchdog firing
and (very late) IAA interrupts;
- Remove a shouldn't-have-been-added WARN_ON() test;
- Guard against one observed OOPS;
- If this watchdog fires during clean HC shutdown, it should act
as a NOP instead of interfering with the shutdown sequence;
- Guard against silicon errata hypothesized by some vendors:
* IAA status latch broken, but IAAD cleared OK;
* IAAD wasn't cleared when IAA status got reported;
The WARN_ON is in bugzilla as 10168; the OOPS as 10078; these are
both regressions.
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Tested-by: Gordon Farquharson <gordonfarquharson@gmail.com>
251bc90076ea0b3a14d0b88dd250581fa815b5de M drivers
For the linux-usb people: this commit breaks resume from suspend to RAM
on my Mac mini Core Duo. It worked with 2.6.24, 2.6.25-rc5, and was
broken since 2.6.25-rc6 (also with -rc7 and later). See
http://lkml.org/lkml/2008/3/20/23 for more information.
Regards,
Tino
--
Can you get any information out of the computer after the hang? It would help to see the dmesg log (with CONFIG_USB_DEBUG enabled) and an Alt-SysRq-T stack dump. Alan Stern --
On Tue, Apr 01, 2008 at 15:06:51 -0400, Alan Stern wrote: Looks difficult. The text console stays black after resume, and there is no serial interface. Regards, Tino --
Network console? SSH? Or even telnet? Alan Stern --
Okay, then try this. Boot with "no_console_suspend" on the command line. Do "echo devices >/sys/power/pm_test" before starting the suspend. Then "echo mem >/sys/power/state", and while the system is paused try unplugging/replugging a USB device. Maybe that will provide usable information. Alan Stern --
Tino, try this patch. It fixed Mark Lord's problem, which looked the same as yours. Alan Stern --- rc8/drivers/usb/host/ehci-hub.c 2008-03-11 11:18:40.000000000 -0400 +++ linux/drivers/usb/host/ehci-hub.c 2008-04-02 13:28:50.000000000 -0400 @@ -135,8 +135,6 @@ hcd->state = HC_STATE_QUIESCING; } ehci->command = ehci_readl(ehci, &ehci->regs->command); - if (ehci->reclaim) - end_unlink_async(ehci); ehci_work(ehci); /* Unlike other USB host controller types, EHCI doesn't have @@ -180,6 +178,9 @@ ehci_halt (ehci); hcd->state = HC_STATE_SUSPENDED; + if (ehci->reclaim) + end_unlink_async(ehci); + /* allow remote wakeup */ mask = INTR_MASK; if (!device_may_wakeup(&hcd->self.root_hub->dev)) --
Fine, it seems to solve the hang, I see no problems after the resume. Thanks a lot. Regards, Tino --
Using this patch fixes the hang for my macbookpro1,1 too. However the console stays black after resume, though I can s2ram multiple times and reboot... Soeren --
Most curious. Based on what I read earlier, I was wondering about the handling of the unlink watchdog during a known-dodgey path. This "fires during a clean shutdown" path. If there's any way this patch would affect resume handling, that's where I'd expect it to kick in. Virtually nothing else *could* cause problems there. If you'd like to experiment, modify the "if (...)" at the top of the ehci_iaa_watchdog() function and make it just check to see if there's FWIW the former has been confirmed as existing on some current AMD/ATI --
You should put "acpi_new_pts_ordering" on kernel command line. Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html --
I did, just forgot to mention it. Regards, Tino --
OK, thanks. Have you tried to do: # echo core > /sys/power/pm_test # echo mem > /sys/power/state (it's good to boot the kernel with no_console_suspend and do "echo 8 > /proc/sys/kernel/printk" before that to see the messages)? Rafael --
Thanks, I'll try this evening. What should I look for in the kernel messages? Regards, Tino --
Any irregularities. Please just post the dmesg output if the system survives. If there are any oopes etc., you should see them. :-) Thanks, Rafael --
No ideas for now. I have created a Bugzilla entry for this problem at: http://bugzilla.kernel.org/show_bug.cgi?id=10291 Thanks, Rafael --
