Ok,
I don't think there really is anything very interesting here, but we're
hopefully whittling down the list of regressions, and fixing various
random other small issues while at it.
Some smallish MIPS updates, networking (and network driver) fixes, removal
of a long obsolete framebuffer driver, etc etc. The shortlog really tells
the story.
We should be getting close to a 2.6.21 release, so please update any
regression reports you've done,
Linus
---
Adrian Bunk (6):
[DCCP]: make dccp_write_xmit_timer() static again
9p: make struct v9fs_cached_file_operations static
drivers/spi/: fix section mismatches
drivers/eisa/pci_eisa.c:pci_eisa_init() should be init
drivers/mfd/sm501.c: fix an off-by-one
net/sunrpc/svcsock.c: fix a check
Alan Cox (2):
tty: minor merge correction
pata_pdc202xx_old: LBA48 bug
Alan Stern (1):
UHCI: Fix problem caused by lack of terminating QH
Albert Lee (5):
pdc202xx_new: Enable ATAPI DMA
libata: reorder HSM_ST_FIRST for easier decoding (take 3)
libata: Clear tf before doing request sense (take 3)
libata: Limit max sector to 128 for TORiSAN DVD drives (take 3)
libata: Limit ATAPI DMA to R/W commands only for TORiSAN DVD drives (take 3)
Alexey Dobriyan (1):
[NET]: Correct accept(2) recovery after sock_attach_fd()
Alexey Kuznetsov (1):
[NET]: Fix neighbour destructor handling.
Andi Kleen (3):
x86-64: Disable local APIC timer use on AMD systems with C1E
x86-64: Let oprofile reserve MSR on all CPUs
x86-64: Increase NMI watchdog probing timeout
Andreas Oberritter (2):
V4L/DVB (5495): Tda10086: fix DiSEqC message length
V4L/DVB (5496): Pluto2: fix incorrect TSCR register setting
Andrew Morton (4):
proc: fix linkage with CONFIG_SYSCTL=y, CONFIG_PROC_SYSCTL=n
revert "retries in ext3_prepare_write() violate ordering requirements"
revert "retries in ext4_prepare_write() violate orderin...This email lists some known regressions in Linus' tree compared to 2.6.20. If you find your name in the Cc header, you are either submitter of one of the bugs, maintainer of an affectected subsystem or driver, a patch of you caused a breakage or I'm considering you in any other way possibly involved with one or more of these issues. Due to the huge amount of recipients, please trim the Cc when answering. Subject : snd_hda_intel doesn't work with ASUS M2V mainboard References : http://bugzilla.kernel.org/show_bug.cgi?id=8273 Submitter : Hans-Georg Rist <hg.rist@web.de> Status : unknown Subject : snd_intel8x0: divide error: 0000 References : http://lkml.org/lkml/2007/3/5/252 Submitter : Michal Piotrowski <michal.k.k.piotrowski@gmail.com> Status : unknown Subject : hal daemon crashes after pulling a USB serial device References : http://www.opensubscriber.com/message/linux-usb-devel@lists.sourceforge.net/6369800.ht... Submitter : Andi Kleen <ak@suse.de> Handled-By : Oliver Neukum <oneukum@suse.de> Status : problem is being debugged Subject : USB: iPod doesn't work (CONFIG_USB_SUSPEND) References : http://lkml.org/lkml/2007/3/21/320 Submitter : Tino Keitel <tino.keitel@gmx.de> Caused-By : Marcelo Tosatti <marcelo@kvack.org> commit 1d619f128ba911cd3e6d6ad3475f146eb92f5c27 Handled-By : Oliver Neukum <oneukum@suse.de> Status : problem is being debuggged Subject : USB: Oops when changing DVB-T adapter References : http://lkml.org/lkml/2007/3/9/212 Submitter : CIJOML <cijoml@volny.cz> Handled-By : Markus Rechberger <markus.rechberger@amd.com> Patch : http://lkml.org/lkml/2007/4/5/154 Status : patches available -
This email lists some known regressions in Linus' tree compared to 2.6.20. If you find your name in the Cc header, you are either submitter of one of the bugs, maintainer of an affectected subsystem or driver, a patch of you caused a breakage or I'm considering you in any other way possibly involved with one or more of these issues. Due to the huge amount of recipients, please trim the Cc when answering. Subject : suspend to disk works only once References : http://lkml.org/lkml/2007/4/13/240 Submitter : Tobias Diedrich <ranma+kernel@tdiedrich.de> Status : unknown Subject : ThinkPad X60: resume no longer works (PCI related?) workaround: booting with "hpet=disable" References : http://lkml.org/lkml/2007/3/13/3 Submitter : Dave Jones <davej@redhat.com> Jeremy Fitzhardinge <jeremy@goop.org> Caused-By : PCI merge commit 78149df6d565c36675463352d0bfe0000b02b7a7 Handled-By : Eric W. Biederman <ebiederm@xmission.com> Rafael J. Wysocki <rjw@sisk.pl> Status : problem is being debugged Subject : Suspend to RAM doesn't work anymore (ACPI?) References : http://lkml.org/lkml/2007/3/19/128 http://bugzilla.kernel.org/show_bug.cgi?id=8247 Submitter : Tobias Doerffel <tobias.doerffel@gmail.com> Handled-By : Rafael J. Wysocki <rjw@sisk.pl> Len Brown <len.brown@intel.com> Status : problem is being debugged Subject : resume from RAM corrupts vesafb console References : http://lkml.org/lkml/2007/3/26/76 Submitter : Marcus Better <marcus@better.se> Handled-By : Pavel Machek <pavel@ucw.cz> Status : problem is being debugged Subject : suspend to disk hangs (CONFIG_NO_HZ) References : http://lkml.org/lkml/2007/3/25/217 Submitter : Jeff Chua <jeff.chua.linux@gmail.com> Status : unknown -
Still hangs on -rc6. Thanks, Jeff. -
On Sat, Apr 14, 2007 at 02:38:08AM +0200, Adrian Bunk wrote: > Subject : ThinkPad X60: resume no longer works (PCI related?) > workaround: booting with "hpet=disable" > References : http://lkml.org/lkml/2007/3/13/3 > Submitter : Dave Jones <davej@redhat.com> > Jeremy Fitzhardinge <jeremy@goop.org> > Caused-By : PCI merge > commit 78149df6d565c36675463352d0bfe0000b02b7a7 > Handled-By : Eric W. Biederman <ebiederm@xmission.com> > Rafael J. Wysocki <rjw@sisk.pl> > Status : problem is being debugged I'm at a loss on this one. git bisect was non-conclusive. I even tried beating up on Eric's console-over-usb to try and get more useful info, but I failed miserably. Dave -- http://www.codemonkey.org.uk -
Already fixed in rc5-git9, see http://bugzilla.kernel.org/show_bug.cgi?id=8247 Tobias
Hi Marcus, A screen with blinking green blocks implies that your display is in text mode, not in graphics mode. I don't know what options you are using, but have you tried using: acpi_sleep=s3_mode If the above does not work, also try acpi_sleep=s3_bios,s3_mode If it is still not working, you can add this to your suspend script: vbetool vbemode set <VESA mode ID> where VESA mode ID = "vga=" value - 512 (0x200) Tony PS: If your BIOS setup has an option to re-POST the graphics card on resume, that is a big help. Tony -
Will try, but I'm using "s2ram -f -a3" which should mean precisely the abov= e=20 IIUC. Marcus
Just for clarification, do you suspend from VESA framebuffer console or from VGA text console? If from the latter, that's actually worse from the user's point of view, but I can modify vgacon so that it saves its Okay. Tony -
=46rom VESA console. Marcus
Have you tried other combinations? s2ram -m -p -f s2ram -s -p -f Tony -
Yes, I tried these slightly different combinations: s2ram -f -a3 -s: Works! The screen becomes green but is restored quickly. I= t=20 prints the following messages: Allocated buffer at 0x11000 (base is 0x0) ES: 0x1100 EBX: 0x0000 Save video state failed Calling restore_state_from =46unction not supported? Restore video state failed Switching back to vt1 s2ram -f -a3 -p: Screen goes green and then blank. Everything hangs, doesn'= t=20 react to keyboard input. s2ram -f -a3 -m: Works! (Tested with 2.6.21-rc7.) Thanks, Marcus
Uhuh. This is second report of this strangeness. On thinkpad r60, -a3 used to work, and now it needs more options. Can you locate patch causing this? Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html -
Thanks. Should we consider this regression resolved? There is really nothing much vesafb can do to restore its previous state, except through the use of userland tools. Tony -
Yes, as far as I am concerned... Marcus
This email lists some known regressions in Linus' tree compared to 2.6.20. If you find your name in the Cc header, you are either submitter of one of the bugs, maintainer of an affectected subsystem or driver, a patch of you caused a breakage or I'm considering you in any other way possibly involved with one or more of these issues. Due to the huge amount of recipients, please trim the Cc when answering. Subject : ali_pata: boot from CD fails References : http://lkml.org/lkml/2007/3/31/160 Submitter : Stephen Clark <Stephen.Clark@seclark.us> Status : unknown Subject : kernels fail to boot with drives on ATIIXP controller (ACPI/IRQ related) References : https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=229621 http://lkml.org/lkml/2007/3/4/257 Submitter : Michal Jaegermann <michal@ellpspace.math.ualberta.ca> Status : unknown Subject : boot failure: rtl8139: exception in interrupt routine References : http://lkml.org/lkml/2007/3/31/160 Submitter : Stephen Clark <Stephen.Clark@seclark.us> Status : unknown Subject : laptops with e1000: lockups References : https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=229603 Submitter : Dave Jones <davej@redhat.com> Handled-By : Jesse Brandeburg <jesse.brandeburg@intel.com> Status : problem is being debugged Subject : forcedeth: interface hangs under load References : http://lkml.org/lkml/2007/4/3/39 Submitter : Ingo Molnar <mingo@elte.hu> Handled-By : Ingo Molnar <mingo@elte.hu> Ayaz Abdulla <aabdulla@nvidia.com> Status : problem is being debugged -
For me, suspend to disk works only once (has been the case for all .21-rcs IIRC, but I didn't get around to report it so far). There are some threads about an issue like this, which is supposed to be fixed by disabling CONFIG_PCI_MSI, but on my system the problem persists nonetheless. On the second suspend attempt, the last message I see is "Suspending console(s)" If I find the time, I'll try to bisect it this weekend. .config: # # Automatically generated make config: don't edit # Linux kernel version: 2.6.21-rc6 # Fri Apr 13 23:08:52 2007 # CONFIG_X86_64=y CONFIG_64BIT=y CONFIG_X86=y CONFIG_GENERIC_TIME=y CONFIG_GENERIC_TIME_VSYSCALL=y CONFIG_ZONE_DMA32=y CONFIG_LOCKDEP_SUPPORT=y CONFIG_STACKTRACE_SUPPORT=y CONFIG_SEMAPHORE_SLEEPERS=y CONFIG_MMU=y CONFIG_ZONE_DMA=y CONFIG_RWSEM_GENERIC_SPINLOCK=y CONFIG_GENERIC_HWEIGHT=y CONFIG_GENERIC_CALIBRATE_DELAY=y CONFIG_X86_CMPXCHG=y CONFIG_EARLY_PRINTK=y CONFIG_GENERIC_ISA_DMA=y CONFIG_GENERIC_IOMAP=y CONFIG_ARCH_MAY_HAVE_PC_FDC=y CONFIG_ARCH_POPULATES_NODE_MAP=y CONFIG_DMI=y CONFIG_AUDIT_ARCH=y CONFIG_GENERIC_BUG=y # CONFIG_ARCH_HAS_ILOG2_U32 is not set # CONFIG_ARCH_HAS_ILOG2_U64 is not set CONFIG_DEFCONFIG_LIST="/lib/modules/$UNAME_RELEASE/.config" # # Code maturity level options # CONFIG_EXPERIMENTAL=y CONFIG_BROKEN_ON_SMP=y CONFIG_LOCK_KERNEL=y CONFIG_INIT_ENV_ARG_LIMIT=32 # # General setup # CONFIG_LOCALVERSION="" CONFIG_LOCALVERSION_AUTO=y CONFIG_SWAP=y CONFIG_SYSVIPC=y # CONFIG_IPC_NS is not set CONFIG_SYSVIPC_SYSCTL=y # CONFIG_POSIX_MQUEUE is not set CONFIG_BSD_PROCESS_ACCT=y CONFIG_BSD_PROCESS_ACCT_V3=y # CONFIG_TASKSTATS is not set # CONFIG_UTS_NS is not set # CONFIG_AUDIT is not set CONFIG_IKCONFIG=y CONFIG_IKCONFIG_PROC=y # CONFIG_SYSFS_DEPRECATED is not set # CONFIG_RELAY is not set # CONFIG_BLK_DEV_INITRD is not set CONFIG_CC_OPTIMIZE_FOR_SIZE=y CONFIG_SYSCTL=y # CONFIG_EMBEDDED is not set CONFIG_UID16=y CONFIG_SYSCTL_SYSCALL=y CONFIG_KALLSYMS=y # CONFIG_KALLSYMS_ALL is no...
Does CONFIG_HPET_TIMER=n make any difference?
Does the latest -git work?
cu
Adrian
--
"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed
-bisect results: git-bisect start # good: [fa285a3d7924a0e3782926e51f16865c5129a2f7] Linux 2.6.20 git-bisect good fa285a3d7924a0e3782926e51f16865c5129a2f7 # bad: [2eb1ae149a28c1b8ade687c5fbab3c37da4c0fba] Linux 2.6.21-rc1 git-bisect bad 2eb1ae149a28c1b8ade687c5fbab3c37da4c0fba # bad: [574009c1a895aeeb85eaab29c235d75852b09eb8] Merge branch 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus git-bisect bad 574009c1a895aeeb85eaab29c235d75852b09eb8 # good: [43187902cbfafe73ede0144166b741fb0f7d04e1] Merge master.kernel.org:/pub/scm/linux/kernel/git/gregkh/driver-2.6 git-bisect good 43187902cbfafe73ede0144166b741fb0f7d04e1 # good: [beda9f3a13bbb22cde92a45f230a02ef2afef6a9] kbuild: more Makefile cleanups git-bisect good beda9f3a13bbb22cde92a45f230a02ef2afef6a9 # bad: [7edc136ab688f751037a86e8a051151d7962d33f] Char: isicom, support higher rates git-bisect bad 7edc136ab688f751037a86e8a051151d7962d33f # good: [6267276f3fdda9ad0d5ca451bdcbdf42b802d64b] optional ZONE_DMA: deal with cases of ZONE_DMA meaning the first zone git-bisect good 6267276f3fdda9ad0d5ca451bdcbdf42b802d64b # bad: [b4ac91a0eac36f347a509afda07e4305e931de61] uml: chan_user.h formatting fixes git-bisect bad b4ac91a0eac36f347a509afda07e4305e931de61 # bad: [bf0059b23fd2f0b304f647d87fad0aa626ecf0c0] M68KNOMMU: user ARRAY_SIZE macro when appropriate git-bisect bad bf0059b23fd2f0b304f647d87fad0aa626ecf0c0 # good: [c1725f2af89f1eda3cb9007290971b55084569a4] ARM26: Use ARRAY_SIZE macro when appropriate git-bisect good c1725f2af89f1eda3cb9007290971b55084569a4 # bad: [9b87ed790714bd3a8d492feb24f6c48f8bb59c3a] m32r: fix do_page_fault and update_mmu_cache git-bisect bad 9b87ed790714bd3a8d492feb24f6c48f8bb59c3a # bad: [d12c610e08022a1b84d6bd4412c189214d32e713] swsusp-change-code-ordering-in-userc-sanity git-bisect bad d12c610e08022a1b84d6bd4412c189214d32e713 # bad: [ed746e3b18f4df18afa3763155972c5835f284c5] swsusp: Change code ordering in disk.c git-bisect bad ed746e3b18f4df18afa3763155972c5835f284c5 # good: [e3c7db621bed4afb8e23...
Doesn't apply cleanly against -rc6, but fixes the problem when
reverted from -rc1.
Index: linux-2.6.21-rc1/kernel/power/disk.c
===================================================================
--- linux-2.6.21-rc1.orig/kernel/power/disk.c 2007-04-14 14:16:59.000000000 +0200
+++ linux-2.6.21-rc1/kernel/power/disk.c 2007-04-14 14:17:03.000000000 +0200
@@ -87,24 +87,52 @@
}
}
-static void unprepare_processes(void)
-{
- thaw_processes();
- pm_restore_console();
-}
-
static int prepare_processes(void)
{
int error = 0;
pm_prepare_console();
+
+ error = disable_nonboot_cpus();
+ if (error)
+ goto enable_cpus;
+
if (freeze_processes()) {
error = -EBUSY;
- unprepare_processes();
+ goto thaw;
}
+
+ if (pm_disk_mode == PM_DISK_TESTPROC) {
+ printk("swsusp debug: Waiting for 5 seconds.\n");
+ mdelay(5000);
+ goto thaw;
+ }
+
+ error = platform_prepare();
+ if (error)
+ goto thaw;
+
+ /* Free memory before shutting down devices. */
+ if (!(error = swsusp_shrink_memory()))
+ return 0;
+
+ platform_finish();
+ thaw:
+ thaw_processes();
+ enable_cpus:
+ enable_nonboot_cpus();
+ pm_restore_console();
return error;
}
+static void unprepare_processes(void)
+{
+ platform_finish();
+ thaw_processes();
+ enable_nonboot_cpus();
+ pm_restore_console();
+}
+
/**
* pm_suspend_disk - The granpappy of hibernation power management.
*
@@ -122,45 +150,29 @@
if (error)
return error;
- if (pm_disk_mode == PM_DISK_TESTPROC) {
- printk("swsusp debug: Waiting for 5 seconds.\n");
- mdelay(5000);
- goto Thaw;
- }
- /* Free memory before shutting down devices. */
- error = swsusp_shrink_memory();
- if (error)
- goto Thaw;
-
- error = platform_prepare();
- if (error)
- goto Thaw;
+ if (pm_disk_mode == PM_DISK_TESTPROC)
+ return 0;
suspend_console();
error = device_suspend(PMSG_FREEZE);
if (error) {
- printk(KERN_ERR "PM: Some devices failed to suspend\n");
- goto Resume_devices;
+ resume_console(...Now, this was already reported in http://lkml.org/lkml/2007/3/16/126 and I even flagged that message in my local folder, but apparently forgot to follow up on it... *sigh* -- Tobias PGP: http://9ac7e0bc.uguu.de このメールは十割再利用されたビットで作られています。 -
Unless I misunderstood something, all of the problems Maxim described in
this email are fixed for him in -rc6.
But it's quite possible that you are running into a different issue
cu
Adrian
--
"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed
-Yes, it's likely. Tobias, I'm unable to reproduce the problem with your .config, but my hardware is certainly different. Which suspend mode do you use? If that's "platform", can you try to use "shutdown" or "reboot" and see if that helps? Rafael -- If you don't have the time to read, you don't have the time or the tools to write. - Stephen King -
Sure. shutdown/reboot works fine, only platform is broken. -- Tobias PGP: http://9ac7e0bc.uguu.de -
Thanks.
Now, I suspect the problem is somehow related to the hardware, so it would help
a lot if we could identify the piece of hardware (or driver) involved.
AFAICT, your system is a non-SMP one, so we can rule out
disable/enable_nonboot_cpus(). To confirm that the problem is related to
platform_finish(), can you please apply the appended debug patch and
see if the suspend in the 'platform' mode works with it?
Also, would that be feasible for you to use 'shutdown' as a workaround in case
the source of the problem is difficult to find and/or fix?
Rafael
---
kernel/power/disk.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
Index: linux-2.6.21-rc6/kernel/power/disk.c
===================================================================
--- linux-2.6.21-rc6.orig/kernel/power/disk.c
+++ linux-2.6.21-rc6/kernel/power/disk.c
@@ -170,8 +170,8 @@ int pm_suspend_disk(void)
if (in_suspend) {
enable_nonboot_cpus();
- platform_finish();
device_resume();
+ platform_finish();
resume_console();
pr_debug("PM: writing image.\n");
error = swsusp_write();
@@ -189,8 +189,8 @@ int pm_suspend_disk(void)
Enable_cpus:
enable_nonboot_cpus();
Resume_devices:
- platform_finish();
device_resume();
+ platform_finish();
resume_console();
Thaw:
unprepare_processes();
-Yes, it's a Asus M2N-SLI-Deluxe Mainboard with a Athlon64 3200+ -- Tobias PGP: http://9ac7e0bc.uguu.de このメールは十割再利用されたビットで作られています。 -
Well, I thought it would, but it also would break some other people's systems.
That's the _real_ problem. Let's see if we can learn more.
Can you please revert it for now, apply the appended one and try to
suspend/resume twice in the 'platform' mode (it may or may not work)?
Rafael
---
kernel/power/disk.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
Index: linux-2.6.21-rc6/kernel/power/disk.c
===================================================================
--- linux-2.6.21-rc6.orig/kernel/power/disk.c
+++ linux-2.6.21-rc6/kernel/power/disk.c
@@ -267,12 +267,15 @@ static int software_resume(void)
error = swsusp_read();
if (error) {
swsusp_free();
- platform_finish();
goto Thaw;
}
pr_debug("PM: Preparing devices for restore.\n");
+ error = platform_prepare();
+ if (error)
+ goto Thaw;
+
suspend_console();
error = device_suspend(PMSG_PRETHAW);
if (error)
@@ -285,6 +288,7 @@ static int software_resume(void)
enable_nonboot_cpus();
Free:
swsusp_free();
+ platform_finish();
device_resume();
resume_console();
Thaw:
-Ok. The patch doesn't apply cleanly to 2.6.21-rc6: |patching file kernel/power/disk.c |Hunk #1 FAILED at 267. |Hunk #2 succeeded at 265 (offset -23 lines). |1 out of 2 hunks FAILED -- saving rejects to file |kernel/power/disk.c.rej wiggle helps, seems the first part of Hunk #1 is already applied in 2.6.21-rc6. With CONFIG_PM_DEBUG=y and CONFIG_DISABLE_CONSOLE_SUSPEND=y I see that the second suspend hangs at "i8042 i8042: EARLY resume". This is kinda interesting because I'm normally using a USB keyboard and sure enough, if I hook up a normal keyboard and disable USB legacy support in the BIOS, then suspend to disk works multiple times. I'd still rather like to use my USB keyboard though. ;) -- Tobias PGP: http://9ac7e0bc.uguu.de このメールは十割再利用されたビットで作られています。 -
And I can now confirm that unpatched 2.6.21-rc6 works fine as long as USB legacy support is disabled (however without legacy support I can't use the USB keyboard to control grub). -- Tobias PGP: http://9ac7e0bc.uguu.de このメールは十割再利用されたビットで作られています。 -
Well, I think that when you're using the USB keyboard and the USB legacy support, the i8042 driver thinks it has a keyboard to handle and tries to handle it during the suspend, which fails. I don't know why it fails during the second suspend, though. I think using the 'shutdown' mode of suspend would be better. There's a little point in using 'platform' on desktop systems anyway. Frankly, I don't know what to do about it. If we move platform_finish() after device_resume(), some systems may be broken and I think there are more such systems than there are systems that set USB legacy support in the BIOS and have no PS/2 keyboards attached. Pavel, what do you think? Rafael -
And NVidia southbridge, so OHCI not UHCI (plus EHCI) ... one experiment would be to disable the EHCI (high speed USB) support in BIOS, to make for a simpler hardware configuration, and see if that makes BIOS happier. (Or better, just take EHCI out of your Linux config.) Likewise, taking the 8042 drivers out of Linux. I wouldn't be surprised if those factors didn't matter, but it'd be good The "legacy" support in at least some cases involves BIOS having a small USB stack -- enough to handle a keyboard or mouse in "boot mode" (plus sometimes a USB disk or CDROM) -- and poking the i8042 chip to act as if *IT* received the data bytes that really came over USB. I sure don't know the ins-and-outs of such schemes (ISTR there are others), but my guess is that either the 8042 or OHCI got confused, at least in conjunction with the lowlevel magic ACPI was doing. What I'm curious about is exactly why the patch matters. What ACPI magic is being invoked to confuse, or unconfuse, those controllers? - Dave -
Well, my theory is the following:
Without the patch, platform_finish() runs before the i8042's .resume() which is
done as though a real keyboard were present, but the ACPI magic is not done
and this confuses the heck out of the controller. Still, it doesn't go mad at
this point just yet (it probably isn't fully functional either, although we
don't see that, because it's not really used), but next, during the subsequent
suspend, it gets poked while device_power_up() is running and goes belly
I think the patch helps, because it makes the ACPI magic be done while the
i8042's .resume() is being executed.
Which makes me think the following patch might help:
drivers/input/serio/i8042.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
Index: linux-2.6.21-rc6/drivers/input/serio/i8042.c
===================================================================
--- linux-2.6.21-rc6.orig/drivers/input/serio/i8042.c 2007-04-07 12:15:19.000000000 +0200
+++ linux-2.6.21-rc6/drivers/input/serio/i8042.c 2007-04-15 18:30:01.000000000 +0200
@@ -846,7 +846,8 @@ static long i8042_panic_blink(long count
static int i8042_suspend(struct platform_device *dev, pm_message_t state)
{
if (dev->dev.power.power_state.event != state.event) {
- if (state.event == PM_EVENT_SUSPEND)
+ if (state.event == PM_EVENT_SUSPEND
+ || state.event == PM_EVENT_PRETHAW)
i8042_controller_reset();
dev->dev.power.power_state = state;
--- Tobias PGP: http://9ac7e0bc.uguu.de このメールは十割再利用されたビットで作られています。 -
Well, this means i8042 can be ruled out, so the problem probably is related to the ACPI resume which makes it _much_ more difficult to debug. Can you compile the ACPI drivers: processor, thermal, fan, battery, etc. as modules, boot the kernel with init=/bin/bash and see if the problem is still present (please keep CONFIG_SERIO_I8042 unset just in case)? Rafael -
I first tried it with acpi+cpufreq completely disabled (works). Then I tried it with acpi enabled, but everything as modules and those not loaded (init=/bin/bash, hangs at second suspend). # # Automatically generated make config: don't edit # Linux kernel version: 2.6.21-rc7 # Sun Apr 22 09:26:07 2007 # CONFIG_X86_64=y CONFIG_64BIT=y CONFIG_X86=y CONFIG_GENERIC_TIME=y CONFIG_GENERIC_TIME_VSYSCALL=y CONFIG_ZONE_DMA32=y CONFIG_LOCKDEP_SUPPORT=y CONFIG_STACKTRACE_SUPPORT=y CONFIG_SEMAPHORE_SLEEPERS=y CONFIG_MMU=y CONFIG_ZONE_DMA=y CONFIG_RWSEM_GENERIC_SPINLOCK=y CONFIG_GENERIC_HWEIGHT=y CONFIG_GENERIC_CALIBRATE_DELAY=y CONFIG_X86_CMPXCHG=y CONFIG_EARLY_PRINTK=y CONFIG_GENERIC_ISA_DMA=y CONFIG_GENERIC_IOMAP=y CONFIG_ARCH_MAY_HAVE_PC_FDC=y CONFIG_ARCH_POPULATES_NODE_MAP=y CONFIG_DMI=y CONFIG_AUDIT_ARCH=y CONFIG_GENERIC_BUG=y # CONFIG_ARCH_HAS_ILOG2_U32 is not set # CONFIG_ARCH_HAS_ILOG2_U64 is not set CONFIG_DEFCONFIG_LIST="/lib/modules/$UNAME_RELEASE/.config" # # Code maturity level options # CONFIG_EXPERIMENTAL=y CONFIG_BROKEN_ON_SMP=y CONFIG_LOCK_KERNEL=y CONFIG_INIT_ENV_ARG_LIMIT=32 # # General setup # CONFIG_LOCALVERSION="" CONFIG_LOCALVERSION_AUTO=y CONFIG_SWAP=y CONFIG_SYSVIPC=y # CONFIG_IPC_NS is not set CONFIG_SYSVIPC_SYSCTL=y # CONFIG_POSIX_MQUEUE is not set CONFIG_BSD_PROCESS_ACCT=y CONFIG_BSD_PROCESS_ACCT_V3=y # CONFIG_TASKSTATS is not set # CONFIG_UTS_NS is not set # CONFIG_AUDIT is not set CONFIG_IKCONFIG=y CONFIG_IKCONFIG_PROC=y # CONFIG_SYSFS_DEPRECATED is not set # CONFIG_RELAY is not set # CONFIG_BLK_DEV_INITRD is not set CONFIG_CC_OPTIMIZE_FOR_SIZE=y CONFIG_SYSCTL=y CONFIG_EMBEDDED=y CONFIG_UID16=y CONFIG_SYSCTL_SYSCALL=y CONFIG_KALLSYMS=y # CONFIG_KALLSYMS_ALL is not set # CONFIG_KALLSYMS_EXTRA_PASS is not set CONFIG_HOTPLUG=y CONFIG_PRINTK=y CONFIG_BUG=y CONFIG_ELF_CORE=y CONFIG_BASE_FULL=y CONFIG_FUTEX=y CONFIG_EPOLL=y CONFIG_SHMEM=y CONFIG_SLAB=y CONFIG_VM_EVENT_COUNTERS=y CONFIG_RT_MUTEXES=y # CONFIG_TINY_SH...
Have you tried with ACPI and without cpufreq? Rafael -
Yes, the second one was with ACPI enabled and cpufreq disabled (CONFIG_X86_ACPI_CPUFREQ is not set). -- Tobias PGP: http://9ac7e0bc.uguu.de このメールは十割再利用されたビットで作られています。 -
Yeah, lack of PRETHAW support could be an issue. As you may recall, it was added because otherwise statically linked USB host controllers came up under the mistaken belief that they were getting a real resume event rather than a restart-after-power-off ... and there needed to be a way to force a hard reset. Seems like a similar issue here. -
This is wierd as i8042 does not use suspend_late/resume_early hooks and so it is impossible for it to hang there. None of input drivers use these I would say that every box that does not use PS/2 keyboard does this. IOW every box with USB keyboard has legacy emulation turned on so quite few of them... -- Dmitry -
Yes. Tobias, can you please post the dmesg output from after a successful Quite some people I know use USB keyboards with notebooks, but in these cases the PS/2 keyboard is still attached (except for notebooks in which the built-in I have such a machine nearby, so I'll see if I can reproduce the problem. Greetings, Rafael -
Here you go: [ 0.000000] Linux version 2.6.21-rc6 (ranma@melchior) (gcc version 4.1.2 20061115 (prerelease) (Debian 4.1.1-21)) #16 PREEMPT Sun Apr 15 09:39:32 CEST 2007 [ 0.000000] Command line: root=/dev/sda5 resume=/dev/sda6 vga=6 apic=verbose ro [ 0.000000] BIOS-provided physical RAM map: [ 0.000000] BIOS-e820: 0000000000000000 - 000000000009f800 (usable) [ 0.000000] BIOS-e820: 000000000009f800 - 00000000000a0000 (reserved) [ 0.000000] BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved) [ 0.000000] BIOS-e820: 0000000000100000 - 000000003fee0000 (usable) [ 0.000000] BIOS-e820: 000000003fee0000 - 000000003fee3000 (ACPI NVS) [ 0.000000] BIOS-e820: 000000003fee3000 - 000000003fef0000 (ACPI data) [ 0.000000] BIOS-e820: 000000003fef0000 - 000000003ff00000 (reserved) [ 0.000000] BIOS-e820: 00000000f0000000 - 00000000f4000000 (reserved) [ 0.000000] BIOS-e820: 00000000fec00000 - 0000000100000000 (reserved) [ 0.000000] Entering add_active_range(0, 0, 159) 0 entries of 256 used [ 0.000000] Entering add_active_range(0, 256, 261856) 1 entries of 256 used [ 0.000000] end_pfn_map = 1048576 [ 0.000000] DMI 2.4 present. [ 0.000000] ACPI: RSDP 000F7B80, 0024 (r2 Nvidia) [ 0.000000] ACPI: XSDT 3FEE30C0, 004C (r1 Nvidia ASUSACPI 42302E31 AWRD 0) [ 0.000000] ACPI: FACP 3FEEC540, 00F4 (r3 Nvidia ASUSACPI 42302E31 AWRD 0) [ 0.000000] ACPI: DSDT 3FEE3240, 92AD (r1 NVIDIA AWRDACPI 1000 MSFT 3000000) [ 0.000000] ACPI: FACS 3FEE0000, 0040 [ 0.000000] ACPI: SSDT 3FEEC740, 00F4 (r1 PTLTD POWERNOW 1 LTP 1) [ 0.000000] ACPI: HPET 3FEEC880, 0038 (r1 Nvidia ASUSACPI 42302E31 AWRD 98) [ 0.000000] ACPI: MCFG 3FEEC900, 003C (r1 Nvidia ASUSACPI 42302E31 AWRD 0) [ 0.000000] ACPI: APIC 3FEEC680, 007C (r1 Nvidia ASUSACPI 42302E31 AWRD 0) [ 0.000000] Entering add_active_range(0, 0, 159) 0 entries of 256 used [ 0.000000] Entering add_active_range(0, 256, 261856) 1 entries ...
Thanks.
[--snip--]
Hmm, it looks like i8042 is the last thing on the dpm_off_irq list. Still,
if the ACPI resume fails, the next messages may not make it to the console
(it's not very probable, though).
I've tried to reproduce your problem on another box on which I have no PS/2
keyboard (USB keyboard/mouse only) and the USB legacy support set, but I can't.
There must be something very special in your configuration.
Have you tried the patch that I posted some time ago (appended again for
convenience)?
Rafael
drivers/input/serio/i8042.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
Index: linux-2.6.21-rc6/drivers/input/serio/i8042.c
===================================================================
--- linux-2.6.21-rc6.orig/drivers/input/serio/i8042.c 2007-04-07 12:15:19.000000000 +0200
+++ linux-2.6.21-rc6/drivers/input/serio/i8042.c 2007-04-15 18:30:01.000000000 +0200
@@ -846,7 +846,8 @@ static long i8042_panic_blink(long count
static int i8042_suspend(struct platform_device *dev, pm_message_t state)
{
if (dev->dev.power.power_state.event != state.event) {
- if (state.event == PM_EVENT_SUSPEND)
+ if (state.event == PM_EVENT_SUSPEND
+ || state.event == PM_EVENT_PRETHAW)
i8042_controller_reset();
dev->dev.power.power_state = state;
-One person reporting a regression against a -rc kernel can mean
houndreds or thousands of people who will run into the same issue after
cu
Adrian
--
"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed
-Well, in this particular case it is not very likely to happen. I have three x86_64 machines here with totally different chipsets/devices on which I'm not seeing anything like that and I believe we'd have more reports before if that were a common issue. That said, I'm not going to ignore it. I'll do my best to debug and fix it, if Tobias helps me. :-) Greetings, Rafael -
Still no luck with Linux melchior 2.6.21-rc6-gd791d413-dirty #4 PREEMPT Sat Apr 14 09:34:21 CEST 2007 x86_64 GNU/Linux Hmm, I just noticed that CONFIG_HPET_TIMER was forced back on after make oldconfig... Is that expected on amd64? # # Automatically generated make config: don't edit # Linux kernel version: 2.6.21-rc6 # Sat Apr 14 09:33:36 2007 # CONFIG_X86_64=y CONFIG_64BIT=y CONFIG_X86=y CONFIG_GENERIC_TIME=y CONFIG_GENERIC_TIME_VSYSCALL=y CONFIG_ZONE_DMA32=y CONFIG_LOCKDEP_SUPPORT=y CONFIG_STACKTRACE_SUPPORT=y CONFIG_SEMAPHORE_SLEEPERS=y CONFIG_MMU=y CONFIG_ZONE_DMA=y CONFIG_RWSEM_GENERIC_SPINLOCK=y CONFIG_GENERIC_HWEIGHT=y CONFIG_GENERIC_CALIBRATE_DELAY=y CONFIG_X86_CMPXCHG=y CONFIG_EARLY_PRINTK=y CONFIG_GENERIC_ISA_DMA=y CONFIG_GENERIC_IOMAP=y CONFIG_ARCH_MAY_HAVE_PC_FDC=y CONFIG_ARCH_POPULATES_NODE_MAP=y CONFIG_DMI=y CONFIG_AUDIT_ARCH=y CONFIG_GENERIC_BUG=y # CONFIG_ARCH_HAS_ILOG2_U32 is not set # CONFIG_ARCH_HAS_ILOG2_U64 is not set CONFIG_DEFCONFIG_LIST="/lib/modules/$UNAME_RELEASE/.config" # # Code maturity level options # CONFIG_EXPERIMENTAL=y CONFIG_BROKEN_ON_SMP=y CONFIG_LOCK_KERNEL=y CONFIG_INIT_ENV_ARG_LIMIT=32 # # General setup # CONFIG_LOCALVERSION="" CONFIG_LOCALVERSION_AUTO=y CONFIG_SWAP=y CONFIG_SYSVIPC=y # CONFIG_IPC_NS is not set CONFIG_SYSVIPC_SYSCTL=y # CONFIG_POSIX_MQUEUE is not set CONFIG_BSD_PROCESS_ACCT=y CONFIG_BSD_PROCESS_ACCT_V3=y # CONFIG_TASKSTATS is not set # CONFIG_UTS_NS is not set # CONFIG_AUDIT is not set CONFIG_IKCONFIG=y CONFIG_IKCONFIG_PROC=y # CONFIG_SYSFS_DEPRECATED is not set # CONFIG_RELAY is not set # CONFIG_BLK_DEV_INITRD is not set CONFIG_CC_OPTIMIZE_FOR_SIZE=y CONFIG_SYSCTL=y # CONFIG_EMBEDDED is not set CONFIG_UID16=y CONFIG_SYSCTL_SYSCALL=y CONFIG_KALLSYMS=y # CONFIG_KALLSYMS_ALL is not set # CONFIG_KALLSYMS_EXTRA_PASS is not set CONFIG_HOTPLUG=y CONFIG_PRINTK=y CONFIG_BUG=y CONFIG_ELF_CORE=y CONFIG_BASE_FULL=y CONFIG_FUTEX=y CONFIG_EPOLL=y CONFIG_SHMEM=y CONFIG_SLAB=y CONFIG_...
Yes it is (on i386 you can disable it).
cu
Adrian
--
"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed
-Can you boot with init=/bin/bash and see if the problem is present in this configuration? Rafael -
Doesn't help. Maybe interesting: In the init=/bin/bash run, the first suspend try was without swap and thus bailed out. After swapon, the second try already hung, despite not having 'really' suspended at all on the first try. I tried it once more, with swap on the first try and got the same 'second try doesn't work' result. git-bisect so far: git-bisect start # good: [fa285a3d7924a0e3782926e51f16865c5129a2f7] Linux 2.6.20 git-bisect good fa285a3d7924a0e3782926e51f16865c5129a2f7 # bad: [2eb1ae149a28c1b8ade687c5fbab3c37da4c0fba] Linux 2.6.21-rc1 git-bisect bad 2eb1ae149a28c1b8ade687c5fbab3c37da4c0fba # bad: [574009c1a895aeeb85eaab29c235d75852b09eb8] Merge branch 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus git-bisect bad 574009c1a895aeeb85eaab29c235d75852b09eb8 # good: [43187902cbfafe73ede0144166b741fb0f7d04e1] Merge master.kernel.org:/pub/scm/linux/kernel/git/gregkh/driver-2.6 git-bisect good 43187902cbfafe73ede0144166b741fb0f7d04e1 # good: [beda9f3a13bbb22cde92a45f230a02ef2afef6a9] kbuild: more Makefile cleanups git-bisect good beda9f3a13bbb22cde92a45f230a02ef2afef6a9 # bad: [7edc136ab688f751037a86e8a051151d7962d33f] Char: isicom, support higher rates git-bisect bad 7edc136ab688f751037a86e8a051151d7962d33f -- Tobias PGP: http://9ac7e0bc.uguu.de このメールは十割再利用されたビットで作られています。 -
Coming up next :) -- Tobias PGP: http://9ac7e0bc.uguu.de このメールは十割再利用されたビットで作られています。 -
On Thu, Apr 05, 2007 at 07:50:11PM -0700, Linus Torvalds wrote: This one breaks resume for me (from STR) on a vaio SZ. Reverting this commit allows resuming again but leaves me with some periodic and unpleasant: [ 155.232000] BUG: soft lockup detected on CPU#1! [ 155.232000] [<c0104cf2>] show_trace_log_lvl+0x1a/0x2f [ 155.232000] [<c0105344>] show_trace+0x12/0x14 [ 155.232000] [<c01053c8>] dump_stack+0x16/0x18 [ 155.232000] [<c0147240>] softlockup_tick+0xa7/0xb6 [ 155.232000] [<c01284d3>] run_local_timers+0x12/0x14 [ 155.232000] [<c012887a>] update_process_times+0x3e/0x63 [ 155.232000] [<c0137656>] tick_sched_timer+0x50/0x95 [ 155.232000] [<c01340e0>] hrtimer_interrupt+0x10b/0x18b [ 155.232000] [<c01137b7>] smp_apic_timer_interrupt+0x6c/0x7e [ 155.232000] [<c0104840>] apic_timer_interrupt+0x28/0x30 [ 155.232000] [<c0102318>] cpu_idle+0x1b/0xc7 [ 155.232000] [<c011297a>] start_secondary+0x32b/0x333 [ 155.232000] [<00000000>] run_init_process+0x3fefed10/0x19 [ 155.232000] ======================= FWIW: I hit the same BUG() in -rc5. full boot+suspend+resume log: http://oioio.altervista.org/linux/kern-2.6.21-rc6.log .config: http://oioio.altervista.org/linux/config-2.6.21-rc6-1 I'm available to test more patches or to provide other info. -- -
Strange,strange... First of all try to boot with clocksource=acpi_pm (I want to test whenever HPET working as clocksource is a problem) Then try to boot with hpet=disable or unset CONFIG_HPET_TIMER (This will disable hpet both as clock source and clockevent) Please send also contents of /proc/timer_list (I want to know whenever APIC timer is enabled there or not) Best regards, Maxim Levitsky -
Yes... strange. I can't reproduce the resume breakage anymore, with or without your patch. I still have the soft lockup anyway after resuming. I'll still keep trying, for now just disregard my previous mail. -- -
A couple more info (probably useless but...): - I noticed the resume problem in -rc6-mm1 but reverting the same patch there doesn't make the laptop resume again - last known succesful resuming kernel: 2.6.21-rc5-mm3 (and without hitting the BUG() above after resume) -- -
i just got the crash below (with slab debug enabled) on -rc6-git4. I
never saw this one before, and as you can see from the recompile count,
i've rebuilt this tree a fair number of times - and the config didnt
change much.
I promptly re-tried the same bzImage but the crash did not reoccur.
So we've got a memory corruptor of some sort in v2.6.21-to-be. I'm 100%
sure that i never saw this under any v2.6.20 variant or on any prior
kernel. The crash site corresponds to a module-refcount dec:
(gdb) list *0x00000000c013c1f4
0xc013c1f4 is in module_put (kernel/module.c:801).
796
797 void module_put(struct module *module)
798 {
799 if (module) {
800 unsigned int cpu = get_cpu();
801 local_dec(&module->ref[cpu].count);
802 /* Maybe they're waiting for us to drop reference? */
803 if (unlikely(!module_is_live(module)))
804 wake_up_process(module->waiter);
805 put_cpu();
(gdb)
NOTE: i'm still using a bzImage kernel, so there are no true modules in
the kernel. (This also makes it pretty likely that this is not a build
artifact either.)
(config and full bootlog attached.)
Ingo
---------------------->
BUG: unable to handle kernel paging request at virtual address 6b6b6ceb
printing eip:
c013c1f5
*pde = 0203000c
Oops: 0002 [#1]
SMP
Modules linked in:
CPU: 0
EIP: 0060:[<c013c1f5>] Not tainted VLI
EFLAGS: 00010256 (2.6.21-rc6 #273)
EIP is at module_put+0x19/0x2d
eax: 6b6b6ceb ebx: f72fee2c ecx: c03c9b36 edx: 6b6b6b6b
esi: f7428f54 edi: 6b6b6b6b ebp: f737bf38 esp: f737bf38
ds: 007b es: 007b fs: 00d8 gs: 0033 ss: 0068
Process udev (pid: 1768, ti=f737a000 task=f7488000 task.ti=f737a000)
Stack: f737bf50 c019e832 f749092c 00000010 f72feda4 f746487c f737bf78 c0167c7f
00000000 00000000 f72f6ba4 c2928d48 f72feda4 f746487c f7be81d4 00000000
f737bf80 c0167d3b f737...I couldn't get suspend-to-disk to work with 2.6.21-rc6. I've tried set/unset CONFIG_NO_HZ/CONFIG_HPET_TIMER, but nothing worked. With rc5 and Maxim's patch, it worked with CONFIG_NO_HZ unset. This is on ThinkPad X60s. Jeff. -
Do you think you could busect it? You'd have to apply maxim's patch by hand at each bisection step (up until the point where it's already applied in the git tree, of course), so it's not a totally mindless bisection, but it should still be fairly painless, since there is only 277 commits between -rc5 and -rc6 (so bisection should rather quickly narrow it down) Linus -
Linus, I did that last night and realize that I could suspend to disk/ram with 2.6.21-rc6 CONFIG_NO_HZ unset. I must have done something wrong before. Thank you, Jeff. -
I'm sitting on five patches which look like 2.6.21 material, but which would normally go through subsystem maintainers: pcmcia: ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.21-rc6/2.6.21-rc6-mm... ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.21-rc6/2.6.21-rc6-mm... driver core: ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.21-rc6/2.6.21-rc6-mm... netdev: ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.21-rc6/2.6.21-rc6-mm... ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.21-rc6/2.6.21-rc6-mm... net: ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.21-rc6/2.6.21-rc6-mm... please send acks, nacks or smacks asap, thanks. -
We should not encourage using platform_device_register_simple as we want to obsolete this function. -- Dmitry -
It sounded this was specific to Ingo. I haven't heard anybody else ACK this one. Need to send this up, but I'm intentionally avoiding work as we are having a big Easter bash here in Raleigh. Silly bunny-related traditions that have nothing to do with Jesus take priority ;-) I have a couple other bug fixes to push, but that will wait until Tuesday. Jeff -
I'm not sure, it sounds a bit like something I saw a while ago. I would have to check for sure, I made a quick debugging patch (sent to netdev) and it went away so I think my last though was a miscompilation. -
