Re: tg3: unable to handle null pointer dereference [Re: Linux 2.6.21-rc6]

Previous thread: Re: Oops in scsi_send_eh_cmnd 2.6.21-rc5-git6,7,10,13 by Andrew Burgess on Thursday, April 5, 2007 - 7:21 pm. (2 messages)

Next thread: Re: set up new kernel with grub by WANG Cong on Thursday, April 5, 2007 - 8:15 pm. (1 message)
From: Linus Torvalds
Date: Thursday, April 5, 2007 - 7:50 pm

Ok,
 I don't think there really is anything very interesting here, but we're 
hopefully whittling down the list of regressions, and fixing various 
random other small issues while at it.

Some smallish MIPS updates, networking (and network driver) fixes, removal 
of a long obsolete framebuffer driver, etc etc. The shortlog really tells 
the story.

We should be getting close to a 2.6.21 release, so please update any 
regression reports you've done,

		Linus

---
Adrian Bunk (6):
      [DCCP]: make dccp_write_xmit_timer() static again
      9p: make struct v9fs_cached_file_operations static
      drivers/spi/: fix section mismatches
      drivers/eisa/pci_eisa.c:pci_eisa_init() should be init
      drivers/mfd/sm501.c: fix an off-by-one
      net/sunrpc/svcsock.c: fix a check

Alan Cox (2):
      tty: minor merge correction
      pata_pdc202xx_old: LBA48 bug

Alan Stern (1):
      UHCI: Fix problem caused by lack of terminating QH

Albert Lee (5):
      pdc202xx_new: Enable ATAPI DMA
      libata: reorder HSM_ST_FIRST for easier decoding (take 3)
      libata: Clear tf before doing request sense (take 3)
      libata: Limit max sector to 128 for TORiSAN DVD drives (take 3)
      libata: Limit ATAPI DMA to R/W commands only for TORiSAN DVD drives (take 3)

Alexey Dobriyan (1):
      [NET]: Correct accept(2) recovery after sock_attach_fd()

Alexey Kuznetsov (1):
      [NET]: Fix neighbour destructor handling.

Andi Kleen (3):
      x86-64: Disable local APIC timer use on AMD systems with C1E
      x86-64: Let oprofile reserve MSR on all CPUs
      x86-64: Increase NMI watchdog probing timeout

Andreas Oberritter (2):
      V4L/DVB (5495): Tda10086: fix DiSEqC message length
      V4L/DVB (5496): Pluto2: fix incorrect TSCR register setting

Andrew Morton (4):
      proc: fix linkage with CONFIG_SYSCTL=y, CONFIG_PROC_SYSCTL=n
      revert "retries in ext3_prepare_write() violate ordering requirements"
      revert "retries in ext4_prepare_write() violate ...
From: Nishanth Aravamudan
Date: Friday, April 6, 2007 - 2:40 pm

2.6.21-rc5 is ok. 2.6.21-rc6 results in

[   14.241665] Unable to handle kernel NULL pointer dereference (address 0000000000000000)
[   14.250025] swapper[1]: Oops 11003706212352 [1]
[   14.254753] Modules linked in:
[   14.258046] 
[   14.258047] Pid: 1, CPU 7, comm:              swapper
[   14.264962] psr : 00001210084a6010 ifs : 8000000000000610 ip  : [<a000000100495371>]    Not tainted
[   14.274399] ip is at tg3_chip_reset+0xf1/0x12c0
[   14.279124] unat: 0000000000000000 pfs : 0000000000000610 rsc : 0000000000000003
[   14.286862] rnat: e000001005bc7d40 bsps: e000001005bc0000 pr  : 68105a9195655599
[   14.294598] ldrs: 0000000000000000 ccv : 0000000000000000 fpsr: 0009804c8a70433f
[   14.302338] csd : 0000000000000000 ssd : 0000000000000000
[   14.307946] b0  : a0000001004952c0 b6  : a00000010038b2e0 b7  : a000000100486580
[   14.315688] f6  : 1003e000000054e304351 f7  : 1003e0000000000000640
[   14.322164] f8  : 1003e000000054e2dd251 f9  : 1003e0000000000000064
[   14.328643] f10 : 10015e7d113fff182eec0 f11 : 1003e000000000073e88a
[   14.335116] r1  : a000000100d4be30 r2  : a000000100b68fc0 r3  : a000000100b68eb0
[   14.342851] r8  : 0000000000000000 r9  : 0000000000000200 r10 : a00000010089d1a8
[   14.350597] r11 : a000000100486580 r12 : e000001005bc7d70 r13 : e000001005bc0000
[   14.358332] r14 : 0000000000000002 r15 : e000001005d08f10 r16 : e000001005d08ee0
[   14.366072] r17 : e000001005d08748 r18 : e000001005d08758 r19 : 0000000000000000
[   14.373815] r20 : e000001005d08748 r21 : 0000000000000000 r22 : 0000000040027401
[   14.381557] r23 : 0000000000027401 r24 : 0000000040000000 r25 : a00000010089d2f0
[   14.389293] r26 : a000000100b5b5c0 r27 : 0000000000000000 r28 : 0000000000000000
[   14.397035] r29 : 0000000000000000 r30 : 0000000000000000 r31 : e000001005d08708
[   14.404847] 
[   14.404848] Call Trace:
[   14.409160]  [<a000000100013900>] show_stack+0x80/0xa0
[   14.409162]                                 sp=e000001005bc7900 bsp=e000001005bc1120
[   ...
From: Michael Chan
Date: Friday, April 6, 2007 - 3:57 pm

Sorry, I think this should fix it:

[TG3]: Fix crash during tg3_init_one().

The driver will crash when the chip has been initialized by EFI before
tg3_init_one().  In this case, the driver will call tg3_chip_reset()
before allocating consistent memory.

The bug is fixed by checking for tp->hw_status before accessing it
during tg3_chip_reset().

Signed-off-by: Michael Chan <mchan@broadcom.com>

diff --git a/drivers/net/tg3.c b/drivers/net/tg3.c
index 0acee9f..256969e 100644
--- a/drivers/net/tg3.c
+++ b/drivers/net/tg3.c
@@ -4834,8 +4834,10 @@ static int tg3_chip_reset(struct tg3 *tp)
 	 * sharing or irqpoll.
 	 */
 	tp->tg3_flags |= TG3_FLAG_CHIP_RESETTING;
-	tp->hw_status->status = 0;
-	tp->hw_status->status_tag = 0;
+	if (tp->hw_status) {
+		tp->hw_status->status = 0;
+		tp->hw_status->status_tag = 0;
+	}
 	tp->last_tag = 0;
 	smp_mb();
 	synchronize_irq(tp->pdev->irq);




-

From: David Miller
Date: Friday, April 6, 2007 - 5:36 pm

From: "Michael Chan" <mchan@broadcom.com>

Applied, thanks Michael.
-

From: Nishanth Aravamudan
Date: Friday, April 6, 2007 - 6:53 pm

FWIW, tested, no panic.

Tested-by: Nishanth Aravamudan <nacc@us.ibm.com>

Thanks,
Nish

-- 
Nishanth Aravamudan <nacc@us.ibm.com>
IBM Linux Technology Center
-

From: Soeren Sonnenburg
Date: Friday, April 6, 2007 - 3:44 pm

regression update for 21-rc6:

1) all s2ram and NO_HZ related things seem to be resolved on my macbook
pro, also 
CONFIG_HPET_TIMER=y
CONFIG_HPET_EMULATE_RTC=y

don't break resume anymore.

2) However I am still having problems with
+CONFIG_HIGH_RES_TIMERS=y
+CONFIG_HPET=y
+CONFIG_HPET_MMAP=y
although the machine resumes, I've managed to get the attached oops.

3) Subject    : SATA breakage on resume
References : http://lkml.org/lkml/2007/3/7/233
Submitter  : Thomas Gleixner <tglx@linutronix.de>
             Soeren Sonnenburg <kernel@nn7.de>
Status     : unknown

I am still seeing these messages after a suspend/resume cycle (though
all devices work even after multiple suspend/resume cycles)

ATA: abnormal status 0x80 on port 0x000140df
ata3.01: revalidation failed (errno=-2)
ata3: failed to recover some devices, retrying in 5 secs
ata1.00: configured for UDMA/33
ATA: abnormal status 0x7F on port 0x000140df
ATA: abnormal status 0x7F on port 0x000140df
ata3.01: configured for UDMA/133

So that's been a big step forward...
Soeren
-- 
For the one fact about the future of which we can be certain is that it
will be utterly fantastic. -- Arthur C. Clarke, 1962
From: Linus Torvalds
Date: Friday, April 6, 2007 - 4:04 pm

[ Added some people to the cc.. Len, Thomas, Ingo - look for the exact 
  report on linux-kernel, but basically it's a "irq 9: nobody cared" issue 
  with acpi_irq on irq9 ]



Ok, interesting. I'd have blamed ACPI for this one (stuck IRQ9 is almost 
always some ACPI event that got stuck or the SCI got mis-routed and/or 
marked with the wrong polarity), although from your message I take it you 
don't get it without high-res timers? 

In fact,  I have a theory.. Your backtrace is:

 [<c0119637>] smp_apic_timer_interrupt+0x57/0x90
 [<c0142d30>] retrigger_next_event+0x0/0xb0
 [<c0104d30>] apic_timer_interrupt+0x28/0x30
 [<c0142d30>] retrigger_next_event+0x0/0xb0
 [<c0140068>] __kfifo_put+0x8/0x90
 [<c0130fe5>] on_each_cpu+0x35/0x60
 [<c0143538>] clock_was_set+0x18/0x20
 [<c0135cdc>] timekeeping_resume+0x7c/0xa0
 [<c02aabe1>] __sysdev_resume+0x11/0x80
 [<c02ab0c7>] sysdev_resume+0x47/0x80
 [<c02b0b05>] device_power_up+0x5/0x10

and the thing is, I don't think we should have interrupt enabled at this 
point in time! I susect that the timer resume enables interrupts too 
early! We should be doing the whole "device_power_up()" sequence with 

This seems to be normal, and related to some unknown timing issue. If the 
thing works for you apart from the message, I'd just ignore it..

		Linus
-

From: Ingo Molnar
Date: Saturday, April 7, 2007 - 1:12 am

yeah, i think you are right. timekeeping_resume() itself does not 
re-enable interrupts, it's clock_was_set() that does it implicitly:

void clock_was_set(void)
{
        /* Retrigger the CPU local events everywhere */
        on_each_cpu(retrigger_next_event, NULL, 0, 1);
}

on_each_cpu() is safe on SMP during resume 'bootup', because we only 
have a single CPU at that point, and smp_call_function() does:

        spin_lock(&call_lock);
        cpus = num_online_cpus() - 1;
        if (!cpus) {
                spin_unlock(&call_lock);

so we just return. Note that the built-in warning of smp_call_function() 
does not trigger because it's done too late:

        /* Can deadlock when called with interrupts disabled */
        WARN_ON(irqs_disabled());

we should move this up to the head of the function. But for this bug in 
question to trigger we'd have to use an UP kernel, which has this code 
for on_each_cpu():

#define on_each_cpu(func,info,retry,wait)       \
        ({                                      \
                local_irq_disable();            \
                func(info);                     \
                local_irq_enable();             \

ouch!

the solution is this: what we want to call here in timekeeping_resume is 
not clock_was_set() but retrigger_next_event() for the current CPU. The 
patch below should fix it. Soeren, can you confirm that you are using a 
!CONFIG_SMP kernel, and if yes, does the patch below fix the resume 
problem for you?

	Ingo

---------------------------->
Subject: [patch] high-res timers: UP resume fix
From: Ingo Molnar <mingo@elte.hu>

Soeren Sonnenburg reported that upon resume he is getting
this backtrace:

 [<c0119637>] smp_apic_timer_interrupt+0x57/0x90
 [<c0142d30>] retrigger_next_event+0x0/0xb0
 [<c0104d30>] apic_timer_interrupt+0x28/0x30
 [<c0142d30>] retrigger_next_event+0x0/0xb0
 [<c0140068>] __kfifo_put+0x8/0x90
 [<c0130fe5>] on_each_cpu+0x35/0x60
 [<c0143538>] clock_was_set+0x18/0x20
 ...
From: Ingo Molnar
Date: Saturday, April 7, 2007 - 1:25 am

hm, you seem to have a CONFIG_SMP=y kernel. I dont immediately see where 
we re-enable interrupts in the SMP case, but could you try my patch 
nevertheless?

	Ingo
-

From: Thomas Gleixner
Date: Saturday, April 7, 2007 - 1:48 am

We do in on_each_cpu() unconditionally. I missed that.

	tglx


-

From: Ingo Molnar
Date: Saturday, April 7, 2007 - 1:50 am

doh, indeed!

	Ingo
-

From: Rafael J. Wysocki
Date: Saturday, April 7, 2007 - 2:48 am

BTW, the on_each_cpu() in clock_was_set() is unnecessary, because
timekeeping_resume() is always run on one CPU.

Greetings,
Rafael
-

From: Ingo Molnar
Date: Saturday, April 7, 2007 - 2:47 am

yes - but that's not the only place where we do clock_was_set(), and the 
on_each_cpu() is necessary in every other case. So i think the right 
solution was the patch i did: to split the resume functionality from the 
clock_was_set() functionality.

	Ingo
-

From: Thomas Gleixner
Date: Saturday, April 7, 2007 - 2:51 am

Right, I reused it and just did not notice, that interrupts are enabled
unconditionally in on_each_cpu().

	tglx


-

From: Rafael J. Wysocki
Date: Saturday, April 7, 2007 - 2:53 am

Agreed.

Rafael
-

From: Pavel Machek
Date: Wednesday, April 11, 2007 - 7:00 am

I wonder if we should add BUG_ON(interrupts_enabled) just before
enabling interrupts to catch similar mistakes early?

							Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-

From: Thomas Gleixner
Date: Saturday, April 7, 2007 - 1:51 am

Acked-by: Thomas Gleixner <tglx@linutronix.de>


-

From: Ingo Molnar
Date: Saturday, April 7, 2007 - 2:49 am

find updated patch below - only the patch description changed: i removed 
the 'UP' thing (patch has relevance on SMP too), and added Thomas' ack.

	Ingo

---------------------------->
Subject: [patch] high-res timers: resume fix
From: Ingo Molnar <mingo@elte.hu>

Soeren Sonnenburg reported that upon resume he is getting
this backtrace:

 [<c0119637>] smp_apic_timer_interrupt+0x57/0x90
 [<c0142d30>] retrigger_next_event+0x0/0xb0
 [<c0104d30>] apic_timer_interrupt+0x28/0x30
 [<c0142d30>] retrigger_next_event+0x0/0xb0
 [<c0140068>] __kfifo_put+0x8/0x90
 [<c0130fe5>] on_each_cpu+0x35/0x60
 [<c0143538>] clock_was_set+0x18/0x20
 [<c0135cdc>] timekeeping_resume+0x7c/0xa0
 [<c02aabe1>] __sysdev_resume+0x11/0x80
 [<c02ab0c7>] sysdev_resume+0x47/0x80
 [<c02b0b05>] device_power_up+0x5/0x10

it turns out that on resume we mistakenly re-enable interrupts.
Do the timer retrigger only on the current CPU.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
---
 include/linux/hrtimer.h |    3 +++
 kernel/hrtimer.c        |   12 ++++++++++++
 2 files changed, 15 insertions(+)

Index: linux/include/linux/hrtimer.h
===================================================================
--- linux.orig/include/linux/hrtimer.h
+++ linux/include/linux/hrtimer.h
@@ -206,6 +206,7 @@ struct hrtimer_cpu_base {
 struct clock_event_device;
 
 extern void clock_was_set(void);
+extern void hres_timers_resume(void);
 extern void hrtimer_interrupt(struct clock_event_device *dev);
 
 /*
@@ -236,6 +237,8 @@ static inline ktime_t hrtimer_cb_get_tim
  */
 static inline void clock_was_set(void) { }
 
+static inline void hres_timers_resume(void) { }
+
 /*
  * In non high resolution mode the time reference is taken from
  * the base softirq time variable.
Index: linux/kernel/hrtimer.c
===================================================================
--- linux.orig/kernel/hrtimer.c
+++ linux/kernel/hrtimer.c
@@ -459,6 +459,18 @@ void ...
From: Rafael J. Wysocki
Date: Saturday, April 7, 2007 - 3:02 am

Hm, I'm probably missing something obvious, but where is it going to be called
from?

Rafael
-

From: Ingo Molnar
Date: Saturday, April 7, 2007 - 3:05 am

doh! :) Find new patch below :-/ Soeren, please test this one.

	Ingo

---------------------------->
Subject: [patch] high-res timers: resume fix
From: Ingo Molnar <mingo@elte.hu>

Soeren Sonnenburg reported that upon resume he is getting
this backtrace:

 [<c0119637>] smp_apic_timer_interrupt+0x57/0x90
 [<c0142d30>] retrigger_next_event+0x0/0xb0
 [<c0104d30>] apic_timer_interrupt+0x28/0x30
 [<c0142d30>] retrigger_next_event+0x0/0xb0
 [<c0140068>] __kfifo_put+0x8/0x90
 [<c0130fe5>] on_each_cpu+0x35/0x60
 [<c0143538>] clock_was_set+0x18/0x20
 [<c0135cdc>] timekeeping_resume+0x7c/0xa0
 [<c02aabe1>] __sysdev_resume+0x11/0x80
 [<c02ab0c7>] sysdev_resume+0x47/0x80
 [<c02b0b05>] device_power_up+0x5/0x10

it turns out that on resume we mistakenly re-enable interrupts.
Do the timer retrigger only on the current CPU.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
---
 include/linux/hrtimer.h |    3 +++
 kernel/hrtimer.c        |   12 ++++++++++++
 kernel/timer.c          |    2 +-
 3 files changed, 16 insertions(+), 1 deletion(-)

Index: linux/include/linux/hrtimer.h
===================================================================
--- linux.orig/include/linux/hrtimer.h
+++ linux/include/linux/hrtimer.h
@@ -206,6 +206,7 @@ struct hrtimer_cpu_base {
 struct clock_event_device;
 
 extern void clock_was_set(void);
+extern void hres_timers_resume(void);
 extern void hrtimer_interrupt(struct clock_event_device *dev);
 
 /*
@@ -236,6 +237,8 @@ static inline ktime_t hrtimer_cb_get_tim
  */
 static inline void clock_was_set(void) { }
 
+static inline void hres_timers_resume(void) { }
+
 /*
  * In non high resolution mode the time reference is taken from
  * the base softirq time variable.
Index: linux/kernel/hrtimer.c
===================================================================
--- linux.orig/kernel/hrtimer.c
+++ linux/kernel/hrtimer.c
@@ -459,6 +459,18 @@ void clock_was_set(void)
 }
 
 /*
+ * During resume ...
From: Soeren Sonnenburg
Date: Saturday, April 7, 2007 - 3:45 am

OK, I did about 5 suspend/resume cycles with

CONFIG_HPET_TIMER=y
CONFIG_HPET_EMULATE_RTC=y
CONFIG_HPET=y
CONFIG_HPET_MMAP=y

and no oops / no problem ...

So I guess the fix take #3 is good :-)

One not directly related to this patch (but probably all the timer
stuff) I noticed with -rc6 is that it takes 10 seconds to suspend (it
was ~2 seconds before)

-- 
Sometimes, there's a moment as you're waking, when you become aware of
the real world around you, but you're still dreaming.
-

From: Soeren Sonnenburg
Date: Sunday, April 8, 2007 - 8:57 am

On Fri, 2007-04-06 at 16:04 -0700, Linus Torvalds wrote:


Argh! Now after intensive use over the last 2 days, I realized that the
internal harddisk works OK, but the dvd-drive did not after the 7th
suspend/resume cycle - the device was suddenly gone (I could not even
eject the disc I just inserted), more verbose dmesg follows:

ata1: port is slow to respond, please be patient (Status 0x80)
ata1: port failed to respond (30 secs, Status 0x80)
ATA: abnormal status 0x80 on port 0x000101f7
ATA: abnormal status 0x80 on port 0x000101f7
ATA: abnormal status 0x80 on port 0x000101f7
ATA: abnormal status 0x80 on port 0x000101f7
ata1.00: qc timeout (cmd 0xa1)
ata1.00: failed to IDENTIFY (I/O error, err_mask=0x4)
ata1.00: revalidation failed (errno=-5)
ata1: failed to recover some devices, retrying in 5 secs
ata1: port is slow to respond, please be patient (Status 0x80)
ata1: port failed to respond (30 secs, Status 0x80)
ATA: abnormal status 0x80 on port 0x000101f7
ATA: abnormal status 0x80 on port 0x000101f7
ATA: abnormal status 0x80 on port 0x000101f7
ATA: abnormal status 0x80 on port 0x000101f7
ata1.00: qc timeout (cmd 0xa1)
ata1.00: failed to IDENTIFY (I/O error, err_mask=0x4)
ata1.00: revalidation failed (errno=-5)
ata1.00: limiting speed to UDMA/33:PIO3
ata1: failed to recover some devices, retrying in 5 secs
ata1: port is slow to respond, please be patient (Status 0x80)
ata1: port failed to respond (30 secs, Status 0x80)
ATA: abnormal status 0x80 on port 0x000101f7
sage repeated 4 times
ata1.00: qc timeout (cmd 0xa1)
ata1.00: failed to IDENTIFY (I/O error, err_mask=0x4)
ata1.00: revalidation failed (errno=-5)
ata1.00: disabled

Soeren
-- 
For the one fact about the future of which we can be certain is that it
will be utterly fantastic. -- Arthur C. Clarke, 1962
-

From: Michal Piotrowski
Date: Saturday, April 7, 2007 - 1:48 am

Hi all,

This looks like a lockdep problem.
2.6.21-rc6
+ hrtimers_debug.patch (from Ingo)
- skge_wol_support (commit a504e64ab42bcc27074ea37405d06833ed6e0820) dropped due to
swsusp problems

[14016.726946] BUG: at /mnt/md0/devel/linux-git/kernel/lockdep.c:2427 check_flags()
[14016.734331]  [<c0105039>] show_trace_log_lvl+0x1a/0x2f
[14016.739507]  [<c0105720>] show_trace+0x12/0x14
[14016.743982]  [<c01057d2>] dump_stack+0x16/0x18
[14016.748460]  [<c013b57f>] check_flags+0x95/0x143
[14016.753106]  [<c013e334>] lock_acquire+0x29/0x82
[14016.757741]  [<c01369dc>] down_write+0x3a/0x54
[14016.762203]  [<c0163be2>] sys_munmap+0x23/0x3f
[14016.766661]  [<c0104060>] syscall_call+0x7/0xb
[14016.771134]  =======================
[14016.774712] irq event stamp: 43076
[14016.778111] hardirqs last  enabled at (43075): [<c0104189>] syscall_exit_work+0x11/0x26
[14016.786166] hardirqs last disabled at (43076): [<c0103f09>] ret_from_exception+0x9/0xc
[14016.794118] softirqs last  enabled at (42608): [<c012653b>] __do_softirq+0xe4/0xea
[14016.801706] softirqs last disabled at (42599): [<c01069b5>] do_softirq+0x64/0xd1

http://www.stardust.webpages.pl/files/tbf/bitis-gabonica/2.6.21-rc6/git-console.log
http://www.stardust.webpages.pl/files/tbf/bitis-gabonica/2.6.21-rc6/git-config

BTW. I noticed some strange fio (1.15) behavior
Starting 16 processes
file:io_u.c:65, assert idx < f->num_maps failed[  1605/ 36442 kb/s] [eta 00m:32s]
fio: pid=13734, got signal=11
file:io_u.c:65, assert idx < f->num_maps failed[ 10452/     0 kb/s] [eta 00m:23s]
fio: pid=13731, got signal=11

Regards,
Michal

-- 
Michal K. K. Piotrowski
LTG - Linux Testers Group (PL)
(http://www.stardust.webpages.pl/ltg/)
LTG - Linux Testers Group (EN)
(http://www.stardust.webpages.pl/linux_testers_group_en/)
-

From: Randy Dunlap
Date: Saturday, April 7, 2007 - 11:37 am

Is it too late to get a v2.6.21-rc6 tag ?

---
~Randy
-

From: Linus Torvalds
Date: Saturday, April 7, 2007 - 11:46 am

It's definitely there, I can see it in gitweb..

Do you have some really ancient git that didn't fetch the tags 
automatically?

		Linus
-

From: Randy Dunlap
Date: Saturday, April 7, 2007 - 11:50 am

Could be.  I'll check that.

Thanks.
---
~Randy
-

From: Linus Torvalds
Date: Saturday, April 7, 2007 - 11:51 am

Oh, my bad. I'd tagged it, but I didn't *sign* the tag, so it was just a 
tag-reference (and git fetch won't fetch them by default).

I replaced the v2.6.21-rc6 tag with a signed one. Do 

	git fetch --tags

to get the thing.

		Linus
-

From: Gene Heskett
Date: Saturday, April 7, 2007 - 1:58 pm

FWIW, this last reversion didn't do it quite right, the device-mapper was 
at 253 prior to this patches parent patch, and now its at 252, which is 
still a 'dump it all' change for both tar & dump.  Until things settle, 
I'm going to test and probably use the instructions that Dave Dillow just 
sent me, which should put it at 238 regardless.

-- 
Cheers, Gene
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
"Looks clean and obviously correct to me, but then _everything_ I write
 always looks obviously correct yo me."

	- Linus
-

From: Greg KH
Date: Sunday, April 8, 2007 - 5:42 pm

Feel free to forward it on with:
	Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

As it was just a documentation update, I figured it was safe to wait for
2.6.22, but I have no objection to it going in now.

thanks,

greg k-h
-

From: Jeff Garzik
Date: Sunday, April 8, 2007 - 5:59 pm

It sounded this was specific to Ingo.  I haven't heard anybody else 

ACK this one.  Need to send this up, but I'm intentionally avoiding work 
as we are having a big Easter bash here in Raleigh.  Silly bunny-related 
traditions that have nothing to do with Jesus take priority ;-)

I have a couple other bug fixes to push, but that will wait until Tuesday.

	Jeff


-

From: Chris Wedgwood
Date: Tuesday, April 10, 2007 - 12:57 am

I'm not sure, it sounds a bit like something I saw a while ago.  I
would have to check for sure, I made a quick debugging patch (sent to
netdev) and it went away so I think my last though was a
miscompilation.

-

From: Ingo Molnar
Date: Wednesday, April 11, 2007 - 12:38 am

the bug has turned into an 'interface hang under high load' (i.e. the 
hack patch above is not necessary, but the problem is still there). It 
still affects the latest forcedeth.c in -rc6. I.e. it's still an 
unresolved regression. The last state i'm aware of is that I have sent 
Ayaz ethtool output as well of the hang, as requested.

	Ingo
-

From: Dmitry Torokhov
Date: Monday, April 9, 2007 - 8:32 pm

We should not encourage using platform_device_register_simple as we want
to obsolete this function.

-- 
Dmitry
-

From: Jeff Chua
Date: Tuesday, April 10, 2007 - 7:35 am

I couldn't get suspend-to-disk to work with 2.6.21-rc6. I've tried
set/unset CONFIG_NO_HZ/CONFIG_HPET_TIMER, but nothing worked.

With rc5 and Maxim's patch, it worked with CONFIG_NO_HZ unset.

This is on ThinkPad X60s.

Jeff.
-

From: Linus Torvalds
Date: Tuesday, April 10, 2007 - 8:35 am

Do you think you could busect it? You'd have to apply maxim's patch by 
hand at each bisection step (up until the point where it's already applied 
in the git tree, of course), so it's not a totally mindless bisection, but 
it should still be fairly painless, since there is only 277 commits 
between -rc5 and -rc6 (so bisection should rather quickly narrow it down)

		Linus
-

From: Jeff Chua
Date: Wednesday, April 11, 2007 - 9:16 pm

Linus,

I did that last night and realize that I could suspend to disk/ram
with 2.6.21-rc6  CONFIG_NO_HZ unset. I must have done something wrong
before.

Thank you,
Jeff.
-

From: Ingo Molnar
Date: Thursday, April 12, 2007 - 2:55 am

i just got the crash below (with slab debug enabled) on -rc6-git4. I 
never saw this one before, and as you can see from the recompile count, 
i've rebuilt this tree a fair number of times - and the config didnt 
change much.

I promptly re-tried the same bzImage but the crash did not reoccur.

So we've got a memory corruptor of some sort in v2.6.21-to-be. I'm 100% 
sure that i never saw this under any v2.6.20 variant or on any prior 
kernel. The crash site corresponds to a module-refcount dec:

(gdb) list *0x00000000c013c1f4
0xc013c1f4 is in module_put (kernel/module.c:801).
796
797     void module_put(struct module *module)
798     {
799             if (module) {
800                     unsigned int cpu = get_cpu();
801                     local_dec(&module->ref[cpu].count);
802                     /* Maybe they're waiting for us to drop reference? */
803                     if (unlikely(!module_is_live(module)))
804                             wake_up_process(module->waiter);
805                     put_cpu();
(gdb)

NOTE: i'm still using a bzImage kernel, so there are no true modules in 
the kernel. (This also makes it pretty likely that this is not a build 
artifact either.)

(config and full bootlog attached.)

	Ingo

---------------------->
BUG: unable to handle kernel paging request at virtual address 6b6b6ceb
 printing eip:
c013c1f5
*pde = 0203000c
Oops: 0002 [#1]
SMP 
Modules linked in:
CPU:    0
EIP:    0060:[<c013c1f5>]    Not tainted VLI
EFLAGS: 00010256   (2.6.21-rc6 #273)
EIP is at module_put+0x19/0x2d
eax: 6b6b6ceb   ebx: f72fee2c   ecx: c03c9b36   edx: 6b6b6b6b
esi: f7428f54   edi: 6b6b6b6b   ebp: f737bf38   esp: f737bf38
ds: 007b   es: 007b   fs: 00d8  gs: 0033  ss: 0068
Process udev (pid: 1768, ti=f737a000 task=f7488000 task.ti=f737a000)
Stack: f737bf50 c019e832 f749092c 00000010 f72feda4 f746487c f737bf78 c0167c7f 
       00000000 00000000 f72f6ba4 c2928d48 f72feda4 f746487c f7be81d4 00000000 
       f737bf80 c0167d3b f737bf98 c01658b2 ...
From: Mattia Dongili
Date: Thursday, April 12, 2007 - 8:14 am

On Thu, Apr 05, 2007 at 07:50:11PM -0700, Linus Torvalds wrote:

This one breaks resume for me (from STR) on a vaio SZ. Reverting this
commit allows resuming again but leaves me with some periodic and unpleasant:

[  155.232000] BUG: soft lockup detected on CPU#1!
[  155.232000]  [<c0104cf2>] show_trace_log_lvl+0x1a/0x2f
[  155.232000]  [<c0105344>] show_trace+0x12/0x14
[  155.232000]  [<c01053c8>] dump_stack+0x16/0x18
[  155.232000]  [<c0147240>] softlockup_tick+0xa7/0xb6
[  155.232000]  [<c01284d3>] run_local_timers+0x12/0x14
[  155.232000]  [<c012887a>] update_process_times+0x3e/0x63
[  155.232000]  [<c0137656>] tick_sched_timer+0x50/0x95
[  155.232000]  [<c01340e0>] hrtimer_interrupt+0x10b/0x18b
[  155.232000]  [<c01137b7>] smp_apic_timer_interrupt+0x6c/0x7e
[  155.232000]  [<c0104840>] apic_timer_interrupt+0x28/0x30
[  155.232000]  [<c0102318>] cpu_idle+0x1b/0xc7
[  155.232000]  [<c011297a>] start_secondary+0x32b/0x333
[  155.232000]  [<00000000>] run_init_process+0x3fefed10/0x19
[  155.232000]  =======================

FWIW: I hit the same BUG() in -rc5.
full boot+suspend+resume log: http://oioio.altervista.org/linux/kern-2.6.21-rc6.log
.config: http://oioio.altervista.org/linux/config-2.6.21-rc6-1

I'm available to test more patches or to provide other info.
-- 
-

From: Mattia Dongili
Date: Thursday, April 12, 2007 - 10:02 am

A couple more info (probably useless but...):
- I noticed the resume problem in -rc6-mm1 but reverting the same patch
  there doesn't make the laptop resume again
- last known succesful resuming kernel: 2.6.21-rc5-mm3 (and without
  hitting the BUG() above after resume)

-- 
-

From: Maxim Levitsky
Date: Thursday, April 12, 2007 - 11:26 am

Strange,strange...


First of all try to boot with clocksource=acpi_pm
(I want to test whenever HPET working as clocksource is a problem)

Then try to boot with hpet=disable or unset CONFIG_HPET_TIMER
(This will disable hpet both as clock source and clockevent)

Please send also contents of  /proc/timer_list 
(I want to know whenever APIC timer is enabled there or not)


Best regards,
	Maxim Levitsky

-

From: Mattia Dongili
Date: Friday, April 13, 2007 - 1:52 am

Yes... strange. I can't reproduce the resume breakage anymore, with or
without your patch. I still have the soft lockup anyway after resuming.
I'll still keep trying, for now just disregard my previous mail.

-- 
-

From: Tobias Diedrich
Date: Friday, April 13, 2007 - 2:29 pm

For me, suspend to disk works only once (has been the case for all
.21-rcs IIRC, but I didn't get around to report it so far).
There are some threads about an issue like this, which is supposed
to be fixed by disabling CONFIG_PCI_MSI, but on my system the
problem persists nonetheless.

On the second suspend attempt, the last message I see is
"Suspending console(s)"

If I find the time, I'll try to bisect it this weekend.

.config:
#
# Automatically generated make config: don't edit
# Linux kernel version: 2.6.21-rc6
# Fri Apr 13 23:08:52 2007
#
CONFIG_X86_64=y
CONFIG_64BIT=y
CONFIG_X86=y
CONFIG_GENERIC_TIME=y
CONFIG_GENERIC_TIME_VSYSCALL=y
CONFIG_ZONE_DMA32=y
CONFIG_LOCKDEP_SUPPORT=y
CONFIG_STACKTRACE_SUPPORT=y
CONFIG_SEMAPHORE_SLEEPERS=y
CONFIG_MMU=y
CONFIG_ZONE_DMA=y
CONFIG_RWSEM_GENERIC_SPINLOCK=y
CONFIG_GENERIC_HWEIGHT=y
CONFIG_GENERIC_CALIBRATE_DELAY=y
CONFIG_X86_CMPXCHG=y
CONFIG_EARLY_PRINTK=y
CONFIG_GENERIC_ISA_DMA=y
CONFIG_GENERIC_IOMAP=y
CONFIG_ARCH_MAY_HAVE_PC_FDC=y
CONFIG_ARCH_POPULATES_NODE_MAP=y
CONFIG_DMI=y
CONFIG_AUDIT_ARCH=y
CONFIG_GENERIC_BUG=y
# CONFIG_ARCH_HAS_ILOG2_U32 is not set
# CONFIG_ARCH_HAS_ILOG2_U64 is not set
CONFIG_DEFCONFIG_LIST="/lib/modules/$UNAME_RELEASE/.config"

#
# Code maturity level options
#
CONFIG_EXPERIMENTAL=y
CONFIG_BROKEN_ON_SMP=y
CONFIG_LOCK_KERNEL=y
CONFIG_INIT_ENV_ARG_LIMIT=32

#
# General setup
#
CONFIG_LOCALVERSION=""
CONFIG_LOCALVERSION_AUTO=y
CONFIG_SWAP=y
CONFIG_SYSVIPC=y
# CONFIG_IPC_NS is not set
CONFIG_SYSVIPC_SYSCTL=y
# CONFIG_POSIX_MQUEUE is not set
CONFIG_BSD_PROCESS_ACCT=y
CONFIG_BSD_PROCESS_ACCT_V3=y
# CONFIG_TASKSTATS is not set
# CONFIG_UTS_NS is not set
# CONFIG_AUDIT is not set
CONFIG_IKCONFIG=y
CONFIG_IKCONFIG_PROC=y
# CONFIG_SYSFS_DEPRECATED is not set
# CONFIG_RELAY is not set
# CONFIG_BLK_DEV_INITRD is not set
CONFIG_CC_OPTIMIZE_FOR_SIZE=y
CONFIG_SYSCTL=y
# CONFIG_EMBEDDED is not set
CONFIG_UID16=y
CONFIG_SYSCTL_SYSCALL=y
CONFIG_KALLSYMS=y
# CONFIG_KALLSYMS_ALL ...
From: Adrian Bunk
Date: Friday, April 13, 2007 - 4:50 pm

Does CONFIG_HPET_TIMER=n make any difference?
Does the latest -git work?
 
cu
Adrian

-- 

       "Is there not promise of rain?" Ling Tan asked suddenly out
        of the darkness. There had been need of rain for many days.
       "Only a promise," Lao Er said.
                                       Pearl S. Buck - Dragon Seed

-

From: Tobias Diedrich
Date: Friday, April 13, 2007 - 11:50 pm

Coming up next :)

-- 
Tobias						PGP: http://9ac7e0bc.uguu.de
このメールは十割再利用されたビットで作られています。
-

From: Tobias Diedrich
Date: Saturday, April 14, 2007 - 1:16 am

Still no luck with
Linux melchior 2.6.21-rc6-gd791d413-dirty #4 PREEMPT Sat Apr 14 09:34:21 CEST 2007 x86_64 GNU/Linux

Hmm, I just noticed that CONFIG_HPET_TIMER was forced back on after
make oldconfig...  Is that expected on amd64?

#
# Automatically generated make config: don't edit
# Linux kernel version: 2.6.21-rc6
# Sat Apr 14 09:33:36 2007
#
CONFIG_X86_64=y
CONFIG_64BIT=y
CONFIG_X86=y
CONFIG_GENERIC_TIME=y
CONFIG_GENERIC_TIME_VSYSCALL=y
CONFIG_ZONE_DMA32=y
CONFIG_LOCKDEP_SUPPORT=y
CONFIG_STACKTRACE_SUPPORT=y
CONFIG_SEMAPHORE_SLEEPERS=y
CONFIG_MMU=y
CONFIG_ZONE_DMA=y
CONFIG_RWSEM_GENERIC_SPINLOCK=y
CONFIG_GENERIC_HWEIGHT=y
CONFIG_GENERIC_CALIBRATE_DELAY=y
CONFIG_X86_CMPXCHG=y
CONFIG_EARLY_PRINTK=y
CONFIG_GENERIC_ISA_DMA=y
CONFIG_GENERIC_IOMAP=y
CONFIG_ARCH_MAY_HAVE_PC_FDC=y
CONFIG_ARCH_POPULATES_NODE_MAP=y
CONFIG_DMI=y
CONFIG_AUDIT_ARCH=y
CONFIG_GENERIC_BUG=y
# CONFIG_ARCH_HAS_ILOG2_U32 is not set
# CONFIG_ARCH_HAS_ILOG2_U64 is not set
CONFIG_DEFCONFIG_LIST="/lib/modules/$UNAME_RELEASE/.config"

#
# Code maturity level options
#
CONFIG_EXPERIMENTAL=y
CONFIG_BROKEN_ON_SMP=y
CONFIG_LOCK_KERNEL=y
CONFIG_INIT_ENV_ARG_LIMIT=32

#
# General setup
#
CONFIG_LOCALVERSION=""
CONFIG_LOCALVERSION_AUTO=y
CONFIG_SWAP=y
CONFIG_SYSVIPC=y
# CONFIG_IPC_NS is not set
CONFIG_SYSVIPC_SYSCTL=y
# CONFIG_POSIX_MQUEUE is not set
CONFIG_BSD_PROCESS_ACCT=y
CONFIG_BSD_PROCESS_ACCT_V3=y
# CONFIG_TASKSTATS is not set
# CONFIG_UTS_NS is not set
# CONFIG_AUDIT is not set
CONFIG_IKCONFIG=y
CONFIG_IKCONFIG_PROC=y
# CONFIG_SYSFS_DEPRECATED is not set
# CONFIG_RELAY is not set
# CONFIG_BLK_DEV_INITRD is not set
CONFIG_CC_OPTIMIZE_FOR_SIZE=y
CONFIG_SYSCTL=y
# CONFIG_EMBEDDED is not set
CONFIG_UID16=y
CONFIG_SYSCTL_SYSCALL=y
CONFIG_KALLSYMS=y
# CONFIG_KALLSYMS_ALL is not set
# CONFIG_KALLSYMS_EXTRA_PASS is not ...
From: Rafael J. Wysocki
Date: Saturday, April 14, 2007 - 2:05 am

Can you boot with init=/bin/bash and see if the problem is present in this
configuration?

Rafael
-

From: Tobias Diedrich
Date: Saturday, April 14, 2007 - 3:32 am

Doesn't help.
Maybe interesting:
In the init=/bin/bash run, the first suspend try was without swap
and thus bailed out. After swapon, the second try already hung,
despite not having 'really' suspended at all on the first try.
I tried it once more, with swap on the first try and got the same
'second try doesn't work' result.

git-bisect so far:
git-bisect start
# good: [fa285a3d7924a0e3782926e51f16865c5129a2f7] Linux 2.6.20
git-bisect good fa285a3d7924a0e3782926e51f16865c5129a2f7
# bad: [2eb1ae149a28c1b8ade687c5fbab3c37da4c0fba] Linux 2.6.21-rc1
git-bisect bad 2eb1ae149a28c1b8ade687c5fbab3c37da4c0fba
# bad: [574009c1a895aeeb85eaab29c235d75852b09eb8] Merge branch 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus
git-bisect bad 574009c1a895aeeb85eaab29c235d75852b09eb8
# good: [43187902cbfafe73ede0144166b741fb0f7d04e1] Merge master.kernel.org:/pub/scm/linux/kernel/git/gregkh/driver-2.6
git-bisect good 43187902cbfafe73ede0144166b741fb0f7d04e1
# good: [beda9f3a13bbb22cde92a45f230a02ef2afef6a9] kbuild: more Makefile cleanups
git-bisect good beda9f3a13bbb22cde92a45f230a02ef2afef6a9
# bad: [7edc136ab688f751037a86e8a051151d7962d33f] Char: isicom, support higher rates
git-bisect bad 7edc136ab688f751037a86e8a051151d7962d33f

-- 
Tobias						PGP: http://9ac7e0bc.uguu.de
このメールは十割再利用されたビットで作られています。
-

From: Adrian Bunk
Date: Saturday, April 14, 2007 - 5:26 am

Yes it is (on i386 you can disable it).

cu
Adrian

-- 

       "Is there not promise of rain?" Ling Tan asked suddenly out
        of the darkness. There had been need of rain for many days.
       "Only a promise," Lao Er said.
                                       Pearl S. Buck - Dragon Seed

-

From: Tobias Diedrich
Date: Saturday, April 14, 2007 - 5:09 am

bisect results:

git-bisect start
# good: [fa285a3d7924a0e3782926e51f16865c5129a2f7] Linux 2.6.20
git-bisect good fa285a3d7924a0e3782926e51f16865c5129a2f7
# bad: [2eb1ae149a28c1b8ade687c5fbab3c37da4c0fba] Linux 2.6.21-rc1
git-bisect bad 2eb1ae149a28c1b8ade687c5fbab3c37da4c0fba
# bad: [574009c1a895aeeb85eaab29c235d75852b09eb8] Merge branch 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus
git-bisect bad 574009c1a895aeeb85eaab29c235d75852b09eb8
# good: [43187902cbfafe73ede0144166b741fb0f7d04e1] Merge master.kernel.org:/pub/scm/linux/kernel/git/gregkh/driver-2.6
git-bisect good 43187902cbfafe73ede0144166b741fb0f7d04e1
# good: [beda9f3a13bbb22cde92a45f230a02ef2afef6a9] kbuild: more Makefile cleanups
git-bisect good beda9f3a13bbb22cde92a45f230a02ef2afef6a9
# bad: [7edc136ab688f751037a86e8a051151d7962d33f] Char: isicom, support higher rates
git-bisect bad 7edc136ab688f751037a86e8a051151d7962d33f
# good: [6267276f3fdda9ad0d5ca451bdcbdf42b802d64b] optional ZONE_DMA: deal with cases of ZONE_DMA meaning the first zone
git-bisect good 6267276f3fdda9ad0d5ca451bdcbdf42b802d64b
# bad: [b4ac91a0eac36f347a509afda07e4305e931de61] uml: chan_user.h formatting fixes
git-bisect bad b4ac91a0eac36f347a509afda07e4305e931de61
# bad: [bf0059b23fd2f0b304f647d87fad0aa626ecf0c0] M68KNOMMU: user ARRAY_SIZE macro when appropriate
git-bisect bad bf0059b23fd2f0b304f647d87fad0aa626ecf0c0
# good: [c1725f2af89f1eda3cb9007290971b55084569a4] ARM26: Use ARRAY_SIZE macro when appropriate
git-bisect good c1725f2af89f1eda3cb9007290971b55084569a4
# bad: [9b87ed790714bd3a8d492feb24f6c48f8bb59c3a] m32r: fix do_page_fault and update_mmu_cache
git-bisect bad 9b87ed790714bd3a8d492feb24f6c48f8bb59c3a
# bad: [d12c610e08022a1b84d6bd4412c189214d32e713] swsusp-change-code-ordering-in-userc-sanity
git-bisect bad d12c610e08022a1b84d6bd4412c189214d32e713
# bad: [ed746e3b18f4df18afa3763155972c5835f284c5] swsusp: Change code ordering in disk.c
git-bisect bad ed746e3b18f4df18afa3763155972c5835f284c5
# good: ...
From: Tobias Diedrich
Date: Saturday, April 14, 2007 - 5:24 am

Doesn't apply cleanly against -rc6, but fixes the problem when
reverted from -rc1.

Index: linux-2.6.21-rc1/kernel/power/disk.c
===================================================================
--- linux-2.6.21-rc1.orig/kernel/power/disk.c	2007-04-14 14:16:59.000000000 +0200
+++ linux-2.6.21-rc1/kernel/power/disk.c	2007-04-14 14:17:03.000000000 +0200
@@ -87,24 +87,52 @@
 	}
 }
 
-static void unprepare_processes(void)
-{
-	thaw_processes();
-	pm_restore_console();
-}
-
 static int prepare_processes(void)
 {
 	int error = 0;
 
 	pm_prepare_console();
+
+	error = disable_nonboot_cpus();
+	if (error)
+		goto enable_cpus;
+
 	if (freeze_processes()) {
 		error = -EBUSY;
-		unprepare_processes();
+		goto thaw;
 	}
+
+	if (pm_disk_mode == PM_DISK_TESTPROC) {
+		printk("swsusp debug: Waiting for 5 seconds.\n");
+		mdelay(5000);
+		goto thaw;
+	}
+
+	error = platform_prepare();
+	if (error)
+		goto thaw;
+
+	/* Free memory before shutting down devices. */
+	if (!(error = swsusp_shrink_memory()))
+		return 0;
+
+	platform_finish();
+ thaw:
+	thaw_processes();
+ enable_cpus:
+	enable_nonboot_cpus();
+	pm_restore_console();
 	return error;
 }
 
+static void unprepare_processes(void)
+{
+	platform_finish();
+	thaw_processes();
+	enable_nonboot_cpus();
+	pm_restore_console();
+}
+
 /**
  *	pm_suspend_disk - The granpappy of hibernation power management.
  *
@@ -122,45 +150,29 @@
 	if (error)
 		return error;
 
-	if (pm_disk_mode == PM_DISK_TESTPROC) {
-		printk("swsusp debug: Waiting for 5 seconds.\n");
-		mdelay(5000);
-		goto Thaw;
-	}
-	/* Free memory before shutting down devices. */
-	error = swsusp_shrink_memory();
-	if (error)
-		goto Thaw;
-
-	error = platform_prepare();
-	if (error)
-		goto Thaw;
+	if (pm_disk_mode == PM_DISK_TESTPROC)
+		return 0;
 
 	suspend_console();
 	error = device_suspend(PMSG_FREEZE);
 	if (error) {
-		printk(KERN_ERR "PM: Some devices failed to suspend\n");
-		goto ...
From: Tobias Diedrich
Date: Saturday, April 14, 2007 - 5:31 am

Now, this was already reported in
http://lkml.org/lkml/2007/3/16/126
and I even flagged that message in my local folder, but apparently forgot
to follow up on it... *sigh*

-- 
Tobias						PGP: http://9ac7e0bc.uguu.de
このメールは十割再利用されたビットで作られています。
-

From: Adrian Bunk
Date: Saturday, April 14, 2007 - 6:00 am

Unless I misunderstood something, all of the problems Maxim described in 
this email are fixed for him in -rc6.

But it's quite possible that you are running into a different issue 

cu
Adrian

-- 

       "Is there not promise of rain?" Ling Tan asked suddenly out
        of the darkness. There had been need of rain for many days.
       "Only a promise," Lao Er said.
                                       Pearl S. Buck - Dragon Seed

-

From: Rafael J. Wysocki
Date: Saturday, April 14, 2007 - 11:28 am

Yes, it's likely.

Tobias, I'm unable to reproduce the problem with your .config, but my hardware
is certainly different.  Which suspend mode do you use?  If that's "platform",
can you try to use "shutdown" or "reboot" and see if that helps?

Rafael


-- 
If you don't have the time to read,
you don't have the time or the tools to write.
		- Stephen King
-

From: Tobias Diedrich
Date: Saturday, April 14, 2007 - 12:56 pm

Sure.
shutdown/reboot works fine, only platform is broken.

-- 
Tobias						PGP: http://9ac7e0bc.uguu.de
-

From: Rafael J. Wysocki
Date: Saturday, April 14, 2007 - 1:23 pm

Thanks.

Now, I suspect the problem is somehow related to the hardware, so it would help
a lot if we could identify the piece of hardware (or driver) involved.

AFAICT, your system is a non-SMP one, so we can rule out
disable/enable_nonboot_cpus().  To confirm that the problem is related to
platform_finish(), can you please apply the appended debug patch and
see if the suspend in the 'platform' mode works with it?

Also, would that be feasible for you to use 'shutdown' as a workaround in case
the source of the problem is difficult to find and/or fix?

Rafael

---
 kernel/power/disk.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Index: linux-2.6.21-rc6/kernel/power/disk.c
===================================================================
--- linux-2.6.21-rc6.orig/kernel/power/disk.c
+++ linux-2.6.21-rc6/kernel/power/disk.c
@@ -170,8 +170,8 @@ int pm_suspend_disk(void)
 
 	if (in_suspend) {
 		enable_nonboot_cpus();
-		platform_finish();
 		device_resume();
+		platform_finish();
 		resume_console();
 		pr_debug("PM: writing image.\n");
 		error = swsusp_write();
@@ -189,8 +189,8 @@ int pm_suspend_disk(void)
  Enable_cpus:
 	enable_nonboot_cpus();
  Resume_devices:
-	platform_finish();
 	device_resume();
+	platform_finish();
 	resume_console();
  Thaw:
 	unprepare_processes();


-

From: Adrian Bunk
Date: Saturday, April 14, 2007 - 1:25 pm

One person reporting a regression against a -rc kernel can mean
houndreds or thousands of people who will run into the same issue after 

cu
Adrian

-- 

       "Is there not promise of rain?" Ling Tan asked suddenly out
        of the darkness. There had been need of rain for many days.
       "Only a promise," Lao Er said.
                                       Pearl S. Buck - Dragon Seed

-

From: Rafael J. Wysocki
Date: Saturday, April 14, 2007 - 1:38 pm

Well, in this particular case it is not very likely to happen.  I have three
x86_64 machines here with totally different chipsets/devices on which I'm
not seeing anything like that and I believe we'd have more reports before
if that were a common issue.

That said, I'm not going to ignore it.  I'll do my best to debug and fix it, if
Tobias helps me. :-)

Greetings,
Rafael
-

From: Tobias Diedrich
Date: Saturday, April 14, 2007 - 2:35 pm

Yes, it's a Asus M2N-SLI-Deluxe Mainboard with a Athlon64 3200+


-- 
Tobias						PGP: http://9ac7e0bc.uguu.de
このメールは十割再利用されたビットで作られています。
-

From: Rafael J. Wysocki
Date: Saturday, April 14, 2007 - 2:58 pm

Well, I thought it would, but it also would break some other people's systems.
That's the _real_ problem.  Let's see if we can learn more.

Can you please revert it for now, apply the appended one and try to
suspend/resume twice in the 'platform' mode (it may or may not work)?

Rafael

---
 kernel/power/disk.c |    6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

Index: linux-2.6.21-rc6/kernel/power/disk.c
===================================================================
--- linux-2.6.21-rc6.orig/kernel/power/disk.c
+++ linux-2.6.21-rc6/kernel/power/disk.c
@@ -267,12 +267,15 @@ static int software_resume(void)
 	error = swsusp_read();
 	if (error) {
 		swsusp_free();
-		platform_finish();
 		goto Thaw;
 	}
 
 	pr_debug("PM: Preparing devices for restore.\n");
 
+	error = platform_prepare();
+	if (error)
+		goto Thaw;
+
 	suspend_console();
 	error = device_suspend(PMSG_PRETHAW);
 	if (error)
@@ -285,6 +288,7 @@ static int software_resume(void)
 	enable_nonboot_cpus();
  Free:
 	swsusp_free();
+	platform_finish();
 	device_resume();
 	resume_console();
  Thaw:
-

From: Tobias Diedrich
Date: Sunday, April 15, 2007 - 12:38 am

Ok. The patch doesn't apply cleanly to 2.6.21-rc6:
|patching file kernel/power/disk.c
|Hunk #1 FAILED at 267.
|Hunk #2 succeeded at 265 (offset -23 lines).
|1 out of 2 hunks FAILED -- saving rejects to file
|kernel/power/disk.c.rej

wiggle helps, seems the first part of Hunk #1 is already applied in
2.6.21-rc6.
With CONFIG_PM_DEBUG=y and CONFIG_DISABLE_CONSOLE_SUSPEND=y I see
that the second suspend hangs at "i8042 i8042: EARLY resume".
This is kinda interesting because I'm normally using a USB keyboard
and sure enough, if I hook up a normal keyboard and disable USB
legacy support in the BIOS, then suspend to disk works multiple
times. I'd still rather like to use my USB keyboard though. ;)

-- 
Tobias						PGP: http://9ac7e0bc.uguu.de
このメールは十割再利用されたビットで作られています。
-

From: Tobias Diedrich
Date: Sunday, April 15, 2007 - 1:02 am

And I can now confirm that unpatched 2.6.21-rc6 works fine as long
as USB legacy support is disabled (however without legacy support I
can't use the USB keyboard to control grub).

-- 
Tobias						PGP: http://9ac7e0bc.uguu.de
このメールは十割再利用されたビットで作られています。
-

From: Rafael J. Wysocki
Date: Sunday, April 15, 2007 - 4:16 am

Well, I think that when you're using the USB keyboard and the USB legacy
support, the i8042 driver thinks it has a keyboard to handle and tries to
handle it during the suspend, which fails.  I don't know why it fails during
the second suspend, though.


I think using the 'shutdown' mode of suspend would be better.  There's a little
point in using 'platform' on desktop systems anyway.

Frankly, I don't know what to do about it.  If we move platform_finish() after
device_resume(), some systems may be broken and I think there are more such
systems than there are systems that set USB legacy support in the BIOS and
have no PS/2 keyboards attached.  Pavel, what do you think?

Rafael
-

From: Dmitry Torokhov
Date: Sunday, April 15, 2007 - 7:19 am

This is wierd as i8042 does not use suspend_late/resume_early hooks and
so it is impossible for it to hang there. None of input drivers use these

I would say that every box that does not use PS/2 keyboard does this.
IOW every box with USB keyboard has legacy emulation turned on so quite
few of them...

-- 
Dmitry
-

From: Rafael J. Wysocki
Date: Sunday, April 15, 2007 - 8:52 am

Yes.

Tobias, can you please post the dmesg output from after a successful

Quite some people I know use USB keyboards with notebooks, but in these cases
the PS/2 keyboard is still attached (except for notebooks in which the built-in

I have such a machine nearby, so I'll see if I can reproduce the problem.

Greetings,
Rafael

-

From: Tobias Diedrich
Date: Sunday, April 15, 2007 - 11:50 am

Here you go:

[    0.000000] Linux version 2.6.21-rc6 (ranma@melchior) (gcc version 4.1.2 20061115 (prerelease) (Debian 4.1.1-21)) #16 PREEMPT Sun Apr 15 09:39:32 CEST 2007
[    0.000000] Command line: root=/dev/sda5 resume=/dev/sda6 vga=6 apic=verbose ro
[    0.000000] BIOS-provided physical RAM map:
[    0.000000]  BIOS-e820: 0000000000000000 - 000000000009f800 (usable)
[    0.000000]  BIOS-e820: 000000000009f800 - 00000000000a0000 (reserved)
[    0.000000]  BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved)
[    0.000000]  BIOS-e820: 0000000000100000 - 000000003fee0000 (usable)
[    0.000000]  BIOS-e820: 000000003fee0000 - 000000003fee3000 (ACPI NVS)
[    0.000000]  BIOS-e820: 000000003fee3000 - 000000003fef0000 (ACPI data)
[    0.000000]  BIOS-e820: 000000003fef0000 - 000000003ff00000 (reserved)
[    0.000000]  BIOS-e820: 00000000f0000000 - 00000000f4000000 (reserved)
[    0.000000]  BIOS-e820: 00000000fec00000 - 0000000100000000 (reserved)
[    0.000000] Entering add_active_range(0, 0, 159) 0 entries of 256 used
[    0.000000] Entering add_active_range(0, 256, 261856) 1 entries of 256 used
[    0.000000] end_pfn_map = 1048576
[    0.000000] DMI 2.4 present.
[    0.000000] ACPI: RSDP 000F7B80, 0024 (r2 Nvidia)
[    0.000000] ACPI: XSDT 3FEE30C0, 004C (r1 Nvidia ASUSACPI 42302E31 AWRD        0)
[    0.000000] ACPI: FACP 3FEEC540, 00F4 (r3 Nvidia ASUSACPI 42302E31 AWRD        0)
[    0.000000] ACPI: DSDT 3FEE3240, 92AD (r1 NVIDIA AWRDACPI     1000 MSFT  3000000)
[    0.000000] ACPI: FACS 3FEE0000, 0040
[    0.000000] ACPI: SSDT 3FEEC740, 00F4 (r1 PTLTD  POWERNOW        1  LTP        1)
[    0.000000] ACPI: HPET 3FEEC880, 0038 (r1 Nvidia ASUSACPI 42302E31 AWRD       98)
[    0.000000] ACPI: MCFG 3FEEC900, 003C (r1 Nvidia ASUSACPI 42302E31 AWRD        0)
[    0.000000] ACPI: APIC 3FEEC680, 007C (r1 Nvidia ASUSACPI 42302E31 AWRD        0)
[    0.000000] Entering add_active_range(0, 0, 159) 0 entries of 256 used
[    0.000000] Entering add_active_range(0, 256, 261856) 1 ...
From: Rafael J. Wysocki
Date: Sunday, April 15, 2007 - 12:37 pm

Thanks.

[--snip--]

Hmm, it looks like i8042 is the last thing on the dpm_off_irq list.  Still,
if the ACPI resume fails, the next messages may not make it to the console
(it's not very probable, though).

I've tried to reproduce your problem on another box on which I have no PS/2
keyboard (USB keyboard/mouse only) and the USB legacy support set, but I can't.
There must be something very special in your configuration.

Have you tried the patch that I posted some time ago (appended again for
convenience)?

Rafael


 drivers/input/serio/i8042.c |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Index: linux-2.6.21-rc6/drivers/input/serio/i8042.c
===================================================================
--- linux-2.6.21-rc6.orig/drivers/input/serio/i8042.c	2007-04-07 12:15:19.000000000 +0200
+++ linux-2.6.21-rc6/drivers/input/serio/i8042.c	2007-04-15 18:30:01.000000000 +0200
@@ -846,7 +846,8 @@ static long i8042_panic_blink(long count
 static int i8042_suspend(struct platform_device *dev, pm_message_t state)
 {
 	if (dev->dev.power.power_state.event != state.event) {
-		if (state.event == PM_EVENT_SUSPEND)
+		if (state.event == PM_EVENT_SUSPEND
+		    || state.event == PM_EVENT_PRETHAW)
 			i8042_controller_reset();
 
 		dev->dev.power.power_state = state;
-

From: David Brownell
Date: Sunday, April 15, 2007 - 8:14 am

And NVidia southbridge, so OHCI not UHCI (plus EHCI) ... one experiment
would be to disable the EHCI (high speed USB) support in BIOS, to make
for a simpler hardware configuration, and see if that makes BIOS happier.
(Or better, just take EHCI out of your Linux config.)  Likewise, taking
the 8042 drivers out of Linux.

I wouldn't be surprised if those factors didn't matter, but it'd be good

The "legacy" support in at least some cases involves BIOS having a
small USB stack -- enough to handle a keyboard or mouse in "boot mode"
(plus sometimes a USB disk or CDROM) -- and poking the i8042 chip to
act as if *IT* received the data bytes that really came over USB.

I sure don't know the ins-and-outs of such schemes (ISTR there are
others), but my guess is that either the 8042 or OHCI got confused,
at least in conjunction with the lowlevel magic ACPI was doing.


What I'm curious about is exactly why the patch matters.  What ACPI
magic is being invoked to confuse, or unconfuse, those controllers?

- Dave



-

From: Rafael J. Wysocki
Date: Sunday, April 15, 2007 - 9:37 am

Well, my theory is the following:

Without the patch, platform_finish() runs before the i8042's .resume() which is
done as though a real keyboard were present, but the ACPI magic is not done
and this confuses the heck out of the controller.  Still, it doesn't go mad at
this point just yet (it probably isn't fully functional either, although we
don't see that, because it's not really used), but next, during the subsequent
suspend, it gets poked while device_power_up() is running and goes belly

I think the patch helps, because it makes the ACPI magic be done while the
i8042's .resume() is being executed.

Which makes me think the following patch might help:

 drivers/input/serio/i8042.c |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Index: linux-2.6.21-rc6/drivers/input/serio/i8042.c
===================================================================
--- linux-2.6.21-rc6.orig/drivers/input/serio/i8042.c	2007-04-07 12:15:19.000000000 +0200
+++ linux-2.6.21-rc6/drivers/input/serio/i8042.c	2007-04-15 18:30:01.000000000 +0200
@@ -846,7 +846,8 @@ static long i8042_panic_blink(long count
 static int i8042_suspend(struct platform_device *dev, pm_message_t state)
 {
 	if (dev->dev.power.power_state.event != state.event) {
-		if (state.event == PM_EVENT_SUSPEND)
+		if (state.event == PM_EVENT_SUSPEND
+		    || state.event == PM_EVENT_PRETHAW)
 			i8042_controller_reset();
 
 		dev->dev.power.power_state = state;
-

From: David Brownell
Date: Sunday, April 15, 2007 - 10:53 am

Yeah, lack of PRETHAW support could be an issue.  As you may recall,
it was added because otherwise statically linked USB host controllers
came up under the mistaken belief that they were getting a real resume
event rather than a restart-after-power-off ... and there needed to be
a way to force a hard reset.  Seems like a similar issue here.

-

From: Tobias Diedrich
Date: Sunday, April 15, 2007 - 12:40 pm

-- 
Tobias						PGP: http://9ac7e0bc.uguu.de
このメールは十割再利用されたビットで作られています。
-

From: Rafael J. Wysocki
Date: Sunday, April 15, 2007 - 12:54 pm

Well, this means i8042 can be ruled out, so the problem probably is related
to the ACPI resume which makes it _much_ more difficult to debug.

Can you compile the ACPI drivers: processor, thermal, fan, battery, etc. as
modules, boot the kernel with init=/bin/bash and see if the problem is still
present (please keep CONFIG_SERIO_I8042 unset just in case)?

Rafael
-

From: Tobias Diedrich
Date: Wednesday, April 25, 2007 - 10:14 am

I first tried it with acpi+cpufreq completely disabled (works).
Then I tried it with acpi enabled, but everything as modules and
those not loaded (init=/bin/bash, hangs at second suspend).

#
# Automatically generated make config: don't edit
# Linux kernel version: 2.6.21-rc7
# Sun Apr 22 09:26:07 2007
#
CONFIG_X86_64=y
CONFIG_64BIT=y
CONFIG_X86=y
CONFIG_GENERIC_TIME=y
CONFIG_GENERIC_TIME_VSYSCALL=y
CONFIG_ZONE_DMA32=y
CONFIG_LOCKDEP_SUPPORT=y
CONFIG_STACKTRACE_SUPPORT=y
CONFIG_SEMAPHORE_SLEEPERS=y
CONFIG_MMU=y
CONFIG_ZONE_DMA=y
CONFIG_RWSEM_GENERIC_SPINLOCK=y
CONFIG_GENERIC_HWEIGHT=y
CONFIG_GENERIC_CALIBRATE_DELAY=y
CONFIG_X86_CMPXCHG=y
CONFIG_EARLY_PRINTK=y
CONFIG_GENERIC_ISA_DMA=y
CONFIG_GENERIC_IOMAP=y
CONFIG_ARCH_MAY_HAVE_PC_FDC=y
CONFIG_ARCH_POPULATES_NODE_MAP=y
CONFIG_DMI=y
CONFIG_AUDIT_ARCH=y
CONFIG_GENERIC_BUG=y
# CONFIG_ARCH_HAS_ILOG2_U32 is not set
# CONFIG_ARCH_HAS_ILOG2_U64 is not set
CONFIG_DEFCONFIG_LIST="/lib/modules/$UNAME_RELEASE/.config"

#
# Code maturity level options
#
CONFIG_EXPERIMENTAL=y
CONFIG_BROKEN_ON_SMP=y
CONFIG_LOCK_KERNEL=y
CONFIG_INIT_ENV_ARG_LIMIT=32

#
# General setup
#
CONFIG_LOCALVERSION=""
CONFIG_LOCALVERSION_AUTO=y
CONFIG_SWAP=y
CONFIG_SYSVIPC=y
# CONFIG_IPC_NS is not set
CONFIG_SYSVIPC_SYSCTL=y
# CONFIG_POSIX_MQUEUE is not set
CONFIG_BSD_PROCESS_ACCT=y
CONFIG_BSD_PROCESS_ACCT_V3=y
# CONFIG_TASKSTATS is not set
# CONFIG_UTS_NS is not set
# CONFIG_AUDIT is not set
CONFIG_IKCONFIG=y
CONFIG_IKCONFIG_PROC=y
# CONFIG_SYSFS_DEPRECATED is not set
# CONFIG_RELAY is not set
# CONFIG_BLK_DEV_INITRD is not set
CONFIG_CC_OPTIMIZE_FOR_SIZE=y
CONFIG_SYSCTL=y
CONFIG_EMBEDDED=y
CONFIG_UID16=y
CONFIG_SYSCTL_SYSCALL=y
CONFIG_KALLSYMS=y
# CONFIG_KALLSYMS_ALL is not set
# CONFIG_KALLSYMS_EXTRA_PASS is not set
CONFIG_HOTPLUG=y
CONFIG_PRINTK=y
CONFIG_BUG=y
CONFIG_ELF_CORE=y
CONFIG_BASE_FULL=y
CONFIG_FUTEX=y
CONFIG_EPOLL=y
CONFIG_SHMEM=y
CONFIG_SLAB=y
CONFIG_VM_EVENT_COUNTERS=y
CONFIG_RT_MUTEXES=y
# ...
From: Rafael J. Wysocki
Date: Wednesday, April 25, 2007 - 12:36 pm

Have you tried with ACPI and without cpufreq?

Rafael
-

From: Tobias Diedrich
Date: Wednesday, April 25, 2007 - 1:09 pm

Yes, the second one was with ACPI enabled and cpufreq disabled
(CONFIG_X86_ACPI_CPUFREQ is not set).

-- 
Tobias						PGP: http://9ac7e0bc.uguu.de
このメールは十割再利用されたビットで作られています。
-

From: Adrian Bunk
Date: Friday, April 13, 2007 - 5:36 pm

This email lists some known regressions in Linus' tree compared to 2.6.20.

If you find your name in the Cc header, you are either submitter of one
of the bugs, maintainer of an affectected subsystem or driver, a patch
of you caused a breakage or I'm considering you in any other way
possibly involved with one or more of these issues.

Due to the huge amount of recipients, please trim the Cc when answering.


Subject    : ali_pata: boot from CD fails
References : http://lkml.org/lkml/2007/3/31/160
Submitter  : Stephen Clark <Stephen.Clark@seclark.us>
Status     : unknown


Subject    : kernels fail to boot with drives on ATIIXP controller
             (ACPI/IRQ related)
References : https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=229621
             http://lkml.org/lkml/2007/3/4/257
Submitter  : Michal Jaegermann <michal@ellpspace.math.ualberta.ca>
Status     : unknown


Subject    : boot failure: rtl8139: exception in interrupt routine
References : http://lkml.org/lkml/2007/3/31/160
Submitter  : Stephen Clark <Stephen.Clark@seclark.us>
Status     : unknown


Subject    : laptops with e1000: lockups
References : https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=229603
Submitter  : Dave Jones <davej@redhat.com>
Handled-By : Jesse Brandeburg <jesse.brandeburg@intel.com>
Status     : problem is being debugged


Subject    : forcedeth: interface hangs under load
References : http://lkml.org/lkml/2007/4/3/39
Submitter  : Ingo Molnar <mingo@elte.hu>
Handled-By : Ingo Molnar <mingo@elte.hu>
             Ayaz Abdulla <aabdulla@nvidia.com>
Status     : problem is being debugged


-

From: Adrian Bunk
Date: Friday, April 13, 2007 - 5:38 pm

This email lists some known regressions in Linus' tree compared to 2.6.20.

If you find your name in the Cc header, you are either submitter of one
of the bugs, maintainer of an affectected subsystem or driver, a patch
of you caused a breakage or I'm considering you in any other way
possibly involved with one or more of these issues.

Due to the huge amount of recipients, please trim the Cc when answering.


Subject    : suspend to disk works only once
References : http://lkml.org/lkml/2007/4/13/240
Submitter  : Tobias Diedrich <ranma+kernel@tdiedrich.de>
Status     : unknown


Subject    : ThinkPad X60: resume no longer works  (PCI related?)
             workaround: booting with "hpet=disable"
References : http://lkml.org/lkml/2007/3/13/3
Submitter  : Dave Jones <davej@redhat.com>
             Jeremy Fitzhardinge <jeremy@goop.org>
Caused-By  : PCI merge
             commit 78149df6d565c36675463352d0bfe0000b02b7a7
Handled-By : Eric W. Biederman <ebiederm@xmission.com>
             Rafael J. Wysocki <rjw@sisk.pl>
Status     : problem is being debugged


Subject    : Suspend to RAM doesn't work anymore  (ACPI?)
References : http://lkml.org/lkml/2007/3/19/128
             http://bugzilla.kernel.org/show_bug.cgi?id=8247
Submitter  : Tobias Doerffel <tobias.doerffel@gmail.com>
Handled-By : Rafael J. Wysocki <rjw@sisk.pl>
             Len Brown <len.brown@intel.com>
Status     : problem is being debugged


Subject    : resume from RAM corrupts vesafb console
References : http://lkml.org/lkml/2007/3/26/76
Submitter  : Marcus Better <marcus@better.se>
Handled-By : Pavel Machek <pavel@ucw.cz>
Status     : problem is being debugged


Subject    : suspend to disk hangs  (CONFIG_NO_HZ)
References : http://lkml.org/lkml/2007/3/25/217
Submitter  : Jeff Chua <jeff.chua.linux@gmail.com>
Status     : unknown

-

From: Antonino A. Daplas
Date: Friday, April 13, 2007 - 6:57 pm

Hi Marcus,

A screen with blinking green blocks implies that your display is in text
mode, not in graphics mode.  I don't know what options you are using,
but have you tried using:

acpi_sleep=s3_mode

If the above does not work, also try

acpi_sleep=s3_bios,s3_mode

If it is still not working, you can add this to your suspend script:

vbetool vbemode set <VESA mode ID>

where VESA mode ID = "vga=" value - 512 (0x200)

Tony

PS: If your BIOS setup has an option to re-POST the graphics card on
resume, that is a big help.

Tony


-

From: Marcus Better
Date: Sunday, April 15, 2007 - 9:26 am

Will try, but I'm using "s2ram -f -a3" which should mean precisely the abov=
e=20
IIUC.

Marcus
From: Antonino A. Daplas
Date: Sunday, April 15, 2007 - 4:08 pm

Just for clarification, do you suspend from VESA framebuffer console or
from VGA text console? If from the latter, that's actually worse from
the user's point of view, but I can modify vgacon so that it saves its

Okay.

Tony


-

From: Marcus Better
Date: Sunday, April 15, 2007 - 11:23 pm

=46rom VESA console.

Marcus
From: Antonino A. Daplas
Date: Sunday, April 15, 2007 - 11:45 pm

Have you tried other combinations?

s2ram -m -p -f
s2ram -s -p -f

Tony


-

From: Marcus Better
Date: Tuesday, April 17, 2007 - 1:17 am

Yes, I tried these slightly different combinations:

s2ram -f -a3 -s: Works! The screen becomes green but is restored quickly. I=
t=20
prints the following messages:
Allocated buffer at 0x11000 (base is 0x0)
ES: 0x1100 EBX: 0x0000
Save video state failed
Calling restore_state_from
=46unction not supported?
Restore video state failed
Switching back to vt1

s2ram -f -a3 -p: Screen goes green and then blank. Everything hangs, doesn'=
t=20
react to keyboard input.

s2ram -f -a3 -m: Works!

(Tested with 2.6.21-rc7.)

Thanks,

Marcus
From: Antonino A. Daplas
Date: Tuesday, April 17, 2007 - 2:27 am

Thanks.  Should we consider this regression resolved?  There is really
nothing much vesafb can do to restore its previous state, except through
the use of userland tools.

Tony


-

From: Marcus Better
Date: Tuesday, April 17, 2007 - 4:54 am

Yes, as far as I am concerned...

Marcus
From: Pavel Machek
Date: Tuesday, April 24, 2007 - 8:33 am

Uhuh. This is second report of this strangeness. On thinkpad r60, -a3
used to work, and now it needs more options. Can you locate patch
causing this?
							Pavel

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-

From: Tobias Doerffel
Date: Saturday, April 14, 2007 - 12:24 am

Already fixed in rc5-git9, see http://bugzilla.kernel.org/show_bug.cgi?id=8247

Tobias
From: Dave Jones
Date: Saturday, April 14, 2007 - 12:40 am

On Sat, Apr 14, 2007 at 02:38:08AM +0200, Adrian Bunk wrote:

 > Subject    : ThinkPad X60: resume no longer works  (PCI related?)
 >              workaround: booting with "hpet=disable"
 > References : http://lkml.org/lkml/2007/3/13/3
 > Submitter  : Dave Jones <davej@redhat.com>
 >              Jeremy Fitzhardinge <jeremy@goop.org>
 > Caused-By  : PCI merge
 >              commit 78149df6d565c36675463352d0bfe0000b02b7a7
 > Handled-By : Eric W. Biederman <ebiederm@xmission.com>
 >              Rafael J. Wysocki <rjw@sisk.pl>
 > Status     : problem is being debugged

I'm at a loss on this one. git bisect was non-conclusive.
I even tried beating up on Eric's console-over-usb to try
and get more useful info, but I failed miserably.

	Dave

-- 
http://www.codemonkey.org.uk
-

From: Jeff Chua
Date: Sunday, April 15, 2007 - 10:15 am

Still hangs on -rc6.

Thanks,
Jeff.
-

From: Adrian Bunk
Date: Friday, April 13, 2007 - 5:38 pm

This email lists some known regressions in Linus' tree compared to 2.6.20.

If you find your name in the Cc header, you are either submitter of one
of the bugs, maintainer of an affectected subsystem or driver, a patch
of you caused a breakage or I'm considering you in any other way
possibly involved with one or more of these issues.

Due to the huge amount of recipients, please trim the Cc when answering.


Subject    : snd_hda_intel doesn't work with ASUS M2V mainboard
References : http://bugzilla.kernel.org/show_bug.cgi?id=8273
Submitter  : Hans-Georg Rist <hg.rist@web.de>
Status     : unknown


Subject    : snd_intel8x0: divide error: 0000
References : http://lkml.org/lkml/2007/3/5/252
Submitter  : Michal Piotrowski <michal.k.k.piotrowski@gmail.com>
Status     : unknown


Subject    : hal daemon crashes after pulling a USB serial device
References : http://www.opensubscriber.com/message/linux-usb-devel@lists.sourceforge.net/6369800.html
Submitter  : Andi Kleen <ak@suse.de>
Handled-By : Oliver Neukum <oneukum@suse.de>
Status     : problem is being debugged


Subject    : USB: iPod doesn't work  (CONFIG_USB_SUSPEND)
References : http://lkml.org/lkml/2007/3/21/320
Submitter  : Tino Keitel <tino.keitel@gmx.de>
Caused-By  : Marcelo Tosatti <marcelo@kvack.org>
             commit 1d619f128ba911cd3e6d6ad3475f146eb92f5c27
Handled-By : Oliver Neukum <oneukum@suse.de>
Status     : problem is being debuggged


Subject    : USB: Oops when changing DVB-T adapter
References : http://lkml.org/lkml/2007/3/9/212
Submitter  : CIJOML <cijoml@volny.cz>
Handled-By : Markus Rechberger <markus.rechberger@amd.com>
Patch      : http://lkml.org/lkml/2007/4/5/154
Status     : patches available

-

Previous thread: Re: Oops in scsi_send_eh_cmnd 2.6.21-rc5-git6,7,10,13 by Andrew Burgess on Thursday, April 5, 2007 - 7:21 pm. (2 messages)

Next thread: Re: set up new kernel with grub by WANG Cong on Thursday, April 5, 2007 - 8:15 pm. (1 message)