The never ending BEEEEP/__smp_call_function_mask with 2.6.25-rc7

Previous thread: [PATCH 0/3] PM: New suspend and hibernation callbacks by Rafael J. Wysocki on Thursday, April 3, 2008 - 4:11 pm. (56 messages)

Next thread: 2.6.25-rc1: volanoMark regression by Rafael J. Wysocki on Thursday, April 3, 2008 - 3:50 pm. (3 messages)
From: Rafael J. Wysocki
Date: Thursday, April 3, 2008 - 3:49 pm

[This time I'm going to do something new: I'll send a series of messages with
individual regression entries CCed to the people involved in handling them in
replies to this message.  Let's see how this works, fingers crossed.  Thx. R.]

This message contains a list of some regressions from 2.6.24, for which there
are no fixes in the mainline I know of.  If any of them have been fixed already,
please let me know.

If you know of any other unresolved regressions from 2.6.24, please let me know
either and I'll add them to the list.  Also, please let me know if any of the
entries below are invalid.


Listed regressions statistics:

  Date          Total  Pending  Unresolved
  ----------------------------------------
  2008-04-04      183       32          28
  2008-03-31      177       34          31
  2008-03-27      171       38          30
  2008-03-22      159       35          31
  2008-03-17      148       38          30
  2008-03-16      146       42          35
  2008-03-14      145       45          39
  2008-03-12      143       51          41
  2008-03-11      141       58          43
  2008-03-10      138       66          47
  2008-03-03      115       65          49
  2008-02-25       90       51          39
  2008-02-17       61       45          37


Unresolved regressions
----------------------

Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=10391
Subject		: 2.6.25-rc7/8: Another resume regression
Submitter	: Mark Lord <lkml@rtr.ca>
Date		: 2008-04-03 15:06 (1 days old)
References	: http://lkml.org/lkml/2008/4/3/283


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=10390
Subject		: Oops while reading /proc/ioports or /proc/iomem
Submitter	: Jan Kara <jack@suse.cz>
Date		: 2008-04-03 15:25 (1 days old)
References	: http://lkml.org/lkml/2008/4/3/149


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=10387
Subject		: rc6+ regression - backlight reset to 0 on boot after 7c0ea45be4f114d85ee35caeead8e1660699c46f
Submitter	: Andrey ...
From: Linus Torvalds
Date: Thursday, April 3, 2008 - 4:59 pm

This sounds very much like some module registered IO ports/memory and was 
then unloaded without unregistering them.

It's a bit hard to guess which module it is, though. The oops says "[last 
unloaded: parport]", so that's likely to be the area.

So I *suspect* this patch might be relevant. Bug apparently introduced in 
f63fd7e299ee13da071ecfce2b90b58c5e1562b1 ("parport_pc: detection for 
SuperIO IT87XX POST") by Petr Cvek.

Petr?

		Linus

---
 drivers/parport/parport_pc.c |    3 +--
 1 files changed, 1 insertions(+), 2 deletions(-)

diff --git a/drivers/parport/parport_pc.c b/drivers/parport/parport_pc.c
index d76d37b..a858089 100644
--- a/drivers/parport/parport_pc.c
+++ b/drivers/parport/parport_pc.c
@@ -1568,9 +1568,8 @@ static void __devinit detect_and_report_it87(void)
 		outb(r | 8, 0x2F);
 		outb(0x02, 0x2E);	/* Lock */
 		outb(0x02, 0x2F);
-
-		release_region(0x2e, 1);
 	}
+	release_region(0x2e, 1);
 }
 #endif /* CONFIG_PARPORT_PC_SUPERIO */
 
--

From: Andrew Morton
Date: Thursday, April 3, 2008 - 9:39 pm

Looks very correct to me.
--

From: Rafael J. Wysocki
Date: Thursday, April 3, 2008 - 4:20 pm

The following report is on the current list of known regressions
from 2.6.24.  Please verify if the issue is still present in the
mainline.


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=10067
Subject		: TUNER_TDA8290=y, VIDEO_DEV=n build error
Submitter	: Toralf Foerster <toralf.foerster@gmx.de>
Date		: 2008-02-22 10:36 (42 days old)
References	: http://lkml.org/lkml/2008/2/19/262


--

From: Adrian Bunk
Date: Friday, April 4, 2008 - 1:47 am

When you asked me just 2 days ago exactly the same question in the 
Bugzilla entry I immediately confirmed it's still present.

Was anything wrong with my answer (and the subsequent discussions) that 

cu
Adrian

-- 

       "Is there not promise of rain?" Ling Tan asked suddenly out
        of the darkness. There had been need of rain for many days.
       "Only a promise," Lao Er said.
                                       Pearl S. Buck - Dragon Seed

--

From: Rafael J. Wysocki
Date: Friday, April 4, 2008 - 2:37 am

This text is automatically added to the messages sent in replies to the main
report.


Thanks,
Rafael
--

From: Adrian Bunk
Date: Sunday, April 6, 2008 - 2:43 pm

First of all you try to get the tracking of all bugs into Bugzilla, and 
now you've sent a batch of emails where the answers will not get into 
Bugzilla automatically.

As an example, for the bug we are talking about the main value of you 
asking me in Bugzilla was not that I confirmed it's still present, the 
main value was that this caused Mauro to make a fix. What would have 
happened if Toralf had answered in an email that the bug is still 
present?

And especially for trickier stuff like suspend/resume problems it might 
also make sense to send the email to all people who might possibly be 
involved with the bug - even if this means putting all email addresses 
from 5 MAINTAINERS entries into one email. It takes a bit more time when 

cu
Adrian

-- 

       "Is there not promise of rain?" Ling Tan asked suddenly out
        of the darkness. There had been need of rain for many days.
       "Only a promise," Lao Er said.
                                       Pearl S. Buck - Dragon Seed

--

From: Rafael J. Wysocki
Date: Sunday, April 6, 2008 - 2:59 pm

Yes, I'm going to do that in the future.

Thanks,
Rafael
--

From: Rafael J. Wysocki
Date: Thursday, April 3, 2008 - 4:22 pm

The following report is on the current list of known regressions
from 2.6.24.  Please verify if the issue is still present in the
mainline.


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=10082
Subject		: 2.6.25-rc2-git4 - Kernel oops while running kernbench and tbench on powerpc
Submitter	: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com>
Date		: 2008-02-20 16:01 (44 days old)
References	: http://lkml.org/lkml/2008/2/20/218
		  http://lkml.org/lkml/2008/1/18/71


--

From: Rafael J. Wysocki
Date: Thursday, April 3, 2008 - 4:22 pm

The following report is on the current list of known regressions
from 2.6.24.  Please verify if the issue is still present in the
mainline.


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=10117
Subject		: 2.6.25-current-git hangs on boot (pci=nommconf helps)
Submitter	: Soeren Sonnenburg <kernel@nn7.de>
Date		: 2008-02-23 18:55 (41 days old)
References	: http://lkml.org/lkml/2008/2/23/263


--

From: Soeren Sonnenburg
Date: Friday, April 4, 2008 - 12:24 am

I rebooted >10 times and couldn't trigger this hang with current
mainstream anymore.... so I guess it is gone.

Soeren
--

From: Rafael J. Wysocki
Date: Thursday, April 3, 2008 - 4:22 pm

The following report is on the current list of known regressions
from 2.6.24.  Please verify if the issue is still present in the
mainline.


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=10156
Subject		: KVM & Qemu crashed with infinite recursive kernel loop in the guest
Submitter	: Zdenek Kabelac <zdenek.kabelac@gmail.com>
Date		: 2008-02-28 11:25 (36 days old)
References	: http://lkml.org/lkml/2008/2/28/106


--

From: Rafael J. Wysocki
Date: Thursday, April 3, 2008 - 4:22 pm

The following report is on the current list of known regressions
from 2.6.24.  Please verify if the issue is still present in the
mainline.


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=10172
Subject		: kvm: INFO: inconsistent lock state
Submitter	: Zdenek Kabelac <zdenek.kabelac@gmail.com>
Date		: 2008-03-05 03:26 (30 days old)


--

From: Rafael J. Wysocki
Date: Thursday, April 3, 2008 - 4:22 pm

The following report is on the current list of known regressions
from 2.6.24.  Please verify if the issue is still present in the
mainline.


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=10133
Subject		: INFO: possible circular locking in the resume
Submitter	: Zdenek Kabelac <zdenek.kabelac@gmail.com>
Date		: 2008-02-27 (37 days old)
References	: http://lkml.org/lkml/2008/2/26/479
Handled-By	: Gautham R Shenoy <ego@in.ibm.com>


--

From: Gautham R Shenoy
Date: Thursday, April 3, 2008 - 10:20 pm

Yes it still is present in the mainline.

Sorry, didn't have time to fix it, since I have been busy with other stuff. 
Will have a look at it over the weekend.

-- 
Thanks and Regards
gautham
--

From: Rafael J. Wysocki
Date: Thursday, April 3, 2008 - 4:22 pm

The following report is on the current list of known regressions
from 2.6.24.  Please verify if the issue is still present in the
mainline.


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=10093
Subject		: 2.6.25-current-git hangs on boot unless CONFIG_CPU_IDLE=n - Apple
Submitter	: Soeren Sonnenburg <kernel@nn7.de>
Date		: 2008-02-23 18:55 (41 days old)
References	: http://lkml.org/lkml/2008/2/23/263
		  http://marc.info/?l=linux-acpi&m=120387537018467&w=4
Handled-By	: Pallipadi, Venkatesh <venkatesh.pallipadi@intel.com>


--

From: Pallipadi, Venkatesh
Date: Thursday, April 3, 2008 - 5:14 pm

Last comment I saw from Soeren says it does not really hang. It waits
for 5-10 seconds sometimes before continuing. We are still trying to
narrow this down with max_cstate etc..

Thanks,
Venki
--

From: Carlos R. Mafra
Date: Thursday, April 3, 2008 - 6:10 pm

I should say that my laptop (Vaio) _hangs_ at boot (less than 10% of boots)
and this is the last message I see:

"ACPI: Processor [CPU1] (supports 8 throttling states)"

and it stays there "forever" (more than 1 or 2 minutes at least).

I am not using my laptop too much these days, and I thought this
bug had been solved because it didn't hang for the last week (but
I boot it once per day), however it hang last night (using some
post 2.6.25-rc7 kernel)

I have pictures of the screen while showing this last message at boot, 
should I post it somewhere or that is not necessary?.
--

From: Ray Lee
Date: Thursday, April 3, 2008 - 8:15 pm

If the sysrq key still works when that happens, try to do a sysrq-t,
s, b to generate a trace, sync the filesystem, and reboot. If anything
made it to the logs, please post it.
--

From: Soeren Sonnenburg
Date: Thursday, April 3, 2008 - 11:05 pm

How could I potentially do that on this #^%^! apple keyboard?
Soeren
--

From: Carlos R. Mafra
Date: Friday, April 4, 2008 - 4:47 am

Sysrq keys don't work when it hangs, I have to push the power button.

Another thing to notice is that when it happens, it usually happens
more times in a row (like 3 times). And then it can take many boots to
happen again.

I've uploaded the picture where it hangs, it is in the very beginning
of the boot process:

http://www.ift.unesp.br/users/crmafra/dsc04673.jpg
--

From: Soeren Sonnenburg
Date: Friday, April 4, 2008 - 5:10 am

Actually this is 
http://bugzilla.kernel.org/show_bug.cgi?id=10117 ... and what you
describe is exactly what I was seeing. 

However I thought that we are talking about the hang with 
ladder governor as the last message on the screen. Anyway I am not sure
if both bugs are related or not but I couldn't reproduce the throttling
states bug with current mainstream...

Soeren
--

From: Carlos R. Mafra
Date: Friday, April 4, 2008 - 5:46 am

Ops, I am sorry. But in the bug description which Rafael wrote above
there is a link to your original post

References: http://lkml.org/lkml/2008/2/23/263


I thought that the bug had vanished, but two days ago it happened
again in my Vaio. There were a couple of different messages comparing
to the picture in the link above, but the last message was exactly the
same.

And I see that Rafael closed the bug 
http://bugzilla.kernel.org/show_bug.cgi?id=10117

The hang two days ago was with 2.6.25-rc7-00150-g(can't remember),
do you think the bug was fixed since that kernel?

I will try to test with the latest git, but I don't have internet
access at my house anymore, so I will have to clone the git
repository into my pendrive and take it there...
--

From: Pallipadi, Venkatesh
Date: Friday, April 4, 2008 - 7:25 am

Can you try the latest git and see whether this still is a problem?

Thanks,
Venki
--

From: Carlos R. Mafra
Date: Friday, April 4, 2008 - 8:51 am

I've just updated to latest git, but it may take some time for it 
to happen again (the first 3 boots were okay). Last time it happened
with 2.6.25-rc7-00149-gaf8be4e (I checked now). 

Do you have reasons to believe this issue was fixed since then?
--

From: Pallipadi, Venkatesh
Date: Friday, April 4, 2008 - 10:37 am

Yes. I think the patch here should have fixed the problem....
Patch : http://marc.info/?l=linux-kernel&m=120674502201007&w=4

Thanks,
--

From: Soeren Sonnenburg
Date: Friday, April 4, 2008 - 10:42 am

On Fri, 2008-04-04 at 10:37 -0700, Pallipadi, Venkatesh wrote:

But this patch is not in the current mainstream... (yet?)
Soeren

--

From: Soeren Sonnenburg
Date: Thursday, April 3, 2008 - 11:32 pm

yes still there, but not a hangs forever hang anymore but only a hang
for (this time) 15 seconds hang.

Soeren
--

From: Pallipadi, Venkatesh
Date: Thursday, April 3, 2008 - 11:38 pm

Can you please try the max_cstate experiments that I mentioned in the
bugzilla.

Thanks,
Venki
--

From: Rafael J. Wysocki
Date: Thursday, April 3, 2008 - 4:22 pm

The following report is on the current list of known regressions
from 2.6.24.  Please verify if the issue is still present in the
mainline.


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=10302
Subject		: 2.6.25-git regression with snd-hda-intel on Dell XPS M1330, no analog sound
Submitter	: Andre Tomt <andre@tomt.net>
Date		: 2008-03-21 20:03 (14 days old)
References	: http://lkml.org/lkml/2008/3/21/295
Handled-By	: Matthew Ranostay <mranostay@embeddedalley.com>


--

From: Bill Davidsen
Date: Saturday, April 5, 2008 - 12:49 pm

Is this by any chance related to the RTC old/new vs. sound problem noted 
in another thread?

-- 
Bill Davidsen <davidsen@tmr.com>
   "We have more to fear from the bungling of the incompetent than from
the machinations of the wicked."  - from Slashdot
--

From: Rafael J. Wysocki
Date: Thursday, April 3, 2008 - 4:22 pm

The following report is on the current list of known regressions
from 2.6.24.  Please verify if the issue is still present in the
mainline.


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=10290
Subject		: [BUG] Linux 2.6.25-rc6 - kernel BUG at fs/mpage.c:476! on powerpc
Submitter	: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com>
Date		: 2008-03-20 13:13 (15 days old)
References	: http://lkml.org/lkml/2008/3/20/39


--

From: Rafael J. Wysocki
Date: Thursday, April 3, 2008 - 4:22 pm

The following report is on the current list of known regressions
from 2.6.24.  Please verify if the issue is still present in the
mainline.


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=10300
Subject		: volume wheel does not work in 2.6.25-rc6
Submitter	: Romano Giannetti <romano.giannetti@gmail.com>
Date		: 2008-03-21 11:42 (14 days old)


--

From: Rafael J. Wysocki
Date: Thursday, April 3, 2008 - 4:22 pm

The following report is on the current list of known regressions
from 2.6.24.  Please verify if the issue is still present in the
mainline.


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=10319
Subject		: 2.6.25-rc6 regression - hang on resume
Submitter	: Soeren Sonnenburg <kernel@nn7.de>
Date		: 2008-03-25 04:44 (10 days old)


--

From: Soeren Sonnenburg
Date: Thursday, April 3, 2008 - 11:31 pm

Yes. The machine resumes and display stays black using s2ram -f -p
(blindly typing reboot etc on keyboard does what is expected). However
display comes back on 2.6.24.

Soeren
--

From: Romano Giannetti
Date: Monday, April 7, 2008 - 12:16 am

I can add that on my laptop (toshiba U305, Intel 945GM chipset), s2ram
-f -p -m which used to work ok, in X and console, in 2.6.24 stopped to
work (I tested -rc8, but I think it's like that since a long time).
Machine resumes but the screen stays off after that, although machine is
working (exactly as Soeren said). 

On the other hand, now the plain "echo mem > /sys/power/state" works
perfectly, from X and console. So some of the magic vbe save/restore of
s2ram mess something up for this card. 

Is it a regression for my laptop (is evidently one for Soeren)? On one
side, a "used to work" setup is broken, but it worked with userspace
hacks; now it works with the plain way, and that's clearly better.

I added on Cc: Jesse, to whom I confirmed that Intel suspend/resume was
ok on Feb, 21, [1] and suspend-devel list, because now I do not know
what to do with the whitelist I sent for s2ram for this machine...

Romano 

[1] http://marc.info/?l=linux-kernel&m=120362475121590&w=2

-- 
Sorry for the disclaimer --- ¡I cannot stop it!



--
La presente comunicación tiene carácter confidencial y es para el exclusivo uso del destinatario indicado en la misma. Si Ud. no es el destinatario indicado, le informamos que cualquier forma de distribución, reproducción o uso de esta comunicación y/o de la información contenida en la misma están estrictamente prohibidos por la ley. Si Ud. ha recibido esta comunicación por error, por favor, notifíquelo inmediatamente al remitente contestando a este mensaje y proceda a continuación a destruirlo. Gracias por su colaboración.

This communication contains confidential information. It is for the exclusive use of the intended addressee. If you are not the intended addressee, please note that any form of distribution, copying or use of this communication or the information in it is strictly prohibited by law. If you have received this communication in error, please immediately notify the sender by reply e-mail and destroy this message. Thank you for your ...
From: Rafael J. Wysocki
Date: Monday, April 7, 2008 - 2:10 am

Your graphics adapter is different from the Soeren's one and the fact that
"echo mem > /sys/power/state" works for you now is probably a result of the
recent changes in the i915 driver that is now supposed to handle suspend
and resume.

That said, there had to be a change that affected both of your systems
between 2.6.24 and .25-rc8.  Moreover, I'm suspecting ACPI or something
generic in the DRM core.

As far as ACPI is concerned, one commit related to backlight has just been
reverted, so Soeren, can you please test the current Linus' tree?
--

From: Tino Keitel
Date: Tuesday, April 8, 2008 - 1:58 am

I tried to use vbetool to get my text console back with 2.6.24 after a
suspend to RAM. As a result, I got a text working console, but a
crashing X server sometimes after resume, and the only way to get X
back was to reboot.

With 2.6.25-rc, the text console is fine after resume, due to the
recent changes to the Intel i915 DRM driver. Maybe those people who
used to use vbetool prior to 2.6.25 should check if this is still
needed.

Regards,
Tino
--

From: Romano Giannetti
Date: Tuesday, April 8, 2008 - 5:35 am

Yep, the problem is just that: in 2.6.24 vbetool hacks (in my case,
using s2ram -p -m options) was needed to resume graphics. Now they are
not needed anymore, and using them _break_ resume.

So, people with a working suspend-to-ram setup (even using the whitelist
in s2ram, I mean, simply calling s2ram) will discover that 2.6.25 break
their machine. Removing the s2ram package and suspending with
kernel-only machinery will work, but... well, it's not user-friendly.
 
Romano 

-- 
Sorry for the disclaimer --- ¡I cannot stop it!



--
La presente comunicación tiene carácter confidencial y es para el exclusivo uso del destinatario indicado en la misma. Si Ud. no es el destinatario indicado, le informamos que cualquier forma de distribución, reproducción o uso de esta comunicación y/o de la información contenida en la misma están estrictamente prohibidos por la ley. Si Ud. ha recibido esta comunicación por error, por favor, notifíquelo inmediatamente al remitente contestando a este mensaje y proceda a continuación a destruirlo. Gracias por su colaboración.

This communication contains confidential information. It is for the exclusive use of the intended addressee. If you are not the intended addressee, please note that any form of distribution, copying or use of this communication or the information in it is strictly prohibited by law. If you have received this communication in error, please immediately notify the sender by reply e-mail and destroy this message. Thank you for your cooperation. 
--

From: Soeren Sonnenburg
Date: Tuesday, April 8, 2008 - 5:39 am

The above holds only if you have an intel graphics adapter. For me
(radeon) it really breaks (as in echo mem >/sys/power/state leaves me a
black screen on resume and manually typing vbetool post or vbetool
vgamode simply hangs the machine).

Soeren
--

From: Fabio Comolli
Date: Tuesday, April 8, 2008 - 5:52 am

For me (radeon X700) using binary fglrx xorg driver without the binary
kernel module (which does not compile under .25rc for me)
resume-from-ram works using s2ram.

Lightly tested with -rc8.


Fabio
--

From: Soeren Sonnenburg
Date: Tuesday, April 8, 2008 - 6:32 am

I am talking about console only. X is a completely different issue
especially with fglrx (which will very likely do the re-init).

Soeren
--

From: Matthew Garrett
Date: Tuesday, April 8, 2008 - 7:41 am

The assumption that a static table of VBE-based quirks can be used 
without paying attention to the capabilities of the video driver has 
always been broken, though I'll freely admit that it's all my fault in 
the first place...

-- 
Matthew Garrett | mjg59@srcf.ucam.org
--

From: Jesse Barnes
Date: Tuesday, April 8, 2008 - 8:07 am

That said, running vbetool from the console after resuming into it with a new 
i915 driver shouldn't kill your machine...

Romano, you say this was working for you before with the suspend/resume 
enabled i915 driver, right?  So something else must have broken?

You can run the upstream DRM modules against 2.6.24 to insulate yourself from 
anything in 2.6.25 that might have broken things...  That would be a good 
data point.

Jesse
--

From: Stefan Seyfried
Date: Thursday, April 17, 2008 - 11:20 am

can you try if this patch to libx86 (make sure it gets installed and used...)
fixes the problem?

Index: libx86-0.99/x86-common.c
===================================================================
--- libx86-0.99.orig/x86-common.c
+++ libx86-0.99/x86-common.c
@@ -232,7 +232,7 @@ int LRMI_common_init(void)
 	}

 	m = mmap((void *)0xa0000, 0x100000 - 0xa0000,
-	 PROT_READ | PROT_WRITE,
+	 PROT_READ | PROT_WRITE | PROT_EXEC,
 	 MAP_FIXED | MAP_SHARED, fd_mem, 0xa0000);


Especially the userspace stuff (VBE_*) should not interfere with the restoring
-- 
Stefan Seyfried
R&D Team Mobile Devices            |              "Any ideas, John?"
SUSE LINUX Products GmbH, Nürnberg | "Well, surrounding them's out."

This footer brought to you by insane German lawmakers:
SUSE Linux Products GmbH, GF: Markus Rex, HRB 16746 (AG Nürnberg)
--

From: Soeren Sonnenburg
Date: Thursday, April 17, 2008 - 12:49 pm

for me it won't (core 1 duo), not clear whether his toshiba is a core 2
duo though supporting that flag...

anyway I tried again with the patch and 2.6.25 and same thing, display
stays black on console. then i s2ram'd again and - the display came
back!!

all my further attempts never made it come back... however from inside X

Hmmhh maybe when the radeon(hd) fb driver enters the kernel it will work
for me nicely again...

Soeren
--

From: Romano Giannetti
Date: Friday, April 18, 2008 - 12:34 am

Sorry to be a bit though, but _what_ I'm supposed to patch? It's in
s2ram sources? Thanks!

Romano

-- 
Sorry for the disclaimer --- ¡I cannot stop it!



--
La presente comunicación tiene carácter confidencial y es para el exclusivo uso del destinatario indicado en la misma. Si Ud. no es el destinatario indicado, le informamos que cualquier forma de distribución, reproducción o uso de esta comunicación y/o de la información contenida en la misma están estrictamente prohibidos por la ley. Si Ud. ha recibido esta comunicación por error, por favor, notifíquelo inmediatamente al remitente contestando a este mensaje y proceda a continuación a destruirlo. Gracias por su colaboración.

This communication contains confidential information. It is for the exclusive use of the intended addressee. If you are not the intended addressee, please note that any form of distribution, copying or use of this communication or the information in it is strictly prohibited by law. If you have received this communication in error, please immediately notify the sender by reply e-mail and destroy this message. Thank you for your cooperation. 
--

From: Soeren Sonnenburg
Date: Friday, April 18, 2008 - 1:32 am

Patch the sources of libx86... (s2ram makes use of that lib to do the
POSTing)

Soeren
--

From: Pavel Machek
Date: Friday, April 11, 2008 - 2:04 pm

Could you get us any debugging output from s2ram? Or maybe even strace
it in both working and broken case, and comparing them? (You may want
to disable randomization so that results are comparable).

									Pavel

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
--

From: Rafael J. Wysocki
Date: Friday, April 11, 2008 - 2:08 pm

Please also test the kernel with the patch from
http://bugzilla.kernel.org/attachment.cgi?id=15736&action=view
applied.

Thanks,
Rafael
--

From: Soeren Sonnenburg
Date: Saturday, April 12, 2008 - 12:27 am

I did on 2.6.24

strace -ff s2ram >s2ram24.trace 2>&1

and .25

strace -ff s2ram >s2ram25.trace 2>&1

with the .24 bringing the display back and .25 not. Files are here

http://nn7.de/debugging/s2ram24.trace.bz2
http://nn7.de/debugging/s2ram25.trace.bz2

I have no idea how to 'disable randomization' and also not which
debugging options I should have set ... tell me what things/options you
need to be en/disabled and I redo the experiment.

Soeren
--

From: Pavel Machek
Date: Sunday, April 13, 2008 - 1:53 am

Hmm: 

/sys/bus/pci/devices/0000:00:1b.0/irq

contains 21 in one case and 22 in another... as do other
interrupts. Is that expected? Can you post /proc/interrupts for both
versions?

Hmm, big part of trace is:

vm86old(0xb7f76c8c)                     = -1 ENOSYS (Function not
implemented)
vm86old(0xb7f76c8c)                     = -1 ENOSYS (Function not
implemented)

...I wonder why we do it so many times?

And here's the difference. .25 says:

vm86old(0xb809ac8c)                     = -1 ENOSYS (Function not
implemented)
vm86old(0xb809ac8c)                     = -1 ENOSYS (Function not
implemented)
Error: something went wrong performing real mode call
open("/sys/class/graphics",
O_RDONLY|O_NONBLOCK|O_LARGEFILE|O_DIRECTORY|0x80000) = -1 ENOENT (No
such file or directory)
open("/dev/tty", O_RDWR|O_LARGEFILE)    = 6
ioctl(6, KDGKBTYPE, 0xbfae8887)         = 0

...can you perhaps add printf-s to s2ram to find out what changed?
									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
--

From: Soeren Sonnenburg
Date: Sunday, April 13, 2008 - 5:05 am

It might be that configs are slightly different - if you think this

OK, I searched for "something went wrong performing real mode call" in
the s2ram source and found this function:

int do_real_post(unsigned pci_device)
{
    int error = 0;
    struct LRMI_regs r;
    memset(&r, 0, sizeof(r));

    /* Several machines seem to want the device that they're POSTing in
       here */
    r.eax = pci_device;

    /* 0xc000 is the video option ROM.  The init code for each
       option ROM is at 0x0003 - so jump to c000:0003 and start running
*/
    r.cs = 0xc000;
    r.ip = 0x0003;

    /* This is all heavily cargo culted but seems to work */
    r.edx = 0x80;
    r.ds = 0x0040;

    if (!LRMI_call(&r)) {
        fprintf(stderr,
            "Error: something went wrong performing real mode call\n");
        error = 1;
    }

    return error;
}

which is obviously called from

int do_post(void)
{
    struct pci_dev *p;
    unsigned int c;
    unsigned int pci_id;
    int error;

    pci_scan_bus(pacc);

    for (p = pacc->devices; p; p = p->next) {
        c = pci_read_word(p, PCI_CLASS_DEVICE);
        if (c == 0x300) {
            pci_id =
                (p->bus << 8) + (p->dev << 3) +
                (p->func & 0x7);
            error = do_real_post(pci_id);
            if (error != 0) {
                return error;
            }
        }
    }
    return 0;
}

so either the graphics adapter is somehow not ready yet or a wrong
address is used for posting?

Do you already now have an idea? Or which things should I print out?

Soeren
--

From: Pavel Machek
Date: Sunday, April 13, 2008 - 2:33 pm

Sweet, so the problem is inside X86EMU_exec(), right? ...and that's
bytecode interpretter. You could add printfs() all over it, to see


It starts to look like git bisect is the easier way to get some
results...
									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
--

From: Rafael J. Wysocki
Date: Sunday, April 13, 2008 - 6:53 am

Well, that looks suspiciously similar to
http://bugzilla.kernel.org/show_bug.cgi?id=10155 .

Thanks,
Rafael
--

From: Soeren Sonnenburg
Date: Sunday, April 13, 2008 - 9:18 am

Hmmhh the only difference is that I don't have a core 2 duo but only a
core 1 duo but hmmmhh the flag down there looks like it does nx, or
should I see something like nx enabled in dmesg?

processor	: 0
vendor_id	: GenuineIntel
cpu family	: 6
model		: 14
model name	: Genuine Intel(R) CPU            1600  @ 2.16GHz
stepping	: 8
cpu MHz		: 1000.000
cache size	: 2048 KB
physical id	: 0
siblings	: 2
core id		: 0
cpu cores	: 2
fdiv_bug	: no
hlt_bug		: no
f00f_bug	: no
coma_bug	: no
fpu		: yes
fpu_exception	: yes
cpuid level	: 6
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov
pat clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx constant_tsc bts
pni monitor vmx est tm2 xtpr
bogomips	: 4333.82
clflush size	: 64


Anyway I will try noexec=off ...

Soeren
--

From: Soeren Sonnenburg
Date: Sunday, April 13, 2008 - 9:30 am

OK I tried noexec=off now... same error in s2ram ... so I guess my CPU
does not support NX ...

Soeren
--

From: Rafael J. Wysocki
Date: Sunday, April 13, 2008 - 9:38 am

s2ram segfaulted in that bug, but your one desn't.  Still, we could have done
another thing that prevents s2ram from doing its job on your system.

Thanks,
Rafael
--

From: Rafael J. Wysocki
Date: Thursday, April 3, 2008 - 4:22 pm

The following report is on the current list of known regressions
from 2.6.24.  Please verify if the issue is still present in the
mainline.


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=10323
Subject		: panic using bridging on linus kernel 2.6.25-rc6
Submitter	: Andy Gospodarek <andy@greyhouse.net>
Date		: 2008-03-25 11:40 (10 days old)


--

From: Rafael J. Wysocki
Date: Thursday, April 3, 2008 - 4:22 pm

The following report is on the current list of known regressions
from 2.6.24.  Please verify if the issue is still present in the
mainline.


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=10320
Subject		: rt2x00 does not associate or give scan results
Submitter	: Marcus Better <marcus@better.se>
Date		: 2008-03-25 06:04 (10 days old)


--

From: Rafael J. Wysocki
Date: Thursday, April 3, 2008 - 4:22 pm

The following report is on the current list of known regressions
from 2.6.24.  Please verify if the issue is still present in the
mainline.


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=10324
Subject		: kernel panic ip_route_input
Submitter	: Denys Fedoryshchenko <nuclearcat@nuclearcat.com>
Date		: 2008-03-25 12:48 (10 days old)


--

From: David Miller
Date: Thursday, April 3, 2008 - 6:06 pm

From: "Rafael J. Wysocki" <rjw@sisk.pl>

This has been confirmed by the reporter to be fixed in the current
tree.
--

From: Rafael J. Wysocki
Date: Friday, April 4, 2008 - 2:41 am

I've closed the bug.

Thanks,
Rafael
--

From: Rafael J. Wysocki
Date: Thursday, April 3, 2008 - 4:22 pm

The following report is on the current list of known regressions
from 2.6.24.  Please verify if the issue is still present in the
mainline.


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=10326
Subject		: inconsistent lock state in net_rx_action
Submitter	: Marcus Better <marcus@better.se>
Date		: 2008-03-25 13:21 (10 days old)
Patch		: http://bugzilla.kernel.org/show_bug.cgi?id=10326#c20


--

From: David Miller
Date: Thursday, April 3, 2008 - 6:06 pm

From: "Rafael J. Wysocki" <rjw@sisk.pl>

The patch in the report has been added to the tree
and it fixes this problem.
--

From: Rafael J. Wysocki
Date: Thursday, April 3, 2008 - 4:22 pm

The following report is on the current list of known regressions
from 2.6.24.  Please verify if the issue is still present in the
mainline.


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=10346
Subject		: Linux 2.6.25-rc6: WARNING: at net/ipv4/tcp_input.c:2510
Submitter	: Georgi Chorbadzhiyski <gf@unixsol.org>
Date		: 2008-03-27 17:29 (8 days old)
References	: http://lkml.org/lkml/2008/3/27/246


--

From: Georgi Chorbadzhiyski
Date: Friday, April 4, 2008 - 8:14 am

Upgraded to 2.6.25-rc8-00139-ge315c12, the server is running 8 hours
already and it seems it's fixed.

Anyway, should such warnings be threated as regressions?

-- 
Georgi Chorbadzhiyski
http://georgi.unixsol.org/
--

From: Rafael J. Wysocki
Date: Thursday, April 3, 2008 - 4:22 pm

The following report is on the current list of known regressions
from 2.6.24.  Please verify if the issue is still present in the
mainline.


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=10349
Subject		: regression: am-utils stopped working in 2.6.25-rc*
Submitter	: Meelis Roos <mroos@linux.ee>
Date		: 2008-03-28 15:20 (7 days old)
References	: http://lkml.org/lkml/2008/3/28/174


--

From: Meelis Roos
Date: Thursday, April 3, 2008 - 11:15 pm

Still present, active discussion in bugzilla. At least partly am-utils 
problem (actually there are 2 of them, one am-utils mount structure 
version problem, another NFS locking problem that was also uncoveredy by 
stricter checking in 2.6.25).

-- 
Meelis Roos (mroos@linux.ee)
--

From: Rafael J. Wysocki
Date: Thursday, April 3, 2008 - 4:22 pm

The following report is on the current list of known regressions
from 2.6.24.  Please verify if the issue is still present in the
mainline.


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=10365
Subject		: usb-storage, error reading the last 8 sectors, regression in 2.6.25-rc7
Submitter	: Sergey Dolgov <solkaa@gmail.com>
Date		: 2008-03-30 11:49 (5 days old)
References	: http://lkml.org/lkml/2008/3/30/11


--

From: Rafael J. Wysocki
Date: Thursday, April 3, 2008 - 4:22 pm

The following report is on the current list of known regressions
from 2.6.24.  Please verify if the issue is still present in the
mainline.


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=10373
Subject		: slub compile error
Submitter	: Toralf Foerster <toralf.foerster@gmx.de>
Date		: 2008-03-31 14:46 (4 days old)
References	: http://lkml.org/lkml/2008/3/31/120
Handled-By	: Christoph Lameter <clameter@sgi.com>
Patch		: http://lkml.org/lkml/2008/3/31/261


--

From: Christoph Lameter
Date: Thursday, April 3, 2008 - 6:37 pm

The bug was not in 2.6.24. It was introduced a couple of days 
before 2.6.25-rc8 and the fix was the last commit added before rc8 was 
released.
--

From: Rafael J. Wysocki
Date: Friday, April 4, 2008 - 2:45 am

Bug closed.

Thanks,
Rafael
--

From: Rafael J. Wysocki
Date: Thursday, April 3, 2008 - 4:22 pm

The following report is on the current list of known regressions
from 2.6.24.  Please verify if the issue is still present in the
mainline.


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=10366
Subject		: 2.6.25-rc7: warn_on_slowpath triggered
Submitter	: Bob Tracy <rct@frus.com>
Date		: 2008-03-29 17:29 (6 days old)
References	: http://lkml.org/lkml/2008/3/29/125
Handled-By	: Bjoern Steinbrink <B.Steinbrink@gmx.de>
Patch		: http://lkml.org/lkml/2008/3/30/245


--

From: Rafael J. Wysocki
Date: Thursday, April 3, 2008 - 4:22 pm

The following report is on the current list of known regressions
from 2.6.24.  Please verify if the issue is still present in the
mainline.


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=10344
Subject		: [2.6.25-rc6] possible regression: X server dying
Submitter	: Tilman Schmidt <tilman@imap.cc>
Date		: 2008-03-24 23:38 (11 days old)
References	: http://lkml.org/lkml/2008/3/24/260
Handled-By	: Dave Airlie <airlied@gmail.com>


--

From: Rafael J. Wysocki
Date: Thursday, April 3, 2008 - 4:22 pm

The following report is on the current list of known regressions
from 2.6.24.  Please verify if the issue is still present in the
mainline.


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=10377
Subject		: Kernel freezes during boot when AC is unplugged
Submitter	: Roman Jarosz <kedgedev@centrum.cz>
Date		: 2008-04-01 16:23 (3 days old)


--

From: Rafael J. Wysocki
Date: Thursday, April 3, 2008 - 4:22 pm

The following report is on the current list of known regressions
from 2.6.24.  Please verify if the issue is still present in the
mainline.


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=10369
Subject		: The never ending BEEEEP/__smp_call_function_mask with 2.6.25-rc7
Submitter	: Chr <chunkeey@web.de>
Date		: 2008-03-30 21:09 (5 days old)
References	: http://lkml.org/lkml/2008/3/30/87


--

From: Chr
Date: Thursday, April 3, 2008 - 6:49 pm

Yep, it's still present... but I don't have no time to debug it. :(

Rhe proposed workaround: "noapictimer" and "hpet=force" works so far...
Maybe it's buggy/bad hardware after all and 2.6.24.4 just doesn't trigger it?! 

Regards,
	Chr.
--

From: Rafael J. Wysocki
Date: Thursday, April 3, 2008 - 4:22 pm

The following report is on the current list of known regressions
from 2.6.24.  Please verify if the issue is still present in the
mainline.


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=10382
Subject		: 2.6.25-rc5.git4 regression PS/2 mouse not detected/working
Submitter	: Yanko Kaneti <yaneti@declera.com>
Date		: 2008-04-02 10:59 (2 days old)
References	: http://lkml.org/lkml/2008/4/2/210
Handled-By	: Dmitry Torokhov <dmitry.torokhov@gmail.com>
		  Balaji Rao <balajirrao@gmail.com>


--

From: Thomas Gleixner
Date: Friday, April 4, 2008 - 6:33 am

Yes, it is. The revert of the patch which caused that is queued for
todays push to Linus.

Thanks,

--

From: Balaji Rao
Date: Friday, April 4, 2008 - 8:32 am

Hi tglx,

I think the following commit should also be reverted. 

commit 37a47db8d7f0f38dac5acf5a13abbc8f401707fa
Author: Balaji Rao <balajirrao@gmail.com>
Date:   Wed Jan 30 13:30:03 2008 +0100

    x86: assign IRQs to HPET timers, fix

-- 
regards,
Balaji Rao
--

From: Thomas Gleixner
Date: Friday, April 4, 2008 - 11:18 am

Yup, I have both.

Anyway, thanks for the reminder 

	tglx
--

From: Rafael J. Wysocki
Date: Thursday, April 3, 2008 - 4:22 pm

The following report is on the current list of known regressions
from 2.6.24.  Please verify if the issue is still present in the
mainline.


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=10387
Subject		: rc6+ regression - backlight reset to 0 on boot after 7c0ea45be4f114d85ee35caeead8e1660699c46f
Submitter	: Andrey Borzenkov <arvidjaar@mail.ru>
Date		: 2008-04-02 22:53 (2 days old)
References	: http://lkml.org/lkml/2008/4/2/366


--

From: Rafael J. Wysocki
Date: Thursday, April 3, 2008 - 4:22 pm

The following report is on the current list of known regressions
from 2.6.24.  Please verify if the issue is still present in the
mainline.


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=10391
Subject		: 2.6.25-rc7/8: Another resume regression
Submitter	: Mark Lord <lkml@rtr.ca>
Date		: 2008-04-03 15:06 (1 days old)
References	: http://lkml.org/lkml/2008/4/3/283


--

From: Mark Lord
Date: Friday, April 4, 2008 - 7:11 am

..

Probably still there, but it's not easily reproduceable
and doesn't happen more than once every couple of days or so.

-ml
 

--

From: Rafael J. Wysocki
Date: Thursday, April 3, 2008 - 4:22 pm

The following report is on the current list of known regressions
from 2.6.24.  Please verify if the issue is still present in the
mainline.


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=10390
Subject		: Oops while reading /proc/ioports or /proc/iomem
Submitter	: Jan Kara <jack@suse.cz>
Date		: 2008-04-03 15:25 (1 days old)
References	: http://lkml.org/lkml/2008/4/3/149


--

From: Rafael J. Wysocki
Date: Thursday, April 3, 2008 - 4:30 pm

The following report is on the current list of known regressions
from 2.6.24.  Please verify if the issue is still present in the
mainline.


Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=10384
Subject		: 2.6.25-rc6-git2: warn_on_slowpath for tcp_simple_retransmit
Submitter	: Alessandro Suardi <alessandro.suardi@gmail.com>
Date		: 2008-04-02 00:28 (2 days old)
References	: http://lkml.org/lkml/2008/4/1/408
Handled-By	: Ilpo Jarvinen <ilpo.jarvinen@helsinki.fi>


--

From: Linus Torvalds
Date: Saturday, April 5, 2008 - 11:57 am

This seems to be a bug in am-utils:

	https://bugzilla.am-utils.org/show_bug.cgi?id=612

and while we try very hard to not break existing binaries that assume some 
old broken kernel behaviour, things like system setup code that is 
outright buggy is kind of exempt from that rule. So I think in this case 

I wouldn't call this a regression. It's a hard-to-trigger warning that is 
being debugged. 

			Linus
--

From: Rafael J. Wysocki
Date: Sunday, April 6, 2008 - 2:10 pm

Dropped from the list.

Thanks,
Rafael
--

Previous thread: [PATCH 0/3] PM: New suspend and hibernation callbacks by Rafael J. Wysocki on Thursday, April 3, 2008 - 4:11 pm. (56 messages)

Next thread: 2.6.25-rc1: volanoMark regression by Rafael J. Wysocki on Thursday, April 3, 2008 - 3:50 pm. (3 messages)