login
Header Space

 
 

2.6.25 regression/oops on boot (ACPI related?)

Previous thread: Re: SO_REUSEADDR not allowing server and client to use same port by Nebojsa Miljanovic on Thursday, February 28, 2008 - 4:44 pm. (1 message)

Next thread: [RFC] Prefixing cgroup generic control filenames with "cgroup." by Paul Menage on Thursday, February 28, 2008 - 5:14 pm. (25 messages)
To: <linux-kernel@...>
Cc: <linux-acpi@...>
Date: Thursday, February 28, 2008 - 4:08 pm

I'm getting a "general protection fault" when trying to boot 2.6.25-rc3
on my AMD64 box; 2.6.24 boots fine. The machine just seems to end up
sitting there at the end, but still responds to a ctrl-alt-del to
cleanly shutdown. The GPF is as follows:

-----
general protection fault: 0000 [1] PREEMPT SMP 
CPU 1 
Modules linked in: thermal(+) processor fan
Pid: 598, comm: modprobe Not tainted 2.6.25-rc3 #1
RIP: 0010:[&lt;ffffffff803590a8&gt;]  [&lt;ffffffff803590a8&gt;] acpi_ns_map_handle_to_node+0x19/0x23
RSP: 0018:ffff81011de5fc68  EFLAGS: 00010246
RAX: 0000000000000000 RBX: 0000000000001001 RCX: 0000000000000000
RDX: 0000000000005067 RSI: 0000000000000001 RDI: 4d52454854584e4c
RBP: ffff81011de5fc68 R08: 0000000000000000 R09: ffff81011de5fc78
R10: ffff81011dcc0648 R11: ffffffff802d566a R12: 4d52454854584e4c
R13: ffff81011de5fcf8 R14: ffffffff80362bd3 R15: 0000000000000003
FS:  00007f04840336e0(0000) GS:ffff81011fab1bc0(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00007f0484032000 CR3: 000000011defc000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process modprobe (pid: 598, threadinfo ffff81011de5e000, task ffff81011de38640)
Stack:  ffff81011de5fc98 ffffffff803584ab ffff81011de5fcf8 ffff81011df1a800
 0000000000000000 ffff81011de5fcf8 ffff81011de5fcb8 ffffffff80362382
 ffff81011de52230 0000000000000001 ffff81011de5fd28 ffffffff880113a9
Call Trace:
 [&lt;ffffffff803584ab&gt;] acpi_get_data+0x3f/0x70
 [&lt;ffffffff80362382&gt;] acpi_bus_get_device+0x25/0x39
 [&lt;ffffffff880113a9&gt;] :thermal:acpi_thermal_cooling_device_cb+0x6b/0x166
 [&lt;ffffffff80405ee8&gt;] ? thermal_zone_bind_cooling_device+0x0/0x26e
 [&lt;ffffffff880114c6&gt;] :thermal:acpi_thermal_bind_cooling_device+0x10/0x12
 [&lt;ffffffff80405e68&gt;] thermal_zone_device_register+0x252/0x2d2
 [&lt;ffffffff88011626&gt;] :thermal:acpi_thermal_add+0x15e/0x42b
 [&lt;ffffffff80364138&g...
To: Jonathan McDowell <noodles@...>
Cc: <linux-kernel@...>, <linux-acpi@...>
Date: Thursday, February 28, 2008 - 1:20 pm

Hi, Jonathan,

Please attach the acpidump output using the latest pmtools at :
http://www.kernel.org/pub/linux/kernel/people/lenb/acpi/utils/
Please attach the result of "cat /proc/acpi/thermal_zone/*/*" as well.

thanks,
rui


--
To: Zhang, Rui <rui.zhang@...>
Cc: <linux-kernel@...>, <linux-acpi@...>
Date: Friday, February 29, 2008 - 3:54 am

I've attached the output of acpidump. The cat results in this output:

[noodles@meepok /proc/acpi/thermal_zone/THRM]$ cat *
0 - Active; 1 - Passive
&lt;polling disabled&gt;
state:                   ok
temperature:             40 C
Segmentation fault

It also causes a general protection fault, which I've attached as well.

This is a stock Debian kernel:

Linux meepok 2.6.24-1-amd64 #1 SMP Mon Feb 11 13:47:43 UTC 2008 x86_64 GNU/Linux

I have a patch from Ming Lin to try out but it'll have to wait until
tomorrow before I can do so.

J.

-- 
Are you out of my mind?
This .sig brought to you by the letter R and the number  6
Product of the Republic of HuggieTag
To: Jonathan McDowell <noodles@...>
Cc: <linux-kernel@...>, <linux-acpi@...>, <ming.m.lin@...>
Date: Thursday, February 28, 2008 - 7:38 pm

We've root caused the problem and Lin Ming's patch should work for you.
Please give it a try. :)

From: Lin Ming &lt;ming.m.lin@intel.com&gt;

Fix a memory overflow bug when copying
NULL internal package element object to external.

Signed-off-by: Lin Ming &lt;ming.m.lin@intel.com&gt;
Signed-off-by: Zhang Rui &lt;rui.zhang@intel.com&gt;
---
 drivers/acpi/utilities/utobject.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Index: linux-2.6/drivers/acpi/utilities/utobject.c
===================================================================
--- linux-2.6.orig/drivers/acpi/utilities/utobject.c
+++ linux-2.6/drivers/acpi/utilities/utobject.c
@@ -432,7 +432,7 @@ acpi_ut_get_simple_object_size(union acp
 	 * element -- which is legal)
 	 */
 	if (!internal_object) {
-		*obj_length = 0;
+		*obj_length = sizeof(union acpi_object);
 		return_ACPI_STATUS(AE_OK);
 	}
 


--
To: Zhang, Rui <rui.zhang@...>
Cc: <linux-kernel@...>, <linux-acpi@...>, <ming.m.lin@...>
Date: Saturday, March 1, 2008 - 11:27 am

I've now done so; it fixes my problem and 2.6.25-rc3 boots fine with it
applied. Thanks.

J.

-- 
Programmer, |   101 things you can't have too    |       Tel/SMS:
sysadmin &amp;  |        much of : 51 - News.        |   +423-663-212343
BHMF.       |                                    |  Made by HuggieTag
--
To: Zhang, Rui <rui.zhang@...>
Cc: <linux-kernel@...>, <linux-acpi@...>, <ming.m.lin@...>
Date: Monday, March 10, 2008 - 10:37 am

Is there a reason this still hasn't made it into 2.6.25-rc5?

J.

-- 
Covered in paint and high as a kite.
This .sig brought to you by the letter I and the number 13
Product of the Republic of HuggieTag
--
To: Jonathan McDowell <noodles@...>
Cc: Zhang, Rui <rui.zhang@...>, <linux-kernel@...>, <linux-acpi@...>, <ming.m.lin@...>, Len Brown <lenb@...>
Date: Monday, March 10, 2008 - 11:01 am

Well, I guess Len was unaware of it (CCed now).

Thanks,
Rafael
--
Previous thread: Re: SO_REUSEADDR not allowing server and client to use same port by Nebojsa Miljanovic on Thursday, February 28, 2008 - 4:44 pm. (1 message)

Next thread: [RFC] Prefixing cgroup generic control filenames with "cgroup." by Paul Menage on Thursday, February 28, 2008 - 5:14 pm. (25 messages)
speck-geostationary