x86_64 2.6.35.* kernels and Intel Xeon X5550

Previous thread: [PATCH 6/6] m68k: fix stack mangling logics in sigreturn by Al Viro on Thursday, October 7, 2010 - 10:10 am. (1 message)

Next thread: [PATCH 2/2] drivers:bluetooth: Kconfig & Makefile for TI BT by pavan_savoy on Thursday, October 7, 2010 - 11:47 am. (2 messages)
From: Marc Aurele La France
Date: Thursday, October 7, 2010 - 10:17 am

Greetings.

I administer a cluster composed of a mixture of various Opteron models and 
Intel Xeon X5550's.  The 2.6.34.*, and prior, kernels run fine on all of 
them.  The 2.6.35 series also runs fine on the Opterons, but not on the 
Xeon's.  All of these are CONFIG_GENERIC_CPU kernels.

On the Xeon's, 2.6.35 hangs early on, upon the first test of trace events 
(in kernel/trace/trace_events.c:event_trace_self_tests()).  When disabling 
all tracing, debugging, etc., it still hangs but slightly later.  The 
megaraid_sas module is loaded, detects the adapter, but never gets around 
to registering it with the SCSI layer.

Core2-specific kernels also hang the same way, as do UP kernels.  I've 
tried backing out certain commits that seemed likely candidates, but have 
yet to stumble upon the one (or more) that is causing this.

Does anyone have any ideas?

Thanks.

Marc.

+----------------------------------+----------------------------------+
|  Marc Aurele La France           |  work:   1-780-492-9310          |
|  Academic Information and        |  fax:    1-780-492-1729          |
|    Communications Technologies   |  email:  tsi@ualberta.ca         |
|  352 General Services Building   +----------------------------------+
|  University of Alberta           |                                  |
|  Edmonton, Alberta               |    Standard disclaimers apply    |
|  T6G 2H1                         |                                  |
|  CANADA                          |                                  |
+----------------------------------+----------------------------------+
--

From: Marc Aurele La France
Date: Friday, October 15, 2010 - 7:42 pm

This is due to "CONFIG_INTEL_IDLE=y".  "m" or "n", the hang doesn't occur.

Of the kernels I've tested, INTEL_IDLE first appears in 2.6.34-git15.  So, 
technically, this is not a regression against 2.6.34.

This does, however, amount to a vote of non-confidence against intel_idle.c.

Marc.

+----------------------------------+----------------------------------+
|  Marc Aurele La France           |  work:   1-780-492-9310          |
|  Academic Information and        |  fax:    1-780-492-1729          |
|    Communications Technologies   |  email:  tsi@ualberta.ca         |
|  352 General Services Building   +----------------------------------+
|  University of Alberta           |                                  |
|  Edmonton, Alberta               |    Standard disclaimers apply    |
|  T6G 2H1                         |                                  |
|  CANADA                          |                                  |
+----------------------------------+----------------------------------+
--

From: Len Brown
Date: Saturday, October 16, 2010 - 12:46 am

Please file a bug report at bugzilla.kernel.org and assign it to me.

Please reproduce using an upstream 2.6.36-rc8 kernel.

Boot a CONFIG_INTEL_IDLE=n kernel and to the bug report...

attach the output from acpidump
'cat /proc/cpuinfo'
'grep . /sys/devices/system/cpu/cpu*/cpuidle/*/*'
'lspci'

Then boot a CONFIG_INTEL_IDLE=y kernel and see what is the highest N that
boots when you boot with "intel_idle.max_cstate=N"  (0 will disable
the driver completely) and if any of them boot, for the highest N,
attach to the bug report the complete dmesg and the output from
'grep . /sys/devices/system/cpu/cpu*/cpuidle/*/*'

thanks,
-Len Brown, Intel Open Source Technology Center
--

From: Marc Aurele La France
Date: Monday, October 18, 2010 - 9:46 am

I've gathered all the data, but the kernel's BZ is borked at the moment, 
so this'll have to wait.

Marc.

+----------------------------------+----------------------------------+
|  Marc Aurele La France           |  work:   1-780-492-9310          |
|  Academic Information and        |  fax:    1-780-492-1729          |
|    Communications Technologies   |  email:  tsi@ualberta.ca         |
|  352 General Services Building   +----------------------------------+
|  University of Alberta           |                                  |
|  Edmonton, Alberta               |    Standard disclaimers apply    |
|  T6G 2H1                         |                                  |
|  CANADA                          |                                  |
+----------------------------------+----------------------------------+
--

From: Marc Aurele La France
Date: Monday, October 18, 2010 - 1:48 pm

Done.  However, due to finger-checks BZ won't let me correct, you'll need 
to ignore the initial descriptions of the attachments.  Look at 
"[details]" instead.

Marc.

+----------------------------------+----------------------------------+
|  Marc Aurele La France           |  work:   1-780-492-9310          |
|  Academic Information and        |  fax:    1-780-492-1729          |
|    Communications Technologies   |  email:  tsi@ualberta.ca         |
|  352 General Services Building   +----------------------------------+
|  University of Alberta           |                                  |
|  Edmonton, Alberta               |    Standard disclaimers apply    |
|  T6G 2H1                         |                                  |
|  CANADA                          |                                  |
+----------------------------------+----------------------------------+
--

From: Henrique de Moraes Holschuh
Date: Tuesday, October 19, 2010 - 5:22 pm

What's the bug number, please?

-- 
  "One disk to rule them all, One disk to find them. One disk to bring
  them all and in the darkness grind them. In the Land of Redmond
  where the shadows lie." -- The Silicon Valley Tarot
  Henrique Holschuh
--

From: Marc Aurele La France
Date: Tuesday, October 19, 2010 - 6:04 pm

20722.

Marc.

+----------------------------------+----------------------------------+
|  Marc Aurele La France           |  work:   1-780-492-9310          |
|  Academic Information and        |  fax:    1-780-492-1729          |
|    Communications Technologies   |  email:  tsi@ualberta.ca         |
|  352 General Services Building   +----------------------------------+
|  University of Alberta           |                                  |
|  Edmonton, Alberta               |    Standard disclaimers apply    |
|  T6G 2H1                         |                                  |
|  CANADA                          |                                  |
+----------------------------------+----------------------------------+
--

Previous thread: [PATCH 6/6] m68k: fix stack mangling logics in sigreturn by Al Viro on Thursday, October 7, 2010 - 10:10 am. (1 message)

Next thread: [PATCH 2/2] drivers:bluetooth: Kconfig & Makefile for TI BT by pavan_savoy on Thursday, October 7, 2010 - 11:47 am. (2 messages)