Greetings. I administer a cluster composed of a mixture of various Opteron models and Intel Xeon X5550's. The 2.6.34.*, and prior, kernels run fine on all of them. The 2.6.35 series also runs fine on the Opterons, but not on the Xeon's. All of these are CONFIG_GENERIC_CPU kernels. On the Xeon's, 2.6.35 hangs early on, upon the first test of trace events (in kernel/trace/trace_events.c:event_trace_self_tests()). When disabling all tracing, debugging, etc., it still hangs but slightly later. The megaraid_sas module is loaded, detects the adapter, but never gets around to registering it with the SCSI layer. Core2-specific kernels also hang the same way, as do UP kernels. I've tried backing out certain commits that seemed likely candidates, but have yet to stumble upon the one (or more) that is causing this. Does anyone have any ideas? Thanks. Marc. +----------------------------------+----------------------------------+ | Marc Aurele La France | work: 1-780-492-9310 | | Academic Information and | fax: 1-780-492-1729 | | Communications Technologies | email: tsi@ualberta.ca | | 352 General Services Building +----------------------------------+ | University of Alberta | | | Edmonton, Alberta | Standard disclaimers apply | | T6G 2H1 | | | CANADA | | +----------------------------------+----------------------------------+ --
This is due to "CONFIG_INTEL_IDLE=y". "m" or "n", the hang doesn't occur. Of the kernels I've tested, INTEL_IDLE first appears in 2.6.34-git15. So, technically, this is not a regression against 2.6.34. This does, however, amount to a vote of non-confidence against intel_idle.c. Marc. +----------------------------------+----------------------------------+ | Marc Aurele La France | work: 1-780-492-9310 | | Academic Information and | fax: 1-780-492-1729 | | Communications Technologies | email: tsi@ualberta.ca | | 352 General Services Building +----------------------------------+ | University of Alberta | | | Edmonton, Alberta | Standard disclaimers apply | | T6G 2H1 | | | CANADA | | +----------------------------------+----------------------------------+ --
Please file a bug report at bugzilla.kernel.org and assign it to me. Please reproduce using an upstream 2.6.36-rc8 kernel. Boot a CONFIG_INTEL_IDLE=n kernel and to the bug report... attach the output from acpidump 'cat /proc/cpuinfo' 'grep . /sys/devices/system/cpu/cpu*/cpuidle/*/*' 'lspci' Then boot a CONFIG_INTEL_IDLE=y kernel and see what is the highest N that boots when you boot with "intel_idle.max_cstate=N" (0 will disable the driver completely) and if any of them boot, for the highest N, attach to the bug report the complete dmesg and the output from 'grep . /sys/devices/system/cpu/cpu*/cpuidle/*/*' thanks, -Len Brown, Intel Open Source Technology Center --
I've gathered all the data, but the kernel's BZ is borked at the moment, so this'll have to wait. Marc. +----------------------------------+----------------------------------+ | Marc Aurele La France | work: 1-780-492-9310 | | Academic Information and | fax: 1-780-492-1729 | | Communications Technologies | email: tsi@ualberta.ca | | 352 General Services Building +----------------------------------+ | University of Alberta | | | Edmonton, Alberta | Standard disclaimers apply | | T6G 2H1 | | | CANADA | | +----------------------------------+----------------------------------+ --
Done. However, due to finger-checks BZ won't let me correct, you'll need to ignore the initial descriptions of the attachments. Look at "[details]" instead. Marc. +----------------------------------+----------------------------------+ | Marc Aurele La France | work: 1-780-492-9310 | | Academic Information and | fax: 1-780-492-1729 | | Communications Technologies | email: tsi@ualberta.ca | | 352 General Services Building +----------------------------------+ | University of Alberta | | | Edmonton, Alberta | Standard disclaimers apply | | T6G 2H1 | | | CANADA | | +----------------------------------+----------------------------------+ --
What's the bug number, please? -- "One disk to rule them all, One disk to find them. One disk to bring them all and in the darkness grind them. In the Land of Redmond where the shadows lie." -- The Silicon Valley Tarot Henrique Holschuh --
20722. Marc. +----------------------------------+----------------------------------+ | Marc Aurele La France | work: 1-780-492-9310 | | Academic Information and | fax: 1-780-492-1729 | | Communications Technologies | email: tsi@ualberta.ca | | 352 General Services Building +----------------------------------+ | University of Alberta | | | Edmonton, Alberta | Standard disclaimers apply | | T6G 2H1 | | | CANADA | | +----------------------------------+----------------------------------+ --
| Jesse Barnes | Re: [stable] [BUG][PATCH] cpqphp: fix kernel NULL pointer dereference |
| Greg KH | [003/136] p54usb: add Zcomax XG-705A usbid |
| Magnus Damm | [PATCH 03/07] ARM: Use shared GIC entry macros on Realview |
| Oliver Neukum | Re: [Bug #13682] The webcam stopped working when upgrading from 2.6.29 to 2.6.30 |
| Martin Schwidefsky | Re: [PATCH] optimized ktime_get[_ts] for GENERIC_TIME=y |
git: | |
| Junio C Hamano | Re: Some advanced index playing |
| Jeff King | Re: confusion over the new branch and merge config |
| Robin Rosenberg | Re: cvs2svn conversion directly to git ready for experimentation |
| Linus Torvalds | git binary size... |
| Ævar Arnfjörð Bjarmason | Re: Challenge with Git-Bash |
| Linux Kernel Mailing List | md: move allocation of -&am |
