2.6.23 hang, unstable clocksource?

Previous thread: Re: Power button policy and mechanism by Bodo Eggert on Tuesday, October 16, 2007 - 4:29 pm. (1 message)

Next thread: [patch] sched: schedstat needs a diet by Ken Chen on Tuesday, October 16, 2007 - 4:37 pm. (7 messages)
To: <linux-kernel@...>
Date: Tuesday, October 16, 2007 - 4:31 pm

Hello,

I am running a vanilla 2.6.23 kernel and am experiencing (seemingly)
random hangs. Below is a piece of dmesg and my kernel config, along
with a few snippets from /var/log/messages showing a 5 minute hang
(and another hang, but I wasn't at the computer at that time).

This has happened probably over 10 times all told, now. The first few
times I just hard-booted the machine, today I decided to be patient,
and after 5 minutes it became usable again.

I've run memtest on it, so I don't think it's the memory... Any help
is greatly appreciated!

As this is my first post, I apologize in advance if I have done
something stupid..!

Thanks,

Joshua Roys

dmesg:
Linux version 2.6.23 (root@orannis) (gcc version 4.2.1 (Gentoo 4.2.1
p1.0)) #1 Fri Oct 12 16:55:58 EDT 2007
ACPI: RSDP 000FEC00, 0014 (r0 DELL )
ACPI: RSDT 000FCC05, 0040 (r1 DELL GX280 7 ASL 61)
ACPI: FACP 000FCC45, 0074 (r1 DELL GX280 7 ASL 61)
ACPI: DSDT FFFD060C, 30BF (r1 DELL dt_ex 1000 MSFT 100000D)
ACPI: FACS 1F686C00, 0040
ACPI: SSDT FFFD3808, 00BA (r1 DELL st_ex 1000 MSFT 100000D)
ACPI: APIC 000FCCB9, 0072 (r1 DELL GX280 7 ASL 61)
ACPI: BOOT 000FCD2B, 0028 (r1 DELL GX280 7 ASL 61)
ACPI: ASF! 000FCD53, 0067 (r16 DELL GX280 7 ASL 61)
ACPI: MCFG 000FCDBA, 003E (r1 DELL GX280 7 ASL 61)
ACPI: HPET 000FCDF8, 0038 (r1 DELL GX280 7 ASL 61)
ACPI: PM-Timer IO Port: 0x808
ACPI: HPET id: 0x8086a201 base: 0xfed00000
Kernel command line: root=/dev/sda2
Initializing CPU#0
Detected 2660.302 MHz processor.
hpet0: at MMIO 0xfed00000, IRQs 2, 8, 0
hpet0: 3 64-bit timers, 14318180 Hz
Calibrating delay using timer specific routine.. 5324.67 BogoMIPS (lpj=10649351)
CPU: After generic identify, caps: bfebfbff 00100000 00000000 00000000
0000451d 00000000 00000000 00000000
monitor/mwait feature present.
using mwait in idle threads.
CPU: Trace cache: 12K uops, L1 D cache: ...

To: Joshua Roys <roysjosh@...>
Cc: <linux-kernel@...>
Date: Thursday, October 25, 2007 - 7:10 pm

What was the last kernel version you were using that didn't show the issue?

Could you disable CONFIG_HANGCHECK_TIMER in your .config? Just to

Another thing to try: Boot with "clocksource=hpet" and verify that
avoids or triggers the issue. If it triggers the issue does it go away
with "clocksource=acpi_pm"?

thanks
-john
-

To: john stultz <johnstul@...>
Cc: Joshua Roys <roysjosh@...>, <linux-kernel@...>
Date: Thursday, October 25, 2007 - 7:28 pm

We're seeing reports of this in Fedora 8, kernel 2.6.23;

https://bugzilla.redhat.com/show_bug.cgi?id=319441
-

To: Chuck Ebbert <cebbert@...>
Cc: john stultz <johnstul@...>, <linux-kernel@...>
Date: Thursday, October 25, 2007 - 11:01 pm

Hello,

I tried 2.6.21 and still had the same problem, I was going to back
farther but then I decided to try a few things with my 2.6.23 config.

$ diff config-2.6.23 newconfig-2.6.23-gentoo
140,151d139
< CONFIG_CPU_FREQ=y
< CONFIG_CPU_FREQ_TABLE=y
< CONFIG_CPU_FREQ_DEBUG=y
< CONFIG_CPU_FREQ_STAT=y
< CONFIG_CPU_FREQ_DEFAULT_GOV_PERFORMANCE=y
< CONFIG_CPU_FREQ_GOV_PERFORMANCE=y
< CONFIG_CPU_FREQ_GOV_USERSPACE=y
< CONFIG_CPU_FREQ_GOV_ONDEMAND=y
< CONFIG_X86_ACPI_CPUFREQ=y
< CONFIG_X86_P4_CLOCKMOD=y
< CONFIG_X86_ACPI_CPUFREQ_PROC_INTF=y
< CONFIG_X86_SPEEDSTEP_LIB=y
488,490d477
< CONFIG_DEBUG_KERNEL=y
< CONFIG_DETECT_SOFTLOCKUP=y
< CONFIG_SCHED_DEBUG=y
492d478
< CONFIG_FORCED_INLINING=y

config-2.6.23 is the vanilla kernel's config that was experiencing the
hangs. 2.6.23-gentoo was also experiencing the hangs. Then, I
disabled CPU frequency scaling (and a few other things I guess,
looking at the diff..) and recompiled and.. no hangs yet. Uptime 3
and a half days.

If you wish I can use a vanilla kernel.

Joshua Roys
-

Previous thread: Re: Power button policy and mechanism by Bodo Eggert on Tuesday, October 16, 2007 - 4:29 pm. (1 message)

Next thread: [patch] sched: schedstat needs a diet by Ken Chen on Tuesday, October 16, 2007 - 4:37 pm. (7 messages)