On Wed, 17 Sep 2008 22:48:55 +0100 Percentage Scheduler: waiting for cpu 208 msec 59.4 % you're rather CPU bound, and your process was woken up but didn't run for over 200 milliseconds.. that sounds like a scheduler fairness issue! --
I'm at a bit of a loss to explain this. I'm not using the group scheduler and the load average is less than 0.4. CPU usage does seem to spike a fair bit (I'm not sure why rhythmbox needs 50%+ for decoding oggs from time to time) but there always seems to be 20% CPU free... I don't know where to start debugging this one and I'm suspicious of the way the problem would happen every 30 seconds too... (this laptop is using ath5k for its wifi) --
Really hard subject. Perfect fairness requires 0 latency - which with a CPU only being able to run one thing at a time is impossible. So what latency ends up being is a measure for the convergence towards fairness. Anyway - 200ms isn't too weird depending on the circumstances. We start out with a 20ms latency for UP, we then multiply with 1+log2(nr_cpus) which in say a quad core machine ends up with 60ms. That ought to mean that under light load the max latency should not exceed twice that (basically a consequence of the Nyquist-Shannon sampling theorem IIRC). Now, if you get get under some load (by default: nr_running > 5) the doing much (and after having looked up the original email I see its a eeeeeeeee atom - which is dual cpu iirc, so that yields 40ms default) - so 200 is definately on the high side. What you can do to investigate this, is use the sched_wakeup tracer from ftrace, that should give a function trace of the highest wakeup latency showing what the kernel is doing. --
I struggled to find documentation of ftrace because it's quite new. I have come across http://www.redhat.com/docs/en-US/Red_Hat_Enterprise_MRG/1.0/html/Realtime_Tuning_Guide... and http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=blob;f=Documentatio... . Thanks to those I started up the debugfs filesystem and went to the trace directories but the on tracers in available_tracers are ftrace sched_switch none I can't see anything in the code that would disable wakeup... Any ideas on what might be wrong? I'm using a 2.6.27rc6 kernel. Additionally I think I found a trigger - unplugging the power cable from the EeePC and having it run on battery seems to then set off this periodic stall every 30 seconds... There's no CPU frequency scaling enabled either (Celeron M's seemingly don't have P states and support for cpufreq is configured out). --
here's two quick howtos: http://redhat.com/~mingo/sched-devel.git/readme-tracer.txt http://redhat.com/~mingo/sched-devel.git/howto-trace-latencies.txt you need to enable: CONFIG_SCHED_TRACER=y CONFIG_CONTEXT_SWITCH_TRACER=y it's not particularly well named though. Why doesnt it say sounds like potential SMM triggered latencies. Ingo --
