This commit makes the ltp cpuctl latency test #2 hang indefinitely:
commit b5d9d734a53e0204aab0089079cbde2a1285a38f
Author: Mike Galbraith <efault@gmx.de>
Date: Tue Sep 8 11:12:28 2009 +0200
sched: Ensure that a child can't gain time over it's parent after fork()
When I revert this commit the test progresses as it did in 2.6.31. I
have seen this issue on 2.6.32 and 2.6.32.19. The hang goes away in
2.6.33 starting with this commit:
commit 88ec22d3edb72b261f8628226cd543589a6d5e1b
Author: Peter Zijlstra <a.p.zijlstra@chello.nl>
Date: Wed Dec 16 18:04:41 2009 +0100
sched: Remove the cfs_rq dependency from set_task_cpu()
Even though this appears to be resolved in 2.6.33, I am reporting it
because 2.6.32 is the "long-term stable release".
My test system is a single socket dual core amd -
model name : Dual Core AMD Opteron(tm) Processor 180
with 4GB of RAM.
Kernel config file attached.
The issue is easily reproducible for me by downloading and building ltp,
then running
testcases/kernel/controllers/cpuctl/run_cpuctl_latency_test.sh 2
Please let me know if you need any other information to help reproduce
this issue.
Thanks
Josh
Ouch. Yeah, that commit is buggy, and never got fixed up in stable. Reverting it will restore a slightly less buggy, but not very good situation. Getting the fork problems all fixed up took a while. Excellent timing you have. I have a tree of backports, but I wasn't counting this commit as a must have, merely highly desirable. This No, the testcase works well. Thanks. -Mike --
(sent quilt stack offline, anybody else wants it, holler, and you'll receive one [absolutely free!] 50k tarball) --
Hi, Mike
On Wed, 25 Aug 2010 07:56:01 +0200
I'm interested in this problem, because I hit the same problem in RHEL6 beta2.
(It based on 2.6.32)
Are you writing a patch to solving this problem?
If you are doing, I can test it in RHEL6 beta2 (or latest).
Appendix.
I could reproduce this problem without ltp. See below.(case 1)
But if cpus are not completely busy, it couldn't occure.(case 2)
[case1]
1) Run busy loop process (number of cpu) in same cpu cgroup.
2) attach process to 1)'s cpu cgroup
-> attach process unfinished
Ex)
# mkdir /cgroup/cpu/test/tasks
# echo $$ > /cgroup/cpu/test/tasks
# ./loop 8 &
[1] 27202
# mpstat -P ALL 1
Linux 2.6.32-37.el6.x86_64 (StingerG.localdomain) 08/31/2010 _x86_64_ (8 CPU)
03:08:45 PM CPU %usr %nice %sys %iowait %irq %soft %steal %guest %idle
03:08:46 PM all 100.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
03:08:46 PM 0 100.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
03:08:46 PM 1 100.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
03:08:46 PM 2 100.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
03:08:46 PM 3 100.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
03:08:46 PM 4 100.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
03:08:46 PM 5 100.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
03:08:46 PM 6 100.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
03:08:46 PM 7 100.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
# echo $$ > /cgroup/cpu/tasks
# time echo $$ > /cgroup/cpu/test/tasks <- unfinish this operation
[case2]
# echo $$ > /cgroup/cpu/test/tasks
# ./loop 7 &
[1] 27259
# mpstat -P ALL 1
Linux 2.6.32-37.el6.x86_64 ...No, the necessary patches were already written. I just needed to I just sent a 50 patch series, ever so lovingly git am applied. git format-patch exported, then imported into evolution one darn patch at a time, to stable to either apply or bin as maintainers see fit. To test, all you should need to do is test mainline. If you'd like a quilt tarball against 32.21 anyway, just holler. The series has all the fork/exec/wakeup/hotplug yada yada fixes I think are mana from heaven for our long-term stable kernel. I may well get "are you outta yer ever lovin' mind?" back, but _I_ think it's needed, so... -Mike --
