When do performance testing on 37-rc1 kernel on Core2 machines, we find the volanomark loopback performance drop about 30%, that due to commit:fab476228ba37907ad7 Volanomark link: http://www.volano.com/benchmarks.html Our volanomark testing parameters as following: "-count 25000 -rooms 10 " JVM is jrockit-R27.3.1-jre1.5.0_11 java_options is "-Xmx1500m -Xms1500m -Xns750m -XXaggressive -Xlargepages -XXlazyUnlocking -Xgc:genpar -XXtlasize:min=16k,preferred=64k" and we set /proc/sys/kernel/sched_compat_yield as "1". We find if with the following patch, the regression can be recovered. Signed-off-by:Alex Shi <alex.shi@intel.com> diff --git a/kernel/sched_fair.c b/kernel/sched_fair.c index f4f6a83..5dca678 100644 --- a/kernel/sched_fair.c +++ b/kernel/sched_fair.c @@ -1766,7 +1766,6 @@ static void pull_task(struct rq *src_rq, struct task_struct *p, check_preempt_curr(this_rq, p, 0); /* re-arm NEWIDLE balancing when moving tasks */ - src_rq->avg_idle = this_rq->avg_idle = 2*sysctl_sched_migration_cost; this_rq->idle_stamp = 0; } It seems some of load_balance() is not necessary that caused by avg_idle setting. But do not know more details of the volano running. Anyone like to give a comments for this issue? Ncrao, I have no idea of your benchmarks, but just guess removing the avg_idle setting won't bring much wakeup delay for tasks. Could you like to show some data of this? The vmstat output for .36 and .37-rc1 kernel as below: 2.6.36 procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------ r b swpd free buff cache si so bi bo in cs us sy id wa st 394 0 0 4680728 3168 118972 0 0 0 3 5314 1348711 38 62 0 0 0 396 0 0 4680500 3184 118976 0 0 0 8 5345 1303237 38 62 0 0 0 413 0 0 4680252 3200 118976 0 0 0 3 5082 1326851 38 61 0 0 0 2.6.37-rc1 procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------ r b ...
Was that test was made before and after applying above commit? Would love to know, how did you find that commit (I mean was it a git Does VolanoMark is used for scheduler benchmarking? If I'm not wrong, You are showing the output of .36 and .37-rc1. If Ncrao's commit is guilty for this performance regression, then what are the results of before and after applied Ncrao's commit. Then, what are the result after applying your patch. You are showing vmstat output of .36 and .37-rc1, which really doesn't prove the point of your patch. It needs to be more clearer. thanks, --
It's not generally considered to be a wonderful benchmark, but it is a good indicator, and worth keeping an eye on IMHO. I don't recall whether that patch works with the idle testcase without resetting the throttle, or if it's only a bit less effective. If it's only a little less effective, I'd be inclined to just whack the reset as Alex did. Whatever is done has to prevent high frequency balancing. -Mike --
From what I recall, I think removing the reset makes the original patch a little less effective. I agree that we can remove the reset if it hurts high frequency balancing. -Thanks, Nikhil --
Ok, let's do that. I added your ack, OK?
From: Alex Shi <alex.shi@intel.com>
Date: Tue, 16 Nov 2010 17:34:02 +0800
sched: volanomark regression fix
Commit fab4762 triggers excessive idle balancing, causing a ~30% loss in
volanomark throughput. Remove idle balancing throttle reset.
Signed-off-by: Mike Galbraith <efault@gmx.de>
Acked-by: Nikhil Rao <ncrao@google.com>
Reported-by: Alex Shi <alex.shi@intel.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <new-submission>
---
kernel/sched_fair.c | 4 ----
1 file changed, 4 deletions(-)
Index: linux-2.6/kernel/sched_fair.c
===================================================================
--- linux-2.6.orig/kernel/sched_fair.c
+++ linux-2.6/kernel/sched_fair.c
@@ -1758,10 +1758,6 @@ static void pull_task(struct rq *src_rq,
set_task_cpu(p, this_cpu);
activate_task(this_rq, p, 0);
check_preempt_curr(this_rq, p, 0);
-
- /* re-arm NEWIDLE balancing when moving tasks */
- src_rq->avg_idle = this_rq->avg_idle = 2*sysctl_sched_migration_cost;
- this_rq->idle_stamp = 0;
}
/*
--
In the original source (.36 kernel) the rq->idle_stamp is set as zero
after task was pulled to this cpu in load_balance(). Nikhil move this
setting to pull_task(), that has same effect.
I don't know what the details effect of removing idle_stamp setting
instead of recovered it on idle_balance(). :)
My machines are doing rc2 performance testing. I may try this patch
after testing finish.
The following is part of Nikhil's old patch.
===
@@ -3162,10 +3186,8 @@ static void idle_balance(int this_cpu, struct rq
*this_rq)
interval = msecs_to_jiffies(sd->balance_interval);
if (time_after(next_balance, sd->last_balance +
interval))
next_balance = sd->last_balance + interval;
- if (pulled_task) {
- this_rq->idle_stamp = 0;
+ if (pulled_task)
break;
- }
}
Regards
Alex
--
forgot to add cc-list in previous mail. --
Commit-ID: b5482cfa1c95a188b3054fa33274806add91bbe5 Gitweb: http://git.kernel.org/tip/b5482cfa1c95a188b3054fa33274806add91bbe5 Author: Alex Shi <alex.shi@intel.com> AuthorDate: Tue, 16 Nov 2010 17:34:02 +0800 Committer: Ingo Molnar <mingo@elte.hu> CommitDate: Thu, 18 Nov 2010 13:11:43 +0100 sched: Fix volanomark performance regression Commit fab4762 triggers excessive idle balancing, causing a ~30% loss in volanomark throughput. Remove idle balancing throttle reset. Originally-by: Alex Shi <alex.shi@intel.com> Signed-off-by: Mike Galbraith <efault@gmx.de> Acked-by: Nikhil Rao <ncrao@google.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1289928732.5169.211.camel@maggy.simson.net> Signed-off-by: Ingo Molnar <mingo@elte.hu> --- kernel/sched_fair.c | 4 ---- 1 files changed, 0 insertions(+), 4 deletions(-) diff --git a/kernel/sched_fair.c b/kernel/sched_fair.c index 52ab113..ba0556d 100644 --- a/kernel/sched_fair.c +++ b/kernel/sched_fair.c @@ -1758,10 +1758,6 @@ static void pull_task(struct rq *src_rq, struct task_struct *p, set_task_cpu(p, this_cpu); activate_task(this_rq, p, 0); check_preempt_curr(this_rq, p, 0); - - /* re-arm NEWIDLE balancing when moving tasks */ - src_rq->avg_idle = this_rq->avg_idle = 2*sysctl_sched_migration_cost; - this_rq->idle_stamp = 0; } /* --
Yes, but lots of benchmarks often find other part kernel issues. like hackbench/netperf often find VM performance issues. And an our cache The vmstat for .36 represents original kernel, .37-rc1 represent with --
I'm not very familiar with the Volanomark benchmark, but from the patch you posted it looks like resetting the idle throttle (i.e. making newidle balance more likely) hurts this benchmark. This is also evident in the sharp increase in ctx rate in 2.6.37-rc1. Resetting throttling was a heuristic added to induce more frequent idle balancing after a migration, and it isn't strictly required to make the other parts of that patch work. In your patch above, can you The benchmark was pretty simple; run nr_cpu while-1 loops, one of them niced to -15. I also experimented with some payloads with random sleep/wakeup times, but nothing with high frequency balancing. I think removing the reset should be OK for the idle test cases. -Thanks, Nikhil --
