Re: [performance bug] volanomark regression on 37-rc1

Previous thread: Are you travel savvy? by Travel Community Geeks on Tuesday, November 16, 2010 - 2:11 am. (1 message)

Next thread: [PATCH] perf: add no-aggregation mode to perf stat -a (v2) by Stephane Eranian on Tuesday, November 16, 2010 - 1:50 am. (1 message)
From: Alex,Shi
Date: Tuesday, November 16, 2010 - 2:34 am

When do performance testing on 37-rc1 kernel on Core2 machines, we find
the volanomark loopback performance drop about 30%, that due to
commit:fab476228ba37907ad7 

Volanomark link: http://www.volano.com/benchmarks.html 
Our volanomark testing parameters as following:
"-count 25000 -rooms 10 "
JVM is jrockit-R27.3.1-jre1.5.0_11
java_options is "-Xmx1500m -Xms1500m -Xns750m -XXaggressive -Xlargepages
-XXlazyUnlocking -Xgc:genpar -XXtlasize:min=16k,preferred=64k"
and we set /proc/sys/kernel/sched_compat_yield as "1". 

We find if with the following patch, the regression can be recovered. 

Signed-off-by:Alex Shi <alex.shi@intel.com>

diff --git a/kernel/sched_fair.c b/kernel/sched_fair.c
index f4f6a83..5dca678 100644
--- a/kernel/sched_fair.c
+++ b/kernel/sched_fair.c
@@ -1766,7 +1766,6 @@ static void pull_task(struct rq *src_rq, struct task_struct *p,
 	check_preempt_curr(this_rq, p, 0);
 
 	/* re-arm NEWIDLE balancing when moving tasks */
-	src_rq->avg_idle = this_rq->avg_idle = 2*sysctl_sched_migration_cost;
 	this_rq->idle_stamp = 0;
 }
 

It seems some of load_balance() is not necessary that caused by avg_idle
setting. But do not know more details of the volano running. Anyone like
to give a comments for this issue? 

Ncrao, I have no idea of your benchmarks, but just guess removing the
avg_idle setting won't bring much wakeup delay for tasks. Could you like
to show some data of this?

The vmstat output for .36 and .37-rc1 kernel as below: 
2.6.36 
procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
394  0      0 4680728   3168 118972    0    0     0     3 5314 1348711 38 62  0  0  0
396  0      0 4680500   3184 118976    0    0     0     8 5345 1303237 38 62  0  0  0
413  0      0 4680252   3200 118976    0    0     0     3 5082 1326851 38 61  0  0  0

2.6.37-rc1
procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------
 r  b  ...
From: Rakib Mullick
Date: Tuesday, November 16, 2010 - 7:38 am

Was that test was made before and after applying above commit? Would
love to know, how did you find that commit (I mean was it a git

Does VolanoMark is used for scheduler benchmarking? If I'm not wrong,

You are showing the output of .36 and .37-rc1. If Ncrao's commit is
guilty for this performance regression, then what are the results of
before and after applied Ncrao's commit. Then, what are the result
after applying your patch. You are showing vmstat output of .36 and
.37-rc1, which really doesn't prove the point of your patch. It needs
to be more clearer.


thanks,
--

From: Mike Galbraith
Date: Tuesday, November 16, 2010 - 8:26 am

It's not generally considered to be a wonderful benchmark, but it is a
good indicator, and worth keeping an eye on IMHO.

I don't recall whether that patch works with the idle testcase without
resetting the throttle, or if it's only a bit less effective.  If it's
only a little less effective, I'd be inclined to just whack the reset as
Alex did.  Whatever is done has to prevent high frequency balancing.

	-Mike

--

From: Nikhil Rao
Date: Tuesday, November 16, 2010 - 9:31 am

From what I recall, I think removing the reset makes the original
patch a little less effective. I agree that we can remove the reset if
it hurts high frequency balancing.

-Thanks,
Nikhil
--

From: Mike Galbraith
Date: Tuesday, November 16, 2010 - 10:32 am

Ok, let's do that.  I added your ack, OK?

From: Alex Shi <alex.shi@intel.com>
Date: Tue, 16 Nov 2010 17:34:02 +0800

    sched: volanomark regression fix

    Commit fab4762 triggers excessive idle balancing, causing a ~30% loss in
    volanomark throughput.  Remove idle balancing throttle reset.

    Signed-off-by: Mike Galbraith <efault@gmx.de>
    Acked-by: Nikhil Rao <ncrao@google.com>
    Reported-by: Alex Shi <alex.shi@intel.com>
    Cc: Ingo Molnar <mingo@elte.hu>
    Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
    LKML-Reference: <new-submission>

---
 kernel/sched_fair.c |    4 ----
 1 file changed, 4 deletions(-)

Index: linux-2.6/kernel/sched_fair.c
===================================================================
--- linux-2.6.orig/kernel/sched_fair.c
+++ linux-2.6/kernel/sched_fair.c
@@ -1758,10 +1758,6 @@ static void pull_task(struct rq *src_rq,
 	set_task_cpu(p, this_cpu);
 	activate_task(this_rq, p, 0);
 	check_preempt_curr(this_rq, p, 0);
-
-	/* re-arm NEWIDLE balancing when moving tasks */
-	src_rq->avg_idle = this_rq->avg_idle = 2*sysctl_sched_migration_cost;
-	this_rq->idle_stamp = 0;
 }
 
 /*


--

From: Nikhil Rao
Date: Tuesday, November 16, 2010 - 12:27 pm

From: Peter Zijlstra
Date: Tuesday, November 16, 2010 - 12:37 pm

Applied, thanks!
--

From: Alex,Shi
Date: Tuesday, November 16, 2010 - 6:17 pm

In the original source (.36 kernel) the rq->idle_stamp is set as zero
after task was pulled to this cpu in load_balance(). Nikhil move this
setting to pull_task(), that has same effect.
I don't know what the details effect of removing idle_stamp setting
instead of recovered it on idle_balance(). :) 

My machines are doing rc2 performance testing. I may try this patch
after testing finish. 

The following is part of Nikhil's old patch. 
===
@@ -3162,10 +3186,8 @@ static void idle_balance(int this_cpu, struct rq
*this_rq)
                interval = msecs_to_jiffies(sd->balance_interval);
                if (time_after(next_balance, sd->last_balance +
interval))
                        next_balance = sd->last_balance + interval;
-               if (pulled_task) {
-                       this_rq->idle_stamp = 0;
+               if (pulled_task)
                        break;
-               }
        }

Regards
Alex



--

From: Nikhil Rao
Date: Wednesday, November 17, 2010 - 12:45 pm

forgot to add cc-list in previous mail.

--

From: tip-bot for Alex Shi
Date: Thursday, November 18, 2010 - 7:08 am

Commit-ID:  b5482cfa1c95a188b3054fa33274806add91bbe5
Gitweb:     http://git.kernel.org/tip/b5482cfa1c95a188b3054fa33274806add91bbe5
Author:     Alex Shi <alex.shi@intel.com>
AuthorDate: Tue, 16 Nov 2010 17:34:02 +0800
Committer:  Ingo Molnar <mingo@elte.hu>
CommitDate: Thu, 18 Nov 2010 13:11:43 +0100

sched: Fix volanomark performance regression

Commit fab4762 triggers excessive idle balancing, causing a ~30% loss in
volanomark throughput. Remove idle balancing throttle reset.

Originally-by: Alex Shi <alex.shi@intel.com>
Signed-off-by: Mike Galbraith <efault@gmx.de>
Acked-by: Nikhil Rao <ncrao@google.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1289928732.5169.211.camel@maggy.simson.net>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
 kernel/sched_fair.c |    4 ----
 1 files changed, 0 insertions(+), 4 deletions(-)

diff --git a/kernel/sched_fair.c b/kernel/sched_fair.c
index 52ab113..ba0556d 100644
--- a/kernel/sched_fair.c
+++ b/kernel/sched_fair.c
@@ -1758,10 +1758,6 @@ static void pull_task(struct rq *src_rq, struct task_struct *p,
 	set_task_cpu(p, this_cpu);
 	activate_task(this_rq, p, 0);
 	check_preempt_curr(this_rq, p, 0);
-
-	/* re-arm NEWIDLE balancing when moving tasks */
-	src_rq->avg_idle = this_rq->avg_idle = 2*sysctl_sched_migration_cost;
-	this_rq->idle_stamp = 0;
 }
 
 /*
--

From: Alex,Shi
Date: Tuesday, November 16, 2010 - 5:23 pm

Yes, but lots of benchmarks often find other part kernel issues. like
hackbench/netperf often find VM performance issues. And an our cache
The vmstat for .36 represents original kernel, .37-rc1 represent with


--

From: Nikhil Rao
Date: Tuesday, November 16, 2010 - 9:21 am

I'm not very familiar with the Volanomark benchmark, but from the
patch you posted it looks like resetting the idle throttle (i.e.
making newidle balance more likely) hurts this benchmark. This is also
evident in the sharp increase in ctx rate in 2.6.37-rc1. Resetting
throttling was a heuristic added to induce more frequent idle
balancing after a migration, and it isn't strictly required to make
the other parts of that patch work. In your patch above, can you

The benchmark was pretty simple; run nr_cpu while-1 loops, one of them
niced to -15. I also experimented with some payloads with random
sleep/wakeup times, but nothing with high frequency balancing. I think
removing the reset should be OK for the idle test cases.

-Thanks,
Nikhil
--

Previous thread: Are you travel savvy? by Travel Community Geeks on Tuesday, November 16, 2010 - 2:11 am. (1 message)

Next thread: [PATCH] perf: add no-aggregation mode to perf stat -a (v2) by Stephane Eranian on Tuesday, November 16, 2010 - 1:50 am. (1 message)