Re: [patch] sched: optimize siblings status check logic in wake_idle()

Previous thread: Re: [PATCH 1/3] libata: add missing PM callbacks by Tejun Heo on Friday, March 2, 2007 - 8:11 pm. (5 messages)

Next thread: Linux 2.6.19.7 by Greg KH on Friday, March 2, 2007 - 10:33 pm. (2 messages)
From: Siddha, Suresh B
Date: Friday, March 2, 2007 - 9:23 pm

When a logical cpu 'x' already has more than one process running, then most likely
the siblings of that cpu 'x' must be busy. Otherwise the idle siblings
would have likely(in most of the scenarios) picked up the extra load making
the load on 'x' atmost one.

Use this logic to eliminate the siblings status check and minimize the cache
misses encountered on a heavily loaded system.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
---

diff --git a/kernel/sched.c b/kernel/sched.c
index 0dc7572..d1ecc56 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -1368,7 +1368,16 @@ static int wake_idle(int cpu, struct task_struct *p)
 	struct sched_domain *sd;
 	int i;
 
-	if (idle_cpu(cpu))
+	/*
+	 * If it is idle, then it is the best cpu to run this task.
+	 *
+	 * This cpu is also the best, if it has more than one task already.
+	 * Siblings must be also busy(in most cases) as they didn't already
+	 * pickup the extra load from this cpu and hence we need not check
+	 * sibling runqueue info. This will avoid the checks and cache miss
+	 * penalities associated with that.
+	 */
+	if (idle_cpu(cpu) || cpu_rq(cpu)->nr_running > 1)
 		return cpu;
 
 	for_each_domain(cpu, sd) {
-

From: Nick Piggin
Date: Sunday, March 4, 2007 - 7:35 pm

Well it does increase the cacheline footprint a bit, but all cachelines
should be local to our L1 cache, presuming you don't have any CPUs where
threads have seperate caches.

-

From: Siddha, Suresh B
Date: Sunday, March 4, 2007 - 9:13 pm

Its more of a theory. There will be some conditions that this won't be true but

These wakeup's can happen across SMP and NUMA domains. In those cases, most likely
the sibling runqueue lines won't be in the caches. This has nothing to do with

On a 16 node system, we have seen ~1.25% perf improvement on a database workload
when we completely short circuited wake_idle(). This patch is trying to comeup
with a best compromise to avoid the cache misses and also minimize the latenices,
perf impact.

thanks,
suresh
-

From: Nick Piggin
Date: Sunday, March 4, 2007 - 9:58 pm

Hmm, I wonder what if we only wake_idle if the wakeup comes from this
CPU or a sibling? That's probably going to have downsides in some
workloads as well, though.

-

From: Siddha, Suresh B
Date: Sunday, March 4, 2007 - 9:24 pm

yep. I thought about it and thought this patch is a decent solution.

thanks,
suresh
-

Previous thread: Re: [PATCH 1/3] libata: add missing PM callbacks by Tejun Heo on Friday, March 2, 2007 - 8:11 pm. (5 messages)

Next thread: Linux 2.6.19.7 by Greg KH on Friday, March 2, 2007 - 10:33 pm. (2 messages)