Re: clock drift in set_task_cpu()

Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]
From: Peter Zijlstra
Date: Thursday, August 5, 2010 - 2:58 am

On Wed, 2010-07-21 at 17:10 +0530, Jack Daniel wrote:

What values get incorrect, do you observe vruntime funnies or only the
schedstat values?


Ah, so the problem is that task_fork_fair() does the task placement
without updated rq clocks? In which case I think we should at least do
an update_rq_clock(rq) in there (see the below patch).


sched_clock_stable being true implies the clock is stable across cores
and thus it shouldn't matter. Or are you saying you're seeing it being
set and still have issues?


The problem here is that your patch introduces the exact race I was
trying to close. We can only access rq->clock when also holding the
appropriate rq->lock, disabling IRQs, while also required is not
sufficient.

[ Also, that rmb() looks just plain wrong ]

Anyway, does the below cure your trouble? (Also, could you describe your
actually observed trouble in more detail?)

---

Subject: sched: ensure rq->clock get sync'ed when migrating tasks

sched_fork() -- we do task placement in ->task_fork_fair() ensure we
  update_rq_clock() so we work with current time. We leave the vruntime
  in relative state, so the time delay until wake_up_new_task() doesn't
  matter.

wake_up_new_task() -- Since task_fork_fair() left p->vruntime in
  relative state we can safely migrate, the activate_task() on the
  remote rq will call update_rq_clock() and causes the clock to be
  synced (enough).

try_to_wake_up() -- In case we'll migrate, we need to update the old
  rq->clock, the activate_task() in ttwu_activate() will already update
  the new rq->clock, and thus the clocks will get sync'ed.

load-balance -- Migrating running tasks always happens with both rq's
  locked, either through double_rq_lock() or double_lock_balance(). So
  sync the rq->clock there.

The problem seems to be that this all could result in too many rq->clock
updates, which are expensive.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
---
 kernel/sched.c      |   20 ++++++++++++++++++--
 kernel/sched_fair.c |    2 ++
 2 files changed, 20 insertions(+), 2 deletions(-)

diff --git a/kernel/sched.c b/kernel/sched.c
index d3c0262..69584b4 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -1756,13 +1756,22 @@ static int _double_lock_balance(struct rq *this_rq, struct rq *busiest)
  */
 static int double_lock_balance(struct rq *this_rq, struct rq *busiest)
 {
+	int ret;
+
+#ifdef CONFIG_SCHED_DEBUG
 	if (unlikely(!irqs_disabled())) {
 		/* printk() doesn't work good under rq->lock */
 		raw_spin_unlock(&this_rq->lock);
 		BUG_ON(1);
 	}
+#endif
+
+	ret = _double_lock_balance(this_rq, busiest);
+
+	update_rq_clock(this_rq);
+	update_rq_clock(busiest);
 
-	return _double_lock_balance(this_rq, busiest);
+	return ret;
 }
 
 static inline void double_unlock_balance(struct rq *this_rq, struct rq *busiest)
@@ -1782,7 +1791,9 @@ static void double_rq_lock(struct rq *rq1, struct rq *rq2)
 	__acquires(rq1->lock)
 	__acquires(rq2->lock)
 {
+#ifdef CONFIG_SCHED_DEBUG
 	BUG_ON(!irqs_disabled());
+#endif
 	if (rq1 == rq2) {
 		raw_spin_lock(&rq1->lock);
 		__acquire(rq2->lock);	/* Fake it out ;) */
@@ -1795,6 +1806,9 @@ static void double_rq_lock(struct rq *rq1, struct rq *rq2)
 			raw_spin_lock_nested(&rq1->lock, SINGLE_DEPTH_NESTING);
 		}
 	}
+
+	update_rq_clock(rq1);
+	update_rq_clock(rq2);
 }
 
 /*
@@ -2395,8 +2409,10 @@ static int try_to_wake_up(struct task_struct *p, unsigned int state,
 	}
 
 	cpu = select_task_rq(rq, p, SD_BALANCE_WAKE, wake_flags);
-	if (cpu != orig_cpu)
+	if (cpu != orig_cpu) {
 		set_task_cpu(p, cpu);
+		update_rq_clock(rq);
+	}
 	__task_rq_unlock(rq);
 
 	rq = cpu_rq(cpu);
diff --git a/kernel/sched_fair.c b/kernel/sched_fair.c
index 9910e1b..f816e74 100644
--- a/kernel/sched_fair.c
+++ b/kernel/sched_fair.c
@@ -3751,6 +3751,8 @@ static void task_fork_fair(struct task_struct *p)
 
 	raw_spin_lock_irqsave(&rq->lock, flags);
 
+	update_rq_clock(rq);
+
 	if (unlikely(task_cpu(p) != this_cpu))
 		__set_task_cpu(p, this_cpu);
 

--
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

Messages in current thread:
clock drift in set_task_cpu(), Jack Daniel, (Wed Jul 21, 4:40 am)
Re: clock drift in set_task_cpu(), Jack Daniel, (Wed Jul 21, 10:34 pm)
Re: clock drift in set_task_cpu(), Jack Daniel, (Fri Jul 30, 4:55 am)
Re: clock drift in set_task_cpu(), Peter Zijlstra, (Thu Aug 5, 2:58 am)
Re: clock drift in set_task_cpu(), Jack Daniel, (Mon Aug 9, 6:17 am)
Re: clock drift in set_task_cpu(), Philby John, (Mon Aug 9, 7:56 am)
[tip:sched/urgent] sched: Fix rq-&gt;clock synchronization wh ..., tip-bot for Peter Zi ..., (Fri Aug 20, 7:16 am)
Re: clock drift in set_task_cpu(), Jack Daniel, (Sun Sep 5, 11:34 pm)
Re: clock drift in set_task_cpu(), Ingo Molnar, (Sun Sep 5, 11:52 pm)