Re: [BUG] hotplug cpus on ia64

Previous thread: kgdbts: BP mismatch c010852a expected c02753e0 by Justin Mattock on Tuesday, June 3, 2008 - 6:06 pm. (1 message)

Next thread: [parch 0/4] vfs: fix utimensat() non-conformances to spec by Michael Kerrisk on Tuesday, June 3, 2008 - 6:24 pm. (1 message)
To: Peter Zijlstra <a.p.zijlstra@...>
Cc: <sivanich@...>, <linux-kernel@...>
Date: Tuesday, June 3, 2008 - 6:17 pm

Yes! It does.

Dimitri Sivanich has run into what looks like a similar problem.
Hope the above workaround is a good clue to its solution.

--
Cliff Wickman
Silicon Graphics, Inc.
cpw@sgi.com
(651) 683-3824
--

To: Cliff Wickman <cpw@...>
Cc: <sivanich@...>, <linux-kernel@...>
Date: Thursday, June 5, 2008 - 8:49 am

Does the below fix it?

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
---
kernel/sched.c | 15 +++++--
kernel/sched_rt.c | 109 +++++++++++++++++++++++++++++++++++++++++++++++++++---
2 files changed, 115 insertions(+), 9 deletions(-)

Index: linux-2.6/kernel/sched_rt.c
===================================================================
--- linux-2.6.orig/kernel/sched_rt.c
+++ linux-2.6/kernel/sched_rt.c
@@ -280,6 +280,9 @@ static int balance_runtime(struct rt_rq
continue;

spin_lock(&iter->rt_runtime_lock);
+ if (iter->rt_runtime == RUNTIME_INF)
+ goto next;
+
diff = iter->rt_runtime - iter->rt_time;
if (diff > 0) {
do_div(diff, weight);
@@ -293,12 +296,105 @@ static int balance_runtime(struct rt_rq
break;
}
}
+next:
spin_unlock(&iter->rt_runtime_lock);
}
spin_unlock(&rt_b->rt_runtime_lock);

return more;
}
+
+static void __disable_runtime(struct rq *rq)
+{
+ struct root_domain *rd = rq->rd;
+ struct rt_rq *rt_rq;
+
+ if (unlikely(!scheduler_running))
+ return;
+
+ for_each_leaf_rt_rq(rt_rq, rq) {
+ struct rt_bandwidth *rt_b = sched_rt_bandwidth(rt_rq);
+ s64 want;
+ int i;
+
+ spin_lock(&rt_b->rt_runtime_lock);
+ spin_lock(&rt_rq->rt_runtime_lock);
+ if (rt_rq->rt_runtime == RUNTIME_INF ||
+ rt_rq->rt_runtime == rt_b->rt_runtime)
+ goto balanced;
+ spin_unlock(&rt_rq->rt_runtime_lock);
+
+ want = rt_b->rt_runtime - rt_rq->rt_runtime;
+
+ for_each_cpu_mask(i, rd->span) {
+ struct rt_rq *iter = sched_rt_period_rt_rq(rt_b, i);
+ s64 diff;
+
+ if (iter == rt_rq)
+ continue;
+
+ spin_lock(&iter->rt_runtime_lock);
+ if (want > 0) {
+ diff = min_t(s64, iter->rt_runtime, want);
+ iter->rt_runtime -= diff;
+ want -= diff;
+ } else {
+ iter->rt_runtime -= want;
+ want -= want;
+ }
+ spin_unlock(&iter->rt_runtime_lock);
+
+ if (...

To: Peter Zijlstra <a.p.zijlstra@...>
Cc: Cliff Wickman <cpw@...>, <sivanich@...>, <linux-kernel@...>
Date: Tuesday, June 10, 2008 - 6:19 am

while it's not the full fix i've applied it to tip/sched-devel for more
testing. Thanks,

Ingo
--

To: Peter Zijlstra <a.p.zijlstra@...>
Cc: Cliff Wickman <cpw@...>, <linux-kernel@...>
Date: Thursday, June 5, 2008 - 9:51 am

pid 4502's current affinity list: 3
pid 4502's current affinity list: 0,1,3
(above command now hangs)

(ps output)
0xe0000060b5650000 4502 4349 0 2 S 0xe0000060b5650390 bash
0xe0000060b8da0000 4843 4502 0 2 D 0xe0000060b8da0390 bash

Stack traceback for pid 4843
0xe0000060b8da0000 4843 4502 0 2 D 0xe0000060b8da0390 bash
0xa0000001007d44b0 schedule+0x1210
args (0xe0000060ba470ce4, 0xa000000100dae190, 0xe000006003129200, 0xa000000100084b70, 0x48c, 0xe0000060b8dafda8, 0xe000006003129200, 0x200, 0xe0000060f780fe80)
0xa0000001007d4ac0 schedule_timeout+0x40
args (0x7fffffffffffffff, 0x0, 0x0, 0xa0000001007d2f00, 0x309, 0xe000006003129200)
0xa0000001007d2f00 wait_for_common+0x240
args (0xe0000060b8dafe08, 0x7fffffffffffffff, 0x2, 0xa0000001007d3280, 0x207, 0xe0000060ba470070)
0xa0000001007d3280 wait_for_completion+0x40
args (0xe0000060b8dafe08, 0xa00000010008d990, 0x38a, 0xffffffffffff9200)
0xa00000010008d990 sched_exec+0x1b0
args (0x2, 0xe0000060ba470000, 0xe0000060ba470010, 0xe000006003129200, 0xa00000010017e980, 0x58e, 0xa00000010017dce0)
0xa00000010017e980 do_execve+0xa0
args (0xe0000060f39e5000, 0x60000000000394b0, 0x6000000000056150, 0xe0000060b8dafe40, 0xe0000060f799f100, 0xe0000060f799bb00, 0xe0000060f799bbd8, 0x60000000000620b1, 0xa000000100013940)
0xa000000100013940 sys_execve+0x60
args (0xe0000060f39e5000, 0xe0000060f39e5000, 0x6000000000056150, 0xe0000060b8dafe40, 0xa00000010000a270, 0x50e, 0x2000000000028490)
0xa00000010000a270 ia64_execve+0x30
args (0x60000000000620a0, 0x60000000000394b0, 0x6000000000056150, 0x0, 0xc00000000000058e, 0x400000000003d020, 0x60000000000394b0, 0x0, 0xa00000010000aba0)
0xa00000010000aba0 ia64_ret_from_syscall
args (0x60000000000620a0, 0x60000000000394b0, 0x6000000000056150, 0x0, 0xc00000000000058e, 0x400000000003d020, 0x60000000000394b0, 0x0)
0xa000000000010720 __kernel_syscall_via_break
args (0x60000000000620a0, 0x600...

To: Dimitri Sivanich <sivanich@...>
Cc: Cliff Wickman <cpw@...>, <linux-kernel@...>
Date: Thursday, June 5, 2008 - 10:18 am

Humpfh :-( I'll continue looking then...

Thanks for testing.

--

To: Cliff Wickman <cpw@...>
Cc: Peter Zijlstra <a.p.zijlstra@...>, <linux-kernel@...>
Date: Wednesday, June 4, 2008 - 9:50 am

This fixes the problem I was seeing as well.
--

Previous thread: kgdbts: BP mismatch c010852a expected c02753e0 by Justin Mattock on Tuesday, June 3, 2008 - 6:06 pm. (1 message)

Next thread: [parch 0/4] vfs: fix utimensat() non-conformances to spec by Michael Kerrisk on Tuesday, June 3, 2008 - 6:24 pm. (1 message)