On Mon, 2008-04-07 at 13:08 -0700, Roland McGrath wrote:
Okay. One of the paths to the update code is through update_curr() in
sched_fair.c, which (in my tree) calls account_group_exec_runtime() to
update the sum_exec_runtime field:
delta_exec = (unsigned long)(now - curr->exec_start);
__update_curr(cfs_rq, curr, delta_exec);
curr->exec_start = now;
if (entity_is_task(curr)) {
struct task_struct *curtask = task_of(curr);
cpuacct_charge(curtask, delta_exec);
account_group_exec_runtime(curtask, delta_exec);
}
To make sure that I understand what's going on, I put an invariant at
the beginning of account_group_exec_runtime():
static inline void account_group_exec_runtime(struct task_struct *tsk,
unsigned long long ns)
{
struct signal_struct *sig = tsk->signal;
struct task_cputime *times;
BUG_ON(tsk != current);
if (unlikely(tsk->exit_state))
return;
if (!sig->cputime.totals)
return;
times = per_cpu_ptr(sig->cputime.totals, get_cpu());
times->sum_exec_runtime += ns;
put_cpu_no_resched();
}
And, you guessed it, the invariant gets violated. Apparently the passed
task_struct isn't the same as "current" at this point.
Any ideas? Am I checking the wrong thing? If we're really not updating
current then the task we are updating could very easily be running
through __exit_signal() on another CPU. (And while I wait for your
response I will of course continue to try to figure this out.)
--
Frank Mayhar <fmayhar@google.com>
Google, Inc.
--