Add an rcu_read_lock() / rcu_read_unlock() pair to protect a fork-time cgroup access. This seems likely to be a false positive. Located by: Alessio Igor Bogani <abogani@texware.it> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> --- sched.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/kernel/sched.c b/kernel/sched.c index 9ab3cd7..d4bb5e0 100644 --- a/kernel/sched.c +++ b/kernel/sched.c @@ -2621,7 +2621,9 @@ void sched_fork(struct task_struct *p, int clone_flags) if (p->sched_class->task_fork) p->sched_class->task_fork(p); + rcu_read_lock(); set_task_cpu(p, cpu); + rcu_read_unlock(); #if defined(CONFIG_SCHEDSTATS) || defined(CONFIG_TASK_DELAY_ACCT) if (likely(sched_info_on())) --
What got accessed? This patch just looks wrong. --
So the only cgroup thing I can find is set_task_rq()'s task_group() usage, which does indeed look like it wants RCU, but then, why would only the sched_fork() usage of set_task_cpu() need this. If it's needed (possible) then set_task_rq() needs it unconditionally. That said, the whole task_group stuff is tied to the cgroup muck, so it shouldn't be possible for the task_group to disappear on us, but then, I always sorta glaze over when I get near the cgroup core. --
Hey, if it was easy, I might have gotten it right on the first try! ;-) Thanx, Paul --
It might well be wrong. The lockdep RCU splat triggered in
task_subsys_state():
static inline struct cgroup_subsys_state *task_subsys_state(
struct task_struct *task, int subsys_id)
{
return rcu_dereference_check(task->cgroups->subsys[subsys_id],
rcu_read_lock_held() ||
cgroup_lock_is_held());
}
My thought was that this access was safe due to the fact that we were
in the middle of creating a task, so that no other CPU could access it.
But I am not certain of this, given the fact that this is digging through
cgroup state.
The lockdep splat is here: http://pastebin.ubuntu.com/406131/
Suggestions?
Thanx, Paul
--
And it appears that my patch is at best insufficient: http://paste.ubuntu.com/406189/ Left to myself, I would wrap copy_process() with rcu_read_lock(), but I would rather hear your thoughts before doing too much more semi-random hacking. ;-) Thanx, Paul --
Well, I don't think you can get away with that, copy_process() wants to sleep on quite a few places ;-) Also, locks should be taken at the smallest possible scope, unless we want to go back to BKL style locking :-) As to that freezer splat, you'd have to chase down the cgroup folks, I'm fully ignorant on that. For the set_task_rq() one, I'm afraid someone (which again I'm afraid will be me) will have to look into how the task_group muck ties into the cgroup muck as I think the original authors of that ran off :/ --
K, adding them to CC. The two splats are: http://pastebin.ubuntu.com/406131/ http://paste.ubuntu.com/406189/ Some additional RCU protection is required, or perhaps some suppression Sigh! Thanx, Paul --
On Mon, Mar 29, 2010 at 2:15 PM, Paul E. McKenney I think you're right that this is a false positive - it would only be a problem if it were possible for the task to be moved to a different cgroup, and I think that shouldn't be the case at this point in the fork path since the new process isn't visible on the tasklist yet, right? Paul --
You are correct, it is not yet on the tasklist. So I have to ask... What happens if the underlying cgroup is removed between the time sched_fork() calls set_task_cpu() and the time that copy_process() puts the new task on the tasklist? Or is the initial cgroup guaranteed to be immortal? Thanx, Paul --
On Mon, Mar 29, 2010 at 4:05 PM, Paul E. McKenney As long as your code is running after cgroup_fork() - which sched_fork() is - then it should be OK. cgroup_fork() takes a reference count on the parent's cgroups set, which implicitly keeps all of those cgroups alive until the task either exits or is moved. But it can't be moved until it's visible on the task list. Possibly dup_task_struct should do tsk->cgroups = NULL so that (currently) unsafe references to the un-refcounted tsk->cgroups before cgroup_fork() get caught. Paul --
Well the thing is, this fork time invocation of set_task_cpu()->set_task_rq() is in no way special, there's multiple places like that. --
Certainly the lack of clarity about why this access is safe indicates that recording some of the locking design in rcu_dereferenceI() would be quite helpful. ;-) Thanx, Paul --
