On 09/23, Paul E. McKenney wrote:even if rcu lock/unlock happen on different CPUs, rcu_flipctr is not "shared", we remember the index, but not the CPU, we always modify the "local" data. Thanks a lot! Actually, this explains most of my questions. I was greatly confused even before I started to read the code. Looking at rcu_flipctr[2] I wrongly assumed that these 2 counters "belong" to different GPs. When I started to understand the reality, I was confused by GP_STAGES == 4 ;) Thanks! yes, this is clear Yes, yes, I see now. We really need this barriers, except I think rcu_try_flip_idle() can use wmb. However, I have a bit offtopic question, // rcu_try_flip_waitzero() if (A == 0) { mb(); B == 0; } Do we really need the mb() in this case? How it is possible that STORE goes before LOAD? "Obviously", the LOAD should be completed first, no? Oooh... well... I need to think more about your explanation :) Hmm. Still can't understand. Suppose that we are doing call_rcu(), and __rcu_advance_callbacks() sees rdp->completed == rcu_ctrlblk.completed but rcu_flip_flag = rcu_flipped (say, another CPU does rcu_try_flip_idle() in between). We ack the flip, call_rcu() enables irqs, the timer interrupt calls __rcu_advance_callbacks() again and moves the callbacks. So, it is still possible that "move callbacks" and "ack the flip" happen out of order. But why this is bad? This can't "speedup" the moving of our callbacks from next to done lists. Yes, RCU state machine can switch to the next state/stage, but this looks safe to me. Help! The last question, rcu_check_callbacks does if (rcu_ctrlblk.completed == rdp->completed) rcu_try_flip(); Could you clarify the check above? Afaics this is just optimization, technically it is correct to rcu_try_flip() at any time, even if ->completed are not synchronized. Most probably in that case rcu_try_flip_waitack() will fail, but nothing bad can happen, yes? oldmask could be obsolete now. Suppose that the admin moves that task to some cpuset or just changes its ->cpus_allowed while it does synchronize_sched(). I think there is another problem. It would be nice to eliminate taking the global sched_hotcpu_mutex in sched_setaffinity() (I think without CONFIG_HOTPLUG_CPU it is not needed right now). In that case sched_setaffinity(0, cpumask_of_cpu(cpu)) can in fact return on the "wrong" CPU != cpu if another thread changes our affinity in parallel, and this breaks synchronize_sched(). Can't we do something different? Suppose that we changed migration_thread(), something like (roughly) - __migrate_task(req->task, cpu, req->dest_cpu); + if (req->task) + __migrate_task(req->task, cpu, req->dest_cpu); + else + schedule(); // unneeded, mb() is enough? complete(&req->done); Now, void synchronize_sched(void) { struct migration_req req; req->task = NULL; init_completion(&req.done); for_each_online_cpu(cpu) { struct rq *rq = cpu_rq(cpu); int online; spin_lock_irq(&rq->lock); online = cpu_online(cpu); // HOTPLUG_CPU if (online) { list_add(&req->list, &rq->migration_queue); req.done.done = 0; } spin_unlock_irq(&rq->lock); if (online) { wake_up_process(rq->migration_thread); wait_for_completion(&req.done); } } } Alternatively, we can use schedule_on_each_cpu(), but it has other disadvantages. Thoughts? Oleg. -
| Scott Preece | Re: Linux Foundation Technical Advisory Board Elections |
| Luis R. Rodriguez | Re: [Announce] Linux-tiny project revival |
| Andrew Morton | 2.6.23-rc1-mm2 |
| Dave Hansen | [PATCH 02/24] rearrange may_open() to be r/o friendly |
git: | |
| David Miller | [GIT]: Networking |
| David Miller | Re: [BUG] New Kernel Bugs |
| David Miller | Re: [PATCH] pkt_sched: Destroy gen estimators under rtnl_lock(). |
| Gerrit Renker | [PATCH 27/37] dccp: Integration of dynamic feature activation - part 2 (server side) |
