On Fri, Mar 28, 2008 at 1:23 AM, Balbir Singh <balbir@linux.vnet.ibm.com> wrote:
I'm not convinced that we need the spinlock. Just use the simple rule
that you can only modify mm->owner if:
- mm->owner points to current
- the new owner is a user of mm
- you hold task_lock() for the new owner (which is necessary anyway to
ensure that the new owner's mm doesn't change while you're updating
mm->owner)
and I think everything is fine without an additional lock.
Do we need this? p->mm can't go away if we're in the middle of forking it.
I suspect that you meant this to be spin_lock(&mm->owner_lock).
I'm not sure I understand what this is doing.
I read it as "if p has its own mm and p is a child thread, set
p->mm->owner to p->group_leader". But by definition if p has its own
mm, then p->group_leader->mm will be different to p->mm, therefore
we'd end up with mm->owner->mm != mm, which seems very bad.
What's the intention of this bit of code?
Is this stale?
And this?
I think it would be better to make this static inline in the header
file - it's just two indexed dereferences, so hardly worth the
function call overhead.
No, controller code shouldn't be changing mm->owner.
And surely we don't need mm_init_cgroup() and mm_free_cgroup() any longer?
Why is it OK to take away the rcu_read_lock() here? We're still doing
an rcu_dereference().
This shouldn't be in here - it should be in the core code that sets up init_mm.
The only way that rcu_read_lock() helps here is if mm freeing is
protected by RCU, which I don't think is the case.
But as long as p==current, there's no race, since no other process
will re-point mm->owner at themselves, so mm can't go away anyway
since we have a reference to it that we're going to be dropping soon.
Is there ever a case where we'd want to call this on anything other
than current? It would simplify the code to just refer to current
rather than tsk.
I'd be inclined to make this BUG_ON(p != current), or just have p as a
local variable initialized from current. (If you're trying to save
multiple calls to current on arches where it's not just a simple
register).
We need to keep checking mm_need_new_owner() since it can become false
if the only other user of the mm exits at the same time that we do.
(In which case there's nothing to do).
Is there a reason to not code this as for_each_thread?
This can break if c is also exiting and has passed the call to
mm_update_next_owner() by the time we assign mm->owner. That's why my
original suggested version had a function like:
static inline void try_give_mm_ownership(struct task_struct *task,
struct mm_struct *mm) {
if (task->mm != mm) return;
task_lock(task);
if (task->mm == mm) {
mm->owner = task;
}
task_unlock(task);
}
i.e. determining that a task is a valid candidate and updating the
owner pointer has to be done in the same critical section.
Also, looking forward to when we have the virtual AS limits
controller, in the (unlikely?) event that the new owner is in a
different virtual AS limit control group, this code will need to be
able to handle shifting the mm->total_mm from the old AS cgroup to the
new one. That's the "fiddly layer violation" that I mentioned earlier.
It might be cleaner to be able to specify on a per-subsystem basis
whether we require that all users of an mm be in the same cgroup.
Maybe this should select MM_OWNER rather than depending on it?
Paul
--