Re: [PATCH] cgroups: fix API thinko

Previous thread: [PULL REQUEST] ext3 & quota fixes and cleanups for 2.6.36 by Jan Kara on Thursday, August 5, 2010 - 3:42 pm. (1 message)

Next thread: [git pull request] ACPICA patches for Linux 2.6.36-merge by Len Brown on Thursday, August 5, 2010 - 4:15 pm. (1 message)
From: Michael S. Tsirkin
Date: Thursday, August 5, 2010 - 3:59 pm

cgroup_attach_task_current_cg API that have upstream is backwards: we
really need an API to attach to the cgroups from another process A to
the current one.

In our case (vhost), a priveledged user wants to attach it's task to cgroups
from a less priveledged one, the API makes us run it in the other
task's context, and this fails.

So let's make the API generic and just pass in 'from' and 'to' tasks.
Add an inline wrapper for cgroup_attach_task_current_cg to avoid
breaking bisect.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---

Paul, Li, Sridhar, could you please review the following
patch?

I only compile-tested it due to travel, but looks
straight-forward to me.
Alex Williamson volunteered to test and report the results.
Sending out now for review as I might be offline for a bit.
Will only try to merge when done, obviously.

If OK, I would like to merge this through -net tree,
together with the patch fixing vhost-net.
Let me know if that sounds ok.

Thanks!

This patch is on top of net-next, it is needed for fix
vhost-net regression in net-next, where a non-priveledged
process can't enable the device anymore:

when qemu uses vhost, inside the ioctl call it
creates a thread, and tries to add
this thread to the groups of current, and it fails.
But we control the thread, so to solve the problem,
we really should tell it 'connect to out cgroups'.

What this patch does is add an API for that.

 include/linux/cgroup.h |   11 ++++++++++-
 kernel/cgroup.c        |    9 +++++----
 2 files changed, 15 insertions(+), 5 deletions(-)

diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h
index 43b2072..b38ec60 100644
--- a/include/linux/cgroup.h
+++ b/include/linux/cgroup.h
@@ -525,7 +525,11 @@ struct task_struct *cgroup_iter_next(struct cgroup *cgrp,
 void cgroup_iter_end(struct cgroup *cgrp, struct cgroup_iter *it);
 int cgroup_scan_tasks(struct cgroup_scanner *scan);
 int cgroup_attach_task(struct cgroup *, struct task_struct *);
-int ...
From: Alex Williamson
Date: Friday, August 6, 2010 - 8:09 am

This does seem to be working here, so please review and let us know if
this looks like a suitable interface.  Thanks,




--

From: Sridhar Samudrala
Date: Friday, August 6, 2010 - 9:34 am

So an unprivileged qemu cannot attach vhost thread to its own cgroups.
I guess you are planning to make the cgroup_attach_task_all() call in 
vhost_worker()
to attach itself to the cgroups of qemu. The new API looks fine, but the
name is little confusing. How about
Now that we are not operating on current, cur_cg should be renamed as 


--

From: Alex Williamson
Date: Friday, August 6, 2010 - 9:38 am

Yes, exactly.




--

From: Andrew Morton
Date: Wednesday, August 25, 2010 - 2:35 pm

On Fri, 06 Aug 2010 10:38:24 -0600

So am I correct to assume that this change is now needed in 2.6.36, and
unneeded in 2.6.35?

Can it affect the userspace<->kernel API in amy manner?  If so, it
should be backported into earlier kernels to reduce the number of
incompatible kernels out there.

Paul, did you have any comments?

I didn't see any update in response to the minor review comments, so...


 include/linux/cgroup.h |    1 +
 kernel/cgroup.c        |    6 +++---
 2 files changed, 4 insertions(+), 3 deletions(-)

diff -puN include/linux/cgroup.h~cgroups-fix-api-thinko-fix include/linux/cgroup.h
--- a/include/linux/cgroup.h~cgroups-fix-api-thinko-fix
+++ a/include/linux/cgroup.h
@@ -579,6 +579,7 @@ void cgroup_iter_end(struct cgroup *cgrp
 int cgroup_scan_tasks(struct cgroup_scanner *scan);
 int cgroup_attach_task(struct cgroup *, struct task_struct *);
 int cgroup_attach_task_all(struct task_struct *from, struct task_struct *);
+
 static inline int cgroup_attach_task_current_cg(struct task_struct *tsk)
 {
 	return cgroup_attach_task_all(current, tsk);
diff -puN kernel/cgroup.c~cgroups-fix-api-thinko-fix kernel/cgroup.c
--- a/kernel/cgroup.c~cgroups-fix-api-thinko-fix
+++ a/kernel/cgroup.c
@@ -1798,13 +1798,13 @@ out:
 int cgroup_attach_task_all(struct task_struct *from, struct task_struct *tsk)
 {
 	struct cgroupfs_root *root;
-	struct cgroup *cur_cg;
 	int retval = 0;
 
 	cgroup_lock();
 	for_each_active_root(root) {
-		cur_cg = task_cgroup_from_root(from, root);
-		retval = cgroup_attach_task(cur_cg, tsk);
+		struct cgroup *from_cg = task_cgroup_from_root(from, root);
+
+		retval = cgroup_attach_task(from_cg, tsk);
 		if (retval)
 			break;
 	}
_

--

From: Paul Menage
Date: Wednesday, August 25, 2010 - 7:08 pm

On Wed, Aug 25, 2010 at 2:35 PM, Andrew Morton

AFAICS it shouldn't affect any existing APIs, either in-kernel or to
userspace - it just makes the existing function
cgroup_attach_task_current_cg() a specialization of a more generic new

Other than the language being a bit confusing, it seems fine. I'd
probably word the patch description as:

Add cgroup_attach_task_all()

The existing cgroup_attach_task_current_cg() API is called by a thread
to attach another thread to all of its cgroups; this is unsuitable for
cases where a privileged task wants to attach itself to the cgroups
of a less privileged one, since the call must be made from the context
of the target task.

This patch adds a more generic cgroup_attach_task_all() API that
allows both the source task and to-be-moved task to be specified.
cgroup_attach_task_current_cg() becomes a specialization of the more
generic new function.

Acked-by: Paul Menage <menage@google.com>
--

From: Michael S. Tsirkin
Date: Tuesday, August 31, 2010 - 7:57 am

Yes, I think so. Unless there are objections, I intend to merge this
(with the review fixes) through net-2.6 together with a vhost-net patch

I think it doesn't affect anything except 2.6.36-rcX,
--

From: Li Zefan
Date: Tuesday, August 17, 2010 - 12:19 am

(Just came back from vacation)


Acked-by: Li Zefan <lizf@cn.fujitsu.com>


That's Ok.


a nitpick:

--

Previous thread: [PULL REQUEST] ext3 & quota fixes and cleanups for 2.6.36 by Jan Kara on Thursday, August 5, 2010 - 3:42 pm. (1 message)

Next thread: [git pull request] ACPICA patches for Linux 2.6.36-merge by Len Brown on Thursday, August 5, 2010 - 4:15 pm. (1 message)