cgroup_attach_task_current_cg API that have upstream is backwards: we really need an API to attach to the cgroups from another process A to the current one. In our case (vhost), a priveledged user wants to attach it's task to cgroups from a less priveledged one, the API makes us run it in the other task's context, and this fails. So let's make the API generic and just pass in 'from' and 'to' tasks. Add an inline wrapper for cgroup_attach_task_current_cg to avoid breaking bisect. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> --- Paul, Li, Sridhar, could you please review the following patch? I only compile-tested it due to travel, but looks straight-forward to me. Alex Williamson volunteered to test and report the results. Sending out now for review as I might be offline for a bit. Will only try to merge when done, obviously. If OK, I would like to merge this through -net tree, together with the patch fixing vhost-net. Let me know if that sounds ok. Thanks! This patch is on top of net-next, it is needed for fix vhost-net regression in net-next, where a non-priveledged process can't enable the device anymore: when qemu uses vhost, inside the ioctl call it creates a thread, and tries to add this thread to the groups of current, and it fails. But we control the thread, so to solve the problem, we really should tell it 'connect to out cgroups'. What this patch does is add an API for that. include/linux/cgroup.h | 11 ++++++++++- kernel/cgroup.c | 9 +++++---- 2 files changed, 15 insertions(+), 5 deletions(-) diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h index 43b2072..b38ec60 100644 --- a/include/linux/cgroup.h +++ b/include/linux/cgroup.h @@ -525,7 +525,11 @@ struct task_struct *cgroup_iter_next(struct cgroup *cgrp, void cgroup_iter_end(struct cgroup *cgrp, struct cgroup_iter *it); int cgroup_scan_tasks(struct cgroup_scanner *scan); int cgroup_attach_task(struct cgroup *, struct task_struct *); -int ...
This does seem to be working here, so please review and let us know if this looks like a suitable interface. Thanks, --
So an unprivileged qemu cannot attach vhost thread to its own cgroups. I guess you are planning to make the cgroup_attach_task_all() call in vhost_worker() to attach itself to the cgroups of qemu. The new API looks fine, but the name is little confusing. How about Now that we are not operating on current, cur_cg should be renamed as --
Yes, exactly. --
On Fri, 06 Aug 2010 10:38:24 -0600
So am I correct to assume that this change is now needed in 2.6.36, and
unneeded in 2.6.35?
Can it affect the userspace<->kernel API in amy manner? If so, it
should be backported into earlier kernels to reduce the number of
incompatible kernels out there.
Paul, did you have any comments?
I didn't see any update in response to the minor review comments, so...
include/linux/cgroup.h | 1 +
kernel/cgroup.c | 6 +++---
2 files changed, 4 insertions(+), 3 deletions(-)
diff -puN include/linux/cgroup.h~cgroups-fix-api-thinko-fix include/linux/cgroup.h
--- a/include/linux/cgroup.h~cgroups-fix-api-thinko-fix
+++ a/include/linux/cgroup.h
@@ -579,6 +579,7 @@ void cgroup_iter_end(struct cgroup *cgrp
int cgroup_scan_tasks(struct cgroup_scanner *scan);
int cgroup_attach_task(struct cgroup *, struct task_struct *);
int cgroup_attach_task_all(struct task_struct *from, struct task_struct *);
+
static inline int cgroup_attach_task_current_cg(struct task_struct *tsk)
{
return cgroup_attach_task_all(current, tsk);
diff -puN kernel/cgroup.c~cgroups-fix-api-thinko-fix kernel/cgroup.c
--- a/kernel/cgroup.c~cgroups-fix-api-thinko-fix
+++ a/kernel/cgroup.c
@@ -1798,13 +1798,13 @@ out:
int cgroup_attach_task_all(struct task_struct *from, struct task_struct *tsk)
{
struct cgroupfs_root *root;
- struct cgroup *cur_cg;
int retval = 0;
cgroup_lock();
for_each_active_root(root) {
- cur_cg = task_cgroup_from_root(from, root);
- retval = cgroup_attach_task(cur_cg, tsk);
+ struct cgroup *from_cg = task_cgroup_from_root(from, root);
+
+ retval = cgroup_attach_task(from_cg, tsk);
if (retval)
break;
}
_
--
On Wed, Aug 25, 2010 at 2:35 PM, Andrew Morton AFAICS it shouldn't affect any existing APIs, either in-kernel or to userspace - it just makes the existing function cgroup_attach_task_current_cg() a specialization of a more generic new Other than the language being a bit confusing, it seems fine. I'd probably word the patch description as: Add cgroup_attach_task_all() The existing cgroup_attach_task_current_cg() API is called by a thread to attach another thread to all of its cgroups; this is unsuitable for cases where a privileged task wants to attach itself to the cgroups of a less privileged one, since the call must be made from the context of the target task. This patch adds a more generic cgroup_attach_task_all() API that allows both the source task and to-be-moved task to be specified. cgroup_attach_task_current_cg() becomes a specialization of the more generic new function. Acked-by: Paul Menage <menage@google.com> --
Yes, I think so. Unless there are objections, I intend to merge this (with the review fixes) through net-2.6 together with a vhost-net patch I think it doesn't affect anything except 2.6.36-rcX, --
(Just came back from vacation) Acked-by: Li Zefan <lizf@cn.fujitsu.com> That's Ok. a nitpick: --
