> On Fri, Apr 11, 2008 at 8:18 PM, Serge E. Hallyn <serue@us.ibm.com> wrote:
> >
> > Quoting Paul Menage (
menage@google.com):
> > > This is a list of some of the sub-projects that I'm planning for
> > > Control Groups, or that I know others are planning on or working on.
> > > Any comments or suggestions are welcome.
> > >
> > >
> > > 1) Stateless subsystems
> > > -----
> > >
> > > This was motivated by the recent "freezer" subsystem proposal, which
> > > included a facility for sending signals to all members of a cgroup.
> > > This wasn't specifically freezer-related, and wasn't even something
> > > that needed particular per-cgroup state - its only state is that set
> > > of processes, which is already tracked by crgoups. So it could
> > > theoretically be mounted on multiple hierarchies at once, and wouldn't
> > > need an entry in the css_set array.
> > >
> > > This would require a few internal plumbing changes in cgroups, in particular:
> > >
> > > - hashing css_set objects based on their cgroups rather than their css pointers
> > > - allowing stateless subsystems to be in multiple hierarchies
> > > - changing the way hierarchy ids are calculated - simply ORing
> > > together the subsystem would no longer work since that could result in
> > > duplicates
> > >
> > > 2) More flexible binding/unbinding/rebinding
> > > -----
> > >
> > > Currently you can only add/remove subsystems to a hierarchy when it
> > > has just a single (root) cgroup. This is a bit inflexible, so I'm
> > > planning to support:
> > >
> > > - adding a subsystem to an existing hierarchy by automatically
> > > creating a subsys state object for the new subsystem for each existing
> > > cgroup in the hierarchy and doing the appropriate
> > > can_attach()/attach_tasks() callbacks for all tasks in the system
> > >
> > > - removing a subsystem from an existing hierarchy by moving all tasks
> > > to that subsystem's root cgroup and destroying the child subsystem
> > > state objects
> > >
> > > - merging two existing hierarchies that have identical cgroup trees
> > >
> > > - (maybe) splitting one hierarchy into two separate hierarchies
> > >
> > > Whether all these operations should be forced through the mount()
> > > system call, or whether they should be done via operations on cgroup
> > > control files, is something I've not figured out yet.
> >
> > I'm tempted to ask what the use case is for this (I assume you have one,
> > you don't generally introduce features for no good reason), but it
> > doesn't sound like this would have any performance effect on the general
> > case, so it sounds good.
> >
> > I'd stick with mount semantics. Just
> > mount -t cgroup -o remount,devices,cpu none /devwh"
> > should handle all cases, no?
> >
> >
> >
> > > 3) Subsystem dependencies
> > > -----
> > >
> > > This would be a fairly simple change, essentially allowing one
> > > subsystem to require that it only be mounted on a hierarchy when some
> > > other subsystem was also present. The implementation would probably be
> > > a callback that allows a subsystem to confirm whether it's prepared to
> > > be included in a proposed hierarchy containing a specified subsystem
> > > bitmask; it would be able to prevent the hierarchy from being created
> > > by giving an error return. An example of a use for this would be a
> > > swap subsystem that is mostly independent of the memory controller,
> > > but uses the page-ownership tracking of the memory controller to
> > > determine which cgroup to charge swap pages to. Hence it would require
> > > that it only be mounted on a hierarchy that also included a memory
> > > controller. The memory controller would make no such requirement by
> > > itself, so could be used on its own without the swap controller.
> > >
> > >
> > > 4) Subsystem Inheritance
> > > ------
> > >
> > > This is an idea that I've been kicking around for a while trying to
> > > figure out whether its usefulness is worth the in-kernel complexity,
> > > versus doing it in userspace. It comes from the idea that although
> > > cgroups supports multiple hierarchies so that different subsystems can
> > > see different task groupings, one of the more common uses of this is
> > > (I believe) to support a setup where say we have separate groups A, B
> > > and C for one resource X, but for resource Y we want a group
> > > consisting of A+B+C. E.g. we want individual CPU limits for A, B and
> > > C, but for disk I/O we want them all to share a common limit. This can
> > > be done from userspace by mounting two hierarchies, one for CPU and
> > > one for disk I/O, and creating appropriate groupings, but it could
> > > also be done in the kernel as follows:
> > >
> > > - each subsystem "foo" would have a "foo.inherit" file provided by
> > > (and handled by) cgroups in each group directory
> > >
> > > - setting the foo.inherit flag (i.e. writing 1 to it) would cause
> > > tasks in that cgroup to share the "foo" subsystem state with the
> > > parent cgroup
> > >
> > > - from the subsystem's point of view, it would only need to worry
> > > about its own foo_cgroup objects and which task was associated with
> > > each object; the subsystem wouldn't need to care about which tasks
> > > were part of each cgroup, and which cgroups were sharing state; that
> > > would all be taken care of by the cgroup framework
> > >
> > > I've mentioned this a couple of times on the containers list as part
> > > of other random discussions; at one point Serge Hallyn expressed some
> > > interest but there's not been much noise about it either way. I
> > > figured I'd include it on this list anyway to see what people think of
> > > it.
> >
> > I guess I'm hoping that if libcg goes well then a userspace daemon can
> > do all we need. Of course the use case I envision is having a container
> > which is locked to some amount of ram, wherein the container admin wants
> > to lock some daemon to a subset of that ram. If the host admin lets the
> > container admin edit a config file (or talk to a daemon through some
> > sock designated for the container) that will only create a child of the
> > container's cgroup, that's probably great.
> >
>
> I thought of doing something like this in libcg (having a daemon and a
> client socket interface), but dropped the idea later. When all
> controllers support multi-levels well, the plan is to create a
> sub-directory in the cgroup hierarchy and give subtree ownership to
> the application administrator.
>
> > So I'm basically being quiet until I see whether libcg will suffice.
> >
>
> If you do have any specific requirements, we can cater to them right
> now. Please do let us know. The biggest challenge right now is getting
> a stable API.