V1->V2 - Fixed a possible race in cpu_cgroup_read_stat. Thank you Paul for pointing this out. - A few other naming changes. This patchset is a first step towards implementing stats for cgroup subsystems. Only a few trivial stats for cpu and memory resource controller have been implemented for now. Please provide comments on the general direction and any suggestions on how you would like the cgroupstats framework to be implemented. roadmap -------- implement a generic statistics framework for cgroups, unification with taskstats/netlink interface, add more statistics -- regards, Balaji Rao Dept. of Mechanical Engineering, National Institute of Technology Karnataka, India --
This is sort of heading in the same way as the cgroup binary stats API that I mentioned a couple of months ago (when I proposed the "cgroup.api" file). Since the cgroup file API encourages subsystems to export values via abstract methods such as read_s64() or read_map() rather than having them handle the file I/O themselves, this gives the basis for a binary stats API - the same methods can be used to retrieve the information in a binary form rather than from regular ASCII-based file reads, and the subsystem doesn't have to care which is being used. I was originally thinking along the lines of having a special mode in which you could obtain a cgroupfs binary file for a cgroup directory that would report a requested set of binary stats each time it was read, but using the netlink/taskstats API might be a good approach too. One of the important API choices would be whether the stats API was fixed in header files shared with userspace, or whether it would be possible for stats to be added and dynamically discovered/used by userspace without needing fixed header file descriptions. The difference would be a bit like the old sysctl API (where each sysctl entry had to be enumerated in a header file) versus the newer /proc/sys approach where numerical values aren't used and userspace can determine which entries are supported at runtime, and even access new previously-unknown entries. Here's one possible way to do it: With the taskstats interface, we could have operations to: - describe the API exported by a given subsystem (automatically generated, based on its registered control files and their access methods) - retrieve a specified set of stats in a binary format So as a concrete example, with the memory, cpuacct and cpu subsystems configured, the reported API might look something like (in pseudo-code form) 0 : memory.usage_in_bytes : u64 1 : memory.limit_in_bytes : u64 2 : memory.failcnt : u64 3 : memory.stat : map 4 : cpuacct.usage : u64 5 : cpu.shares : u64 ...
Overall the idea looks good to me. We are also looking at a generic framework in cgroups that would delegate the job of handling statistics to the cgroup framework itself. This would avoid code duplication across various controllers. -- regards, Balaji Rao --
Yes, avoiding code duplication is good. On thing - when you say "statistics" do you mean all statistics (i.e. all values that can be read from control files) or specifically arrays/maps of values in specific control files? I'm using it to mean the former, but you appear to be mostly referring to stats maps such as "memory.stat" or "cpu.stat". Paul --
Sorry for the confusion. You're right. I'm speaking about the "cpu.stat" and "memory.stat" idea. What we want is, a layer above the generic file presentation layer, that would collect and manage the statistics the controllers would provide. -- regards, Balaji Rao Dept. of Mechanical Engineering, National Institute of Technology Karnataka, India --
I like the overall approach, do you have a prototype implementation? -- Warm Regards, Balbir Singh Linux Technology Center IBM, ISTL --
No, nothing yet. Well, I posted a version of the API description file a couple of months ago but people didn't seem to like that. Paul --
I like the roadmap and the patches. We need these statistics quite urgently. We have lot of control and some statistics. We need more statistics to help make the decision making (resizing, moving task, etc.) easier -- Warm Regards, Balbir Singh Linux Technology Center IBM, ISTL --
| Karl Meyer | PROBLEM: 2.6.23-rc "NETDEV WATCHDOG: eth0: transmit timed out" |
| Greg Kroah-Hartman | [PATCH 040/196] kobject: add kobject_add_ng function |
| Steven Rostedt | [RFC PATCH v4] Unified trace buffer |
| Dave Airlie | [git pull] drm patches for 2.6.27 final |
| Krzysztof Halasa | Re: [PATCH v2] Re: WAN: new PPP code for generic HDLC |
| David Miller | Re: [PATCH] Expose netdevice dev_id through sysfs |
| Jay Cliburn | Re: atl1 64-bit => 32-bit DMA borkage (reproducible, bisected) |
| Evgeniy Polyakov | [resend take 2 0/4] Distributed storage. |
git: | |
| Andrew Morton | Untracked working tree files |
| Miklos Vajna | [rfc] git submodules howto |
| Ben Collins | Re: [kernel.org users] [RFD] On deprecating "git-foo" for builtins |
| Jon Smirl | ! [rejected] master -> master (non-fast forward) |
| rancor | How to copy/pipe console buffert to file? |
| Pieter Verberne | File collision while using pkg_add |
| Greg Thomas | Re: Is it possible to fix a stale NFS hadle without rebooting? |
| Didier Wiroth | win32-codecs, avi and amd64 question |
| Netfilter kernel module | 9 hours ago | Linux kernel |
| serial driver xmit problem | 12 hours ago | Linux kernel |
| Why Windows is better than Linux | 12 hours ago | Linux general |
| How can I see my kernel messages in vt12? | 19 hours ago | Linux kernel |
| Grub | 1 day ago | Linux general |
| vmalloc_fault handling in x86_64 | 1 day ago | Linux kernel |
| epoll_wait()ing on epoll FD | 1 day ago | Linux kernel |
| Framebuffer in x86_64 causes problems to multiseat | 1 day ago | Linux kernel |
| Difference between 2.4 and 2.6 regarding thread creation | 1 day ago | Linux general |
| Compiling gfs2 on kernel 2.6.27 | 2 days ago | Linux kernel |
