On Fri, 5 Oct 2007, Balbir Singh wrote:It would be wrong to ignore the unuse_pte() case: what it's intending to do is correct, it's just being prevented by the swapcache issue from doing what it intends at present. (Though I'm not thrilled with the idea of it causing an admin's swapoff to fail because of a cgroup reaching mem limit there, I do agree with your earlier argument that that's the right thing to happen, and it's up to the admin to fix things up - my original objection came from not realizing that normally the cgroup will reclaim from itself to free its mem. Hmm, would the charge fail or the mm get OOM'ed?) Ignoring add_to/remove_from swap cache is what I've tried before, and again today. It's not enough: if you trying run a memhog (something that allocates and touches more memory than the cgroup is allowed, relying on pushing out to swap to complete), then that works well with the present accounting in add_to/remove_from swap cache, but it OOMs once I remove the memcontrol mods from mm/swap_state.c. I keep going back to investigate why, keep on thinking I understand it, then later realize I don't. Please give it a try, I hope you've got better mental models than I have. And I don't think it will be enough to handle shmem/tmpfs either; but won't worry about that until we've properly understood why exempting swapcache leads to those OOMs, and fixed that up. It checks MEM_CGROUP_TYPE_ALL there, yes; but I can't find anything checking for either MEM_CGROUP_TYPE_MAPPED or MEM_CGROUP_TYPE_CACHED. (Or is it hidden in one of those preprocesor ## things which frustrate both my greps and me!?) I sent a patch to linux-mm last night, to remove that confusion. Those indeed are strange behaviours (if the swapoff really has succeeded, rather than lying), I not seen such and don't have an explanation. tmpfs doesn't add any weirdness there: when there's no swap, there can be no swap cache. Or is the swapoff still in progress? While it's busy, we keep /proc/meminfo looking sensible, but <Alt><SysRq>m can show negative free swap (IIRC). I'll be interested to hear what your investigation shows. Hugh -
| Andrew Morton | -mm merge plans for 2.6.23 |
| Greg Kroah-Hartman | [PATCH 006/196] Chinese: add translation of oops-tracing.txt |
| Greg KH | Re: Dual-Licensing Linux Kernel with GPL V2 and GPL V3 |
| Roland Dreier | Re: Integration of SCST in the mainstream Linux kernel |
git: | |
| David Miller | [GIT]: Networking |
| Gerrit Renker | [PATCH 15/37] dccp: Set per-connection CCIDs via socket options |
| Linus Torvalds | Re: iptables very slow after commit 784544739a25c30637397ace5489eeb6e15d7d49 |
| Herbert Xu | Re: [PATCH] pkt_sched: Destroy gen estimators under rtnl_lock(). |
