On Sun, 28 Oct 2007, Paul Jackson wrote:If we can't identify any applications that would be broken by this, what's the difference in simply implementing Choice B and then, if we hear complaints, add your hack to revert back to Choice A behavior based on the get_mempolicy() call you specified is always part of libnuma? The problem that I see with immediately offering both choices is that we don't know if anybody is actually reverting back to Choice A behavior because libnuma, by default, would use it. That's going to making it very painful to remove later because we've supported both options and have made libnuma and {get,set}_mempolicy() arguments ambiguous. We should only support both choices if they will both be used and there's no hard evidence to suggest that at this point. You earlier insisted on an ease of documentation for the MPOL_INTERLEAVE case and now this dual support that you're proposing is going to make the documentation very difficult to understand for anyone who simply wants to use mempolicies. Others even in this thread have had a hard enough time understanding the difference between the two choices and you explained them very thoroughly. It's going to be much more trouble than it's worth, I predict. And that application would need to be implemented to know the nodes that it has access to before it issues its set_mempolicy(MPOL_PREFERRED) command anyway if it truly uses Choice A behavior. So unless these tasks are looking in /proc/pid/status and parsing Mems_allowed and then specifying one as its preferred node or always being guaranteed a certain set of nodes that they are always attached to in a cpuset so they have such foresight of what node to prefer, Choice A can't possibly be what they want. I appreciate that very much. The needs I was addressing with my initial patchset was so that when a cpuset is expanded, any MPOL_INTERLEAVE memory policy of attached tasks automatically get expanded as well. This discussion has somewhat diverged from that, but I hope you still support what we earlier talked about in terms of adding a field to struct mempolicy to remember the intended nodemask the application asked to interleave over. You don't actually need to choose between the two choices for adapting MPOL_INTERLEAVE over _all_ allowed cpuset nodes. I thought what we agreed upon and what you were going to implement was adding a nodemask_t to struct mempolicy for the intended nodemask of the memory policy and then AND it with pol->cpuset_mems_allowed. That completely satisfies my needs and my applications that want to allocate over all available nodes (by simply passing numa_all_nodes to set_mempolicy(MPOL_INTERLEAVE)). If I wanted to interleave only over a subset, the choices would matter. David -
| Greg KH | [RFC] sample kobject implementation |
| Greg Kroah-Hartman | [PATCH 001/196] Chinese: Add the known_regression URI to the HOWTO |
| Paul E. McKenney | [PATCH RFC 2/9] RCU: Fix barriers |
| Joe Perches | [PATCH 011/148] include/asm-x86/bug.h: checkpatch cleanups - formatting only |
git: | |
| Jarek Poplawski | [PATCH] pkt_sched: Destroy gen estimators under rtnl_lock(). |
| Gerrit Renker | [PATCH 15/37] dccp: Set per-connection CCIDs via socket options |
| Linus Torvalds | Re: [GIT]: Networking |
| Jeff Garzik | Re: [PATCH] drivers/net: remove network drivers' last few uses of IRQF_SAMPLE_RANDOM |
