On Sat, 2008-02-02 at 18:37 +0900, KOSAKI Motohiro wrote:The memoryless nodes patch series changed a lot of things, so just reverting this one area [mpol_check_policy()] probably won't restore the prior behavior. A fully populated node mask is not necessarily a proper subset of node_online_map(). And contextualize_policy() also requires the mask to be a subset of mems_allowed which also defaults to nodes with memory. I don't know how Mel Gorman's "two zonelist" series, which is still awaiting a window into the -mm tree, affects this behavior. Those patches will certainly be affected by whatever we decide here. I don't know the current state of Paul's rework of cpusets and mems_allowed. That probably resolves this issue, if he still plans on allowing a fully populated mask to indicate interleaving over all allowed nodes. I have a patch that takes a different approach to "interleave=all" that doesn't solve Paul's and David's requirements. I also have patches to libnuma and numactl that work with my patches, but I saw no sense in posting them unless my kernel patches got some traction. If interested, you can find them at: http://free.linux.hp.com/~lts/Patches/Numactl/ In addition to Andi's answer about simplicity, libnuma and numactl predate the sysfs node masks. There was no way to query what the valid set of nodes would be, but the kernel allowed a fully populated map. We broke that with the memoryless nodes rework. Regarding the patch itself: If others have no problems with displaying a "has_high_memory" node mask for systems w/o HIGH_MEM configured, I can live with it. The current upstream kernel [2.6.24] supports a MPOL_MEMS_ALLOWED flag to get_mempolicy() to return the nodes allowed in the caller's cpuset. My numactl patches, mentioned above, support this. However, as Andi says, we really can't break application behavior. All applications that use mempolicy don't necessarily use libnuma APIs. So, a fully populated interleave node mask should be allowed and should probably mean "all allowed nodes with memory". I think we'd still need to reduce the interleave policy mask to nodes with memory when it's installed or find another way to skip memoryless nodes when interleaving, else we don't get even distribution of interleaved pages over the nodes that do have memory. This is one of the memoryless nodes fixes. I THINK this is one of the areas that Paul and David are investigating. Christoph, Mel, Paul: any suggestions for a [relatively quick] fix that doesn't break the memoryless nodes work and doesn't violate cpuset constraints? --
| James Bottomley | Re: Integration of SCST in the mainstream Linux kernel |
| David Miller | Slow DOWN, please!!! |
| Jared Hulbert | [PATCH 00/10] AXFS: Advanced XIP filesystem |
| Alan Cox | Re: TALPA - a threat model? well sorta. |
| Jarek Poplawski | [PATCH] pkt_sched: Destroy gen estimators under rtnl_lock(). |
| David Miller | [GIT]: Networking |
| Gerrit Renker | [PATCH 0/37] dccp: Feature negotiation - last call for comments |
| Corey Minyard | Re: [PATCH 3/3] Convert the UDP hash lock to RCU |
git: | |
