Hacking and requiring an updated version of libnuma to allow empty
nodemasks to be passed is a poor solution; if mempolicy's are supposed to
be independent from cpusets, then what semantics does an empty nodemask
actually imply when using MPOL_INTERLEAVE? To me, it means the entire
set_mempolicy() should be a no-op, and that's exactly how mainline
currently treats it _as_well_ as libnuma. So justifying this change in
the man page is respectible, but passing an empty nodemask just doesn't
make sense.
Passing empty nodemasks with MPOL_INTERLEAVE to set_mempolicy() is the
only reasonable way of specifying you want, at all times, to interleave
over all available nodes? I doubt it.
I personally prefer an approach where cpusets take the responsibility for
determining how policies change (they use set_mempolicy() anyway to effect
their mems boundaries) because it's cpusets that has changed the available
nodemask out from beneath the application. So instead of trying to create
a solution where cpusets impact mempolicies and mempolicies impact
cpusets, it should only be in a single direction. Cpusets change the
set of available nodes and should update the attached tasks' mempolicies
at the same time. That's the same as saying that cpusets should be built
on top of mempolicies, which they are, and shouldn't have any reverse
dependency.
Completely irrelevant; I care about the interaction between cpusets and
mempolicies in mainline Linux.
David
-