Cc: Lee Schermerhorn <Lee.Schermerhorn@...>, Andrew Morton <akpm@...>, Christoph Lameter <clameter@...>, Andi Kleen <ak@...>, <linux-kernel@...>, <mel@...>
Ok, so this truly is a new feature that isn't addressed either by the
current implementation or my patchset. Fair enough.
You're specifically trying to avoid having the application know about its
cpuset placement with regard to mems at the time it sets up its mempolicy,
right? Otherwise it could already setup this relative nodemask by
selecting node 2, from your example above, in its mems_allowed and it
would always be remapped appropriately.
So, for example, if the task is bound by mems 1-3, and it asks for
MPOL_INTERLEAVE over 2-4, then initially the mempolicy is only effected
over node 3 and if it's later expanded to mems 1-8, then the mempolicy is
effected over nodes 3-5, right?
And if the mems change to 3-8, the mempolicy is remapped to 5-7 even
though 3-5 (which it already was interleaving over) is still accessible?
Does MPOL_INTERLEAVE | MPOL_F_STATIC_NODES | MPOL_F_PAULS_NEW_FLAG make
any logical sense? If it does, I think we're going to be writing some
very complex remap code in our future.
I think it will work very nicely and the benefit is immediately obvious
for systems that have large nodemasks.
Well, I didn't cave on anything, I said that we can reconsider it in the
hopes that other people would add their feedback. I think continuing to
discuss this matter with yourself and Lee (and whomever else is
interested) will lead us to the correct solution. Since this is an
internal implementation detail, I think it's important to hear other
people's opinions since we're the ones who will be hacking the code in the
future so it's really our opinions that matter.
Yes.
And it does in my latest series, which I sent to you last night.
David
--