Re: [patch 3/4] mempolicy: add MPOL_F_STATIC_NODES flag

!MAILaRCHIVE_VOTE_RePLACE
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]
To: David Rientjes <rientjes@...>
Cc: <Lee.Schermerhorn@...>, <akpm@...>, <clameter@...>, <ak@...>, <linux-kernel@...>, <mel@...>
Date: Wednesday, February 13, 2008 - 1:04 pm

David, responding to pj, write:

No -- MPOL_F_STATIC_NODES does not handle my second case.  Notice the
phrase 'cpuset relative.'

In my second case, nodes are numbered relative to the cpuset.  If you
say "node 2" then you mean whatever is the third (since numbering is
zero based) node in your current cpuset.

This cpuset relative mode that I'm proposing is an entirely different
mode of numbering nodes than anything we've seen so far (except for our
side discussions with just yourself, Lee and Christoph, last December.)

In this mode, "node 2" doesn't mean what the system calls "node 2"; it
means the third node in whatever is ones current cpuset placement (if
your cpuset even has that many nodes), and mempolicies using this mode
are automatically remapped by the kernel, anytime the cpuset placement
changes.

This second, cpuset relative, mode is required:

1) to provide a race-free means for an application to place its memory
   when the application considers all physical nodes essentially
   equivalent, and just wants to lay things out relative to whatever
   cpuset it happens to be running in, and

2) to provide a practical means, without the need for constantly
   reprobing ones current cpuset placement, for an application to
   specify cpuset-relative placement to be applied whenever the
   application is placed in a larger cpuset, even if the application
   is presently in a smaller cpuset.

Without it, the application has to first query its current physical
cpuset placement, then calculate its desired relative placement into
terms of the current physical nodes it finds itself on, and then issue
various mbind or set_mempolicy calls using those current physical node
numbers, praying that it wasn't migrated to other physical nodes at the
same time, which would have invalidated its current placement
assumptions in its relative to physical node number calculations.

And without it, if an application is currently placed in a smaller cpuset
than it would prefer, it cannot specify how it would want its mempolicies
should it ever subsequently be moved to a larger cpuset.  This leaves such
an application with little option other than to constantly reprobe its
cpuset placement, in order to know if and when it needs to reissue its
mbind and set_mempolicy calls because it gained access to a larger cpuset.

I agree, David, that this present MPOL_F_STATIC_NODES patch handles the
case of a growing cpuset (or hotplug added nodes) for the static mapped
case (node "2" means physical system node "2", always.)  But this
present patch, by design, does not address the case of a growing cpuset
for the case where an application actually wants its mempolicies remapped.

My original code remapping mempolicies when cpuset placement changes,
that has been in the kernel for a couple of years now, was -supposed-
to handle this relative case.  But it is flawed, as it fails to meet
the requirements I list above.  We can't just change the way that mode
works now, for compatibility reasons.  So we need to add a new cpuset
relative mode, just as you're adding a STATIC mode, that addresses the
above requirements for a cpuset relative, remapped, numbering of
mempolicy nodes.



Ok - I tried to elaborate, above.  You (David), Lee and Christoph will
perhaps recognize this elaboration, as it essentially repeats things
I said in our earlier discussion in December and early January.

Yes - hotplug presents the same problems as cpusets growing larger.

If this makes sense to you, David, and you'd like to include this
second, cpuset relative mode, in your patchset, that would be excellent.

Given that I have not been very good at explaining this second mode
you might choose not to do that; in that case I'll have to follow up,
after your second patch shapes up, with a patch of my own, adding this
cpuset relative mode.



I'd suggest we let future expansions deal with their own needs.  We
don't usually pad internal (not externally visible) data structures
in anticipation that someone else might need the space in the future.

At least earlier, Andi Kleen, when he was the primary author and sole
active maintainer of this mempolicy code, was always keen to avoid
expanding the size of 'struct mempolicy' by another nodemask.

I have not done the calculations myself to know how vital it is to
keep the size of struct mempolicy to a minimum.  It certainly seems
worth a bit of effort, however, if adding this union of these two
nodemasks doesn't complicate the code too horribly much.


Cool.  Thanks.  (I'm glad you caved ... ;).  Looking forward in my inbox, I see
that Lee has some suggestions on where to handle the conversion between the
packed mode and the separate fields.  I'm too lazy to think about that more,
and will likely acquiesce to whatever you and Lee agree to.



Ah - good.  I missed that.  Just to be sure I'm reading the code right,
I take it that it is the following line, at the end of the mpol_new()
routine, that stores the unaltered user nodemask ... right?

	policy->user_nodemask = *nodes;



I would be inclined toward having the classic compatibility mode (no
such mode as MPOL_F_STATIC_NODES specified) continue to do as it always
has done, which apparently includes failing EINVAL in some cases due to
an empty nodemask intersection with the current cpuset.

But I would also expect that this new MPOL_F_STATIC_NODES would allow
specifying any nodes, whether or not any of them were in the tasks
current cpuset.

Looking ahead, I think Lee is saying pretty much the same thing.

-- 
                  I won't rest till it's the best ...
                  Programmer, Linux Scalability
                  Paul Jackson <pj@sgi.com> 1.940.382.4214
--
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

Messages in current thread:
[patch 1/4] mempolicy: convert MPOL constants to enum, David Rientjes, (Mon Feb 11, 11:30 am)
Re: [patch 1/4] mempolicy: convert MPOL constants to enum, Lee Schermerhorn, (Tue Feb 12, 8:10 pm)
Re: [patch 1/4] mempolicy: convert MPOL constants to enum, Christoph Lameter, (Tue Feb 12, 9:04 pm)
Re: [patch 1/4] mempolicy: convert MPOL constants to enum, David Rientjes, (Tue Feb 12, 10:00 pm)
Re: [patch 1/4] mempolicy: convert MPOL constants to enum, Paul Jackson, (Tue Feb 12, 10:22 pm)
Re: [patch 1/4] mempolicy: convert MPOL constants to enum, David Rientjes, (Tue Feb 12, 10:42 pm)
Re: [patch 1/4] mempolicy: convert MPOL constants to enum, Paul Jackson, (Tue Feb 12, 10:59 pm)
Re: [patch 1/4] mempolicy: convert MPOL constants to enum, David Rientjes, (Tue Feb 12, 11:17 pm)
Re: [patch 1/4] mempolicy: convert MPOL constants to enum, Paul Jackson, (Tue Feb 12, 11:22 pm)
Re: [patch 1/4] mempolicy: convert MPOL constants to enum, David Rientjes, (Tue Feb 12, 8:53 pm)
Re: [patch 1/4] mempolicy: convert MPOL constants to enum, Christoph Lameter, (Mon Feb 11, 3:32 pm)
Re: [patch 1/4] mempolicy: convert MPOL constants to enum, David Rientjes, (Mon Feb 11, 3:40 pm)
Re: [patch 1/4] mempolicy: convert MPOL constants to enum, Christoph Lameter, (Mon Feb 11, 3:48 pm)
Re: [patch 1/4] mempolicy: convert MPOL constants to enum, David Rientjes, (Mon Feb 11, 4:02 pm)
Re: [patch 1/4] mempolicy: convert MPOL constants to enum, Christoph Lameter, (Mon Feb 11, 4:45 pm)
[patch 2/4] mempolicy: support optional mode flags, David Rientjes, (Mon Feb 11, 11:30 am)
Re: [patch 2/4] mempolicy: support optional mode flags, Lee Schermerhorn, (Tue Feb 12, 8:14 pm)
Re: [patch 2/4] mempolicy: support optional mode flags, David Rientjes, (Tue Feb 12, 8:25 pm)
[patch 3/4] mempolicy: add MPOL_F_STATIC_NODES flag, David Rientjes, (Mon Feb 11, 11:30 am)
Re: [patch 3/4] mempolicy: add MPOL_F_STATIC_NODES flag, Paul Jackson, (Thu Feb 14, 6:09 am)
Re: [patch 3/4] mempolicy: add MPOL_F_STATIC_NODES flag, David Rientjes, (Thu Feb 14, 5:38 pm)
Re: [patch 3/4] mempolicy: add MPOL_F_STATIC_NODES flag, Paul Jackson, (Fri Feb 15, 5:27 am)
Re: [patch 3/4] mempolicy: add MPOL_F_STATIC_NODES flag, David Rientjes, (Fri Feb 15, 4:23 pm)
Re: [patch 3/4] mempolicy: add MPOL_F_STATIC_NODES flag, Paul Jackson, (Fri Feb 15, 7:45 pm)
Re: [patch 3/4] mempolicy: add MPOL_F_STATIC_NODES flag, David Rientjes, (Fri Feb 15, 7:55 pm)
Re: [patch 3/4] mempolicy: add MPOL_F_STATIC_NODES flag, Paul Jackson, (Fri Feb 15, 8:11 pm)
Re: [patch 3/4] mempolicy: add MPOL_F_STATIC_NODES flag, David Rientjes, (Fri Feb 15, 4:32 pm)
Re: [patch 3/4] mempolicy: add MPOL_F_STATIC_NODES flag, David Rientjes, (Thu Feb 14, 3:40 pm)
Re: [patch 3/4] mempolicy: add MPOL_F_STATIC_NODES flag, David Rientjes, (Thu Feb 14, 9:44 pm)
Re: [patch 3/4] mempolicy: add MPOL_F_STATIC_NODES flag, Paul Jackson, (Fri Feb 15, 6:00 am)
Re: [patch 3/4] mempolicy: add MPOL_F_STATIC_NODES flag, Lee Schermerhorn, (Tue Feb 12, 8:22 pm)
Re: [patch 3/4] mempolicy: add MPOL_F_STATIC_NODES flag, David Rientjes, (Wed Feb 13, 12:18 am)
Re: [patch 3/4] mempolicy: add MPOL_F_STATIC_NODES flag, Lee Schermerhorn, (Wed Feb 13, 12:14 pm)
Re: [patch 3/4] mempolicy: add MPOL_F_STATIC_NODES flag, David Rientjes, (Wed Feb 13, 3:12 pm)
Re: [patch 3/4] mempolicy: add MPOL_F_STATIC_NODES flag, David Rientjes, (Wed Feb 13, 1:06 am)
Re: [patch 3/4] mempolicy: add MPOL_F_STATIC_NODES flag, Lee Schermerhorn, (Wed Feb 13, 11:15 am)
Re: [patch 3/4] mempolicy: add MPOL_F_STATIC_NODES flag, Paul Jackson, (Tue Feb 12, 11:52 pm)
Re: [patch 3/4] mempolicy: add MPOL_F_STATIC_NODES flag, David Rientjes, (Wed Feb 13, 12:03 am)
Re: [patch 3/4] mempolicy: add MPOL_F_STATIC_NODES flag, Paul Jackson, (Wed Feb 13, 12:13 am)
Re: [patch 3/4] mempolicy: add MPOL_F_STATIC_NODES flag, David Rientjes, (Wed Feb 13, 12:23 am)
Re: [patch 3/4] mempolicy: add MPOL_F_STATIC_NODES flag, Paul Jackson, (Wed Feb 13, 4:03 am)
Re: [patch 3/4] mempolicy: add MPOL_F_STATIC_NODES flag, David Rientjes, (Wed Feb 13, 5:36 am)
Re: [patch 3/4] mempolicy: add MPOL_F_STATIC_NODES flag, Paul Jackson, (Wed Feb 13, 1:04 pm)
Re: [patch 3/4] mempolicy: add MPOL_F_STATIC_NODES flag, David Rientjes, (Wed Feb 13, 3:02 pm)
Re: [patch 3/4] mempolicy: add MPOL_F_STATIC_NODES flag, Paul Jackson, (Wed Feb 13, 4:29 pm)
Re: [patch 3/4] mempolicy: add MPOL_F_STATIC_NODES flag, Paul Jackson, (Thu Feb 14, 6:26 am)
Re: [patch 3/4] mempolicy: add MPOL_F_STATIC_NODES flag, David Rientjes, (Thu Feb 14, 3:45 pm)
Re: [patch 3/4] mempolicy: add MPOL_F_STATIC_NODES flag, Paul Jackson, (Fri Feb 15, 6:19 am)
Re: [patch 3/4] mempolicy: add MPOL_F_STATIC_NODES flag, David Rientjes, (Fri Feb 15, 4:14 pm)
Re: [patch 3/4] mempolicy: add MPOL_F_STATIC_NODES flag, David Rientjes, (Wed Feb 13, 5:35 pm)
Re: [patch 3/4] mempolicy: add MPOL_F_STATIC_NODES flag, Paul Jackson, (Thu Feb 14, 8:27 am)
Re: [patch 3/4] mempolicy: add MPOL_F_STATIC_NODES flag, Paul Jackson, (Thu Feb 14, 7:12 am)
Re: [patch 3/4] mempolicy: add MPOL_F_STATIC_NODES flag, Lee Schermerhorn, (Wed Feb 13, 12:01 pm)
Re: [patch 3/4] mempolicy: add MPOL_F_STATIC_NODES flag, David Rientjes, (Wed Feb 13, 2:48 pm)
Re: [patch 3/4] mempolicy: add MPOL_F_STATIC_NODES flag, Lee Schermerhorn, (Wed Feb 13, 3:05 pm)
Re: [patch 3/4] mempolicy: add MPOL_F_STATIC_NODES flag, David Rientjes, (Wed Feb 13, 3:17 pm)
Re: [patch 3/4] mempolicy: add MPOL_F_STATIC_NODES flag, Paul Jackson, (Wed Feb 13, 2:58 pm)
Re: [patch 3/4] mempolicy: add MPOL_F_STATIC_NODES flag, Christoph Lameter, (Mon Feb 11, 3:34 pm)
Re: [patch 3/4] mempolicy: add MPOL_F_STATIC_NODES flag, KOSAKI Motohiro, (Mon Feb 11, 2:25 pm)
Re: [patch 3/4] mempolicy: add MPOL_F_STATIC_NODES flag, David Rientjes, (Mon Feb 11, 3:56 pm)
Re: [patch 3/4] mempolicy: add MPOL_F_STATIC_NODES flag, Lee Schermerhorn, (Tue Feb 12, 8:25 pm)
Re: [patch 3/4] mempolicy: add MPOL_F_STATIC_NODES flag, David Rientjes, (Tue Feb 12, 8:57 pm)
Re: [patch 2/4] mempolicy: support optional mode flags, Lee Schermerhorn, (Mon Feb 11, 12:36 pm)
Re: [patch 2/4] mempolicy: support optional mode flags, David Rientjes, (Mon Feb 11, 3:34 pm)
Re: [patch 2/4] mempolicy: support optional mode flags, Lee Schermerhorn, (Tue Feb 12, 11:31 am)
Re: [patch 2/4] mempolicy: support optional mode flags, David Rientjes, (Tue Feb 12, 3:14 pm)
Re: [patch 2/4] mempolicy: support optional mode flags, Paul Jackson, (Mon Feb 11, 4:55 pm)
Re: [patch 2/4] mempolicy: support optional mode flags, David Rientjes, (Mon Feb 11, 5:52 pm)
Re: [patch 2/4] mempolicy: support optional mode flags, Paul Jackson, (Mon Feb 11, 5:57 pm)
Re: [patch 1/4] mempolicy: convert MPOL constants to enum, David Rientjes, (Mon Feb 11, 3:25 pm)