Re: [patch 2/2] cpusets: add interleave_over_allowed option

Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]
From: Paul Jackson
Date: Sunday, October 28, 2007 - 4:46 pm

David wrote:

You state this point clearly, but I have to disagree.

The Linux documentation is not a legal contract.  Anytime we change the
actual behaviour of the code, we have to ask ourselves what will be the
impact of that change on existing users and usages.  The burden is on
us to minimize breaking things (by that I mean, what users would
consider breakage, even if we think it is all for the better and that
their code was the real problem.)  I didn't say no breakage, but
minimum breakage, doing our best to guide users through changes with
minimum disruption to their work.

Linux is gaining market share rapidly because we co-operate with our
users to give us both the best chance of succeeding.

We don't just play gotcha games with the documentation -- ha ha --
we didn't document that detail, so it's your fault for ever depending
on it.  And besides your code sucks.  So there!  Let's leave that game
for others.


Yes, if there is such an application, I'm trying to protect it.


Perhaps.  Perhaps not.  I don't know.


If that were so, then yes much of your subsequent reasoning would follow.

However, let me repeat the following from my previous message:


The above is the hack that allows us to support existing libnuma based
applications (the most significant users of memory policy historically)
with a default of Choice A, while other code and future code defaults
to Choice B.


That's not the only sort of application I'm trying to protect.

I'm trying to protect almost any application that uses both
set_mempolicy or mbind, while in a cpuset.

    If a task is in a cpuset on say nodes 16-23, and it wants to issue
    any mbind, or any MPOL_PREFERRED, MPOL_BIND, or MPOL_INTERLEAVE
    mempolicy call, then under Choice A it must issue nodemasks offset
    by 16, relative to what it would issue under Choice B.

Almost any task using memory policies on a system making active use of
cpusets will be affected, even well written ones doing simple things.

I am more concerned that the above hack for libnuma isn't enough,
rather than it is unnecessary.

I think the above hack covers existing libnuma users rather well,
though I could be wrong even here, as I don't actually work with
most of the existing libnuma users.

And I can cover those using both memory policies and cpusets via
libcpuset, as there I am the expert and know that I can guide
libcpuset and its users through this change.  There will be some
breakage, but I know how to manage it.

However, if anyone is deploying product or has important (to them) but
not easy to change software using both memory policies and cpusets that
are not doing so via libnuma or libcpuset, then any change that
changes the default for their memory policy calls from Choice A to
Choice B will probably break them.

Perhaps there are no significant cases like this, using memory policies
on cpuset managed systems, but not via libnuma or libcpuset.  It might
be that the only way to smoke out such cases is to ship the change (to
Choice B default) and see what squawks.  That kinda sucks.

I'm tempted to think I need to go a bit further:
 1) Add a per-system or per-cpuset runtime mode, to enable a system
    administrator to revert to the Choice A default.
 2) Make sure that both the new libnuma and libcpuset versions dynamically
    probe the state of this default, and dynamically adapt to running
    on a kernel, or in a cpuset, of either default.  If this mode is
    per-cpuset, then so long as there are not two applications that
    must both run in the same cpuset, both using memory policies,
    neither using libnuma or libcpuset, requiring conflicting defaults
    (one Choice A, the other Choice B) then this would provide an
    administrator sufficient flexibility to adapt.

There would be one key advantage to a per-cpuset mode flag here.  It
exposes in a fairly visible way this change in the default numbering of
memory policy node masks, from cpuset relative to system relative (with
the system now automatically making them cpuset relative across any
changes in the number of nodes in the cpuset.)  This is a classic way
to guide users to understanding new alternatives and changes; expose it
as a 'button on the dashboard', a feature, not a hidden gotcha.

It's the old "it's a feature, not a bug" approach.  It can be a
useful mechanism to educate and enpower customers.

Users end up seeing it, learning about it, enjoying some sense of
control, and usually being able to make it work for them, one way
or the other.  That works out better than just hitting some random
subset of users with subtle bugs (from their perspective) over which
they have little or no control and fewer clues.

-- 
                  I won't rest till it's the best ...
                  Programmer, Linux Scalability
                  Paul Jackson <pj@sgi.com> 1.925.600.0401
-
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

Messages in current thread:
[patch 2/2] cpusets: add interleave_over_allowed option, David Rientjes, (Thu Oct 25, 3:54 pm)
Re: [patch 2/2] cpusets: add interleave_over_allowed option, Christoph Lameter, (Thu Oct 25, 4:37 pm)
Re: [patch 2/2] cpusets: add interleave_over_allowed option, Christoph Lameter, (Thu Oct 25, 5:28 pm)
Re: [patch 2/2] cpusets: add interleave_over_allowed option, Lee Schermerhorn, (Fri Oct 26, 8:18 am)
Re: [patch 2/2] cpusets: add interleave_over_allowed option, Lee Schermerhorn, (Fri Oct 26, 8:30 am)
Re: [patch 2/2] cpusets: add interleave_over_allowed option, Lee Schermerhorn, (Fri Oct 26, 8:37 am)
Re: [patch 2/2] cpusets: add interleave_over_allowed option, Lee Schermerhorn, (Fri Oct 26, 10:28 am)
Re: [patch 2/2] cpusets: add interleave_over_allowed option, Christoph Lameter, (Fri Oct 26, 10:36 am)
Re: [patch 2/2] cpusets: add interleave_over_allowed option, David Rientjes, (Fri Oct 26, 11:45 am)
Re: [patch 2/2] cpusets: add interleave_over_allowed option, David Rientjes, (Fri Oct 26, 11:46 am)
Re: [patch 2/2] cpusets: add interleave_over_allowed option, Michael Kerrisk, (Fri Oct 26, 1:21 pm)
Re: [patch 2/2] cpusets: add interleave_over_allowed option, Michael Kerrisk, (Fri Oct 26, 1:33 pm)
Re: [patch 2/2] cpusets: add interleave_over_allowed option, Lee Schermerhorn, (Fri Oct 26, 1:43 pm)
Re: [patch 2/2] cpusets: add interleave_over_allowed option, Christoph Lameter, (Fri Oct 26, 2:05 pm)
Re: [patch 2/2] cpusets: add interleave_over_allowed option, Christoph Lameter, (Fri Oct 26, 2:12 pm)
Re: [patch 2/2] cpusets: add interleave_over_allowed option, Lee Schermerhorn, (Fri Oct 26, 2:13 pm)
Re: [patch 2/2] cpusets: add interleave_over_allowed option, Christoph Lameter, (Fri Oct 26, 2:17 pm)
Re: [patch 2/2] cpusets: add interleave_over_allowed option, Lee Schermerhorn, (Fri Oct 26, 2:26 pm)
Re: [patch 2/2] cpusets: add interleave_over_allowed option, Lee Schermerhorn, (Fri Oct 26, 2:31 pm)
Re: [patch 2/2] cpusets: add interleave_over_allowed option, Christoph Lameter, (Fri Oct 26, 2:37 pm)
Re: [patch 2/2] cpusets: add interleave_over_allowed option, Christoph Lameter, (Fri Oct 26, 6:26 pm)
Re: [patch 2/2] cpusets: add interleave_over_allowed option, Christoph Lameter, (Fri Oct 26, 7:50 pm)
Re: [patch 2/2] cpusets: add interleave_over_allowed option, Christoph Lameter, (Fri Oct 26, 11:07 pm)
Re: [patch 2/2] cpusets: add interleave_over_allowed option, David Rientjes, (Sat Oct 27, 10:45 am)
Re: [patch 2/2] cpusets: add interleave_over_allowed option, Christoph Lameter, (Sat Oct 27, 10:47 am)
Re: [patch 2/2] cpusets: add interleave_over_allowed option, David Rientjes, (Sat Oct 27, 10:50 am)
Re: [patch 2/2] cpusets: add interleave_over_allowed option, David Rientjes, (Sat Oct 27, 12:16 pm)
Re: [patch 2/2] cpusets: add interleave_over_allowed option, David Rientjes, (Sun Oct 28, 11:19 am)
Re: [patch 2/2] cpusets: add interleave_over_allowed option, Paul Jackson, (Sun Oct 28, 4:46 pm)
Re: [patch 2/2] cpusets: add interleave_over_allowed option, David Rientjes, (Mon Oct 29, 12:00 am)
Re: [patch 2/2] cpusets: add interleave_over_allowed option, Lee Schermerhorn, (Mon Oct 29, 8:00 am)
Re: [patch 2/2] cpusets: add interleave_over_allowed option, Lee Schermerhorn, (Mon Oct 29, 8:10 am)
Re: [patch 2/2] cpusets: add interleave_over_allowed option, Lee Schermerhorn, (Mon Oct 29, 9:23 am)
Re: [patch 2/2] cpusets: add interleave_over_allowed option, Lee Schermerhorn, (Mon Oct 29, 9:54 am)
Re: [patch 2/2] cpusets: add interleave_over_allowed option, Lee Schermerhorn, (Mon Oct 29, 10:46 am)
Re: [patch 2/2] cpusets: add interleave_over_allowed option, Lee Schermerhorn, (Mon Oct 29, 12:01 pm)
Re: [patch 2/2] cpusets: add interleave_over_allowed option, Christoph Lameter, (Mon Oct 29, 1:35 pm)
Re: [patch 2/2] cpusets: add interleave_over_allowed option, Christoph Lameter, (Mon Oct 29, 1:36 pm)
Re: [patch 2/2] cpusets: add interleave_over_allowed option, Lee Schermerhorn, (Tue Oct 30, 1:20 pm)