Re: [PATCH for 2.6.24][regression fix] Mempolicy: silently restrict nodemask to allowed nodes V3

Previous thread: CONFIG_SECTION_MISMATCH is replaced by CONFIG_DEBUG_SECTION_MISMATCH by Peter Teoh on Saturday, February 2, 2008 - 3:54 am. (1 message)

Next thread: [PATCH] x86: remove final FASTCALL() uses by Harvey Harrison on Saturday, February 2, 2008 - 4:23 am. (1 message)
To: <linux-mm@...>, <linux-kernel@...>, Andrew Morton <akpm@...>, Christoph Lameter <clameter@...>, Andi Kleen <andi@...>
Cc: <kosaki.motohiro@...>
Date: Saturday, February 2, 2008 - 4:12 am

Hi

I tested numactl on 2.6.24-rc8-mm1.
and I found strange behavior.

test method and result.

$ numactl --interleave=all ls
set_mempolicy: Invalid argument
setting interleave mask: Invalid argument

numactl command download from
ftp://ftp.suse.com/pub/people/ak/numa/
(I choice numactl-1.0.2)

Of course, older kernel(RHEL5.1) works good.

more detail:

1. my machine node and memory.

$ numactl --hardware
available: 16 nodes (0-15)
node 0 size: 0 MB
node 0 free: 0 MB
node 1 size: 0 MB
node 1 free: 0 MB
node 2 size: 3872 MB
node 2 free: 1487 MB
node 3 size: 4032 MB
node 3 free: 3671 MB
node 4 size: 0 MB
node 4 free: 0 MB
node 5 size: 0 MB
node 5 free: 0 MB
node 6 size: 0 MB
node 6 free: 0 MB
node 7 size: 0 MB
node 7 free: 0 MB
node 8 size: 0 MB
node 8 free: 0 MB
node 9 size: 0 MB
node 9 free: 0 MB
node 10 size: 0 MB
node 10 free: 0 MB
node 11 size: 0 MB
node 11 free: 0 MB
node 12 size: 0 MB
node 12 free: 0 MB
node 13 size: 0 MB
node 13 free: 0 MB
node 14 size: 0 MB
node 14 free: 0 MB
node 15 size: 0 MB
node 15 free: 0 MB

2. numactl behavior of --interleave=all
2.1 scan "/sys/devices/system/node" dir
2.2 calculate max node number
2.3 all bit turn on of existing node.
(i.e. 0xFF generated on my environment.)
2.4 call set_mempolicy()

3. 2.6.24-rc8-mm1 set_mempolicy(2) behavior
3.1 check nodesubset(nodemask argument, node_states[N_HIGH_MEMORY])
in mpol_check_policy()

-> check failed when memmoryless node exist.
(i.e. node_states[N_HIGH_MEMORY] of my machine is 0xc)

4. RHEL5.1 set_mempolicy(2) behavior
4.1 check nodesubset(nodemask argument, node_online_map)
in mpol_check_policy().

-> check success.

I don't know wrong either kernel or libnuma.
Please any comments!

- kosaki

--

To: KOSAKI Motohiro <kosaki.motohiro@...>
Cc: <linux-mm@...>, <linux-kernel@...>, Andrew Morton <akpm@...>, Christoph Lameter <clameter@...>, Andi Kleen <andi@...>, <Lee.Schermerhorn@...>
Date: Saturday, February 2, 2008 - 5:09 am

[intentional full quote]

When the kernel behaviour changes and breaks user space then the kernel
is usually wrong. Cc'ed Lee S. who maintains the kernel code now.

-Andi
--

To: Andi Kleen <andi@...>
Cc: <kosaki.motohiro@...>, <linux-mm@...>, <linux-kernel@...>, Andrew Morton <akpm@...>, Christoph Lameter <clameter@...>, <Lee.Schermerhorn@...>
Date: Saturday, February 2, 2008 - 5:37 am

may be yes, may be no.

I have 1 simple question.
Why do libnuma generate bitpattern of all bit on instead
check /sys/devices/system/node/has_high_memory nor
check /sys/devices/system/node/online?

Do you know it?

and I made simple patch that has_high_memory exposed however CONFIG_HIGHMEM disabled.
if CONFIG_HIGHMEM disabled, the has_high_memory file show
the same as the has_normal_memory.

may be, userland process should check has_high_memory file.

but, I am not confident.
Thanks.

- kosaki

Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>

---
drivers/base/node.c | 7 +------
1 file changed, 1 insertion(+), 6 deletions(-)

Index: b/drivers/base/node.c
===================================================================
--- a/drivers/base/node.c 2008-02-02 17:52:32.000000000 +0900
+++ b/drivers/base/node.c 2008-02-02 18:32:38.000000000 +0900
@@ -276,7 +276,6 @@ static SYSDEV_CLASS_ATTR(has_normal_memo
NULL);
static SYSDEV_CLASS_ATTR(has_cpu, 0444, print_nodes_has_cpu, NULL);

-#ifdef CONFIG_HIGHMEM
static ssize_t print_nodes_has_high_memory(struct sysdev_class *class,
char *buf)
{
@@ -285,15 +284,11 @@ static ssize_t print_nodes_has_high_memo

static SYSDEV_CLASS_ATTR(has_high_memory, 0444, print_nodes_has_high_memory,
NULL);
-#endif
-
struct sysdev_class_attribute *node_state_attr[] = {
&attr_possible,
&attr_online,
&attr_has_normal_memory,
-#ifdef CONFIG_HIGHMEM
&attr_has_high_memory,
-#endif
&attr_has_cpu,
};

@@ -302,7 +297,7 @@ static int node_states_init(void)
int i;
int err = 0;

- for (i = 0; i < NR_NODE_STATES; i++) {
+ for (i = 0; i < ARRAY_SIZE(node_state_attr); i++) {
int ret;
...

To: KOSAKI Motohiro <kosaki.motohiro@...>
Cc: Andi Kleen <andi@...>, <linux-mm@...>, <linux-kernel@...>, Andrew Morton <akpm@...>, Christoph Lameter <clameter@...>, Paul Jackson <pj@...>, David Rientjes <rientjes@...>, Mel Gorman <mel@...>
Date: Monday, February 4, 2008 - 2:20 pm

The memoryless nodes patch series changed a lot of things, so just
reverting this one area [mpol_check_policy()] probably won't restore the
prior behavior. A fully populated node mask is not necessarily a proper
subset of node_online_map(). And contextualize_policy() also requires
the mask to be a subset of mems_allowed which also defaults to nodes
with memory.

I don't know how Mel Gorman's "two zonelist" series, which is still
awaiting a window into the -mm tree, affects this behavior. Those
patches will certainly be affected by whatever we decide here.

I don't know the current state of Paul's rework of cpusets and
mems_allowed. That probably resolves this issue, if he still plans on
allowing a fully populated mask to indicate interleaving over all
allowed nodes.

I have a patch that takes a different approach to "interleave=all" that
doesn't solve Paul's and David's requirements. I also have patches to
libnuma and numactl that work with my patches, but I saw no sense in
posting them unless my kernel patches got some traction. If interested,
you can find them at:

In addition to Andi's answer about simplicity, libnuma and numactl
predate the sysfs node masks. There was no way to query what the valid
set of nodes would be, but the kernel allowed a fully populated map. We

Regarding the patch itself: If others have no problems with displaying
a "has_high_memory" node mask for systems w/o HIGH_MEM configured, I can
live with it.

The current upstream kernel [2.6.24] supports a MPOL_MEMS_ALLOWED flag
to get_mempolicy() to return the nodes allowed in the caller's cpuset.
My numactl patches, mentioned above, support this.

However, as Andi says, we really can't break application behavior. All
applications that use mempolicy don't necessarily use libnuma APIs. So,
a fully populated interleave node mask should be allowed and should
probably mean "all allowed nodes with memory".

I think we'd still need to reduce the interleave policy mask to nodes
with memory...

To: Lee Schermerhorn <Lee.Schermerhorn@...>
Cc: KOSAKI Motohiro <kosaki.motohiro@...>, Andi Kleen <andi@...>, <linux-mm@...>, <linux-kernel@...>, Andrew Morton <akpm@...>, Christoph Lameter <clameter@...>, Paul Jackson <pj@...>, David Rientjes <rientjes@...>
Date: Tuesday, February 5, 2008 - 10:31 am

I doubt they'd make a difference to this particular problem.

--
Mel Gorman
Part-time Phd Student Linux Technology Center
University of Limerick IBM Dublin Software Lab
--

To: Mel Gorman <mel@...>
Cc: KOSAKI Motohiro <kosaki.motohiro@...>, Andi Kleen <andi@...>, <linux-mm@...>, <linux-kernel@...>, Andrew Morton <akpm@...>, Christoph Lameter <clameter@...>, Paul Jackson <pj@...>, David Rientjes <rientjes@...>
Date: Tuesday, February 5, 2008 - 11:23 am

I didn't really think so, but I wanted to give you a heads up regarding
this, as I think it will affect your patches. I'm hoping we'll see them
in -mm soon after the .25 merge window closes. If you get there before
a fix to this issue, so much the better, IMO :-).

--

To: Lee Schermerhorn <Lee.Schermerhorn@...>
Cc: Mel Gorman <mel@...>, KOSAKI Motohiro <kosaki.motohiro@...>, Andi Kleen <andi@...>, <linux-mm@...>, <linux-kernel@...>, Andrew Morton <akpm@...>, Paul Jackson <pj@...>, David Rientjes <rientjes@...>
Date: Tuesday, February 5, 2008 - 2:12 pm

Could we focus on the problem instead of discussion of new patches under
development? Can we confirm that what Kosaki sees is a bug?
--

To: Christoph Lameter <clameter@...>
Cc: Mel Gorman <mel@...>, KOSAKI Motohiro <kosaki.motohiro@...>, Andi Kleen <andi@...>, <linux-mm@...>, <linux-kernel@...>, Andrew Morton <akpm@...>, Paul Jackson <pj@...>, David Rientjes <rientjes@...>
Date: Tuesday, February 5, 2008 - 2:27 pm

Christoph: you are free to ignore any part of this discussion that you

by definition, right? we broke user space.

Lee

--

To: Lee Schermerhorn <Lee.Schermerhorn@...>
Cc: Mel Gorman <mel@...>, KOSAKI Motohiro <kosaki.motohiro@...>, Andi Kleen <andi@...>, <linux-mm@...>, <linux-kernel@...>, Andrew Morton <akpm@...>, Paul Jackson <pj@...>, David Rientjes <rientjes@...>
Date: Tuesday, February 5, 2008 - 3:04 pm

Had the impression that we are ignoring Kosaki's fix to the problem. Can
we fix up his patch to address the immediate issue?

--

To: Christoph Lameter <clameter@...>
Cc: <Lee.Schermerhorn@...>, <mel@...>, <kosaki.motohiro@...>, <andi@...>, <linux-mm@...>, <linux-kernel@...>, <akpm@...>, <rientjes@...>
Date: Tuesday, February 5, 2008 - 3:15 pm

Since any of those future patches only add optional modes
with new flags, while preserving current behaviour if you
don't use one of the new flags, therefore the current behavior
has to work as best it can.

Therefore fixes such as this to address immediate issues
are probably needed. Yup.

--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <pj@sgi.com> 1.940.382.4214
--

To: Paul Jackson <pj@...>
Cc: Christoph Lameter <clameter@...>, <Lee.Schermerhorn@...>, <mel@...>, <kosaki.motohiro@...>, <andi@...>, <linux-mm@...>, <linux-kernel@...>, <akpm@...>
Date: Tuesday, February 5, 2008 - 4:06 pm

There's a subtlety to this issue that allows it to be fixed and easily
extended for two upcoming changes:

- Paul Jackson's mempolicy and cpuset interactions change that will
probably allow set_mempolicy() callers to specify with a MPOL_*
flag whether they are referring to "dynamic" or "static" nodemasks[*],
and

- node hotplug (both add and remove) that will change the state of a
node with an identical id.

Paul, with his patch, will need to preserve the "intent" of the mempolicy
as the nodemask that was passed by the user and attempt on all successive
rebinds to accomodate that intent as much as possible.

So at the time of rebind it is quite simple to intersect the set of system
nodes that have memory with the intent of the mempolicy to yield the
effected nodemask. This nodemask is saved in the mempolicy (pol->v.nodes
in this case for interleave) and only steps through the set of nodes that
can allow interleaved allocations.

When the available nodes changes, either by cpuset change or node hotplug,
the rebind is quite simple when the intent is preserved. So we're going
to need an additional nodemask_t added to struct mempolicy that saves this
intent and modify contextualize_policy() to allow it. This will basically
make any set_mempolicy() call succeed even if the application does not
have access to any of the mempolicy nodes because it is possible that they
will become accessible in the future. In that case the mempolicy is
effectively MPOL_DEFAULT until the desired nodes become available and it
is effected.

David
--

To: Lee Schermerhorn <Lee.Schermerhorn@...>
Cc: <kosaki.motohiro@...>, <andi@...>, <linux-mm@...>, <linux-kernel@...>, <akpm@...>, <clameter@...>, <rientjes@...>, <mel@...>
Date: Tuesday, February 5, 2008 - 6:17 am

It got a bit stalled out for the last month (my employer had other
designs on my time.) But I'd really like to drive it home.

What happened so far, in December 2007 and earlier, is that a few of us:

David Rientjes <rientjes@google.com>
Lee.Schermerhorn@hp.com
Christoph Lameter <clameter@sgi.com>
Andi Kleen <ak@suse.de>

had a discussion, motivated in good part by the need to allow a
mempolicy of MPOL_INTERLEAVE over all nodes currently available in
the cpuset, where that interleave policy was robustly preserved if
the cpuset changed (without requiring the application to somehow
"know" its cpuset had changed and reissuing the set_mempolicy call.)

But that discussion touched on some other long standing deficiencies
in the way that I had originally glued cpusets and memory policies
together. The current mechanism doesn't handle changing cpusets very
well, especially if the number of nodes in the cpuset increases.

Obviously, I can't change the current behaviour, especially of the
mempolicy system calls. I can only add new options that provide new
alternatives.

The patchset I'd like to drive home addresses these issues with a
couple of additional MPOL_* flags, upward compatible, that alter the
way that nodemasks are mapped into cpusets, and remapped if the cpuset
subsequently changes.

The next two steps I need to take are:
1) propose this patch, with careful explanation (it's easy to lose
one's bearings in the mappings and remappings of node numberings)
to a wider audience, such as linux-mm or linux-kernel, and
2) carefully test this, especially on each code path I touched in
mm/mempolicy.c, where the changes were delicate, to ensure I
didn't break any existing code.

There were also some other, smaller patches proposed, by myself and
others. I was preferring to address a wider set of the long standing
issues in this area, but the others above mostly preferred the smaller
patches. This needs to be discussed in a wider for...

To: Paul Jackson <pj@...>
Cc: Lee Schermerhorn <Lee.Schermerhorn@...>, <kosaki.motohiro@...>, <andi@...>, <linux-mm@...>, <linux-kernel@...>, <akpm@...>, <clameter@...>, <mel@...>
Date: Tuesday, February 5, 2008 - 3:56 pm

That's because of the nodemask remaps that are done for the various
mempolicy cases when rebinding the policy. I agree we cannot change that
implementation now even though it is undocumented.

The more alarming result of these remaps is in the MPOL_BIND case, as
we've talked about before. The language in set_mempolicy(2):

The MPOL_BIND policy is a strict policy that restricts memory
allocation to the nodes specified in nodemask. There won't be
allocations on other nodes.

makes it pretty clear that allocations will not be done on other nodes not
provided in the set_mempolicy() nodemask if the task is not swapped out.

But the current implementation allows that if the task is either moved to
a different cpuset or its cpuset's mems change. For example, consider a
task that is allowed nodes 1-3 by its cpuset and asks for a MPOL_BIND
mempolicy of node 2. If that cpuset's mems change to 4-6, the mempolicy

I think if these MPOL_* flags that you're proposing are made as generic as
possible for all possible mempolicies (current and future), it would be
the optimal change. It would prevent us from having to add new flags for
corner-cases in the future and would allow us to keep the flag set as
small as possible. My suggestion of MPOL_F_STATIC_NODEMASK goes a long
way to solve these issues both for MPOL_INTERLEAVE (in conjunction with
storing the set_mempolicy() intent) and the MPOL_BIND discrepency I
mentioned above.

David
--

To: David Rientjes <rientjes@...>
Cc: <Lee.Schermerhorn@...>, <kosaki.motohiro@...>, <andi@...>, <linux-mm@...>, <linux-kernel@...>, <akpm@...>, <clameter@...>, <mel@...>
Date: Tuesday, February 5, 2008 - 4:51 pm

You're diving into the middle of a rather involved discussion
we had on the other various patches proposed to extend the
interaction of mempolicy's with cpusets and hotplug.

I choose not to hijack this current thread with my rebuttal,
which you've seen before, of your points here.

--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <pj@sgi.com> 1.940.382.4214
--

To: Paul Jackson <pj@...>
Cc: <Lee.Schermerhorn@...>, <kosaki.motohiro@...>, <andi@...>, <linux-mm@...>, <linux-kernel@...>, <akpm@...>, <clameter@...>, <mel@...>
Date: Tuesday, February 5, 2008 - 5:03 pm

I've simply identified that MPOL_BIND mempolicy interactions with a task's
changing mems_allowed as a result of a cpuset move or mems change is also
an issue that can be addressed at the same time as the interleave problem.

The issues of mempolicies working over memoryless nodes and supporting
changing cpusets are very closely related and can be addressed in the same
way. It would be disappointing to see a lot of work done to fix the
memoryless node issue or the changing cpuset mems issue and then realize
both could have been fixed quite simply with a relatively small set of
changes.

David
--

To: David Rientjes <rientjes@...>
Cc: <Lee.Schermerhorn@...>, <kosaki.motohiro@...>, <andi@...>, <linux-mm@...>, <linux-kernel@...>, <akpm@...>, <clameter@...>, <mel@...>
Date: Tuesday, February 5, 2008 - 5:33 pm

The suggested patch of KOSAKI Motohiro didn't look like a lot of work to me.

I continue to prefer not to hijack this thread for that other discussion.
Just presenting your position and calling it "simple" is misleading.
The discussion so far has involved over a hundred messages over months,
and certainly your position, nor mine for that matter, obtained concensus.

How does the patch of KOSAKI Motohiro, earlier in this thread, look to you?

--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <pj@sgi.com> 1.940.382.4214
--

To: Paul Jackson <pj@...>
Cc: David Rientjes <rientjes@...>, <kosaki.motohiro@...>, <andi@...>, <linux-mm@...>, <linux-kernel@...>, <akpm@...>, <clameter@...>, <mel@...>
Date: Tuesday, February 5, 2008 - 6:04 pm

Paul,

It wasn't clear to me whether Kosaki-san's patch required a modified
numactl/libnuma or not. I think so, because that patch doesn't change
the error return in contextualize_policy() and in mpol_check_policy().
My modified numactl/libnuma avoids this by only passing in allowed mems
fetch via get_mempolicy() with the new MEMS_ALLOWED flags.

The patch I just posted doesn't depend on the numactl changes and seems
quite minimal to me. I think it cleans up the differences between
set_mempolicy() and mbind(), as well. However, some may take exception
to the change in behavior--silently ignoring dis-allowed nodes in
set_mempolicy().

Also, your cpuset/mempolicy work will probably need to undo the
unconditional masking in contextualize_policy() and/or save the original
node mask somewhere...

Lee

--

To: Lee Schermerhorn <Lee.Schermerhorn@...>
Cc: <rientjes@...>, <kosaki.motohiro@...>, <andi@...>, <linux-mm@...>, <linux-kernel@...>, <akpm@...>, <clameter@...>, <mel@...>
Date: Tuesday, February 5, 2008 - 6:50 pm

Yeah, something like that ... just a small matter of code.

--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <pj@sgi.com> 1.940.382.4214
--

To: Lee Schermerhorn <Lee.Schermerhorn@...>
Cc: Paul Jackson <pj@...>, <kosaki.motohiro@...>, <andi@...>, <linux-mm@...>, <linux-kernel@...>, <akpm@...>, <clameter@...>, <mel@...>
Date: Tuesday, February 5, 2008 - 6:44 pm

If the intent of the set_mempolicy() call is going to be preserved in the
struct mempolicy with Paul's change, then we're going to allow disallowed
nodes anyway. So the only nodemask errors that we should return are ones
that are empty; nodemasks that include offlined nodes should be allowed to
support node hotplug. Likewise, memoryless nodes should still be saved as
the intent of the syscall.

The change to save the intent or silently ignore disallowed nodes would
also require applications to issue a successive get_mempolicy() call to
know what their current mempolicy is, since it will be able to change with
a cpusets change or node hotplug event. There is no longer this assurance
that if set_mempolicy() returns without an error that the memory policy is
effected. But in the presence of subsystems such as cpusets that allow
those mempolicies to change from beneath the application, there is no way
around that: the nodemask that the mempolicy acts on can dynamically
change at any time.

So I don't see any problem with silently ignoring disallowed nodes and
encourage it so that the kernel accomodates the intent of the mempolicy in
the future if and when it can be effected.

David
--

To: Paul Jackson <pj@...>
Cc: <kosaki.motohiro@...>, Lee Schermerhorn <Lee.Schermerhorn@...>, <andi@...>, <linux-mm@...>, <linux-kernel@...>, <akpm@...>, <clameter@...>, <rientjes@...>, <mel@...>
Date: Tuesday, February 5, 2008 - 7:14 am

Great.

at that time, I will join review the patch with presure :)

- kosaki

--

To: Lee Schermerhorn <Lee.Schermerhorn@...>
Cc: <kosaki.motohiro@...>, Andi Kleen <andi@...>, <linux-mm@...>, <linux-kernel@...>, Andrew Morton <akpm@...>, Christoph Lameter <clameter@...>, Paul Jackson <pj@...>, David Rientjes <rientjes@...>, Mel Gorman <mel@...>, <torvalds@...>
Date: Tuesday, February 5, 2008 - 5:26 am

Hi Lee-san

unfortunately it doesn't works on my test environment ;-)

numactl-orig numactl-with-lee-patch
2.6.24 failed failed
2.6.24-rc8-mm1 failed failed

I got below error messages by all case.

$ numactl --interleave=all ls
set_mempolicy: Invalid argument
setting interleave mask: Invalid argument

I think kernel is need changed too.
I attached bellow.

OK, I cancel my previous has_high_memory patch.

this is good news for me :)
I'll wait his patch post.

Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>

---
mm/mempolicy.c | 11 +++++++----
1 file changed, 7 insertions(+), 4 deletions(-)

Index: b/mm/mempolicy.c
===================================================================
--- a/mm/mempolicy.c 2008-02-02 17:54:33.000000000 +0900
+++ b/mm/mempolicy.c 2008-02-05 17:49:47.000000000 +0900
@@ -187,9 +187,12 @@ static struct mempolicy *mpol_new(int mo
atomic_set(&policy->refcnt, 1);
switch (mode) {
case MPOL_INTERLEAVE:
- policy->v.nodes = *nodes;
- nodes_and(policy->v.nodes, policy->v.nodes,
- node_states[N_HIGH_MEMORY]);
+ if (nodes) {
+ policy->v.nodes = *nodes;
+ nodes_and(policy->v.nodes, policy->v.nodes,
+ node_states[N_HIGH_MEMORY]);
+ } else
+ policy->v.nodes = node_states[N_HIGH_MEMORY];
if (nodes_weight(policy->v.nodes) == 0) {
kmem_cache_free(policy_cache, policy);
return ERR_PTR(-EINVAL);
@@ -934,7 +937,7 @@ asmlinkage long sys_set_mempolicy(int mo
err = get_nodes(&nodes, nmask, maxnode);
if (err)
return err;
- return do_set_mempolicy(mode, &nodes);
+ return do_set_mempolicy(mode, nodes_empty(nodes) ? NULL : &nodes);
}
...

To: KOSAKI Motohiro <kosaki.motohiro@...>, <linux-kernel@...>, Andrew Morton <akpm@...>
Cc: Christoph Lameter <clameter@...>, Paul Jackson <pj@...>, David Rientjes <rientjes@...>, Mel Gorman <mel@...>, linux-mm <linux-mm@...>, <torvalds@...>, Eric Whitney <eric.whitney@...>
Date: Friday, February 8, 2008 - 3:45 pm

Was "Re: [2.6.24 regression][BUGFIX] numactl --interleave=all doesn't
works on memoryless node."

[Aside: I noticed there were two slightly different distributions for
this topic. I've unified the distribution lists w/o dropping anyone, I
think. Apologies if you'd rather have been dropped...]

Here's V3 of the patch, accomodating Kosaki Motohiro's suggestion for
folding contextualize_policy() into mpol_check_policy() [because my
"was_empty" argument "was ugly" ;-)]. It does seem to clean up the
code.

I'm still deferring David Rientjes' suggestion to fold
mpol_check_policy() into mpol_new(). We need to sort out whether
mempolicies specified for tmpfs and hugetlbfs mounts always need the
same "contextualization" as user/application installed policies. I
don't want to hold up this bug fix for that discussion. This is
something Paul J will need to address with his cpuset/mempolicy rework,
so we can sort it out in that context.

Again, tested with "numactl --interleave=all" and memtoy on ia64 using
mem= command line argument to simulate memoryless node.

Lee

============================
[PATCH] 2.6.24-mm1 - mempolicy: silently restrict nodemask to allowed nodes

V2 -> V3:
+ As suggested by Kosaki Motohito, fold the "contextualization"
of policy nodemask into mpol_check_policy(). Looks a little
cleaner.

V1 -> V2:
+ Communicate whether or not incoming node mask was empty to
mpol_check_policy() for better error checking.
+ As suggested by David Rientjes, remove the now unused
cpuset_nodes_subset_current_mems_allowed() from cpuset.h

Kosaki Motohito noted that "numactl --interleave=all ..." failed in the
presence of memoryless nodes. This patch attempts to fix that problem.

Some background:

numactl --interleave=all calls set_mempolicy(2) with a fully
populated [out to MAXNUMNODES] nodemask. set_mempolicy()
[in do_set_mempolicy()] calls contextualize_policy() which
requires that the nodemask be a subset of the current task's
mems_allowed; e...

To: Lee Schermerhorn <Lee.Schermerhorn@...>
Cc: <kosaki.motohiro@...>, <linux-kernel@...>, Andrew Morton <akpm@...>, Christoph Lameter <clameter@...>, Paul Jackson <pj@...>, David Rientjes <rientjes@...>, Mel Gorman <mel@...>, linux-mm <linux-mm@...>, <torvalds@...>, Eric Whitney <eric.whitney@...>, Greg KH <greg@...>
Date: Sunday, February 10, 2008 - 1:29 am

CC'd Greg KH <greg@kroah.com>

I tested this patch on fujitsu memoryless node.
(2.6.24 + silently-restrict-nodemask-to-allowed-nodes-V3 insted 2.6.24-mm1)
it seems works good.

Tested-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>

Greg, I hope this patch merge to 2.6.24.x stable tree because
this patch is regression fixed patch.
Please tell me what do i doing for it.

--

To: KOSAKI Motohiro <kosaki.motohiro@...>
Cc: Lee Schermerhorn <Lee.Schermerhorn@...>, <linux-kernel@...>, Andrew Morton <akpm@...>, Christoph Lameter <clameter@...>, Paul Jackson <pj@...>, David Rientjes <rientjes@...>, Mel Gorman <mel@...>, linux-mm <linux-mm@...>, <torvalds@...>, Eric Whitney <eric.whitney@...>
Date: Sunday, February 10, 2008 - 1:49 am

Once the patch goes into Linus's tree, feel free to send it to the
stable@kernel.org address so that we can include it in the 2.6.24.x
tree.

thanks,

greg k-h
--

To: Greg KH <greg@...>
Cc: KOSAKI Motohiro <kosaki.motohiro@...>, Lee Schermerhorn <Lee.Schermerhorn@...>, <linux-kernel@...>, Andrew Morton <akpm@...>, Christoph Lameter <clameter@...>, Paul Jackson <pj@...>, David Rientjes <rientjes@...>, Mel Gorman <mel@...>, linux-mm <linux-mm@...>, Eric Whitney <eric.whitney@...>
Date: Sunday, February 10, 2008 - 3:42 am

I've been ignoring the patches because they say "PATCH 2.6.24-mm1", and so
I simply don't know whether it's supposed to go into *my* kernel or just
-mm.

There's also been several versions and discussions, so I'd really like to
have somebody send me a final patch with all the acks etc.. One that is
clearly for me, not for -mm.

Linus
--

To: KOSAKI Motohiro <kosaki.motohiro@...>
Cc: Linus Torvalds <torvalds@...>, Greg KH <greg@...>, <linux-kernel@...>, Andrew Morton <akpm@...>, Christoph Lameter <clameter@...>, Paul Jackson <pj@...>, David Rientjes <rientjes@...>, Mel Gorman <mel@...>, linux-mm <linux-mm@...>, Eric Whitney <eric.whitney@...>
Date: Monday, February 11, 2008 - 12:47 pm

Kosaki-san: You've tested V3 on '.24. Do you want to repost the patch
refreshed against .24, adding your "Tested-by:" [and "Signed-off-by:",
as the folding of the contextualization into mpol_check_policy() is
based on your code--apologies for not adding it myself]? I'm tied up
with something else for most of this week and won't get to it until
Friday, earliest.

Regards,
Lee

P.S., As Andrew pointed out, I forgot to run checkpatch and the patch
does include a violation thereof.

--

To: Lee Schermerhorn <Lee.Schermerhorn@...>, Andrew Morton <akpm@...>
Cc: <kosaki.motohiro@...>, Linus Torvalds <torvalds@...>, Greg KH <greg@...>, <linux-kernel@...>, Christoph Lameter <clameter@...>, Paul Jackson <pj@...>, David Rientjes <rientjes@...>, Mel Gorman <mel@...>, linux-mm <linux-mm@...>, Eric Whitney <eric.whitney@...>
Date: Tuesday, February 12, 2008 - 12:30 am

Hi Andrew

# this is second post of the same patch.

this is backport from -mm to mainline.
original patch is http://marc.info/?l=linux-kernel&m=120250000001182&w=2

my change is only line number change and remove extra space.
please ack.

============================
[PATCH] 2.6.24 - mempolicy: silently restrict nodemask to allowed nodes

V2 -> V3:
+ As suggested by Kosaki Motohito, fold the "contextualization"
of policy nodemask into mpol_check_policy(). Looks a little
cleaner.

V1 -> V2:
+ Communicate whether or not incoming node mask was empty to
mpol_check_policy() for better error checking.
+ As suggested by David Rientjes, remove the now unused
cpuset_nodes_subset_current_mems_allowed() from cpuset.h

Kosaki Motohito noted that "numactl --interleave=all ..." failed in the
presence of memoryless nodes. This patch attempts to fix that problem.

Some background:

numactl --interleave=all calls set_mempolicy(2) with a fully
populated [out to MAXNUMNODES] nodemask. set_mempolicy()
[in do_set_mempolicy()] calls contextualize_policy() which
requires that the nodemask be a subset of the current task's
mems_allowed; else EINVAL will be returned. A task's
mems_allowed will always be a subset of node_states[N_HIGH_MEMORY]--
i.e., nodes with memory. So, a fully populated nodemask will
be declared invalid if it includes memoryless nodes.

NOTE: the same thing will occur when running in a cpuset
with restricted mem_allowed--for the same reason:
node mask contains dis-allowed nodes.

mbind(2), on the other hand, just masks off any nodes in the
nodemask that are not included in the caller's mems_allowed.

In each case [mbind() and set_mempolicy()], mpol_check_policy()
will complain [again, resulting in EINVAL] if the nodemask contains
any memoryless nodes. This is somewhat redundant as mpol_new()
will remove memoryless nodes for interleave policy, as will
bind_zonelist()--called by mpol_new() for BIND policy.

Pr...

To: KOSAKI Motohiro <kosaki.motohiro@...>
Cc: Lee Schermerhorn <Lee.Schermerhorn@...>, Andrew Morton <akpm@...>, Linus Torvalds <torvalds@...>, Greg KH <greg@...>, <linux-kernel@...>, Christoph Lameter <clameter@...>, Paul Jackson <pj@...>, Mel Gorman <mel@...>, linux-mm <linux-mm@...>, Eric Whitney <eric.whitney@...>
Date: Tuesday, February 12, 2008 - 1:06 am

Linus has already merged this patch into his tree, but next time you
pass along a contribution to a maintainer the first line should read:

From: Lee Schermerhorn <Lee.Schermerhorn@hp.com>

so the person who actually wrote the patch is listed as the author in the
git commit.
--

To: KOSAKI Motohiro <kosaki.motohiro@...>
Cc: Lee Schermerhorn <Lee.Schermerhorn@...>, Linus Torvalds <torvalds@...>, Greg KH <greg@...>, <linux-kernel@...>, Christoph Lameter <clameter@...>, Paul Jackson <pj@...>, David Rientjes <rientjes@...>, Mel Gorman <mel@...>, linux-mm <linux-mm@...>, Eric Whitney <eric.whitney@...>
Date: Tuesday, February 12, 2008 - 1:07 am

As it's now post -rc1 and not a 100% obvious thing, I tend to hang onto
such patches for a week or so before sending up to Linus

Should this be backported to 2.6.24.x? If so, the reasons for such a
relatively stern step should be spelled out in the changelog for the
-stable maintiners to evaluate.

Thanks.
--

To: Andrew Morton <akpm@...>
Cc: <kosaki.motohiro@...>, Lee Schermerhorn <Lee.Schermerhorn@...>, Linus Torvalds <torvalds@...>, Greg KH <greg@...>, <linux-kernel@...>, Christoph Lameter <clameter@...>, Paul Jackson <pj@...>, David Rientjes <rientjes@...>, Mel Gorman <mel@...>, linux-mm <linux-mm@...>, Eric Whitney <eric.whitney@...>
Date: Tuesday, February 12, 2008 - 9:18 am

Oh,
you think below reason is not enough, really?

1. it is regression.
2. it is very easy reprodusable on memoryless node machine.

if so, i back down on my backport reclaim.
I don't hope increase your headache ;-)

thanks.

-kosaki

--

To: Linus Torvalds <torvalds@...>
Cc: Greg KH <greg@...>, KOSAKI Motohiro <kosaki.motohiro@...>, Lee Schermerhorn <Lee.Schermerhorn@...>, <linux-kernel@...>, Christoph Lameter <clameter@...>, Paul Jackson <pj@...>, David Rientjes <rientjes@...>, Mel Gorman <mel@...>, linux-mm <linux-mm@...>, Eric Whitney <eric.whitney@...>
Date: Sunday, February 10, 2008 - 6:31 am

fyi, I won't be able to do much patch-wrangling until Tuesday or Wednesday.
All the big machines are disconnected and mothballed due to domestic
s/carpet/hardwood/g.

--

To: Lee Schermerhorn <Lee.Schermerhorn@...>
Cc: <linux-kernel@...>, Andrew Morton <akpm@...>, Christoph Lameter <clameter@...>, Paul Jackson <pj@...>, David Rientjes <rientjes@...>, Mel Gorman <mel@...>, linux-mm <linux-mm@...>, <torvalds@...>, Eric Whitney <eric.whitney@...>
Date: Saturday, February 9, 2008 - 2:11 pm

Hi Lee-san

looks good for me.
I'll test about the head of week and report it by another mail.

--

To: KOSAKI Motohiro <kosaki.motohiro@...>, <linux-kernel@...>, Andrew Morton <akpm@...>
Cc: Christoph Lameter <clameter@...>, Paul Jackson <pj@...>, David Rientjes <rientjes@...>, Mel Gorman <mel@...>, <torvalds@...>, Eric Whitney <eric.whitney@...>
Date: Wednesday, February 6, 2008 - 1:38 pm

I've updated the patch to restore some error checking that my previous
version and the memoryless-nodes series lost.

Again, tested with "numactl --interleave=all" and memtoy on ia64 using
mem= command line argument to simulate memoryless node.

Lee
----------------------------------
[PATCH] 2.6.24-mm1 - mempolicy: silently restrict to allowed nodes

V1 -> V2:
+ Communicate whether or not incoming node mask was empty to
mpol_check_policy() for better error checking.
+ As suggested by David Rientjes, remove the now unused
cpuset_nodes_subset_current_mems_allowed() from cpuset.h

Kosaki Motohito noted that "numactl --interleave=all ..." failed in the
presence of memoryless nodes. This patch attempts to fix that problem.

Some background:

numactl --interleave=all calls set_mempolicy(2) with a fully
populated [out to MAXNUMNODES] nodemask. set_mempolicy()
[in do_set_mempolicy()] calls contextualize_policy() which
requires that the nodemask be a subset of the current task's
mems_allowed; else EINVAL will be returned. A task's
mems_allowed will always be a subset of node_states[N_HIGH_MEMORY]--
i.e., nodes with memory. So, a fully populated nodemask will
be declared invalid if it includes memoryless nodes.

NOTE: the same thing will occur when running in a cpuset
with restricted mem_allowed--for the same reason:
node mask contains dis-allowed nodes.

mbind(2), on the other hand, just masks off any nodes in the
nodemask that are not included in the caller's mems_allowed.

In each case [mbind() and set_mempolicy()], mpol_check_policy()
will complain [again, resulting in EINVAL] if the nodemask contains
any memoryless nodes. This is somewhat redundant as mpol_new()
will remove memoryless nodes for interleave policy, as will
bind_zonelist()--called by mpol_new() for BIND policy.

Proposed fix:

1) modify contextualize_policy to:
a) remember whether the incoming node mask is empty.
b) if not, restrict the nodemask to allowed nod...

To: Lee Schermerhorn <Lee.Schermerhorn@...>
Cc: <kosaki.motohiro@...>, <linux-kernel@...>, Andrew Morton <akpm@...>, Christoph Lameter <clameter@...>, Paul Jackson <pj@...>, David Rientjes <rientjes@...>, Mel Gorman <mel@...>, <torvalds@...>, Eric Whitney <eric.whitney@...>
Date: Thursday, February 7, 2008 - 4:31 am

Hi Lee-san

Unfortunately, 2.6.24-mm1 can't boot on fujitsu machine.
(hmm, origin.patch cause regression to pci initialization ;-)

instead, I tested 2.6.24 + your patch.
it seem work good :)

Tested-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>

was_empty argument is a bit ugly.
Could we unify mpol_check_policy and contextualize_policy?

Could we N_POSSIBLE check?

I attached the patch for my idea explain.
on my test environment, your patch and mine works good both.

- kosaki

---
mm/mempolicy.c | 47 +++++++++++++++++++++--------------------------
1 file changed, 21 insertions(+), 26 deletions(-)

Index: b/mm/mempolicy.c
===================================================================
--- a/mm/mempolicy.c 2008-02-07 17:19:09.000000000 +0900
+++ b/mm/mempolicy.c 2008-02-07 17:24:28.000000000 +0900
@@ -114,9 +114,25 @@ static void mpol_rebind_policy(struct me
const nodemask_t *newmask);

/* Do sanity checking on a policy */
-static int mpol_check_policy(int mode, nodemask_t *nodes, int was_empty)
+static int mpol_check_policy(int mode, nodemask_t *nodes)
{
- int is_empty = nodes_empty(*nodes);
+ int was_empty;
+ int is_empty;
+
+ if (!nodes)
+ return 0;
+
+ /*
+ * Remember whether in coming nodemask was empty, If not,
+ * restrict the nodes to the allowed nodes in the cpuset.
+ * This is guaranteed to be a subset of nodes with memory.
+ */
+ cpuset_update_task_memory_state();
+ was_empty = nodes_empty(*nodes);
+ if (!was_empty)
+ nodes_and(*nodes, *nodes, cpuset_current_mems_allowed);
+
+ is_empty = nodes_empty(*nodes);

switch (mode) {
case MPOL_DEFAULT:
@@ -144,7 +160,7 @@ static int mpol_check_policy(int mode, n
return -EINVAL;
break;
}
- return 0;
+ return nodes_subset(*nodes, node_states[N_POSSIBLE]) ? 0 : -EINVAL;
}

/* Generate a custom zonelist for the BIND policy. */
@@ -432,27 +448,6 @@ static int mbind_range(struct vm_area_st
return err;
}

-static...

To: KOSAKI Motohiro <kosaki.motohiro@...>, <linux-kernel@...>, Andrew Morton <akpm@...>
Cc: Christoph Lameter <clameter@...>, Paul Jackson <pj@...>, David Rientjes <rientjes@...>, Mel Gorman <mel@...>, <torvalds@...>, Eric Whitney <eric.whitney@...>
Date: Tuesday, February 5, 2008 - 5:57 pm

Here's a patch that addresses the problem w/o requiring change to
numactl or libnuma. It DOES have side affects, discussed in the
description.

Tested with memoryless nodes and restricted cpusets using the numactl
installed with RHEL5.1.

Altho' nominally against 24-mm1, applies cleanly to 2.6.24. Should be
suitable for 'stable' if everyone agrees.

Lee
----------------------------------

[PATCH] 2.6.24-mm1 - mempolicy: silently restrict to allowed nodes

Kosaki-san noted that "numactl --interleave=all ..." failed in the
presence of memoryless nodes. This patch attempts to fix that
problem.

Some background:

numactl --interleave=all calls set_mempolicy(2) with a fully
populated [out to MAXNUMNODES] nodemask. set_mempolicy()
[in do_set_mempolicy()] calls contextualize_policy() which
requires that the nodemask be a subset of the current task's
mems_allowed; else EINVAL will be returned. A task's
mems_allowed will always be a subset of node_states[N_HIGH_MEMORY]--
i.e., nodes with memory. So, a fully populated nodemask will
be declared invalid if it includes memoryless nodes.

NOTE: the same thing will occur when running in a cpuset
with restricted mem_allowed--for the same reason:
node mask contains dis-allowed nodes.

mbind(2), on the other hand, just masks off any nodes in the
nodemask that are not included in the caller's mems_allowed.

In each case [mbind() and set_mempolicy()], mpol_check_policy()
will complain [again, resulting in EINVAL] if the nodemask contains
any memoryless nodes. This is somewhat redundant as mpol_new()
will remove memoryless nodes for interleave policy, as will
bind_zonelist()--called by mpol_new() for BIND policy.

Proposed fix:

1) modify contextualize_policy to just remove the non-allowed
nodes, as is currently done in-line for mbind(). This
guarantees that the resulting mask includes only nodes with
memory.

NOTE: this is a [benign, IMO] change in behavior for
set_mempolic...

To: Lee Schermerhorn <Lee.Schermerhorn@...>
Cc: <kosaki.motohiro@...>, <linux-kernel@...>, Andrew Morton <akpm@...>, Christoph Lameter <clameter@...>, Paul Jackson <pj@...>, David Rientjes <rientjes@...>, Mel Gorman <mel@...>, <torvalds@...>, Eric Whitney <eric.whitney@...>
Date: Wednesday, February 6, 2008 - 2:49 am

Thank you!

but unfortunately, My machine is broken phisically today ;-)
I will test it tommorow or later.

--

To: Lee Schermerhorn <Lee.Schermerhorn@...>
Cc: KOSAKI Motohiro <kosaki.motohiro@...>, <linux-kernel@...>, Andrew Morton <akpm@...>, Christoph Lameter <clameter@...>, Paul Jackson <pj@...>, Mel Gorman <mel@...>, <torvalds@...>, Eric Whitney <eric.whitney@...>
Date: Tuesday, February 5, 2008 - 10:17 pm

This change will be necessary when the nodemask passed from the syscall is

I would defer the intersection until later because contextualize_policy()
is called before mpol_new() so we have no struct mempolicy to save the
intent in. It doesn't matter for the sake of this change, I know, but you
could move this intersection to mpol_new() and give us an opportunity to
store the user's nodemask in the mempolicy with a one-line change and get
the same desired result.

You can now remove cpuset_nodes_subset_current_mems_allowed() from

Looks good, thanks for doing this.
--

To: David Rientjes <rientjes@...>
Cc: KOSAKI Motohiro <kosaki.motohiro@...>, <linux-kernel@...>, Andrew Morton <akpm@...>, Christoph Lameter <clameter@...>, Paul Jackson <pj@...>, Mel Gorman <mel@...>, <torvalds@...>, Eric Whitney <eric.whitney@...>
Date: Wednesday, February 6, 2008 - 12:11 pm

Hi, David:

I wanted to avoid a major restructuring of the code for this patch.
However, now that both do_mbind() and do_set_mempolicy() both call
contextualize_policy() [which calls mpol_check_policy()] immediately
before calling mpol_new(), I agree we can push this "contextualization"
down there. I would like to defer this to another patch--perhaps as
part of Paul's rework of mempolicy and cpusets.

Note that there is another caller of mpol_new() --
mpol_shared_policy_init(). We'll need to decide whether that call needs
to be contextualized, as it constructs a policy from the tmpfs or
hugetlbfs superblock, as specified on the mount command [or kernel
command line?]. As this is a privileged operation, one could argue that

As I mentioned to Christoph, I'll post a new version that I think
handles the error conditions better.

Later,
Lee

--

To: Lee Schermerhorn <Lee.Schermerhorn@...>
Cc: <kosaki.motohiro@...>, <linux-kernel@...>, <akpm@...>, <clameter@...>, <rientjes@...>, <mel@...>, <torvalds@...>, <eric.whitney@...>
Date: Tuesday, February 5, 2008 - 6:15 pm

At first glance, I like it. Thanks.

The changes in the exact behaviour of set_mempolicy (and mbind?) seem
to me to be changes for the better -- subtle improvements in the
consistency of handling corner cases.

However I don't have code that depends all that elaborately on the
fine details of these system calls, so I'm easy. If others know of
an existing or likely usage pattern that this patch would break,
that would be interesting input.

Thanks, Lee.

--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <pj@sgi.com> 1.940.382.4214
--

To: Lee Schermerhorn <Lee.Schermerhorn@...>
Cc: KOSAKI Motohiro <kosaki.motohiro@...>, <linux-kernel@...>, Andrew Morton <akpm@...>, Paul Jackson <pj@...>, David Rientjes <rientjes@...>, Mel Gorman <mel@...>, <torvalds@...>, Eric Whitney <eric.whitney@...>
Date: Tuesday, February 5, 2008 - 6:12 pm

Would just removing #ifdef CONFIG_CPUSETS work? mems_allowed falls back to
node_possible_map.... Shouldnt that be node_online_map?

--

To: Christoph Lameter <clameter@...>
Cc: KOSAKI Motohiro <kosaki.motohiro@...>, <linux-kernel@...>, Andrew Morton <akpm@...>, Paul Jackson <pj@...>, David Rientjes <rientjes@...>, Mel Gorman <mel@...>, <torvalds@...>, Eric Whitney <eric.whitney@...>
Date: Wednesday, February 6, 2008 - 12:00 pm

Only for mbind(). set_mempolicy(), via contextualize_policy() was just
returning EINVAL for invalid nodes in the mask. I don't know if it
always worked like this, or if we did this in the memoryless nodes

the nodes_subset() would always return true, once we mask it with
cpuset_current_mems_allowed(), right? mems_allowed can now only contain
nodes with memory and if cpusets are not configured,
cpuset_current_mems_allowed() just returns node_states[N_HIGH_MEMORY].

Again, with the change to contextualize_policy(), the nodemask is

This is the main change in the patch: masking off the invalid nodes
[like sys_mbind() did inline] rather than complaining about them.

However, after I finish testing, I will post an update to this patch

I removed this because I changed do_mbind() to call the revised
contextualize_policy() that does exactly this masking. I didn't see any
reason to leave the duplicate code there.

I think that mems_allowed now falls back to nodes with memory. Or it
should in the current code. When Paul adds his new magic, that might
change.

--

To: KOSAKI Motohiro <kosaki.motohiro@...>
Cc: Andi Kleen <andi@...>, <linux-mm@...>, <linux-kernel@...>, Andrew Morton <akpm@...>, Christoph Lameter <clameter@...>, <Lee.Schermerhorn@...>
Date: Saturday, February 2, 2008 - 7:30 am

It's far simpler and cheaper (sysfs is expensive) to do this in the kernel
and besides the kernel can do more easily keep up with dynamic topology

To be honest I've never tried seriously to make 32bit NUMA policy
(with highmem) work well; just kept it at a "should not break"
level. That is because with highmem the kernel's choices at
placing memory are seriously limited anyways so I doubt 32bit
NUMA will ever work very well.

-Andi

--

To: Andi Kleen <andi@...>
Cc: KOSAKI Motohiro <kosaki.motohiro@...>, <linux-mm@...>, <linux-kernel@...>, Andrew Morton <akpm@...>, <Lee.Schermerhorn@...>
Date: Monday, February 4, 2008 - 3:03 pm

Memory policies do not work reliably with config highmem (I have never
seen such usage because large memory systems are typically 64 bit
which have no highmem, but there are some 32bit numa uses of HIGHMEM) ....

Memory policies are only applied to the highest zone. So if a system has
highmem on some nodes and not on the others then policies will only be
applied if allocations happen to occur on the highmem nodes.
--

Previous thread: CONFIG_SECTION_MISMATCH is replaced by CONFIG_DEBUG_SECTION_MISMATCH by Peter Teoh on Saturday, February 2, 2008 - 3:54 am. (1 message)

Next thread: [PATCH] x86: remove final FASTCALL() uses by Harvey Harrison on Saturday, February 2, 2008 - 4:23 am. (1 message)