Re: [PATCH 0/2] NR_CPUS: increase maximum NR_CPUS to 4096

Previous thread: [PATCH 12/12] cpu/node mask: reduce stack usage using MASK_NONE, MASK_ALL by Mike Travis on Tuesday, March 25, 2008 - 9:38 pm. (1 message)

Next thread: [PATCH 2/2] x86: Modify Kconfig to allow up to 4096 cpus by Mike Travis on Tuesday, March 25, 2008 - 9:41 pm. (7 messages)
To: Andrew Morton <akpm@...>
Cc: Ingo Molnar <mingo@...>, <linux-mm@...>, <linux-kernel@...>
Date: Tuesday, March 25, 2008 - 9:41 pm

Increases the limit of NR_CPUS to 4096 and introduces a
boolean called "MAXSMP" which when set (e.g. "allyesconfig")
will set NR_CPUS = 4096 and NODES_SHIFT = 9 (512).

I've been running this config (4k NR_CPUS, 512 Max Nodes)
on an AMD box with 2 dual-cores and 4gb memory. I've also
successfully booted it in a simulated 2cpus/1Gb environment.

Based on:
git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86.git

Signed-off-by: Mike Travis <travis@sgi.com>
---

Memory usage effects from upping NR_CPUS to 4096 and MAX_NUMANODES to 512.

255-akpm2: akpm2 config with NR_CPUS=255 / NUMA_NODE_SHIFT=6
4k-akpm2: akpm2 config with NR_CPUS=4096 / NUMA_NODE_SHIFT=9

====== Data (-l 1000)
1 - 255-akpm2
2 - 4k-akpm2

.1. .2. ..final..
1114112 +3899392 5013504 +350% irq_desc(.data.cacheline_aligned)
313344 +4177920 4491264 +1333% irq_cfg(.data.read_mostly)
76800 +537600 614400 +700% early_node_map(.init.data)
32640 +491648 524288 +1506% boot_pageset(.bss)
32640 +491648 524288 +1506% boot_cpu_pda(.data.cacheline_aligned)
23040 +161280 184320 +700% initkmem_list3(.init.data)
5632 +39424 45056 +700% node_devices(.bss)
4096 +28672 32768 +700% plat_node_bdata(.bss)
2656 +34312 36968 +1291% cache_cache(.data)
2048 +14336 16384 +700% rio_devs(.init.data)
2048 +260096 262144 +12700% node_to_cpumask_map(.data.read_mostly)
2040 +30728 32768 +1506% centrino_model(.bss)
2040 +30728 32768 +1506% centrino_cpu(.bss)
2040 +30728 32768 +1506% _cpu_pda(.data.read_mostly)
1024 +1024 2048 +100% pxm_to_node_map(.data)
1024 +7168 8192 +700% nodes_add(.bss)
1024 +7168 8192 +700% nodes(.init.data)
1024 +7168 8192 +700% hugepage_freelists(.bss)
1020 +15364 16384 +1506% x86_cpu_to_node_map_init(.data)...

To: Mike Travis <travis@...>
Cc: Andrew Morton <akpm@...>, <linux-mm@...>, <linux-kernel@...>
Date: Wednesday, March 26, 2008 - 2:19 am

cool!

this depends on the cpumask changes to work correctly (i.e. to boot at
all), right?

Ingo
--

To: Ingo Molnar <mingo@...>
Cc: Andrew Morton <akpm@...>, <linux-mm@...>, <linux-kernel@...>
Date: Wednesday, March 26, 2008 - 11:59 am

Yes, it overflows the stack quite quickly without the cpumask changes.
I didn't do any testing to see what's the minimal set of changes.

Thanks,
Mike
--

Previous thread: [PATCH 12/12] cpu/node mask: reduce stack usage using MASK_NONE, MASK_ALL by Mike Travis on Tuesday, March 25, 2008 - 9:38 pm. (1 message)

Next thread: [PATCH 2/2] x86: Modify Kconfig to allow up to 4096 cpus by Mike Travis on Tuesday, March 25, 2008 - 9:41 pm. (7 messages)