Believe it or not, 64 might not be enough. The Nahelem 8 core (16 HT's)
has two QPI connects. In theory, you could put together a node with 4
cpu sockets and 2 of the new io inf's on a single board. That's 64 cpus
and 4 PCIe busses (plus all the legacy stuff). The Intel microarch
could very well support 8 cores in the next gen processors.
Btw, I meant the above to be a struct so node and bitmap are both
present. This causes a contiguous subset of cpu ids to be in the
bitmask. Of course, this would rely on the cpus being "discovered"
in topology order, possibly with holes (not clear if that's really
necessary.)
So a system with 8 nodes and 32 processors each, node 2's cpus would be
64..95 and the nodecpumask would be { 2, 0xffffffff00000000 } (assuming
max cpus per node == 64.)
Another angle thrown around was using a 128 bit cpu mask struct,
with some number of upper bits defining the remainder, which could be a
bit mask field, a pointer to a bitmask, a bitmask subset (as above),
etc. Then all the cpus_* ops would be modified to accept the alternate
types of cpu mask sets, compiling out (optimizing) those not present on
a particular arch.
[One last point, we (SGI) are counting on _this_ release to have
NR_CPUS=4096 in the default distro config. Sufficient to say, some
of our customers will not accept "special" built kernels, but instead
require standard, certified, licensable kernels built by the distros.
(This is for the "Enterprise" Editions, Desktop distros course probably
won't go as high.)]
Thanks,
Mike
--