There is no point in making absolute demands like "no limits". There are
always limits to everything.
A new implementation avoids the need to allocate per cpu arrays and also
avoids the 32 bytes per object times cpus that are mostly wasted for small
allocations today. So its going to potentially allow more per cpu objects
that available today.
A reasonable implementation for 64 bit is likely going to depend on
reserving some virtual memory space for the per cpu mappings so that they
can be dynamically grown up to what the reserved virtual space allows.
F.e. If we reserve 256G of virtual space and support a maximum of 16k cpus
then there is a limit on the per cpu space available of 16MB.
-