actually, it's not that remote, it happens every time
NR_CPUS > num_possible_cpus(). i ran into this myself
on a dual core box with NR_CPUS=4. due to my rewrite
of the i386 per-cpu segment handling, i actually got
a NULL deref where the vanilla kernel would be accessing
the area of [__per_cpu_start, __per_cpu_end] for each
non-possible CPU (which doesn't crash per se but is
still not correct somehow i think).
-