local_t requires the disabling of preempt to work right.
The real solution here is cpu_alloc / cpu_ops. Per cpu operations work on
an offset relative to the start of the per process cpu data area in some
register. An increment can then be atomic vs. interrupt because it does
the calculation of the address and the inc in one instruction. F.e.
gs: inc [percpu_offset]
Processor may change before and after without ill effects. So no preempt
since the instruction will always reference the %%gs register that points
to the percpu area of the currently executing processor.
--