You need at least the zero basing to enable the use of the segment
register on x86_64.
The argument does not make any sense. First you want to use atomic_t then
not?
I tried it and did not give any benefit except first failing due to bugs
because local_t did not disable preempt6... This led to Andi fixing
local_t.
But with the preempt disabling I could not discern what the benefit
would be.
CPU_INC does not require disabling of preempt and the cpu alloc patches
shorten the code sequence to increment a VM counter significantly.
Here is the header from the patch. How would cpu_local_inc be able to do
better unless you adopt this patchset and add a shim layer?
Subject: VM statistics: Use CPU ops
The use of CPU ops here avoids the offset calculations that we used to
have to do with per cpu operations. The result of this patch is that event
counters are coded with a single instruction the following way:
incq %gs:offset(%rip)
Without these patches this was:
mov %gs:0x8,%rdx
mov %eax,0x38(%rsp)
mov xxx(%rip),%eax
mov %eax,0x48(%rsp)
mov varoffset,%rax
incq 0x110(%rax,%rdx,1)
--