> - no normal processor "read-modify-write" instructions that by design
Take a look at sparc 32bit. That only has a single meaningful atomic
instruction (swap byte with 0xFF). It provides all the kernel atomic_t
operations via this: arch/sparc/lib/atomic32.c. That bitops are done a
similar way, which leaves spinlocks and the like.
More importantly if your true locks in the FPGA are really fast in CPU
terms then you can think of every other atomic instructions as being
implemented using
lock(cpu_atomic_instruction_lock)
do bits
unlock(cpu_atomic_instruction_lock)
(its just this is normally done in hardware/microcode)
Doing it per instruction might be a bit naïve but I think you can
reasonably do it so that things like spinlocks use a single (or a hashed
set) of non kernel locks to implement "atomic" instructions, and as
sparc32 shows you only need a tiny subset of them to implement the rest
in their terms.
So I don't actually think you need any kernel core changes to get going,
and given the kernel dynamically allocates a lot of locks I suspect
trying to dynamically manage atomic ram allocations is going to cost more
than executing a few instructions here and there under a single very fast
hardware assisted lock.
Alan
--