Using the spinlock array idea also doesn't work in userspace
because any signal handler that tries to do an atomic on the
same object will deadlock on the spinlock.
You'll have have to do this entirely in the kernel, and your
FUTEX implementation will have to always make the futex()
system call.
It's the only way to do this and have it work completely.
We have to do something similar on sparc32, and the signal handler
deadlocks are very real, many glibc testcases that use threads
and pthread mutexes will deadlock because of this very issue if
you try to do the spinlock trick in userspace.
--