While this might be of less interest after today's discussion, I
promised to share the results of a run with 8 threads with a wider
selection of lower duty-cycles. The results are very poor for adaptive
and worse for aas (multiple spinners) compared to normal FUTEX_LOCK. As
Thomas and Peter have pointed out, the implementation is sub-optimal.
Before abandoning this approach I will see if I can find the bottlenecks
and simplify the kernel side of things. My impression is that I am doing
a lot more work in the kernel, especially in the adaptive loop, than is
really necessary.
Both the 8 and 256 Thread plots can be viewed here:
http://www.kernel.org/pub/linux/kernel/people/dvhart/adaptive_futex/v4/
--
Darren Hart
IBM Linux Technology Center
Real-Time Linux Team
--