OK, I have repeated the benchmarking in two additional cases:
1) NetBSD with 8 CPUs and some kind of experimental kernel that Andrew
gave me (based on the vmlocking branch). This is using the new scheduler.
2) As above with experimental libc and libpthread also given to me by
Andrew. I dunno what changes these contain either :)
I was only able to run in the 8 CPU configuration because when I tried
to disable CPUs with cpuctl, processes would hang under load. This is
probably a scheduler issue.
http://people.freebsd.org/~kris/scaling/netbsd.png
This shows some improvement but not much, relatively speaking. In
particular performance at 4 threads is still significantly below FreeBSD
performance, which (given what I measured previously) suggests that
there is still a performance deficit with 4 CPUs on NetBSD. It would be
nice to be able to test this directly though, maybe Andrew can give me a
kernel that has MAXCPU=4 or whatever the NetBSD version is.
Kris