Andrew Doran wrote:Well you have to be careful there, tcmalloc apparently defers frees, and is not really a general purpose malloc. The linux performance problems are (were? I haven't tried recent kernels) real though. I am somewhat surprised by this, because on FreeBSD it is really not spending much time in the kernel (only ~20% system time), so there does not seem to be much scope for a 10% performance difference. Also it took quite a lot of work to optimize locking of various kernel subsystems that are used by this workload, and until that point there was significant kernel lock contention which reduced performance by tens of percent. I would have expected this to matter on NetBSD - even with the vmlocking work there is still more to go. I will try to reproduce this on my own hardware (see below). Here is the initial run with CVS HEAD sources (I took out the obvious things from GENERIC.MP like I386_CPU support, etc, and removed the default datasize and stack size limits). Same benchmark config that Andrew is using, etc. http://people.freebsd.org/~kris/scaling/netbsd.png There are a couple of things to note: * the drop-off above 8 threads on FreeBSD is due to non-scalability of mysql itself. i.e. it comes from pthread mutex contention in userland. This is the only relevant lock contention point in the FreeBSD kernel on this workload. There are some things we can do in libpthread to mitigate the performance loss in the over-contended pthread situation, but we haven't done them yet. * The tail end of the graph is somewhat noisy, which is the reason for the jump at 19 threads (I only graphed a single run). The distribution at 20 clients looks like: +------------------------------------------------------------+ | x x | |x x x xxx x x xx x x xxx x xx| | |_______________A_M_____________| | +------------------------------------------------------------+ N Min Max Median Avg Stddev x 20 2326.01 2758.86 2586.47 2572.856 116.69937 Next, to try and reproduce Andrew's result, I disabled 4 CPUs (using cpuctl in NetBSD) and compared FreeBSD and NetBSD again. I didnt do a full graph yet, but the results are consistent with what I saw on 8 CPUs. NetBSD: 4 threads 1137.83 1135.49 1138.80 1138.06 20 threads 1101.84 1068.56 1075.32 998.49 Note that these are lower but not too different from the NetBSD values when all 8 CPUs are in use. FreeBSD: 4 threads 1985.48 1997.13 1997.43 20 threads 1813.02 1817.73 1824.59 The 4 thread performance is basically identical to the 8 CPU case, showing that the FreeBSD scaling graphed on 8 CPUs is the same as on 4 CPUs (but without the tail since mysql contention is now rate-limited), i.e. FreeBSD is continuing to scale linearly. This measurement shows that FreeBSD is performing 70-80% better than NetBSD in this 4 CPU configuration. This is in contrast to Andrew's findings which seem to show NetBSD performing 10% better than FreeBSD on a 4 CPU system (a very old one though). I will try later with the experimental kernel Andrew sent me (which includes the new scheduler). If it indeed gives a 100% performance improvement that would be a significant result :-) Kris
| Ryan Hope | reiser4 for 2.6.27-rc1 |
| Bart Van Assche | Integration of SCST in the mainstream Linux kernel |
| Pierre Ossman | [RFC][PATCH] cpuidle: avoid singing capacitors |
| Rafael J. Wysocki | 2.6.26-rc9-git12: Reported regressions from 2.6.25 |
git: | |
| Marius Storm-Olsen | Stats in Git |
| Jakub Narebski | [PATCH] gitweb: Use File::Find::find in git_get_projects_list |
| Johannes Schindelin | [PATCH 01/15] Mark strings for translation. |
| Linus Torvalds | Re: git and larger trees, not so fast? |
| Richard Stallman | Real men don't attack straw men |
| Steven Surdock | Problems with second ipsec(ctl) tunnel |
| GVG GVG | ssh_exchange_identification: Connection closed by remote host |
| Bertram Scharpf | First install: Grub doesn't find partitions |
| Jim Winstead Jr. | Re: Root Disk/Book Disk Compatibility |
| Stephen Pierce | SLS |
| Les Andrzejewski | X386/WD90C31/SUMSUNG SYNC MASTER 4 |
| Sander van Malssen | uemacs |
