Indeed that is a worst case scenario due to finer grained locking. The
opposite side of that is that fast concurrent freeing of objects from two
processors will have higher performance in slub since there is
significantly less global lock contention and less work with expiring
objects and moving them around (if you hit the queue limits then SLAB
will do synchroonous merging of objects into slabs, its then no longer
able to hide the object handling overhead in cache_reap().)
--