Re: [RFC/PATCH] SLUB: dynamic per-cache MIN_PARTIAL

Previous thread: Re: Kernel Installer by Satish Eerpini on Monday, August 4, 2008 - 2:31 pm. (1 message)

Next thread: Tyan S2923-E suspend to ram fails to resume by Simon Arlott on Monday, August 4, 2008 - 2:56 pm. (8 messages)
From: Pekka J Enberg
Date: Monday, August 4, 2008 - 2:39 pm

From: Pekka Enberg <penberg@cs.helsinki.fi>

This patch changes the static MIN_PARTIAL to a dynamic per-cache ->min_partial
value that is calculated from object size. The bigger the object size, the more
pages we keep on the partial list.

I tested SLAB, SLUB, and SLUB with this patch on Jens Axboe's 'netio' example
script of the fio benchmarking tool. The script stresses the networking
subsystem which should also give a fairly good beating of kmalloc() et al.

To run the test yourself, first clone the fio repository:

  git clone git://git.kernel.dk/fio.git

and then run the following command n times on your machine:

  time ./fio examples/netio

The results on my 2-way 64-bit x86 machine are as follows:

  [ the minimum, maximum, and average are captured from 50 individual runs ]

                 real time (seconds)
                 min      max      avg      sd
  SLAB           22.76    23.38    22.98    0.17
  SLUB           22.80    25.78    23.46    0.72
  SLUB (dynamic) 22.74    23.54    23.00    0.20    

                 sys time (seconds)
                 min      max      avg      sd
  SLAB           6.90     8.28     7.70     0.28
  SLUB           7.42     16.95    8.89     2.28
  SLUB (dynamic) 7.17     8.64     7.73     0.29    

                 user time (seconds)
                 min      max      avg      sd
  SLAB           36.89    38.11    37.50    0.29
  SLUB           30.85    37.99    37.06    1.67
  SLUB (dynamic) 36.75    38.07    37.59    0.32    

As you can see from the above numbers, this patch brings SLUB to the same level
as SLAB for this particular workload fixing a ~2% regression. I'd expect this
change to help similar workloads that allocate a lot of objects that are close
to the size of a page.

Cc: Matthew Wilcox <matthew@wil.cx>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Christoph Lameter <cl@linux-foundation.org>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
---
 include/linux/slub_def.h |    1 +
 ...
From: Christoph Lameter
Date: Monday, August 4, 2008 - 2:43 pm

Well looks okay. Sigh. I sure wish we would deal with the page allocator
performance instead of adding more buffering.

--

From: Matthew Wilcox
Date: Tuesday, August 12, 2008 - 5:27 am

Hi Pekka,

We tested this patch and it was performance-neutral on TPC-C.  I was
hoping it would give a nice improvement ... so I'm disappointed.  But at
least there's no regression!

-- 
Intel are signing my paycheques ... these opinions are still mine
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."
--

From: Pekka Enberg
Date: Tuesday, August 12, 2008 - 5:33 am

Hi Matthew,



OK, so your regression is something else then. Well, thanks for testing!

--

Previous thread: Re: Kernel Installer by Satish Eerpini on Monday, August 4, 2008 - 2:31 pm. (1 message)

Next thread: Tyan S2923-E suspend to ram fails to resume by Simon Arlott on Monday, August 4, 2008 - 2:56 pm. (8 messages)