Re: kswapd min order, slub max order [was Re: -mm merge plans for 2.6.24]

!MAILaRCHIVE_VOTE_RePLACE
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]
To: Mel Gorman <mel@...>
Cc: Hugh Dickins <hugh@...>, Andrew Morton <akpm@...>, <linux-kernel@...>, <linux-mm@...>
Date: Tuesday, October 2, 2007 - 8:37 pm

On Tue, 2 Oct 2007, Christoph Lameter wrote:


A patch like this? This is based on the number of page structs on the 
system. Maybe it needs to be based on the number of MAX_ORDER blocks
for antifrag?


SLUB: Determine slub_max_order depending on the number of pages available

Determine the maximum order to be used for slabs and the mininum
desired number of objects in a slab from the amount of pages that
a system has available (like SLAB does for the order 1/0 distinction).

For systems with less than 128M only use order 0 allocations (SLAB does 
that for <32M only). The order 0 config is useful for small systems to 
minimize the memory used. Memory easily fragments since we have less than 
32k pages to play with. Order 0 insures that higher order allocations are 
minimized (Larger orders must still be used for objects that do not fit 
into order 0 pages).

Then step up to order 1 for systems < 256000 pages (1G)

Order 2 limit to systems < 1000000 page structs (4G)

Order 3 for systems larger than that.

Signed-off-by: Christoph Lameter <clameter@sgi.com>

---
 mm/slub.c |   49 +++++++++++++++++++++++++------------------------
 1 file changed, 25 insertions(+), 24 deletions(-)

Index: linux-2.6/mm/slub.c
===================================================================
--- linux-2.6.orig/mm/slub.c	2007-10-02 09:26:16.000000000 -0700
+++ linux-2.6/mm/slub.c	2007-10-02 16:40:22.000000000 -0700
@@ -153,25 +153,6 @@ static inline void ClearSlabDebug(struct
 /* Enable to test recovery from slab corruption on boot */
 #undef SLUB_RESILIENCY_TEST
 
-#if PAGE_SHIFT <= 12
-
-/*
- * Small page size. Make sure that we do not fragment memory
- */
-#define DEFAULT_MAX_ORDER 1
-#define DEFAULT_MIN_OBJECTS 4
-
-#else
-
-/*
- * Large page machines are customarily able to handle larger
- * page orders.
- */
-#define DEFAULT_MAX_ORDER 2
-#define DEFAULT_MIN_OBJECTS 8
-
-#endif
-
 /*
  * Mininum number of partial slabs. These will be left on the partial
  * lists even if they are empty. kmem_cache_shrink may reclaim them.
@@ -1718,8 +1699,9 @@ static struct page *get_object_page(cons
  * take the list_lock.
  */
 static int slub_min_order;
-static int slub_max_order = DEFAULT_MAX_ORDER;
-static int slub_min_objects = DEFAULT_MIN_OBJECTS;
+static int slub_max_order;
+static int slub_min_objects = 4;
+static int manual;
 
 /*
  * Merge control. If this is set then no merging of slab caches will occur.
@@ -2237,7 +2219,7 @@ static struct kmem_cache *kmalloc_caches
 static int __init setup_slub_min_order(char *str)
 {
 	get_option (&str, &slub_min_order);
-
+	manual = 1;
 	return 1;
 }
 
@@ -2246,7 +2228,7 @@ __setup("slub_min_order=", setup_slub_mi
 static int __init setup_slub_max_order(char *str)
 {
 	get_option (&str, &slub_max_order);
-
+	manual = 1;
 	return 1;
 }
 
@@ -2255,7 +2237,7 @@ __setup("slub_max_order=", setup_slub_ma
 static int __init setup_slub_min_objects(char *str)
 {
 	get_option (&str, &slub_min_objects);
-
+	manual = 1;
 	return 1;
 }
 
@@ -2566,6 +2548,16 @@ int kmem_cache_shrink(struct kmem_cache 
 }
 EXPORT_SYMBOL(kmem_cache_shrink);
 
+/*
+ * Table to autotune the maximum slab order based on the number of pages
+ * that the system has available.
+ */
+static unsigned long __initdata phys_pages_for_order[PAGE_ALLOC_COSTLY_ORDER] = {
+	32768,		/* >128M if using 4K pages, >512M (16k), >2G (64k) */
+	256000,		/* >1G if using 4k pages, >4G (16k), >16G (64k) */
+	1000000		/* >4G if using 4k pages, >16G (16k), >64G (64k) */
+};
+
 /********************************************************************
  *			Basic setup of slabs
  *******************************************************************/
@@ -2575,6 +2567,15 @@ void __init kmem_cache_init(void)
 	int i;
 	int caches = 0;
 
+	if (!manual) {
+		/* No manual parameters. Autotune for system */
+		for (i = 0; i < PAGE_ALLOC_COSTLY_ORDER; i++)
+			if (num_physpages > phys_pages_for_order[i]) {
+				slub_max_order++;
+				slub_min_objects <<= 1;
+			}
+	}
+
 #ifdef CONFIG_NUMA
 	/*
 	 * Must first have the slab cache available for the allocations of the
-
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

Messages in current thread:
-mm merge plans for 2.6.24, Andrew Morton, (Mon Oct 1, 5:22 pm)
new aops merge [was Re: -mm merge plans for 2.6.24], Hugh Dickins, (Tue Oct 2, 12:21 pm)
Re: kswapd min order, slub max order [was Re: -mm merge plan..., Christoph Lameter, (Tue Oct 2, 8:37 pm)
x86 patches was Re: -mm merge plans for 2.6.24, Andi Kleen, (Tue Oct 2, 2:18 am)
Re: x86 patches was Re: -mm merge plans for 2.6.24, Andrew Morton, (Tue Oct 2, 2:32 am)
Re: x86 patches was Re: -mm merge plans for 2.6.24, Ingo Molnar, (Tue Oct 2, 3:37 am)
Re: x86 patches was Re: -mm merge plans for 2.6.24, Andi Kleen, (Tue Oct 2, 3:46 am)
Re: x86 patches was Re: -mm merge plans for 2.6.24, Thomas Gleixner, (Tue Oct 2, 3:58 am)
Re: x86 patches was Re: -mm merge plans for 2.6.24, Andi Kleen, (Tue Oct 2, 3:01 am)
Re: x86 patches was Re: -mm merge plans for 2.6.24, Andy Whitcroft, (Tue Oct 2, 5:26 am)
Re: x86 patches was Re: -mm merge plans for 2.6.24, Andrew Morton, (Tue Oct 2, 3:18 am)
Re: x86 patches was Re: -mm merge plans for 2.6.24, KAMEZAWA Hiroyuki, (Tue Oct 2, 3:36 am)
Re: x86 patches was Re: -mm merge plans for 2.6.24, Christoph Lameter, (Tue Oct 2, 2:16 pm)
Re: x86 patches was Re: -mm merge plans for 2.6.24, Nish Aravamudan, (Tue Oct 2, 12:40 pm)
Re: x86 patches was Re: -mm merge plans for 2.6.24, Andrew Morton, (Tue Oct 2, 3:43 am)
Re: x86 patches was Re: -mm merge plans for 2.6.24, KAMEZAWA Hiroyuki, (Tue Oct 2, 4:16 am)
Re: x86 patches was Re: -mm merge plans for 2.6.24, Christoph Lameter, (Tue Oct 2, 2:18 pm)
Re: x86 patches was Re: -mm merge plans for 2.6.24, Yasunori Goto, (Tue Oct 2, 6:48 am)
Re: x86 patches was Re: -mm merge plans for 2.6.24, Lee Schermerhorn, (Tue Oct 2, 1:25 pm)
Re: x86 patches was Re: -mm merge plans for 2.6.24, Lee Schermerhorn, (Tue Oct 2, 1:17 pm)
Re: x86 patches was Re: -mm merge plans for 2.6.24, Matt Mackall, (Tue Oct 2, 3:55 am)
Re: x86 patches was Re: -mm merge plans for 2.6.24, Andi Kleen, (Tue Oct 2, 3:59 am)
Re: -mm merge plans for 2.6.24, Pekka Enberg, (Tue Oct 2, 12:12 pm)
v4l-stk11xx* [Was: -mm merge plans for 2.6.24], Jiri Slaby, (Tue Oct 2, 3:59 am)
writeback fixes, Fengguang Wu, (Tue Oct 2, 4:39 am)
Re: -mm merge plans for 2.6.24, Borislav Petkov, (Sat Oct 13, 4:44 am)
Re: -mm merge plans for 2.6.24, Andrew Morton, (Sat Oct 13, 4:52 am)
Re: -mm merge plans for 2.6.24, Borislav Petkov, (Sat Oct 13, 7:45 am)
r/o bind mounts, was Re: -mm merge plans for 2.6.24, Christoph Hellwig, (Tue Oct 9, 5:19 am)
Re: remove zero_page (was Re: -mm merge plans for 2.6.24), Linus Torvalds, (Wed Oct 3, 11:21 am)
Re: remove zero_page (was Re: -mm merge plans for 2.6.24), Linus Torvalds, (Tue Oct 9, 10:52 am)
Re: remove zero_page (was Re: -mm merge plans for 2.6.24), Linus Torvalds, (Tue Oct 9, 10:22 pm)
Re: remove zero_page (was Re: -mm merge plans for 2.6.24), Hugh Dickins, (Wed Oct 10, 12:06 am)
Re: remove zero_page (was Re: -mm merge plans for 2.6.24), Linus Torvalds, (Wed Oct 10, 1:20 am)
Re: remove zero_page (was Re: -mm merge plans for 2.6.24), Linus Torvalds, (Wed Oct 10, 11:04 am)
Re: remove zero_page (was Re: -mm merge plans for 2.6.24), Linus Torvalds, (Tue Oct 9, 11:06 pm)
wibbling over the cpuset shed domain connnection, Paul Jackson, (Mon Oct 1, 5:34 pm)
Re: wibbling over the cpuset shed domain connnection, Nick Piggin, (Tue Oct 2, 8:36 am)
Re: wibbling over the cpuset shed domain connnection, Paul Jackson, (Wed Oct 3, 1:21 am)
Re: wibbling over the cpuset shed domain connnection, Nick Piggin, (Tue Oct 2, 9:12 am)
Re: wibbling over the cpuset shed domain connnection, Paul Jackson, (Wed Oct 3, 3:00 am)
Re: wibbling over the cpuset shed domain connnection, Andrew Morton, (Wed Oct 3, 6:57 am)
per BDI dirty limit (was Re: -mm merge plans for 2.6.24), Peter Zijlstra, (Tue Oct 2, 4:17 am)
Re: per BDI dirty limit (was Re: -mm merge plans for 2.6.24), Martin Knoblauch, (Wed Oct 3, 7:00 am)
Re: per BDI dirty limit (was Re: -mm merge plans for 2.6.24), Peter Zijlstra, (Fri Oct 26, 10:48 am)
Re: per BDI dirty limit (was Re: -mm merge plans for 2.6.24), Trond Myklebust, (Fri Oct 26, 12:37 pm)
Re: per BDI dirty limit (was Re: -mm merge plans for 2.6.24), Peter Zijlstra, (Fri Dec 14, 10:50 am)
Re: per BDI dirty limit (was Re: -mm merge plans for 2.6.24), Miklos Szeredi, (Fri Dec 14, 11:14 am)
Re: per BDI dirty limit (was Re: -mm merge plans for 2.6.24), Peter Zijlstra, (Fri Dec 14, 11:54 am)
Re: per BDI dirty limit (was Re: -mm merge plans for 2.6.24), Miklos Szeredi, (Fri Oct 26, 11:06 am)
Re: per BDI dirty limit (was Re: -mm merge plans for 2.6.24), Peter Zijlstra, (Fri Oct 26, 11:22 am)
Re: per BDI dirty limit (was Re: -mm merge plans for 2.6.24), Peter Zijlstra, (Fri Oct 26, 11:33 am)
[PATCH] mm: sysfs: expose the BDI object in sysfs, Peter Zijlstra, (Fri Nov 2, 10:59 am)
Re: [PATCH] mm: sysfs: expose the BDI object in sysfs, Kay Sievers, (Fri Nov 2, 11:13 am)
Re: per BDI dirty limit (was Re: -mm merge plans for 2.6.24), Peter Zijlstra, (Sat Oct 27, 12:07 pm)