Re: [patch 3/4] cpu alloc: The allocator

Previous thread: [patch 0/4] Cpu alloc V6: Replace percpu allocator in modules.c by Christoph Lameter on Monday, September 29, 2008 - 12:35 pm. (4 messages)

Next thread: [patch 1/4] Make the per cpu reserve configurable by Christoph Lameter on Monday, September 29, 2008 - 12:35 pm. (1 message)
From: Christoph Lameter
Date: Monday, September 29, 2008 - 12:35 pm

The per cpu allocator allows dynamic allocation of memory on all
processors simultaneously. A bitmap is used to track used areas.
The allocator implements tight packing to reduce the cache footprint
and increase speed since cacheline contention is typically not a concern
for memory mainly used by a single cpu. Small objects will fill up gaps
left by larger allocations that required alignments.

The size of the cpu_alloc area can be changed via the percpu=xxx
kernel parameter.

Signed-off-by: Christoph Lameter <cl@linux-foundation.org>

---
 include/linux/percpu.h |   46 ++++++++++++
 include/linux/vmstat.h |    2 
 mm/Makefile            |    2 
 mm/cpu_alloc.c         |  181 +++++++++++++++++++++++++++++++++++++++++++++++++
 mm/vmstat.c            |    1 
 5 files changed, 230 insertions(+), 2 deletions(-)
 create mode 100644 include/linux/cpu_alloc.h
 create mode 100644 mm/cpu_alloc.c

Index: linux-2.6/include/linux/vmstat.h
===================================================================
--- linux-2.6.orig/include/linux/vmstat.h	2008-09-29 13:08:23.000000000 -0500
+++ linux-2.6/include/linux/vmstat.h	2008-09-29 13:09:33.000000000 -0500
@@ -37,7 +37,7 @@
 		FOR_ALL_ZONES(PGSCAN_KSWAPD),
 		FOR_ALL_ZONES(PGSCAN_DIRECT),
 		PGINODESTEAL, SLABS_SCANNED, KSWAPD_STEAL, KSWAPD_INODESTEAL,
-		PAGEOUTRUN, ALLOCSTALL, PGROTATED,
+		PAGEOUTRUN, ALLOCSTALL, PGROTATED, CPU_BYTES,
 #ifdef CONFIG_HUGETLB_PAGE
 		HTLB_BUDDY_PGALLOC, HTLB_BUDDY_PGALLOC_FAIL,
 #endif
Index: linux-2.6/mm/Makefile
===================================================================
--- linux-2.6.orig/mm/Makefile	2008-09-29 13:08:23.000000000 -0500
+++ linux-2.6/mm/Makefile	2008-09-29 13:09:33.000000000 -0500
@@ -11,7 +11,7 @@
 			   maccess.o page_alloc.o page-writeback.o pdflush.o \
 			   readahead.o swap.o truncate.o vmscan.o \
 			   prio_tree.o util.o mmzone.o vmstat.o backing-dev.o \
-			   page_isolation.o mm_init.o $(mmu-y)
+			   page_isolation.o mm_init.o cpu_alloc.o $(mmu-y)
 
 ...
From: Pekka Enberg
Date: Monday, September 29, 2008 - 11:35 pm

What is this thing? Otherwise looks good to me.

--

From: Christoph Lameter
Date: Tuesday, September 30, 2008 - 4:38 am

This is the number of units available from the cpu allocator. Its determined
on bootup and the bitmap is sized correspondingly.




--

From: Pekka Enberg
Date: Tuesday, September 30, 2008 - 4:48 am

I think you're confusing it to "nr_units" or, alternatively, I need new
glasses.

--

From: Christoph Lameter
Date: Tuesday, September 30, 2008 - 5:12 am

You are right. units is debri from earlier revs and has no function today.


Subject: cpu_alloc: Remove useless variable

The "units" variable is a leftover and has no function at this point.

Signed-off-by: Christoph Lameter <cl@linux-foundation.org>

Index: linux-2.6/mm/cpu_alloc.c
===================================================================
--- linux-2.6.orig/mm/cpu_alloc.c	2008-09-30 07:09:09.000000000 -0500
+++ linux-2.6/mm/cpu_alloc.c	2008-09-30 07:10:20.000000000 -0500
@@ -27,8 +27,6 @@
 #define UNIT_TYPE int
 #define UNIT_SIZE sizeof(UNIT_TYPE)

-int units;	/* Actual available units */
-
 /*
  * How many units are needed for an object of a given size
  */
--

From: Andrew Morton
Date: Friday, October 3, 2008 - 12:33 am

Might be able to use bitmap_find_free_region() here, if we try hard enough.

But as a general thing, it would be better to add any missing
functionality to the bitmap API and then to use the bitmap API

Apart from that the interface, intent and implementation seem reasonable.

But I'd have though that it would be possible to only allocate the
storage for online CPUs.  That would be a pretty significant win for
some system configurations?

--

From: Pekka Enberg
Date: Friday, October 3, 2008 - 12:43 am

Hi Andrew,


But bitmap_fill() assumes that the starting offset is aligned to
unsigned long (which is not the case here), doesn't it?


Maybe, but then you'd have to deal with CPU hotplug... iik.

		Pekka

--

From: Andrew Morton
Date: Friday, October 3, 2008 - 1:20 am

Of course.
--

From: Christoph Lameter
Date: Friday, October 3, 2008 - 7:15 am

Yup cannot find equivalent bitmap operations for cpu_alloc.

Also the search operations already use find_next_zero_bit() and
find_next_bit(). So this should be okay.

We could define new bitops:

bitmap_set_range(dst, start, end)
bitmap_clear_range(dst, start, end)

int find_zero_bits(dst, start, end, nr_of_zero_bits)

but then there are additional alignment requirements that such a generic
function would not be able to check for.


--

From: Christoph Lameter
Date: Friday, October 3, 2008 - 5:48 am

ok will use bitops here.


We  have tried that but currently the kernel (core and in particular arch
code) keeps state for all possible cpus in percpu segments. Would require more
extensive cleanup of numerous arches to do.


--

From: Rusty Russell
Date: Sunday, October 5, 2008 - 2:10 pm

It shouldn't be a big win, since possible ~= online for most systems.  And 
having all the per-cpu users register online and offline cpu callbacks is 
error prone and a PITA.

Rusty.
--

From: Christoph Lameter
Date: Tuesday, October 7, 2008 - 6:27 am

That also has the nice consequence that moving the allocators (page allocator
/ slub) to the use of cpu_alloc will avoid the online and offline callbacks
(the main focus of these is getting rid of large pointer arrays there and
simplifying bootstrap etc).



--

Previous thread: [patch 0/4] Cpu alloc V6: Replace percpu allocator in modules.c by Christoph Lameter on Monday, September 29, 2008 - 12:35 pm. (4 messages)

Next thread: [patch 1/4] Make the per cpu reserve configurable by Christoph Lameter on Monday, September 29, 2008 - 12:35 pm. (1 message)