Re: [PATCH -mm] relayfs: support larger relay buffer

Previous thread: [PATCH] mm: Fix possible off-by-one in walk_pte_range() by Johannes Weiner on Tuesday, April 15, 2008 - 7:00 am. (5 messages)

Next thread: Re: User space automounter problems after upgrade to 2.6.25-rc9 by Martin Knoblauch on Tuesday, April 15, 2008 - 8:47 am. (1 message)
From: Masami Hiramatsu
Date: Tuesday, April 15, 2008 - 8:27 am

Use vmalloc() and memset() instead of kcalloc() to allocate a page* array
when the array size is bigger than one page. This enables relayfs to support
bigger relay buffers than 64MB on 4k-page system, 512MB on 16k-page system.

Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
---
This is useful for a 64-bit system which has a plenty of memory (tens of
giga bytes) and a large kernel memory space.

I tested it on x86-64 and ia64.

 kernel/relay.c |   22 ++++++++++++++++++----
 1 file changed, 18 insertions(+), 4 deletions(-)

Index: 2.6.25-rc8-mm2/kernel/relay.c
===================================================================
--- 2.6.25-rc8-mm2.orig/kernel/relay.c
+++ 2.6.25-rc8-mm2/kernel/relay.c
@@ -104,12 +104,20 @@ static int relay_mmap_buf(struct rchan_b
 static void *relay_alloc_buf(struct rchan_buf *buf, size_t *size)
 {
 	void *mem;
-	unsigned int i, j, n_pages;
+	unsigned int i, j, n_pages, pa_size;

 	*size = PAGE_ALIGN(*size);
 	n_pages = *size >> PAGE_SHIFT;
+	pa_size = n_pages * sizeof(struct page *);

-	buf->page_array = kcalloc(n_pages, sizeof(struct page *), GFP_KERNEL);
+	if (pa_size > PAGE_SIZE) {
+		buf->page_array = vmalloc(pa_size);
+		if (buf->page_array)
+			memset(buf->page_array, 0, pa_size);
+	} else {
+		buf->page_array = kcalloc(n_pages, sizeof(struct page *),
+					  GFP_KERNEL);
+	}
 	if (!buf->page_array)
 		return NULL;

@@ -130,7 +138,10 @@ static void *relay_alloc_buf(struct rcha
 depopulate:
 	for (j = 0; j < i; j++)
 		__free_page(buf->page_array[j]);
-	kfree(buf->page_array);
+	if (pa_size > PAGE_SIZE)
+		vfree(buf->page_array);
+	else
+		kfree(buf->page_array);
 	return NULL;
 }

@@ -189,7 +200,10 @@ static void relay_destroy_buf(struct rch
 		vunmap(buf->start);
 		for (i = 0; i < buf->page_count; i++)
 			__free_page(buf->page_array[i]);
-		kfree(buf->page_array);
+		if (buf->page_count * sizeof(struct page *) > PAGE_SIZE)
+			vfree(buf->page_array);
+		else
+			kfree(buf->page_array);
 	}
 ...
From: Tom Zanussi
Date: Tuesday, April 15, 2008 - 9:22 pm

Hi,

It looks ok to me, but it might be a little cleaner and avoid some
duplication if you add the new code as a couple of functions instead.
Just a suggestion...


--

From: Masami Hiramatsu
Date: Wednesday, April 16, 2008 - 9:19 am

Hi Tom,


Sure, that is a good idea, I'll renew my patch.
Thank you,

-- 
Masami Hiramatsu

Software Engineer
Hitachi Computer Products (America) Inc.
Software Solutions Division

e-mail: mhiramat@redhat.com

--

From: Masami Hiramatsu
Date: Wednesday, April 16, 2008 - 11:03 am

Use vmalloc() and memset() instead of kcalloc() to allocate a page* array
when the array size is bigger than one page. This enables relayfs to support
bigger relay buffers than 64MB on 4k-page system, 512MB on 16k-page system.

Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
---
Changes from take1 to take2:
- add relay_alloc_page_array() and relay_free_page_array()
- use is_vmalloc_addr() instead of checking array size.

This is useful for a 64-bit system which has a plenty of memory (tens of
giga bytes) and a large kernel memory space.

I tested it on x86-64 and ia64.

 kernel/relay.c |   29 ++++++++++++++++++++++++++---
 1 file changed, 26 insertions(+), 3 deletions(-)

Index: 2.6.25-rc8-mm2/kernel/relay.c
===================================================================
--- 2.6.25-rc8-mm2.orig/kernel/relay.c
+++ 2.6.25-rc8-mm2/kernel/relay.c
@@ -27,6 +27,29 @@
 static DEFINE_MUTEX(relay_channels_mutex);
 static LIST_HEAD(relay_channels);

+static struct page *relay_alloc_page_array(unsigned int n_pages)
+{
+	struct page *array;
+	unsigned int pa_size = n_pages * sizeof(struct page *);
+
+	if (pa_size > PAGE_SIZE) {
+		array = vmalloc(pa_size);
+		if (array)
+			memset(array, 0, pa_size);
+	} else {
+		array = kcalloc(n_pages, sizeof(struct page *), GFP_KERNEL);
+	}
+	return array;
+}
+
+static void relay_free_page_array(struct page *array)
+{
+	if (is_vmalloc_addr(array))
+		vfree(array);
+	else
+		kfree(array);
+}
+
 /*
  * close() vm_op implementation for relay file mapping.
  */
@@ -109,7 +132,7 @@ static void *relay_alloc_buf(struct rcha
 	*size = PAGE_ALIGN(*size);
 	n_pages = *size >> PAGE_SHIFT;

-	buf->page_array = kcalloc(n_pages, sizeof(struct page *), GFP_KERNEL);
+	buf->page_array = relay_alloc_page_array(n_pages);
 	if (!buf->page_array)
 		return NULL;

@@ -130,7 +153,7 @@ static void *relay_alloc_buf(struct rcha
 depopulate:
 	for (j = 0; j < i; j++)
 ...
From: Pekka J Enberg
Date: Wednesday, April 16, 2008 - 11:21 am

Hi Masami,


I think it's bit confusing to have relay_alloc_page_array() return a 
pointer to struct page as it's really allocating an _array_ of pointers to 
struct page. So why not just use void * here as the kernel memory 

Here as well.
--

From: Masami Hiramatsu
Date: Wednesday, April 16, 2008 - 11:34 am

Hi,


Thank you very much, it was my mistake.

Thanks,


-- 
Masami Hiramatsu

Software Engineer
Hitachi Computer Products (America) Inc.
Software Solutions Division

e-mail: mhiramat@redhat.com

--

From: Masami Hiramatsu
Date: Wednesday, April 16, 2008 - 12:51 pm

Use vmalloc() and memset() instead of kcalloc() to allocate a page* array
when the array size is bigger than one page. This enables relayfs to support
bigger relay buffers than 64MB on 4k-page system, 512MB on 16k-page system.

Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
---
Changes from take2 to take3:
 - Use struct page ** instead of struct page *.
 - move functions to the place before relay_mmap_buf.
 - add comments.

This is useful for a 64-bit system which has a plenty of memory (tens of
giga bytes) and a large kernel memory space.

I tested it on x86-64 and ia64.

 kernel/relay.c |   35 ++++++++++++++++++++++++++++++++---
 1 file changed, 32 insertions(+), 3 deletions(-)

Index: 2.6.25-rc8-mm2/kernel/relay.c
===================================================================
--- 2.6.25-rc8-mm2.orig/kernel/relay.c
+++ 2.6.25-rc8-mm2/kernel/relay.c
@@ -65,6 +65,35 @@ static struct vm_operations_struct relay
 	.close = relay_file_mmap_close,
 };

+/*
+ * allocate an array of pointers of struct page
+ */
+static struct page **relay_alloc_page_array(unsigned int n_pages)
+{
+	struct page **array;
+	unsigned int pa_size = n_pages * sizeof(struct page *);
+
+	if (pa_size > PAGE_SIZE) {
+		array = vmalloc(pa_size);
+		if (array)
+			memset(array, 0, pa_size);
+	} else {
+		array = kcalloc(n_pages, sizeof(struct page *), GFP_KERNEL);
+	}
+	return array;
+}
+
+/*
+ * free an array of pointers of struct page
+ */
+static void relay_free_page_array(struct page **array)
+{
+	if (is_vmalloc_addr(array))
+		vfree(array);
+	else
+		kfree(array);
+}
+
 /**
  *	relay_mmap_buf: - mmap channel buffer to process address space
  *	@buf: relay channel buffer
@@ -109,7 +138,7 @@ static void *relay_alloc_buf(struct rcha
 	*size = PAGE_ALIGN(*size);
 	n_pages = *size >> PAGE_SHIFT;

-	buf->page_array = kcalloc(n_pages, sizeof(struct page *), GFP_KERNEL);
+	buf->page_array = relay_alloc_page_array(n_pages);
 	if (!buf->page_array)
 		return NULL;

@@ ...
From: Pekka Enberg
Date: Wednesday, April 16, 2008 - 12:54 pm

Looks good to me!

Reviewed-by: Pekka Enberg <penberg@cs.helsinki.fi>
--

From: Andrew Morton
Date: Wednesday, April 16, 2008 - 1:48 pm

On Wed, 16 Apr 2008 15:51:56 -0400

It's a bit odd to multiply n_pages*sizeof() and to then call kcalloc(),
which needs to do the same multiplication.

The compiler will presumably optimise that away, but still, how about this?

--- a/kernel/relay.c~relayfs-support-larger-relay-buffer-take-3-cleanup
+++ a/kernel/relay.c
@@ -71,14 +71,14 @@ static struct vm_operations_struct relay
 static struct page **relay_alloc_page_array(unsigned int n_pages)
 {
 	struct page **array;
-	unsigned int pa_size = n_pages * sizeof(struct page *);
+	size_t pa_size = n_pages * sizeof(struct page *);
 
 	if (pa_size > PAGE_SIZE) {
 		array = vmalloc(pa_size);
 		if (array)
 			memset(array, 0, pa_size);
 	} else {
-		array = kcalloc(n_pages, sizeof(struct page *), GFP_KERNEL);
+		array = kzalloc(pa_size, GFP_KERNEL);
 	}
 	return array;
 }
_


size_t is strictly the correct type for pa_size here.  Even though
vmalloc() takes a ulong.

--

From: Masami Hiramatsu
Date: Wednesday, April 16, 2008 - 2:00 pm

Hi Andrew,


Sure, it looks good to me.
Thank you so much,


Thanks for the advice,


-- 
Masami Hiramatsu

Software Engineer
Hitachi Computer Products (America) Inc.
Software Solutions Division

e-mail: mhiramat@redhat.com

--

From: Tom Zanussi
Date: Wednesday, April 16, 2008 - 9:05 pm

Hi,

Looks fine to me.

Reviewed-by: Tom Zanussi <tzanussi@gmail.com>



--

From: Pekka Enberg
Date: Wednesday, April 16, 2008 - 1:33 am

From: Masami Hiramatsu
Date: Wednesday, April 16, 2008 - 7:36 am

Hi Pekka,


Thank you for your good advice!

-- 
Masami Hiramatsu

Software Engineer
Hitachi Computer Products (America) Inc.
Software Solutions Division

e-mail: mhiramat@redhat.com

--

Previous thread: [PATCH] mm: Fix possible off-by-one in walk_pte_range() by Johannes Weiner on Tuesday, April 15, 2008 - 7:00 am. (5 messages)

Next thread: Re: User space automounter problems after upgrade to 2.6.25-rc9 by Martin Knoblauch on Tuesday, April 15, 2008 - 8:47 am. (1 message)