Re: [PATCH 1/1] handle initialising compound pages at orders greater than MAX_ORDER

Previous thread: Block: Fix handling of stopped queues and a plugging issue by Elias Oltmanns on Thursday, October 2, 2008 - 8:59 am. (3 messages)

Next thread: [PATCH] docbook: update procfs credits by Randy Dunlap on Thursday, October 2, 2008 - 9:41 am. (1 message)
From: Andy Whitcroft
Date: Thursday, October 2, 2008 - 9:19 am

When we initialise a compound page we initialise the page flags and head
page pointer for all base pages spanned by that page.  When we initialise a
gigantic page (a page of order greater than or equal to MAX_ORDER) we have
to initialise more than MAX_ORDER_NR_PAGES pages.  Currently we assume
that all elements of the mem_map in this page are contigious in memory.
However this is only guarenteed out to MAX_ORDER_NR_PAGES pages, and with
SPARSEMEM enabled they will not be contigious.  This leads us to walk off
the end of the first section and scribble on everything which follows, BAD.

When we reach a MAX_ORDER_NR_PAGES boundary we much locate the next section
of the mem_map.  As gigantic pages can only be maximally aligned we know
this will occur at exact multiple of MAX_ORDER_NR_PAGES pages from the
start of the page.

This is a bug fix for the gigantic page support in hugetlbfs, please
consider for merging before 2.6.27.

Credit to Mel Gorman for spotting the issue.

Signed-off-by: Andy Whitcroft <apw@shadowen.org>
---
 mm/page_alloc.c |   13 ++++++++-----
 1 files changed, 8 insertions(+), 5 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index e293c58..27b8681 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -268,13 +268,14 @@ void prep_compound_page(struct page *page, unsigned long order)
 {
 	int i;
 	int nr_pages = 1 << order;
+	struct page *p = page + 1;
 
 	set_compound_page_dtor(page, free_compound_page);
 	set_compound_order(page, order);
 	__SetPageHead(page);
-	for (i = 1; i < nr_pages; i++) {
-		struct page *p = page + i;
-
+	for (i = 1; i < nr_pages; i++, p++) {
+		if (unlikely((i & (MAX_ORDER_NR_PAGES - 1)) == 0))
+			p = pfn_to_page(page_to_pfn(page) + i);
 		__SetPageTail(p);
 		p->first_page = page;
 	}
@@ -284,6 +285,7 @@ static void destroy_compound_page(struct page *page, unsigned long order)
 {
 	int i;
 	int nr_pages = 1 << order;
+	struct page *p = page + 1;
 
 	if (unlikely(compound_order(page) != order))
 ...
From: Andrew Morton
Date: Thursday, October 2, 2008 - 2:30 pm

On Thu,  2 Oct 2008 17:19:56 +0100

gad.  Wouldn't it be clearer to do

	for (i = 1; i < nr_pages; i++) {
		struct page *p = pfn_to_page(i);
		__SetPageTail(p);
		p->first_page = page;
	}

Oh well, I guess we can go with the obfuscated, uncommented version for
now :(

This patch applies to 2.6.26 (and possibly earlier) but I don't think
those kernels can trigger the bug?

--

From: Nick Piggin
Date: Thursday, October 2, 2008 - 11:43 pm

I think the problem is that pfn_to_page isn't always trivial. I would
prefer to have seen a new function for hugetlb to use, and keep the
branch-less version for the page allocator itself.
--

From: Andy Whitcroft
Date: Friday, October 3, 2008 - 11:11 am

Yes that would probabally be a better way forward overall.  I see that
the current one has gone upstream which at least pluggs the hole we have
right now.  We are still testing and when that is done we will know if
there are any other issues.  As part of that I will look at pulling out
a gigantic page specific version of the destructor on top of this one.

-apw
--

Previous thread: Block: Fix handling of stopped queues and a plugging issue by Elias Oltmanns on Thursday, October 2, 2008 - 8:59 am. (3 messages)

Next thread: [PATCH] docbook: update procfs credits by Randy Dunlap on Thursday, October 2, 2008 - 9:41 am. (1 message)