From: Suresh Siddha <suresh.b.siddha@intel.com>
Subject: [patch 1/2] x86: track memtype for RAM in page struct
Track the memtype for RAM pages in page struct instead of using the memtype
list. This avoids the explosion in the number of entries in memtype list
(of the order of 20,000 with AGP) and makes the PAT tracking simpler. We are
using PG_arch_1 bit in page->flags.
We still use the memtype list for non RAM pages.
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
---
arch/x86/mm/ioremap.c | 19 +++++++++++
arch/x86/mm/pat.c | 83 +++++++++++++++++++++++++++++++++++++++++++++++++
include/asm-x86/page.h | 1
3 files changed, 103 insertions(+)
Index: tip/arch/x86/mm/ioremap.c
===================================================================
--- tip.orig/arch/x86/mm/ioremap.c 2008-09-12 16:03:33.000000000 -0700
+++ tip/arch/x86/mm/ioremap.c 2008-09-12 16:23:33.000000000 -0700
@@ -102,6 +102,25 @@ int page_is_ram(unsigned long pagenr)
return 0;
}
+int pagerange_is_ram(unsigned long start, unsigned long end)
+{
+ int ram_page = 0, not_rampage = 0;
+ unsigned long page_nr;
+
+ for (page_nr = (start >> PAGE_SHIFT); page_nr < (end >> PAGE_SHIFT);
+ ++page_nr) {
+ if (page_is_ram(page_nr))
+ ram_page = 1;
+ else
+ not_rampage = 1;
+
+ if (ram_page == not_rampage)
+ return -1;
+ }
+
+ return ram_page;
+}
+
/*
* Fix up the linear direct mapping of the kernel to avoid cache attribute
* conflicts.
Index: tip/arch/x86/mm/pat.c
===================================================================
--- tip.orig/arch/x86/mm/pat.c 2008-09-12 16:03:33.000000000 -0700
+++ tip/arch/x86/mm/pat.c 2008-09-12 16:23:33.000000000 -0700
@@ -211,6 +211,75 @@ static struct memtype *cached_entry;
static u64 cached_start;
/*
+ * RED-PEN: TODO: Add PageReserved() check aswell here,
+ * once we add SetPageReserved() to all the drivers using
+ * set_memory_* or ...Please define PG_arch_1 a proper name so that its easy to tell its being
used just by looking at page-flags.h.
J
--
it should be defined in include/asm-x86/page.h though, not in page-flags.h - other architectures are using this flag for other purposes. Ingo --
No, other shared-use flags are all defined in page-flags.h:
enum pageflags {
[...]
__NR_PAGEFLAGS,
/* Filesystems */
PG_checked = PG_owner_priv_1,
/* XEN */
PG_pinned = PG_owner_priv_1,
PG_savepinned = PG_dirty,
/* SLOB */
PG_slob_page = PG_active,
PG_slob_free = PG_private,
/* SLUB */
PG_slub_frozen = PG_active,
PG_slub_debug = PG_error,
};
We could #ifdef CONFIG_X86 just to make it clear we're talking about a
specific X86 usage. But page-flags.h does seem to have become the
central authority on all struct page flags usage.
J
--
well, in case of the PG_arch_* flags, they are all defined in architecture files: ./include/asm-parisc/pgtable.h:#define PG_dcache_dirty PG_arch_1 ./include/asm-mips/cacheflush.h:#define PG_dcache_dirty PG_arch_1 ./arch/sparc64/mm/init.c:#define PG_dcache_dirty PG_arch_1 ./arch/sh/include/cpu-sh4/cpu/cacheflush.h:#define PG_mapped PG_arch_1 ./arch/sh/include/cpu-sh3/cpu/cacheflush.h:#define PG_mapped PG_arch_1 ./arch/arm/include/asm/cacheflush.h:#define PG_dcache_dirty PG_arch_1 they are explicitly reserved for per architecture details. Ingo --
Below is the updated patch.
Thanks,
Venki
x86: track memtype for RAM in page struct
From: Suresh Siddha <suresh.b.siddha@intel.com>
Subject: x86: track memtype for RAM in page struct
Track the memtype for RAM pages in page struct instead of using the memtype
list. This avoids the explosion in the number of entries in memtype list
(of the order of 20,000 with AGP) and makes the PAT tracking simpler. We are
using PG_arch_1 bit in page->flags.
We still use the memtype list for non RAM pages.
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
---
arch/x86/mm/ioremap.c | 19 +++++++++
arch/x86/mm/pat.c | 83 +++++++++++++++++++++++++++++++++++++++++++
include/asm-x86/cacheflush.h | 1
include/asm-x86/page.h | 1
4 files changed, 104 insertions(+)
Index: tip/arch/x86/mm/ioremap.c
===================================================================
--- tip.orig/arch/x86/mm/ioremap.c 2008-09-23 13:31:31.000000000 -0700
+++ tip/arch/x86/mm/ioremap.c 2008-09-23 13:32:26.000000000 -0700
@@ -102,6 +102,25 @@ int page_is_ram(unsigned long pagenr)
return 0;
}
+int pagerange_is_ram(unsigned long start, unsigned long end)
+{
+ int ram_page = 0, not_rampage = 0;
+ unsigned long page_nr;
+
+ for (page_nr = (start >> PAGE_SHIFT); page_nr < (end >> PAGE_SHIFT);
+ ++page_nr) {
+ if (page_is_ram(page_nr))
+ ram_page = 1;
+ else
+ not_rampage = 1;
+
+ if (ram_page == not_rampage)
+ return -1;
+ }
+
+ return ram_page;
+}
+
/*
* Fix up the linear direct mapping of the kernel to avoid cache attribute
* conflicts.
Index: tip/arch/x86/mm/pat.c
===================================================================
--- tip.orig/arch/x86/mm/pat.c 2008-09-23 13:31:31.000000000 -0700
+++ tip/arch/x86/mm/pat.c 2008-09-23 13:32:26.000000000 -0700
@@ -211,6 +211,75 @@ static struct memtype *cached_entry;
static u64 cached_start;
/*
+ * ...I'd suggest to avoid the open coding of page flag handling using set bit etc. It may be best to slowly migrate the arch specific definitions into page-flags.h. Otherwise we wont have a central authority on page flags. --
OK. Below is the updated patch that does not use set/clear directly on page
flag.
x86: track memtype for RAM in page struct
From: Suresh Siddha <suresh.b.siddha@intel.com>
Subject: x86: track memtype for RAM in page struct
Track the memtype for RAM pages in page struct instead of using the memtype
list. This avoids the explosion in the number of entries in memtype list
(of the order of 20,000 with AGP) and makes the PAT tracking simpler. We are
using PG_arch_1 bit in page->flags.
We still use the memtype list for non RAM pages.
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
---
arch/x86/mm/ioremap.c | 19 +++++++++
arch/x86/mm/pat.c | 83 +++++++++++++++++++++++++++++++++++++++++++
include/asm-x86/cacheflush.h | 2 +
include/asm-x86/page.h | 1
4 files changed, 105 insertions(+)
Index: tip/arch/x86/mm/ioremap.c
===================================================================
--- tip.orig/arch/x86/mm/ioremap.c 2008-09-23 13:31:31.000000000 -0700
+++ tip/arch/x86/mm/ioremap.c 2008-09-24 08:39:29.000000000 -0700
@@ -102,6 +102,25 @@ int page_is_ram(unsigned long pagenr)
return 0;
}
+int pagerange_is_ram(unsigned long start, unsigned long end)
+{
+ int ram_page = 0, not_rampage = 0;
+ unsigned long page_nr;
+
+ for (page_nr = (start >> PAGE_SHIFT); page_nr < (end >> PAGE_SHIFT);
+ ++page_nr) {
+ if (page_is_ram(page_nr))
+ ram_page = 1;
+ else
+ not_rampage = 1;
+
+ if (ram_page == not_rampage)
+ return -1;
+ }
+
+ return ram_page;
+}
+
/*
* Fix up the linear direct mapping of the kernel to avoid cache attribute
* conflicts.
Index: tip/arch/x86/mm/pat.c
===================================================================
--- tip.orig/arch/x86/mm/pat.c 2008-09-23 13:31:31.000000000 -0700
+++ tip/arch/x86/mm/pat.c 2008-09-24 08:39:27.000000000 -0700
@@ -211,6 +211,75 @@ static struct memtype *cached_entry;
...applied to tip/x86/pat, thanks! Nice patch. Ingo --
I'd rather we didn't add any more uses of PageReserved without a really good reason. At this point in time (or at least last time I looked, a year or two ago), it isn't a whole lot of work to remove PG_reserved completely. If the waters get muddied up again, it could require another significant rework to remove it in future... Thanks, Nick --
If a driver by mistake free's a RAM page before changing its memory attribute back to WB, we want the generic -mm to catch it. Today, free_pages_check() prevents freeing the page with PageReserved set. We want to use this and make sure that either set_page_uc/wc() or the driver calling these API's set the PageReserved bit. There are already some drivers which do SetPageReserved() before changing the attribute. I don't know the history behind PageReserved. But is there a recommended way to achieve what we want? Either we need to use PageReserved bit or add some arch specific checks (in the x86 case, check arch_1) in free_pages_check(). Right? thanks, suresh --
no, the generic -mm does not 'catch' PageReserved, it simply _ignores_ it. So i agree about having a debug check there, it's just that what you propose does not achieve that. PageReserved is a legacy thing that should not be used in new code. the best way i think would be to add arch_1 to PAGE_FLAGS_CHECK_AT_FREE. That way the mm becomes very noisy if this is ever freed. Nick, Andrew, any preferences? Ingo --
It would be better if another flag is used, yes. Setting PageReserved used to prevent put_page from decrementing the page's refcount. Most drivers still do it just because we kept that code in there to catch any bugs caused by the removal of PageReserved check from the refcounting. --
Corrections for a few typos and some suggestions for minor improvements in the comment. Cheers, FJP "one (driver)" --
setting it PageReserved is a hack. We should set it PG_arch_1 and extend the page allocator to emit a kernel warning if a PG_arch_1 page is freed. Ingo --
Marking PageReserved will take care of all the places where Reserved is being checked, like swap etc. And we don't have to add checks for extra bit in architecture independent code. Thanks, Venki --
