Follow up on earlier PAT patch series here: http://lkml.org/lkml/2008/1/10/312 This patch series adds Page Attribute Table (PAT) support on x86. There have been few changes based on comments for earlier patches and also issues that was seen while the earlier patchset was in mm. The main changes include: * Unlike earlier patchset, there are no changes to identity mapping of reserved regions. * Unlike earlier patches, there are no chanegs to early ioremap. * We look at MTRR setting and PAT request and track the resultant type to avoid aliasing. * UC_MINUS in PAT to provide backward compatibility to /dem/mem mmap users. In general, we have tried to make patches more simpler and cleaner. Hope is to cause less disruption along the way. The changes/cleaups that went into x86/mm (specifically pageattr.c) has helped us along the way. The patchset is against x86 testing from couple of days back. The last patch in the series is meant for test-tree only and adds some useful printks that can help us debug any potential issues. There are two issues that we are leaving out at the moment to make the patch simple. We will be addressing them with incremental patches soon: * FB/DRM drivers using pgprot_val and changing protection on their own without using any proper APIs like ioremap. There are few such usages and each one will be addressed separately. * To change attributes from WC to WB in a "perfect way", one has to follow certain sequence like make page non-present etc. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> -- --
thanks Venki, i've queued this up so that we can see how well it goes. It now looks a lot less dangerous and more compatible than it did before hm, until this is done correctly i guess we should disallow WC to WB transitions? A good number of erratas apply i suspect :-/ Ingo --
no big issues so far, just a simple build fix for the !MTRR case below.
Ingo
----------------->
Subject: x86: PAT fix
From: Ingo Molnar <mingo@elte.hu>
Date: Fri Mar 21 15:42:28 CET 2008
build fix for !CONFIG_MTRR.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
include/asm-x86/mtrr.h | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)
Index: linux-x86.q/include/asm-x86/mtrr.h
===================================================================
--- linux-x86.q.orig/include/asm-x86/mtrr.h
+++ linux-x86.q/include/asm-x86/mtrr.h
@@ -84,10 +84,9 @@ struct mtrr_gentry
#ifdef __KERNEL__
-extern u8 mtrr_type_lookup(u64 addr, u64 end);
-
/* The following functions are for use by other drivers */
# ifdef CONFIG_MTRR
+extern u8 mtrr_type_lookup(u64 addr, u64 end);
extern void mtrr_save_fixed_ranges(void *);
extern void mtrr_save_state(void);
extern int mtrr_add (unsigned long base, unsigned long size,
@@ -101,6 +100,13 @@ extern void mtrr_ap_init(void);
extern void mtrr_bp_init(void);
extern int mtrr_trim_uncached_memory(unsigned long end_pfn);
# else
+static inline u8 mtrr_type_lookup(u64 addr, u64 end)
+{
+ /*
+ * Return no-MTRRs:
+ */
+ return 0xff;
+}
#define mtrr_save_fixed_ranges(arg) do {} while (0)
#define mtrr_save_state() do {} while (0)
static __inline__ int mtrr_add (unsigned long base, unsigned long size,
--
I think we can support WC in the mean time. Currently we follow the TLB and cache flushing logic that is there which is a OK solution. I mean, I dont think there will be nasty hangs etc because of this. We do keep track of the usage of these mappings and there should not be any thing other than speculative accesses from other CPUs at the time we change the attribute. So, we are trying to double check whether the SDM approach is really needed for our usage model. BTW, suggested solution in SDM says we should make the page "Not Present" first flush the TLBs and then change the attribute and make it present. This will potentially involve some changes in page fault handler as well. Thanks, Venki --
I have to say I think this looks like a good patchset. However, I'd like a bit more clarification with regards to the above point? -hpa --
X seems to use (in that order) - mmap the range through /dev/mem - Set MTRR for the range to WC I see this happening on one of my test systems with relatively new xorg. In this case, when mmap does the reserve for this range, if we give UC mapping then we will effectively negate the MTRR WC setting with the range being mapped UC. To accomodate this special use case, we give /dev/mem mmap (only when there are no other already existing mappings) a UC_MINUS attribute. With that, if and when X sets MTRR the range will become WC and until that time it will be UC. We ensure that all page table mappings use UC_MINUS for that range. Long term, we want X to switch to /proc/ of /sys interfaces. But, we can also provide backward compatibility for existing X usage like above. Thanks, Venki --
Makes total sense. Eventually I think we want to do /proc/mtrr emulation, but for now, this is probably the best option. -hpa --
