Re: [patch 00/13] x86: PAT support updated - v3

Previous thread: none

Next thread: [git pull] async_tx and a fsldma fix for 2.6.25-rc by Dan Williams on Tuesday, March 18, 2008 - 5:45 pm. (1 message)
From: venkatesh.pallipadi
Date: Tuesday, March 18, 2008 - 5:00 pm

Follow up on earlier PAT patch series here:
http://lkml.org/lkml/2008/1/10/312

This patch series adds Page Attribute Table (PAT) support on x86. There have
been few changes based on comments for earlier patches and also issues that
was seen while the earlier patchset was in mm. The main changes include:

* Unlike earlier patchset, there are no changes to identity mapping of
  reserved regions.
* Unlike earlier patches, there are no chanegs to early ioremap.
* We look at MTRR setting and PAT request and track the resultant type
  to avoid aliasing.
* UC_MINUS in PAT to provide backward compatibility to /dem/mem mmap users.

In general, we have tried to make patches more simpler and cleaner. Hope is
to cause less disruption along the way. The changes/cleaups that went into
x86/mm (specifically pageattr.c) has helped us along the way.

The patchset is against x86 testing from couple of days back.

The last patch in the series is meant for test-tree only and adds some useful
printks that can help us debug any potential issues.

There are two issues that we are leaving out at the moment to make the patch
simple. We will be addressing them with incremental patches soon:
* FB/DRM drivers using pgprot_val and changing protection on their own
  without using any proper APIs like ioremap. There are few such usages and
  each one will be addressed separately.
* To change attributes from WC to WB in a "perfect way", one has to follow
  certain sequence like make page non-present etc.

Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>

-- 
--

From: Ingo Molnar
Date: Friday, March 21, 2008 - 6:24 am

thanks Venki, i've queued this up so that we can see how well it goes. 
It now looks a lot less dangerous and more compatible than it did before 

hm, until this is done correctly i guess we should disallow WC to WB 
transitions? A good number of erratas apply i suspect :-/

	Ingo
--

From: Ingo Molnar
Date: Friday, March 21, 2008 - 7:55 am

no big issues so far, just a simple build fix for the !MTRR case below.

	Ingo

----------------->
Subject: x86: PAT fix
From: Ingo Molnar <mingo@elte.hu>
Date: Fri Mar 21 15:42:28 CET 2008

build fix for !CONFIG_MTRR.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
 include/asm-x86/mtrr.h |   10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

Index: linux-x86.q/include/asm-x86/mtrr.h
===================================================================
--- linux-x86.q.orig/include/asm-x86/mtrr.h
+++ linux-x86.q/include/asm-x86/mtrr.h
@@ -84,10 +84,9 @@ struct mtrr_gentry
 
 #ifdef __KERNEL__
 
-extern u8 mtrr_type_lookup(u64 addr, u64 end);
-
 /*  The following functions are for use by other drivers  */
 # ifdef CONFIG_MTRR
+extern u8 mtrr_type_lookup(u64 addr, u64 end);
 extern void mtrr_save_fixed_ranges(void *);
 extern void mtrr_save_state(void);
 extern int mtrr_add (unsigned long base, unsigned long size,
@@ -101,6 +100,13 @@ extern void mtrr_ap_init(void);
 extern void mtrr_bp_init(void);
 extern int mtrr_trim_uncached_memory(unsigned long end_pfn);
 #  else
+static inline u8 mtrr_type_lookup(u64 addr, u64 end)
+{
+	/*
+	 * Return no-MTRRs:
+	 */
+	return 0xff;
+}
 #define mtrr_save_fixed_ranges(arg) do {} while (0)
 #define mtrr_save_state() do {} while (0)
 static __inline__ int mtrr_add (unsigned long base, unsigned long size,
--

From: Venki Pallipadi
Date: Friday, March 21, 2008 - 12:26 pm

I think we can support WC in the mean time. Currently we follow the TLB and
cache flushing logic that is there which is a OK solution. I mean, I dont think
there will be nasty hangs etc because of this. We do keep track of the usage
of these mappings and there should not be any thing other than speculative
accesses from other CPUs at the time we change the attribute. So, we are trying
to double check whether the SDM approach is really needed for our usage model.

BTW, suggested solution in SDM says we should make the page "Not Present" first
flush the TLBs and then change the attribute and make it present. This will
potentially involve some changes in page fault handler as well.

Thanks,
Venki
--

From: H. Peter Anvin
Date: Friday, March 21, 2008 - 6:29 am

I have to say I think this looks like a good patchset.  However, I'd 
like a bit more clarification with regards to the above point?

	-hpa
--

From: Venki Pallipadi
Date: Friday, March 21, 2008 - 12:19 pm

X seems to use (in that order)
- mmap the range through /dev/mem
- Set MTRR for the range to WC

I see this happening on one of my test systems with relatively new xorg.

In this case, when mmap does the reserve for this range, if we give UC mapping
then we will effectively negate the MTRR WC setting with the range being mapped
UC. To accomodate this special use case, we give /dev/mem mmap (only when there
are no other already existing mappings) a UC_MINUS attribute. With that,
if and when X sets MTRR the range will become WC and until that time it will be
UC. We ensure that all page table mappings use UC_MINUS for that range.

Long term, we want X to switch to /proc/ of /sys interfaces. But, we can also
provide backward compatibility for existing X usage like above.

Thanks,
Venki
--

From: H. Peter Anvin
Date: Friday, March 21, 2008 - 12:59 pm

Makes total sense.  Eventually I think we want to do /proc/mtrr 
emulation, but for now, this is probably the best option.

	-hpa
--

Previous thread: none

Next thread: [git pull] async_tx and a fsldma fix for 2.6.25-rc by Dan Williams on Tuesday, March 18, 2008 - 5:45 pm. (1 message)