[PATCH 2/2] x86: mtrr_cleanup try gran_size to less than 1M

Previous thread: Strange mtrrs in Aspire One by J.A. on Monday, September 29, 2008 - 4:57 pm. (8 messages)

Next thread: Re: [PATCH v3 02/14] PCI: prevent duplicate slot names by Kenji Kaneshige on Monday, September 29, 2008 - 7:06 pm. (5 messages)
From: Yinghai Lu
Date: Monday, September 29, 2008 - 6:54 pm

make the print out right with size < 1M

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>

---
 arch/x86/kernel/cpu/mtrr/main.c |  109 ++++++++++++++++++++++++++++++----------
 1 file changed, 82 insertions(+), 27 deletions(-)

Index: linux-2.6/arch/x86/kernel/cpu/mtrr/main.c
===================================================================
--- linux-2.6.orig/arch/x86/kernel/cpu/mtrr/main.c
+++ linux-2.6/arch/x86/kernel/cpu/mtrr/main.c
@@ -905,6 +905,27 @@ set_var_mtrr_all(unsigned int address_bi
 	}
 }
 
+static unsigned long to_size_factor(unsigned long sizek, char *factorp)
+{
+	char factor;
+	unsigned long base = sizek;
+
+	if (base & ((1<<10) - 1)) {
+		/* not MB alignment */
+		factor = 'K';
+	} else if (base & ((1<<20) - 1)){
+		factor = 'M';
+		base >>= 10;
+	} else {
+		factor = 'G';
+		base >>= 20;
+	}
+
+	*factorp = factor;
+
+	return base;
+}
+
 static unsigned int __init
 range_to_mtrr(unsigned int reg, unsigned long range_startk,
 	      unsigned long range_sizek, unsigned char type)
@@ -926,13 +947,21 @@ range_to_mtrr(unsigned int reg, unsigned
 			align = max_align;
 
 		sizek = 1 << align;
-		if (debug_print)
+		if (debug_print) {
+			char start_factor = 'K', size_factor = 'K';
+			unsigned long start_base, size_base;
+
+			start_base = to_size_factor(range_startk, &start_factor),
+			size_base = to_size_factor(sizek, &size_factor),
+
 			printk(KERN_DEBUG "Setting variable MTRR %d, "
-				"base: %ldMB, range: %ldMB, type %s\n",
-				reg, range_startk >> 10, sizek >> 10,
+				"base: %ld%cB, range: %ld%cB, type %s\n",
+				reg, start_base, start_factor,
+				size_base, size_factor,
 				(type == MTRR_TYPE_UNCACHABLE)?"UC":
 				    ((type == MTRR_TYPE_WRBACK)?"WB":"Other")
 				);
+		}
 		save_var_mtrr(reg++, range_startk, sizek, type);
 		range_startk += sizek;
 		range_sizek -= sizek;
@@ -1244,6 +1273,8 @@ static int __init mtrr_cleanup(unsigned
 
 	if (mtrr_chunk_size && mtrr_gran_size) {
 		int num_reg;
+		char ...
From: Yinghai Lu
Date: Monday, September 29, 2008 - 6:54 pm

one have gran < 1M

reg00: base=0xd8000000 (3456MB), size= 128MB: uncachable, count=1
reg01: base=0xe0000000 (3584MB), size= 512MB: uncachable, count=1
reg02: base=0x00000000 (   0MB), size=4096MB: write-back, count=1
reg03: base=0x100000000 (4096MB), size= 512MB: write-back, count=1
reg04: base=0x120000000 (4608MB), size= 128MB: write-back, count=1
reg05: base=0xd7f80000 (3455MB), size= 512KB: uncachable, count=1

will get

Found optimal setting for mtrr clean up
gran_size: 512K         chunk_size: 2M  num_reg: 7      lose RAM: 0G
range0: 0000000000000000 - 00000000d8000000
Setting variable MTRR 0, base: 0GB, range: 2GB, type WB
Setting variable MTRR 1, base: 2GB, range: 1GB, type WB
Setting variable MTRR 2, base: 3GB, range: 256MB, type WB
Setting variable MTRR 3, base: 3328MB, range: 128MB, type WB
hole: 00000000d7f00000 - 00000000d7f80000
Setting variable MTRR 4, base: 3455MB, range: 512KB, type UC
rangeX: 0000000100000000 - 0000000128000000
Setting variable MTRR 5, base: 4GB, range: 512MB, type WB
Setting variable MTRR 6, base: 4608MB, range: 128MB, type WB

so start from 64k instead of 1M

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
---
 arch/x86/kernel/cpu/mtrr/main.c |    8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

Index: linux-2.6/arch/x86/kernel/cpu/mtrr/main.c
===================================================================
--- linux-2.6.orig/arch/x86/kernel/cpu/mtrr/main.c
+++ linux-2.6/arch/x86/kernel/cpu/mtrr/main.c
@@ -1197,11 +1197,11 @@ struct mtrr_cleanup_result {
 };
 
 /*
- * gran_size: 1M, 2M, ..., 2G
+ * gran_size: 64K, 128K, 256K, 512K, 1M, 2M, ..., 2G
  * chunk size: gran_size, ..., 2G
- * so we need (1+12)*6
+ * so we need (1+16)*8
  */
-#define NUM_RESULT	78
+#define NUM_RESULT	136
 #define PSHIFT		(PAGE_SHIFT - 10)
 
 static struct mtrr_cleanup_result __initdata result[NUM_RESULT];
@@ -1322,7 +1322,7 @@ static int __init mtrr_cleanup(unsigned
 	i = 0;
 	memset(min_loss_pfn, 0xff, sizeof(min_loss_pfn));
 ...
From: J.A.
Date: Thursday, October 2, 2008 - 3:03 pm

Or there is something I don't catch about mtrrs, or it still does silly
things.

I have an ASUS PCDL, dual xeon, 2Gb of memory. Mtrrs after cleanup are:

werewolf:/proc> cat mtrr
reg00: base=0x00000000 (   0MB), size=1024MB: write-back, count=1
reg01: base=0x40000000 (1024MB), size= 512MB: write-back, count=1
reg02: base=0x60000000 (1536MB), size= 256MB: write-back, count=1
reg03: base=0x70000000 (1792MB), size= 128MB: write-back, count=1
reg04: base=0x78000000 (1920MB), size=  64MB: write-back, count=1
reg05: base=0x7c000000 (1984MB), size=  64MB: write-back, count=1
reg06: base=0x7ff00000 (2047MB), size=   1MB: uncachable, count=1

So it adds a last WB zone, but substracts the last 1Mb. (Why do 
I have that stupid uncacheable mb ? probably a bios issue...)
But those two 64 mb zones could be add to a 128Mb, that new one
with previous to 256Mb and so on, giving something like:

reg00: base=0x00000000 (   0MB), size=2048MB: write-back, count=1
reg01: base=0x7ff00000 (2047MB), size=   1MB: uncachable, count=1

Is this incorrect ?

dmesg is below

Linux version 2.6.26-jam14 (root@werewolf.home) (gcc version 4.3.2 (GCC) ) #2 SMP PREEMPT Thu Oct 2 23:48:10 CEST 2008
BIOS-provided physical RAM map:
 BIOS-e820: 0000000000000000 - 000000000009f400 (usable)
 BIOS-e820: 000000000009f400 - 00000000000a0000 (reserved)
 BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved)
 BIOS-e820: 0000000000100000 - 000000007fee0000 (usable)
 BIOS-e820: 000000007fee0000 - 000000007fee3000 (ACPI NVS)
 BIOS-e820: 000000007fee3000 - 000000007fef0000 (ACPI data)
 BIOS-e820: 000000007fef0000 - 000000007ff00000 (reserved)
 BIOS-e820: 00000000fec00000 - 0000000100000000 (reserved)
last_pfn = 0x7fee0 max_arch_pfn = 0x100000
x86 PAT enabled: cpu 0, old 0x7040600070406, new 0x7010600070106
total RAM coverred: 2047M
 gran_size: 64K 	chunk_size: 64K 	num_reg: 8  	lose cover RAM: 7M
 gran_size: 64K 	chunk_size: 128K 	num_reg: 8  	lose cover RAM: 7M
 gran_size: 64K 	chunk_size: 256K 	num_reg: 8  	lose ...
From: Yinghai Lu
Date: Thursday, October 2, 2008 - 3:33 pm

can you boot with mtrr_cleanup_debug?

also what is /proc/mtrr with disable_mtrr_cleanup?

YH
--

From: Yinghai Lu
Date: Thursday, October 2, 2008 - 3:52 pm

are you on latest tip/master?

YH
--

From: J.A.
Date: Thursday, October 2, 2008 - 4:00 pm

No, I use rc8-git3 with your patches (the small gran_size series and the
parameter rename, except the debug one) manually applied.

-- 
J.A. Magallon <jamagallon()ono!com>     \               Software is like sex:
                                         \         It's better when it's free
Mandriva Linux release 2009.0 (Cooker) for i586
Linux 2.6.25-jam18 (gcc 4.3.1 20080626 (GCC) #1 SMP
--

From: Yinghai Lu
Date: Thursday, October 2, 2008 - 4:20 pm

ah..
please pick up all of them from tip/master...

      x86: don't need to go to chunksize to 4G
      x86: mtrr_cleanup optimization, v2
      x86: add mtrr_cleanup_debug command line
      x86: cleanup, remove extra ifdef
      x86: mtrr_cleanup hole size should be less than half of chunk_size, v2
      x86: mtrr_cleanup safe to get more spare regs now
      x86: mtrr_cleanup prepare to make gran_size to less 1M
      x86: mtrr_cleanup try gran_size to less than 1M
      x86: change MTRR_SANITIZER to def_bool y


YH
--

From: J.A.
Date: Thursday, October 2, 2008 - 3:58 pm

Yeah, it goes right on the Xeon:

reg00: base=0x00000000 (   0MB), size=2048MB: write-back, count=1
reg01: base=0x7ff00000 (2047MB), size=   1MB: uncachable, count=1

Will try your patch now.

-- 
J.A. Magallon <jamagallon()ono!com>     \               Software is like sex:
                                         \         It's better when it's free
Mandriva Linux release 2009.0 (Cooker) for i586
Linux 2.6.25-jam18 (gcc 4.3.1 20080626 (GCC) #1 SMP
--

From: D. Hugh Redelmeier
Date: Thursday, October 2, 2008 - 4:55 pm

| From: Yinghai Lu <yinghai@kernel.org>

| can you boot with mtrr_cleanup_debug?
| 
| also what is /proc/mtrr with disable_mtrr_cleanup?

Shouldn't mtrr_cleanup_debug show the pre-cleanup MTRR settings?
--

From: Yinghai Lu
Date: Thursday, October 2, 2008 - 6:27 pm

yeah, will add some line about that.

YH
--

From: J.A.
Date: Thursday, October 2, 2008 - 3:32 pm

Also, on a 64 bit box with 4Gb, it gives this:

cicely:~# cat /proc/mtrr
reg00: base=0x00000000 (   0MB), size=4096MB: write-back, count=1
reg01: base=0x100000000 (4096MB), size=1024MB: write-back, count=1
reg02: base=0x140000000 (5120MB), size= 512MB: write-back, count=1
reg03: base=0x160000000 (5632MB), size= 256MB: write-back, count=1
reg04: base=0x80000000 (2048MB), size=2048MB: uncachable, count=1

Patch below cleans up formatting, with space for big bases and sizes (64 Gb).

diff -p -up arch/x86/kernel/cpu/mtrr/if.c.orig arch/x86/kernel/cpu/mtrr/if.c
--- arch/x86/kernel/cpu/mtrr/if.c.orig	2008-10-03 00:16:37.000000000 +0200
+++ arch/x86/kernel/cpu/mtrr/if.c	2008-10-03 00:22:34.000000000 +0200
@@ -16,7 +16,7 @@
 static const char *const mtrr_strings[MTRR_NUM_TYPES] =
 {
     "uncachable",               /* 0 */
-    "write-combining",          /* 1 */
+    "write-combine",            /* 1 */
     "?",                        /* 2 */
     "?",                        /* 3 */
     "write-through",            /* 4 */
@@ -405,9 +405,9 @@ static int mtrr_seq_show(struct seq_file
 			}
 			/* RED-PEN: base can be > 32bit */ 
 			len += seq_printf(seq, 
-				   "reg%02i: base=0x%05lx000 (%4luMB), size=%4lu%cB: %s, count=%d\n",
+				   "reg%02i: base=0x%06lx000 (%5luMB), size=%5lu%cB, count=%d: %s\n",
 			     i, base, base >> (20 - PAGE_SHIFT), size, factor,
-			     mtrr_attrib_to_str(type), mtrr_usage_table[i]);
+			     mtrr_usage_table[i], mtrr_attrib_to_str(type));
 		}
 	}
 	return 0;

-- 
J.A. Magallon <jamagallon()ono!com>     \               Software is like sex:
                                         \         It's better when it's free
Mandriva Linux release 2009.0 (Cooker) for i586
Linux 2.6.25-jam18 (gcc 4.3.1 20080626 (GCC) #1 SMP
--

From: Yinghai Lu
Date: Thursday, October 2, 2008 - 3:39 pm

boundary handling may have problem...


can you post /proc/mtrr with disable_mtrr_cleanup?

YH
--

From: Yinghai Lu
Date: Thursday, October 2, 2008 - 3:46 pm

can you try

diff --git a/arch/x86/kernel/cpu/mtrr/main.c b/arch/x86/kernel/cpu/mtrr/main.c
index ef64128..70beb13 100644
--- a/arch/x86/kernel/cpu/mtrr/main.c
+++ b/arch/x86/kernel/cpu/mtrr/main.c
@@ -1044,7 +1044,7 @@ second_try:
                hole_sizek = range0_sizek - state->range_sizek - second_sizek;

                /* hole size should be less than half of range0 size */
-               if (hole_sizek > (range0_sizek >> 1) &&
+               if (hole_sizek >= (range0_sizek >> 1) &&
                    range0_sizek >= chunk_sizek) {
                        range0_sizek -= chunk_sizek;
                        second_sizek = 0;

YH
--

From: Ingo Molnar
Date: Friday, October 3, 2008 - 12:38 am

applied to tip/x86/mtrr, thanks Yinghai!

	Ingo
--

From: J.A.
Date: Thursday, October 2, 2008 - 4:20 pm

Oops, sorry, this is without cleanup. This is a distro kernel and is built
but not enable by deafult. As it is rc7, I will use 'enble_mtrr_cleanup' ;):

cicely:~# cat /proc/mtrr
reg00: base=0x00000000 (   0MB), size=2048MB: write-back, count=1
reg01: base=0x100000000 (4096MB), size=1024MB: write-back, count=1
reg02: base=0x140000000 (5120MB), size= 512MB: write-back, count=1
reg03: base=0x160000000 (5632MB), size= 256MB: write-back, count=1

I have lost 2Gb ?

cicely:~# free
             total       used       free     shared    buffers     cached
Mem:       3755568     182348    3573220          0      14024      72716
-/+ buffers/cache:      95608    3659960

I can't easily try your patch, this is a distro kernel.
I will get the src.rpm...

Ahhhhh....

This is a dual opteron board. dmidecode says:

Handle 0x0026, DMI type 16, 15 bytes
Physical Memory Array
    Location: System Board Or Motherboard
    Use: System Memory
    Error Correction Type: Single-bit ECC
    Maximum Capacity: 8 GB
    Error Information Handle: Not Provided
    Number Of Devices: 8

So it maps one Opteron memory in first 4Gb and the other on the second 4Gb.
So I should have 2Gb@0 and 2Gb@4Gb.
What I don't know is why the bios eats up 256Mb.

-- 
J.A. Magallon <jamagallon()ono!com>     \               Software is like sex:
                                         \         It's better when it's free
Mandriva Linux release 2009.0 (Cooker) for i586
Linux 2.6.25-jam18 (gcc 4.3.1 20080626 (GCC) #1 SMP
--

From: Yinghai Lu
Date: Thursday, October 2, 2008 - 4:45 pm

check if you enable memhole remapping in BIOS.

YH
--

From: Ingo Molnar
Date: Friday, October 3, 2008 - 12:37 am

hm, maybe this bit could confuse versions of Xorg that modifies 
/proc/mtrr - i.e. this could be part of the ABI towards user-space. 
We'll see.

	Ingo
--

From: H. Peter Anvin
Date: Friday, October 3, 2008 - 12:38 am

This *IS* part of the ABI toward userspace, although I think Xorg uses 
the ioctl() interface.

	-hpa
--

From: Ingo Molnar
Date: Friday, October 3, 2008 - 12:42 am

yeah, but aspects of the ABI that applications do not rely on can 
generally be changed. OTOH, i've undone this aspect of the patch - 
/proc/mtrr is a legacy interface and changes to it are unnecessary. I 
kept the boot printout cleanups.

	Ingo
--

From: H. Peter Anvin
Date: Friday, October 3, 2008 - 12:44 am

Great.  It's worth noting that these strings aren't just used for 
readout, but also for setting.

	-hpa
--

From: D. Hugh Redelmeier
Date: Friday, October 3, 2008 - 7:57 am

| From: H. Peter Anvin <hpa@zytor.com>

| Ingo Molnar wrote:
| > * J.A. Magall
From: H. Peter Anvin
Date: Friday, October 3, 2008 - 12:40 am

By the way, this is just plain wrong; the cachability name is "Write 
Combining (WC)" not "write-combine".

	-hpa
--

From: Ingo Molnar
Date: Friday, October 3, 2008 - 12:44 am

and "uncachable" should be "uncacheable" i guess.

anyway, agreed, lets not touch this.

	Ingo
--

From: J.A.
Date: Friday, October 3, 2008 - 12:41 am

Oops, it's not only used for messages, also for writes in /proc !!!

Forget this part.

-- 
J.A. Magallon <jamagallon()ono!com>     \               Software is like sex:
                                         \         It's better when it's free
Mandriva Linux release 2009.0 (Cooker) for i586
Linux 2.6.25-jam18 (gcc 4.3.1 20080626 (GCC) #1 SMP
--

From: H. Peter Anvin
Date: Monday, September 29, 2008 - 9:07 pm

Applied series to tip:x86/mtrr, thanks!

	-hpa
--

From: Ingo Molnar
Date: Tuesday, September 30, 2008 - 12:45 am

tested them and pushed it out into tip/master.

	Ingo
--

From: Ingo Molnar
Date: Tuesday, September 30, 2008 - 1:11 am

btw., now that we've got wide enough exposure (a kernel release), could 
we please flip around the default of CONFIG_MTRR_SANITIZER and make it 
default-y? This feature should only produce a worse result if there's a 
bug in that code, right?

	Ingo
--

From: Yinghai Lu
Date: Tuesday, September 30, 2008 - 9:06 am

it will compare the exact mem range between the updated and original,
so it should be safe to set it to default-y.

YH
--

From: H. Peter Anvin
Date: Tuesday, September 30, 2008 - 11:50 am

Yes.  Furthermore, right now the code has to be enabled *and* a kernel 
command line option has to be passed.  I suspect we eventually want to 
runtime-enable it by default, too.

	-hpa
--

Previous thread: Strange mtrrs in Aspire One by J.A. on Monday, September 29, 2008 - 4:57 pm. (8 messages)

Next thread: Re: [PATCH v3 02/14] PCI: prevent duplicate slot names by Kenji Kaneshige on Monday, September 29, 2008 - 7:06 pm. (5 messages)