[PATCH 3/10] x86 boot: add header comment to dmi.h stating what it is

Previous thread: v2.6.25: SKB BUG: Invalid truesize (260) len=134, sizeof(sk_buff)=164 by Marco Berizzi on Wednesday, May 14, 2008 - 8:04 am. (2 messages)

Next thread: 2.6.26-rc2-mm1: oops on ARM while registering GPIO Keys by Byron Bradley on Wednesday, May 14, 2008 - 8:34 am. (2 messages)
From: Paul Jackson
Date: Wednesday, May 14, 2008 - 8:15 am

From: Paul Jackson <pj@sgi.com>

The patch:
    x86: convert cpu_to_apicid to be a per cpu variable
introduced a dependency of ipi.h on smp.h in x86
builds with an allnoconfig.  Including smp.h in ipi.h
fixes the build error:
    In file included from arch/x86/kernel/traps_64.c:48:
    include/asm/ipi.h: In function 'send_IPI_mask_sequence':
    include/asm/ipi.h:114: error: 'per_cpu__x86_cpu_to_apicid' undeclared (first use in this function)

Signed-off-by: Paul Jackson <pj@sgi.com>
Cc: Mike Travis <travis@sgi.com>

---
 include/asm-x86/ipi.h |    1 +
 1 file changed, 1 insertion(+)

--- linux.orig/include/asm-x86/ipi.h	2008-05-13 10:44:39.000000000 -0700
+++ linux/include/asm-x86/ipi.h	2008-05-13 10:56:37.431506774 -0700
@@ -20,6 +20,7 @@
 
 #include <asm/hw_irq.h>
 #include <asm/apic.h>
+#include <asm/smp.h>
 
 /*
  * the following functions deal with sending IPIs between CPUs.

-- 
                          I won't rest till it's the best ...
                          Programmer, Linux Scalability
                          Paul Jackson <pj@sgi.com> 1.650.933.1373
--

From: Paul Jackson
Date: Wednesday, May 14, 2008 - 8:15 am

From: Paul Jackson <pj@sgi.com

Remove three extern declarations for routines
that don't exist.  Fix a typo in a comment.

Signed-off-by: Paul Jackson <pj@sgi.com>

---
 arch/x86/mm/numa_64.c |    2 +-
 include/linux/efi.h   |    4 ----
 2 files changed, 1 insertion(+), 5 deletions(-)

--- 2.6.26-rc2-mm1.orig/include/linux/efi.h	2008-05-14 03:07:03.231239814 -0700
+++ 2.6.26-rc2-mm1/include/linux/efi.h	2008-05-14 03:07:07.039302049 -0700
@@ -287,7 +287,6 @@ efi_guid_unparse(efi_guid_t *guid, char 
 extern void efi_init (void);
 extern void *efi_get_pal_addr (void);
 extern void efi_map_pal_code (void);
-extern void efi_map_memmap(void);
 extern void efi_memmap_walk (efi_freemem_callback_t callback, void *arg);
 extern void efi_gettimeofday (struct timespec *ts);
 extern void efi_enter_virtual_mode (void);	/* switch EFI to virtual mode, if possible */
@@ -295,14 +294,11 @@ extern u64 efi_get_iobase (void);
 extern u32 efi_mem_type (unsigned long phys_addr);
 extern u64 efi_mem_attributes (unsigned long phys_addr);
 extern u64 efi_mem_attribute (unsigned long phys_addr, unsigned long size);
-extern int efi_mem_attribute_range (unsigned long phys_addr, unsigned long size,
-				    u64 attr);
 extern int __init efi_uart_console_only (void);
 extern void efi_initialize_iomem_resources(struct resource *code_resource,
 		struct resource *data_resource, struct resource *bss_resource);
 extern unsigned long efi_get_time(void);
 extern int efi_set_rtc_mmss(unsigned long nowtime);
-extern int is_available_memory(efi_memory_desc_t * md);
 extern struct efi_memory_map memmap;
 
 /**
--- 2.6.26-rc2-mm1.orig/arch/x86/mm/numa_64.c	2008-05-14 03:07:03.231239814 -0700
+++ 2.6.26-rc2-mm1/arch/x86/mm/numa_64.c	2008-05-14 03:07:07.047302180 -0700
@@ -231,7 +231,7 @@ void __init setup_node_bootmem(int nodei
 	else
 		bootmap_start = round_up(start, PAGE_SIZE);
 	/*
-	 * SMP_CAHCE_BYTES could be enough, but init_bootmem_node like
+	 * SMP_CACHE_BYTES could be enough, but init_bootmem_node ...
From: Paul Jackson
Date: Wednesday, May 14, 2008 - 8:15 am

From: Paul Jackson <pj@sgi.com>

The "dmi.h" file did not state anywhere in the file what "DMI" was.
For those who know, it's obvious.  For the rest of us, I added a
brief opening comment.

Signed-off-by: Paul Jackson <pj@sgi.com>

---
 drivers/firmware/dmi_scan.c |    5 +++++
 1 file changed, 5 insertions(+)

--- linux.orig/drivers/firmware/dmi_scan.c	2008-04-16 19:49:44.000000000 -0700
+++ linux/drivers/firmware/dmi_scan.c	2008-05-05 03:41:45.921911888 -0700
@@ -8,6 +8,11 @@
 #include <linux/slab.h>
 #include <asm/dmi.h>
 
+/*
+ * DMI stands for "Desktop Management Interface".  It is part
+ * of and an antecedent to, SMBIOS, which stands for System
+ * Management BIOS.  See further: http://www.dmtf.org/standards
+ */
 static char dmi_empty_string[] = "        ";
 
 static const char * __init dmi_string_nosave(const struct dmi_header *dm, u8 s)

-- 
                          I won't rest till it's the best ...
                          Programmer, Linux Scalability
                          Paul Jackson <pj@sgi.com> 1.650.933.1373
--

From: Paul Jackson
Date: Wednesday, May 14, 2008 - 8:15 am

From: Paul Jackson <pj@sgi.com>

The use of #defines with '##' pre-processor concatenation is a useful
way to form several symbol names with a common pattern.  But when there
is just a single name obtained from that #define, it's just obfuscation.
Better to just write the plain symbol name, as is.

The following patch is a result of my wasting ten minutes looking through
the kernel to figure out what 'PB_migrate_end' meant, and forgetting what
I came to do, by the time I figured out that the #define PB_range macro
defined it.

Signed-off-by: Paul Jackson <pj@sgi.com>

---
 include/linux/pageblock-flags.h |    8 +++-----
 1 file changed, 3 insertions(+), 5 deletions(-)

--- linux.orig/include/linux/pageblock-flags.h	2008-05-13 05:42:39.869811856 -0700
+++ linux/include/linux/pageblock-flags.h	2008-05-13 05:51:47.994829315 -0700
@@ -25,13 +25,11 @@
 
 #include <linux/types.h>
 
-/* Macro to aid the definition of ranges of bits */
-#define PB_range(name, required_bits) \
-	name, name ## _end = (name + required_bits) - 1
-
 /* Bit indices that affect a whole block of pages */
 enum pageblock_bits {
-	PB_range(PB_migrate, 3), /* 3 bits required for migrate types */
+	PB_migrate,
+	PB_migrate_end = PB_migrate + 3 - 1,
+			/* 3 bits required for migrate types */
 	NR_PAGEBLOCK_BITS
 };
 

-- 
                          I won't rest till it's the best ...
                          Programmer, Linux Scalability
                          Paul Jackson <pj@sgi.com> 1.650.933.1373
--

From: Paul Jackson
Date: Wednesday, May 14, 2008 - 8:15 am

From: Paul Jackson <pj@sgi.com>

Standardize a few pointer declarations to not have the
extra space after the '*' character.

Signed-off-by: Paul Jackson <pj@sgi.com>

---
 arch/x86/kernel/e820_32.c |    2 +-
 arch/x86/kernel/e820_64.c |    2 +-
 arch/x86/kernel/efi_64.c  |    2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

--- 2.6.26-rc2-mm1.orig/arch/x86/kernel/e820_32.c	2008-05-14 02:27:19.659222093 -0700
+++ 2.6.26-rc2-mm1/arch/x86/kernel/e820_32.c	2008-05-14 02:50:11.450652182 -0700
@@ -280,7 +280,7 @@ void __init add_memory_region(unsigned l
  * replaces the original e820 map with a new one, removing overlaps.
  *
  */
-int __init sanitize_e820_map(struct e820entry * biosmap, char * pnr_map)
+int __init sanitize_e820_map(struct e820entry *biosmap, char *pnr_map)
 {
 	struct change_member *change_tmp;
 	unsigned long current_type, last_type;
--- 2.6.26-rc2-mm1.orig/arch/x86/kernel/e820_64.c	2008-05-14 02:27:19.687222570 -0700
+++ 2.6.26-rc2-mm1/arch/x86/kernel/e820_64.c	2008-05-14 03:04:29.540729012 -0700
@@ -740,7 +740,7 @@ static void early_panic(char *msg)
 }
 
 /* We're not void only for x86 32-bit compat */
-char * __init machine_specific_memory_setup(void)
+char *__init machine_specific_memory_setup(void)
 {
 	char *who = "BIOS-e820";
 	/*
--- 2.6.26-rc2-mm1.orig/arch/x86/kernel/efi_64.c	2008-05-14 02:27:19.783224204 -0700
+++ 2.6.26-rc2-mm1/arch/x86/kernel/efi_64.c	2008-05-14 02:50:11.474652591 -0700
@@ -103,7 +103,7 @@ void __init efi_reserve_bootmem(void)
 				memmap.nr_map * memmap.desc_size);
 }
 
-void __iomem * __init efi_ioremap(unsigned long phys_addr, unsigned long size)
+void __iomem *__init efi_ioremap(unsigned long phys_addr, unsigned long size)
 {
 	static unsigned pages_mapped __initdata;
 	unsigned i, pages;

-- 
                          I won't rest till it's the best ...
                          Programmer, Linux Scalability
                          Paul Jackson <pj@sgi.com> 1.650.933.1373
--

From: Paul Jackson
Date: Wednesday, May 14, 2008 - 8:15 am

From: Paul Jackson <pj@sgi.com>

This patch is motivated by a subsequent patch which will allow for more
memory map entries on EFI supported systems than can be passed via the x86
legacy BIOS E820 interface.  The legacy interface is limited to E820MAX ==
128 memory entries, and that "E820MAX" manifest constant was used as the
size for several arrays and loops over those arrays.

The primary change in this patch is to change code loop sizes over those
arrays from using the constant E820MAX, to using the ARRAY_SIZE() macro
evaluated for the array being looped.  That way, a subsequent patch can
change the size of some of these arrays, without breaking this code.

This patch also adds a parameter to the sanitize_e820_map() routine,
which had an implicit size for the array passed it of E820MAX entries.
This new parameter explicitly passes the size of said array.  Once again,
this will allow a subsequent patch to change that array size for some
calls to sanitize_e820_map() without breaking the code.

As part of enhancing the sanitize_e820_map() interface this way, I further
combined the unnecessarily distinct x86_32 and x86_64 declarations for
this routine into a single, commonly used, declaration.

This patch in itself should make no difference to the resulting kernel
binary.

Signed-off-by: Paul Jackson <pj@sgi.com>

---
 arch/x86/boot/memory.c        |    3 ++-
 arch/x86/kernel/e820_32.c     |    9 +++++----
 arch/x86/kernel/e820_64.c     |   15 +++++++++------
 arch/x86/mach-default/setup.c |    4 +++-
 arch/x86/mach-voyager/setup.c |    4 +++-
 include/asm-x86/setup.h       |    4 +++-
 6 files changed, 25 insertions(+), 14 deletions(-)

--- 2.6.26-rc2-mm1.orig/arch/x86/boot/memory.c	2008-05-14 03:07:02.923234780 -0700
+++ 2.6.26-rc2-mm1/arch/x86/boot/memory.c	2008-05-14 03:07:20.607523804 -0700
@@ -13,6 +13,7 @@
  */
 
 #include "boot.h"
+#include <linux/kernel.h>
 
 #define SMAP	0x534d4150	/* ASCII "SMAP" */
 
@@ -53,7 +54,7 @@ static int ...
From: Paul Jackson
Date: Wednesday, May 14, 2008 - 8:15 am

From: Paul Jackson <pj@sgi.com>

Extend internal boot time memory tables to allow for up to
three entries per node, which may be larger than the 128 E820MAX
entries handled by the legacy BIOS E820 interface.  The EFI
interface, if present, is capable of passing memory map
entries for these larger node counts.

This patch requires an earlier patch that rewrote code depending
on these array sizes from using E820MAX explicitly to size loops,
to instead using ARRAY_SIZE() of the applicable array.

Another patch following this one will provide the code to pick
up additional memory entries passed via the EFI interface from
the BIOS and insert them in the following, now enlarged, arrays.

Signed-off-by: Paul Jackson <pj@sgi.com>

---
 arch/x86/kernel/e820_32.c |    8 ++++----
 arch/x86/kernel/e820_64.c |    8 ++++----
 include/asm-x86/e820.h    |   37 ++++++++++++++++++++++++++++++++++++-
 3 files changed, 44 insertions(+), 9 deletions(-)

--- 2.6.26-rc2-mm1.orig/include/asm-x86/e820.h	2008-05-14 03:07:02.719231446 -0700
+++ 2.6.26-rc2-mm1/include/asm-x86/e820.h	2008-05-14 08:03:45.138599021 -0700
@@ -2,6 +2,41 @@
 #define __ASM_E820_H
 #define E820MAP	0x2d0		/* our map */
 #define E820MAX	128		/* number of entries in E820MAP */
+
+/*
+ * Legacy E820 BIOS limits us to 128 (E820MAX) nodes due to the
+ * constrained space in the zeropage.  If we have more nodes than
+ * that, and if we've booted off EFI firmware, then the EFI tables
+ * passed us from the EFI firmware can list more nodes.  Size our
+ * internal memory map tables to have room for these additional
+ * nodes, based on up to three entries per node for which the
+ * kernel was built: MAX_NUMNODES == (1 << CONFIG_NODES_SHIFT),
+ * plus E820MAX, allowing space for the possible duplicate E820
+ * entries that might need room in the same arrays, prior to the
+ * call to sanitize_e820_map() to remove duplicates.  The allowance
+ * of three memory map entries per node is "enough" entries for
+ * the initial hardware ...
From: Paul Jackson
Date: Wednesday, May 14, 2008 - 8:15 am

From: Paul Jackson <pj@sgi.com>

The map size counter passed into, and back out of, sanitize_e820_map(),
was an eight bit type (char or u8), as derived from its origins in
legacy BIOS E820 structures.  This patch changes that type to an 'int',
to allow this sanitize routine to also be used on larger maps (larger
than the 256 count that fits in a char).  The legacy BIOS E820 interface
of course does not change; that remains at 8 bits for this count, holding
up to E820MAX == 128 entries.  But the kernel internals can handle more
when those additional memory map entries are passed from the BIOS via
EFI interfaces.

Signed-off-by: Paul Jackson <pj@sgi.com>

---
 arch/x86/kernel/e820_32.c     |    4 ++--
 arch/x86/kernel/e820_64.c     |   12 ++++++++----
 arch/x86/mach-default/setup.c |    5 ++++-
 arch/x86/mach-voyager/setup.c |    5 ++++-
 include/asm-x86/setup.h       |    2 +-
 5 files changed, 19 insertions(+), 9 deletions(-)

--- 2.6.26-rc2-mm1.orig/arch/x86/kernel/e820_32.c	2008-05-14 03:07:25.795608600 -0700
+++ 2.6.26-rc2-mm1/arch/x86/kernel/e820_32.c	2008-05-14 03:07:30.395683787 -0700
@@ -281,7 +281,7 @@ void __init add_memory_region(unsigned l
  *
  */
 int __init sanitize_e820_map(struct e820entry *biosmap, int max_nr_map,
-				char *pnr_map)
+				int *pnr_map)
 {
 	struct change_member *change_tmp;
 	unsigned long current_type, last_type;
@@ -765,7 +765,7 @@ void __init update_memory_range(u64 star
 }
 void __init update_e820(void)
 {
-	u8 nr_map;
+	int nr_map;
 
 	nr_map = e820.nr_map;
 	if (sanitize_e820_map(e820.map, ARRAY_SIZE(e820.map), &nr_map))
--- 2.6.26-rc2-mm1.orig/arch/x86/kernel/e820_64.c	2008-05-14 03:07:25.811608861 -0700
+++ 2.6.26-rc2-mm1/arch/x86/kernel/e820_64.c	2008-05-14 03:07:30.399683852 -0700
@@ -510,7 +510,7 @@ static void __init e820_print_map(char *
  *
  */
 int __init sanitize_e820_map(struct e820entry *biosmap, int max_nr_map,
-				    char *pnr_map)
+				    int *pnr_map)
 {
 	struct change_member {
 		struct e820entry ...
From: Paul Jackson
Date: Wednesday, May 14, 2008 - 8:15 am

From: Paul Jackson <pj@sgi.com>

Elaborate on the comment for sanitize_e820_map(), epxlaining more what
it does, what it inputs, and what it returns.  Rearrange the placement of
this comment to fit kernel conventions, before the routine's code rather
than buried inside it.

Signed-off-by: Paul Jackson <pj@sgi.com>

---
 arch/x86/kernel/e820_64.c |   94 +++++++++++++++++++++++++++-------------------
 1 file changed, 56 insertions(+), 38 deletions(-)

--- 2.6.26-rc2-mm1.orig/arch/x86/kernel/e820_64.c	2008-05-14 03:07:30.399683852 -0700
+++ 2.6.26-rc2-mm1/arch/x86/kernel/e820_64.c	2008-05-14 03:07:34.755755053 -0700
@@ -506,9 +506,64 @@ static void __init e820_print_map(char *
  * Sanitize the BIOS e820 map.
  *
  * Some e820 responses include overlapping entries. The following
- * replaces the original e820 map with a new one, removing overlaps.
+ * replaces the original e820 map with a new one, removing overlaps,
+ * and resolving conflicting memory types in favor of highest
+ * numbered type.
  *
+ * The input parameter biosmap points to an array of 'struct
+ * e820entry' which on entry has elements in the range [0, *pnr_map)
+ * valid, and which has space for up to max_nr_map entries.
+ * On return, the resulting sanitized e820 map entries will be in
+ * overwritten in the same location, starting at biosmap.
+ *
+ * The integer pointed to by pnr_map must be valid on entry (the
+ * current number of valid entries located at biosmap) and will
+ * be updated on return, with the new number of valid entries
+ * (something no more than max_nr_map.)
+ *
+ * The return value from sanitize_e820_map() is zero if it
+ * successfully 'sanitized' the map entries passed in, and is -1
+ * if it did nothing, which can happen if either of (1) it was
+ * only passed one map entry, or (2) any of the input map entries
+ * were invalid (start + size < start, meaning that the size was
+ * so big the described memory range wrapped around through zero.)
+ *
+ *	Visually we're performing the ...
From: Paul Jackson
Date: Wednesday, May 14, 2008 - 8:15 am

From: Paul Jackson <pj@sgi.com>

Add to the kernels boot memory map 'memmap' entries found in
the EFI memory descriptors passed in from the BIOS.

On EFI systems, up to E820MAX == 128 memory map entries can
be passed via the legacy E820 interface (limited by the size
of the 'zeropage').  These entries can be duplicated in the
EFI descriptors also passed from the BIOS, and possibly more
entries passed by the EFI interface, which does not have the
E820MAX limit on number of memory map entries.

This code doesn't worry about the likely duplicate, overlapping
or (unlikely) conflicting entries between the EFI map and the
E820 map.  It just dumps all the EFI entries into the memmap[]
array (which already has the E820 entries) and lets the existing
routine sanitize_e820_map() sort the mess out.

Signed-off-by: Paul Jackson <pj@sgi.com>

---
 arch/x86/kernel/efi.c |   26 ++++++++++++++++++++++++++
 1 file changed, 26 insertions(+)

--- linux.orig/arch/x86/kernel/efi.c	2008-05-06 06:44:32.332774433 -0700
+++ linux/arch/x86/kernel/efi.c	2008-05-06 06:55:20.767816417 -0700
@@ -213,6 +213,31 @@ unsigned long efi_get_time(void)
 		      eft.minute, eft.second);
 }
 
+/*
+ * Tell the kernel about the EFI memory map.  This might include
+ * more than the max 128 entries that can fit in the e820 legacy
+ * (zeropage) memory map.
+ */
+
+static void __init add_efi_memmap(void)
+{
+	void *p;
+
+	for (p = memmap.map; p < memmap.map_end; p += memmap.desc_size) {
+		efi_memory_desc_t *md = p;
+		unsigned long long start = md->phys_addr;
+		unsigned long long size = md->num_pages << EFI_PAGE_SHIFT;
+		int e820_type;
+
+		if (md->attribute & EFI_MEMORY_WB)
+			e820_type = E820_RAM;
+		else
+			e820_type = E820_RESERVED;
+		add_memory_region(start, size, e820_type);
+	}
+	sanitize_e820_map(e820.map, ARRAY_SIZE(e820.map), &e820.nr_map);
+}
+
 #if EFI_DEBUG
 static void __init print_efi_memmap(void)
 {
@@ -370,6 +395,7 @@ void __init efi_init(void)
 	if (memmap.desc_size != ...
From: Paul Jackson
Date: Thursday, May 15, 2008 - 5:03 pm

Ying Huang, H. Peter Anvin, Andi Kleen, Ingo Molnar (and anyone else
interested):

I invite your review of these patches, especially 7, 9 and 10 of the
set of 10, that I posted yesterday.

In combination, I believe that they allow passing more than E820MAX
(128) memory map entries from EFI firmware to a booting kernel, beyond
the E820 constraints of non-EFI legacy BIOS's, without changing the
EFI interface interface.  My understanding of the EFI interface is
that it already allows for this, and that we just needed to add the
ability, as done in these patches, for the kernel to make use of this.

The separate work of Ying Huang, to leverage Andi's reserve_early()
and to revise the EFI interface, to pass multiple pages of information
from the EFI firmware to the kernel at boot is, I presume, still needed
for other extensions, such as for passing more hard disk drive entries.

But the particular extension of interest to me, passing more memory map
entries (for future x86_64 systems with more than 128 memory nodes)
can I believe be done fairly easily, with these patches, with no need
of revising the EFI interface or reserving early memory, and with code
that is compatible across both x86_32 and x86_64 arch's.

-- 
                  I won't rest till it's the best ...
                  Programmer, Linux Scalability
                  Paul Jackson <pj@sgi.com> 1.940.382.4214
--

From: Ingo Molnar
Date: Friday, May 16, 2008 - 5:38 am

thanks Paul, i've queued up your changes to the -tip tree - they look 
nice and finegrained.

I merged them ontop of Yinghai's e820 unification work. The merge was 
far from trivial (functions you were patching moved to other files, 
interface definitions changed, etc.) so double-checking it would be 
nice. This will all show up in the x86/mtrr branch in -tip, probably 
later today.

	Ingo
--

From: Paul Jackson
Date: Friday, May 16, 2008 - 9:30 am

Ok - will do.

-- 
                  I won't rest till it's the best ...
                  Programmer, Linux Scalability
                  Paul Jackson <pj@sgi.com> 1.940.382.4214
--

From: Paul Jackson
Date: Saturday, May 17, 2008 - 11:49 pm

I only see one detail worth mentioning on reading the resulting merge
in ingo.tip.

Shouldn't the static array declarations in arch/x86/kernel/e820.c
be indented?  It looks like the -file- static declarations
from arch/x86/kernel/e820_32.c (which were not indented) took
precedence in the merge over the -function- static declarations
from arch/x86/kernel/e820_64.c.  The code, from the perspective
of what ends up in the compiled kernel, seems fine -- just the
indentation off.

int __init sanitize_e820_map(struct e820entry *biosmap, int max_nr_map,
				int *pnr_map)
{
	struct change_member {
		struct e820entry *pbios; /* pointer to original bios entry */
		unsigned long long addr; /* address for this change point */
	};
static struct change_member change_point_list[2*E820_X_MAX] __initdata;
static struct change_member *change_point[2*E820_X_MAX] __initdata;
static struct e820entry *overlap_list[E820_X_MAX] __initdata;
static struct e820entry new_bios[E820_X_MAX] __initdata;
	struct change_member *change_tmp;
	unsigned long current_type, last_type;
	unsigned long long last_addr;


-- 
                  I won't rest till it's the best ...
                  Programmer, Linux Scalability
                  Paul Jackson <pj@sgi.com> 1.940.382.4214
--

From: Ingo Molnar
Date: Monday, May 19, 2008 - 9:10 am

indeed that's quite off.

	Ingo
--

From: Huang, Ying
Date: Monday, May 26, 2008 - 8:12 pm

Hi, Paul,

Sorry for my late. I am busy on something other these days.

Because EFI memory map are converted to E820 memory map in boot loader
if the converted E820 entry number is below 128 now, I think it is
better to do all the conversion in boot loader. The limitation lies in
the size of struct boot_params (zero page), which is 4k. But there is an
extension to struct boot_params named linked list of struct setup_data
in a previous patch:

http://lkml.org/lkml/2008/3/27/26

With the help of linked list of struct setup_data, the E820 memory map
with more than 128 entries in EFI system can be passed to kernel as
follow.

1. In boot loader, EFI memory map is converted to E820 memory map A
2. The low 128 map entries of map A is copied to e820_map of struct
boot_params.
3. A setup_data struct is constructed to hold the other E820 entries.

4. In Linux kernel, the E820 map entries from both struct boot_params
and setup_data are used.

Best Regards,
Huang Ying

--

From: Paul Jackson
Date: Tuesday, May 27, 2008 - 12:08 pm

That previous patch of yours that you reference required modifying
the long standing E820 interface to the kernel, including bumping the
boot_params.hdr.version from 0x0208 to0x0209, and required reserving
additional early boot memory, in order to manage new linked list
data structures to be passed to the kernel and to allocate space for
these structures in early boot.

Why do you think that patch is better?  You say "because entries below
128 are already converted to the E820 map."

But I reply mine is better, because we can already provide, with the
current EFI interfaces, all the entries from the boot loader to the
kernel, so we do not need to provide them twice, a second time via an
E820 map mechanism that must be revised and extended to handle the
additional memory map entries past the first 128 of them.

Why should we do the extra work of passing something in a more difficult
manner, revising a key interface, when we can already pass it with
existing interface?


Resolving this need to pass more than 128 memory map entries has been
in development now for over six months.  I see related developments in
this area dating back to early October 2007.  I have been quite
explicit since January of 2008 that we (SGI) needed a solution to this
fairly soon.  I am running out of time, and need to have a solution to
this accepted.

If at some point you have other reasons to revision the EFI and E820
interfaces, for example to pass the EDD BIOS Enhanced Disk Drive
Services past what the classic E820 interface handles, that would be
fine my me.

I am not opposed to your patch mentioned above.  It's just that I don't
think it is needed for this particular purpose, of extending the number
of memory map entries past 128.

-- 
                  I won't rest till it's the best ...
                  Programmer, Linux Scalability
                  Paul Jackson <pj@sgi.com> 1.940.382.4214
--

From: H. Peter Anvin
Date: Tuesday, May 27, 2008 - 12:16 pm

This was discussed over a year ago, and it was declared the boot 
loaders' responsibility.  Please go back and look at the archives for 
the discussion rather than rehashing it at this time.

	-hpa
--

From: Paul Jackson
Date: Wednesday, May 28, 2008 - 2:59 am

Good grief - you're right.  I had only looked back nine months in my
archives, as you can tell, from my last message, in which I recite
some of that history.  This saga has dragged on for two years now.

Summary of what's transpired in the last two years:

  A] In the summer of 2006, Edgar Hucek submitted a patch that
     after some back and forth ended up doing pretty much what my
     latest patch does -- enabling the kernel to extract any EFI
     memory map entries and add them to the E820 map.
     
     At the time, there seemed to be agreement from both Linus and
     yourself (hpa) that this was a good approach.
     
     However, Edgar Hucek, perhaps not used to the various details
     and difficulties needed to gain acceptance of a kernel patch,
     gave up and went away before the patch gained final acceptance.

  B] Beginning in May of 2007, with the work of Chandramouli
     "Mouli" Narayanan, and continuing in July 2007 with the
     work of Ying Huang, when Mouli went on sabbatical, an
     extended series of patches has added EFI support to x86_32
     and x86_64.

  C] Beginning in August 2007, H. Peter Anvin has been advocating
     that instead of the approach of [A] above, rather we extend the
     e820 interface between the bootloader and the kernel to pass
     more than a single 4K page (the legacy 'zeropage'), using a
     linked of setup_data structs.
     
     Ying Huang has been fully supportive of this direction,  and,
     along with the reserve_memory() apparatus by Andi Kleen,
     providing the bulk of the code implementing it.
     
     The essential motivation for this extension of the e820 interface
     seems to be allowing for more than six drives in the EDD BIOS
     Enhanced Disk Drive Services interface.

  D] Beginning in January 2008, I have been repeatedly urging
     the portion of this work along that would enable more
     memory nodes than the 128 (E820MAX) allowed by the legacy
     e820 bios interface.  It is ...
From: Mike Travis
Date: Friday, May 16, 2008 - 6:20 am

Acked-By: Mike Travis <travis@sgi.com>


--

Previous thread: v2.6.25: SKB BUG: Invalid truesize (260) len=134, sizeof(sk_buff)=164 by Marco Berizzi on Wednesday, May 14, 2008 - 8:04 am. (2 messages)

Next thread: 2.6.26-rc2-mm1: oops on ARM while registering GPIO Keys by Byron Bradley on Wednesday, May 14, 2008 - 8:34 am. (2 messages)