Now that 2.6.26-rc1 boots on my Ultra5, I noticed that it reports having only 128MB RAM, while earlier kernels reported the correct amount: 256MB. A diff of the dmesg output from 2.6.25 and 2.6.26-rc1 shows: --- dmesg-2.6.25 2008-05-07 19:41:26.000000000 +0200 +++ dmesg-2.6.26-rc1 2008-05-07 19:41:26.000000000 +0200 @@ -1,24 +1,36 @@ PROMLIB: Sun IEEE Boot Prom 'OBP 3.25.3 2000/06/29 14:12' PROMLIB: Root node compatible: -Linux version 2.6.25 (mikpe@sparge) (gcc version 4.2.3) #1 Thu Apr 17 19:53:16 CEST 2008 +Linux version 2.6.26-rc1 (mikpe@sparge) (gcc version 4.2.3) #1 Wed May 7 17:41:33 CEST 2008 console [earlyprom0] enabled ARCH: SUN4U Ethernet address: 08:00:20:fd:ec:1f Kernel: Using 1 locked TLB entries for main kernel image. Remapping the kernel... done. -[0000000200000000-fffff80000400000] page_structs=262144 node=0 entry=0/0 -[0000000200000000-fffff80000800000] page_structs=262144 node=0 entry=1/0 -[0000000200000000-fffff80000c00000] page_structs=262144 node=0 entry=2/0 -[0000000200000000-fffff80001000000] page_structs=262144 node=0 entry=3/0 OF stdout device is: /pci@1f,0/pci@1,1/SUNW,m64B@2 -PROM: Built device tree with 46848 bytes of memory. -On node 0 totalpages: 32298 +PROM: Built device tree with 46641 bytes of memory. +Top of RAM: 0x17f46000, Total RAM: 0xff40000 +Memory hole size: 128MB +Entering add_active_range(0, 0, 16384) 0 entries of 256 used +Entering add_active_range(0, 32768, 49023) 1 entries of 256 used +Entering add_active_range(0, 49024, 49053) 2 entries of 256 used +Entering add_active_range(0, 49055, 49059) 3 entries of 256 used +[0000000200000000-fffff80000400000] page_structs=131072 node=0 entry=0/0 +[0000000200000000-fffff80000800000] page_structs=131072 node=0 entry=1/0 +Zone PFN ranges: + Normal 0 -> 49059 +Movable zone start PFN for each node +early_node_map[4] active PFN ranges + 0: 0 -> 16384 + 0: 32768 -> 49023 + 0: 49024 -> 49053 + 0: 49055 -> 49059 +On node 0 ...
From: Mikael Pettersson <mikpe@it.uu.se> Thanks for the report, I'll try to track this one down. I have a machine configured similarly to your's, so this ought to not be too difficult to fix... I hope. :-) --
From: Mikael Pettersson <mikpe@it.uu.se>
Try as I might I couldn't reproduce this, although I did find another
bug along the way.
But that's OK, we'll add some debugging and fetch the necessary
information from your machine.
The good news is that the early bootup does see all 256MB of your
memory:
Top of RAM: 0x17f46000, Total RAM: 0xff40000
Memory hole size: 128MB
Entering add_active_range(0, 0, 16384) 0 entries of 256 used
Entering add_active_range(0, 32768, 49023) 1 entries of 256 used
Entering add_active_range(0, 49024, 49053) 2 entries of 256 used
Entering add_active_range(0, 49055, 49059) 3 entries of 256 used
That "0xff40000" value is 267649024 decimal, and the size of the page
ranges registered next match up.
And yet we get:
Memory: 127016k available (1920k kernel code, 744k data, 152k init) [fffff80000000000,0000000017f46000]
which is strange.
Between these two events there is only a handfull of bootmem
allocations, which together should not total 128MB on your
machine. :-)
We have some existing debugging, which I'd like you to enable on the
boot command line. Simply add "numa=debug" and that'll get some more
vebose information.
Please also add the debugging patch below.
Thanks!
diff --git a/lib/lmb.c b/lib/lmb.c
index 83287d3..3f55973 100644
--- a/lib/lmb.c
+++ b/lib/lmb.c
@@ -19,6 +19,8 @@
struct lmb lmb;
+#define DEBUG
+
void lmb_dump_all(void)
{
#ifdef DEBUG
@@ -29,7 +31,7 @@ void lmb_dump_all(void)
pr_debug(" memory.size = 0x%llx\n",
(unsigned long long)lmb.memory.size);
for (i=0; i < lmb.memory.cnt ;i++) {
- pr_debug(" memory.region[0x%x].base = 0x%llx\n",
+ pr_debug(" memory.region[0x%lx].base = 0x%llx\n",
i, (unsigned long long)lmb.memory.region[i].base);
pr_debug(" .size = 0x%llx\n",
(unsigned long long)lmb.memory.region[i].size);
@@ -38,7 +40,7 @@ void lmb_dump_all(void)
pr_debug(" reserved.cnt = 0x%lx\n", lmb.reserved.cnt);
pr_debug(" ...David Miller writes: > From: Mikael Pettersson <mikpe@it.uu.se> > Date: Wed, 7 May 2008 20:36:55 +0200 > > > Now that 2.6.26-rc1 boots on my Ultra5, I noticed that it > > reports having only 128MB RAM, while earlier kernels reported > > the correct amount: 256MB. > > > > A diff of the dmesg output from 2.6.25 and 2.6.26-rc1 shows: > > Try as I might I couldn't reproduce this, although I did find another > bug along the way. > > But that's OK, we'll add some debugging and fetch the necessary > information from your machine. > > The good news is that the early bootup does see all 256MB of your > memory: > > Top of RAM: 0x17f46000, Total RAM: 0xff40000 > Memory hole size: 128MB > Entering add_active_range(0, 0, 16384) 0 entries of 256 used > Entering add_active_range(0, 32768, 49023) 1 entries of 256 used > Entering add_active_range(0, 49024, 49053) 2 entries of 256 used > Entering add_active_range(0, 49055, 49059) 3 entries of 256 used > > That "0xff40000" value is 267649024 decimal, and the size of the page > ranges registered next match up. > > And yet we get: > > Memory: 127016k available (1920k kernel code, 744k data, 152k init) [fffff80000000000,0000000017f46000] > > which is strange. > > Between these two events there is only a handfull of bootmem > allocations, which together should not total 128MB on your > machine. :-) > > We have some existing debugging, which I'd like you to enable on the > boot command line. Simply add "numa=debug" and that'll get some more > vebose information. > > Please also add the debugging patch below. Right, 2.6.26-rc2 plus your debugging patch and booted with numa=debug prints the following: [ 0.000000] PROMLIB: Sun IEEE Boot Prom 'OBP 3.25.3 2000/06/29 14:12' [ 0.000000] PROMLIB: Root node compatible: [ 0.000000] Linux version 2.6.26-rc2-test (mikpe@sparge) (gcc version 4.2.3) #1 Mon May 12 19:22:10 CEST 2008 [ 0.000000] console [earlyprom0] enabled [ ...
From: Mikael Pettersson <mikpe@it.uu.se> It's completely new and harmless. You have memory at 0->128MB and at 256MB->384MB. That gap from 128MB->256MB is the hole is it talking about. It's computed very simply, it is the difference between the highest value physical address of available memory and the total amount of available memory. Don't let it scare you, it's 2GB on one of my machines :-) --
From: Mikael Pettersson <mikpe@it.uu.se>
Hmmm... my debugging patch had this:
diff --git a/lib/lmb.c b/lib/lmb.c
index 83287d3..3f55973 100644
--- a/lib/lmb.c
+++ b/lib/lmb.c
...
+#define DEBUG
+
which should have resulted in lmb_dump_all() printing some userful
debugging messages. I validated that it did so on my machine with the
patch applied, but they appear nowhere in your logs :(
paginig_init() in arch/sparc64/mm/init.c calls lmb_analyze() them lmb_dump_all().
Those messages go out with KERN_DEBUG log level, maybe messages at
that level were trimmed by your log capture for some reason?
In any event I think I know the area that's cause some kind of problem.
It looks like lmb_alloc() has a case where it will reserve the wrong
amount of memory, or something like that.
You can remove the debugging patch I sent you, and try this one instead.
Please make sure KERN_DEBUG messages make it into the log :-)
diff --git a/arch/sparc64/mm/init.c b/arch/sparc64/mm/init.c
index a9828d7..a628a99 100644
--- a/arch/sparc64/mm/init.c
+++ b/arch/sparc64/mm/init.c
@@ -1353,6 +1353,8 @@ static void __init bootmem_init_one_node(int nid)
numadbg("bootmem_init_one_node(%d)\n", nid);
+ lmb_dump_all();
+
p = NODE_DATA(nid);
if (p->node_spanned_pages) {
diff --git a/lib/lmb.c b/lib/lmb.c
index 83287d3..d8c84f3 100644
--- a/lib/lmb.c
+++ b/lib/lmb.c
@@ -17,6 +17,8 @@
#define LMB_ALLOC_ANYWHERE 0
+#define DEBUG
+
struct lmb lmb;
void lmb_dump_all(void)
@@ -29,7 +31,7 @@ void lmb_dump_all(void)
pr_debug(" memory.size = 0x%llx\n",
(unsigned long long)lmb.memory.size);
for (i=0; i < lmb.memory.cnt ;i++) {
- pr_debug(" memory.region[0x%x].base = 0x%llx\n",
+ pr_debug(" memory.region[0x%lx].base = 0x%llx\n",
i, (unsigned long long)lmb.memory.region[i].base);
pr_debug(" .size = 0x%llx\n",
(unsigned long long)lmb.memory.region[i].size);
@@ -38,7 +40,7 @@ void lmb_dump_all(void)
...From: David Miller <davem@davemloft.net> Nevermind I see why the messages don't show up. Hold on for a second and I'll send an updated debugging patch. Thanks. --
From: David Miller <davem@davemloft.net>
Try this patch instead, thanks!
diff --git a/arch/sparc64/mm/init.c b/arch/sparc64/mm/init.c
index a9828d7..a628a99 100644
--- a/arch/sparc64/mm/init.c
+++ b/arch/sparc64/mm/init.c
@@ -1353,6 +1353,8 @@ static void __init bootmem_init_one_node(int nid)
numadbg("bootmem_init_one_node(%d)\n", nid);
+ lmb_dump_all();
+
p = NODE_DATA(nid);
if (p->node_spanned_pages) {
diff --git a/lib/lmb.c b/lib/lmb.c
index 83287d3..f294bbc 100644
--- a/lib/lmb.c
+++ b/lib/lmb.c
@@ -10,6 +10,8 @@
* 2 of the License, or (at your option) any later version.
*/
+#define DEBUG
+
#include <linux/kernel.h>
#include <linux/init.h>
#include <linux/bitops.h>
@@ -29,7 +31,7 @@ void lmb_dump_all(void)
pr_debug(" memory.size = 0x%llx\n",
(unsigned long long)lmb.memory.size);
for (i=0; i < lmb.memory.cnt ;i++) {
- pr_debug(" memory.region[0x%x].base = 0x%llx\n",
+ pr_debug(" memory.region[0x%lx].base = 0x%llx\n",
i, (unsigned long long)lmb.memory.region[i].base);
pr_debug(" .size = 0x%llx\n",
(unsigned long long)lmb.memory.region[i].size);
@@ -38,7 +40,7 @@ void lmb_dump_all(void)
pr_debug(" reserved.cnt = 0x%lx\n", lmb.reserved.cnt);
pr_debug(" reserved.size = 0x%lx\n", lmb.reserved.size);
for (i=0; i < lmb.reserved.cnt ;i++) {
- pr_debug(" reserved.region[0x%x].base = 0x%llx\n",
+ pr_debug(" reserved.region[0x%lx].base = 0x%llx\n",
i, (unsigned long long)lmb.reserved.region[i].base);
pr_debug(" .size = 0x%llx\n",
(unsigned long long)lmb.reserved.region[i].size);
--
David Miller writes:
> From: David Miller <davem@davemloft.net>
> Date: Mon, 12 May 2008 15:36:28 -0700 (PDT)
>
> > Nevermind I see why the messages don't show up. Hold on for a second
> > and I'll send an updated debugging patch.
>
> Try this patch instead, thanks!
>
> diff --git a/arch/sparc64/mm/init.c b/arch/sparc64/mm/init.c
> index a9828d7..a628a99 100644
> --- a/arch/sparc64/mm/init.c
> +++ b/arch/sparc64/mm/init.c
> @@ -1353,6 +1353,8 @@ static void __init bootmem_init_one_node(int nid)
>
> numadbg("bootmem_init_one_node(%d)\n", nid);
>
> + lmb_dump_all();
> +
> p = NODE_DATA(nid);
>
> if (p->node_spanned_pages) {
> diff --git a/lib/lmb.c b/lib/lmb.c
> index 83287d3..f294bbc 100644
> --- a/lib/lmb.c
> +++ b/lib/lmb.c
> @@ -10,6 +10,8 @@
> * 2 of the License, or (at your option) any later version.
> */
>
> +#define DEBUG
> +
> #include <linux/kernel.h>
> #include <linux/init.h>
> #include <linux/bitops.h>
> @@ -29,7 +31,7 @@ void lmb_dump_all(void)
> pr_debug(" memory.size = 0x%llx\n",
> (unsigned long long)lmb.memory.size);
> for (i=0; i < lmb.memory.cnt ;i++) {
> - pr_debug(" memory.region[0x%x].base = 0x%llx\n",
> + pr_debug(" memory.region[0x%lx].base = 0x%llx\n",
> i, (unsigned long long)lmb.memory.region[i].base);
> pr_debug(" .size = 0x%llx\n",
> (unsigned long long)lmb.memory.region[i].size);
> @@ -38,7 +40,7 @@ void lmb_dump_all(void)
> pr_debug(" reserved.cnt = 0x%lx\n", lmb.reserved.cnt);
> pr_debug(" reserved.size = 0x%lx\n", lmb.reserved.size);
> for (i=0; i < lmb.reserved.cnt ;i++) {
> - pr_debug(" reserved.region[0x%x].base = 0x%llx\n",
> + pr_debug(" reserved.region[0x%lx].base = 0x%llx\n",
> i, (unsigned long long)lmb.reserved.region[i].base);
> pr_debug(" .size = 0x%llx\n",
> (unsigned long long)lmb.reserved.region[i].size);
...From: Mikael Pettersson <mikpe@it.uu.se> Yeah, those last two reserved regions are where your 128MB went to, as I suspected. Thanks I'll try to figure out where to go from here. --
From: Mikael Pettersson <mikpe@it.uu.se> Date: Tue, 13 May 2008 21:31:19 +0200 From the very beginning your higher RAM is gone. It's correct that some memory should be reserved, for the ramdisk, but not 128MB :-) The ramdisk is just under 4MB in size, so something is fishy here. And indeed, I'm reserving the wrong length. Please try this patch: sparc64: Fix lmb_reserve() args in find_ramdisk(). This fixes the missing ram regression reported by Mikael Pettersson <mikpe@it.uu.se>, much thanks for all of this help in diagnosing this. The second argument to lmb_reserve() is a size, not an end address bounds. Signed-off-by: David S. Miller <davem@davemloft.net> --- arch/sparc64/mm/init.c | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/arch/sparc64/mm/init.c b/arch/sparc64/mm/init.c index a9828d7..3c7b947 100644 --- a/arch/sparc64/mm/init.c +++ b/arch/sparc64/mm/init.c @@ -768,7 +768,7 @@ static void __init find_ramdisk(unsigned long phys_base) initrd_start = ramdisk_image; initrd_end = ramdisk_image + sparc_ramdisk_size; - lmb_reserve(initrd_start, initrd_end); + lmb_reserve(initrd_start, sparc_ramdisk_size); initrd_start += PAGE_OFFSET; initrd_end += PAGE_OFFSET; -- 1.5.5.1.57.g5909c --
David Miller writes: > From: Mikael Pettersson <mikpe@it.uu.se> > Date: Tue, 13 May 2008 21:31:19 +0200 > > Ok, Mikael, I think I figured out this bug: > > > Found ramdisk at physical address 0x10800000, size 3683665 > > lmb_dump_all: > > memory.cnt = 0x4 > > memory.size = 0xff40000 > > memory.region[0x0].base = 0x0 > > .size = 0x8000000 > > memory.region[0x1].base = 0x10000000 > > .size = 0x7efe000 > > memory.region[0x2].base = 0x17f00000 > > .size = 0x3a000 > > memory.region[0x3].base = 0x17f3e000 > > .size = 0x8000 > > reserved.cnt = 0x2 > > reserved.size = 0x0 > > reserved.region[0x0].base = 0x10000000 > > .size = 0x35bf60 > > reserved.region[0x1].base = 0x10800000 > > .size = 0x10b83551 > > >From the very beginning your higher RAM is gone. > It's correct that some memory should be reserved, > for the ramdisk, but not 128MB :-) > > The ramdisk is just under 4MB in size, so something is fishy here. > > And indeed, I'm reserving the wrong length. Please try this > patch: > > sparc64: Fix lmb_reserve() args in find_ramdisk(). > > This fixes the missing ram regression reported by > Mikael Pettersson <mikpe@it.uu.se>, much thanks for > all of this help in diagnosing this. > > The second argument to lmb_reserve() is a size, > not an end address bounds. > > Signed-off-by: David S. Miller <davem@davemloft.net> > --- > arch/sparc64/mm/init.c | 2 +- > 1 files changed, 1 insertions(+), 1 deletions(-) > > diff --git a/arch/sparc64/mm/init.c b/arch/sparc64/mm/init.c > index a9828d7..3c7b947 100644 > --- a/arch/sparc64/mm/init.c > +++ b/arch/sparc64/mm/init.c > @@ -768,7 +768,7 @@ static void __init find_ramdisk(unsigned long phys_base) > initrd_start = ramdisk_image; > initrd_end = ramdisk_image + sparc_ramdisk_size; > > ...
Mikael Pettersson writes: > Thanks Dave. As the dmesg diff below shows, this patch fixed > my lost RAM issue. I actually have about 8KB more available Doh! s/KB/MB/ of course. > now than I had with 2.6.25. --
From: Mikael Pettersson <mikpe@it.uu.se> Thanks a lot for all of your help. --
