Re: [PATCH 0/1] x86: fix remove cpu_pda table patch

Previous thread: [PATCH] ptrace: conditionalize compat_ptrace_request by Roland McGrath on Monday, April 28, 2008 - 1:57 pm. (1 message)

Next thread: [PATCH 1/1] x86: remove static boot_cpu_pda array v2 by Mike Travis on Monday, April 28, 2008 - 2:09 pm. (1 message)
From: Mike Travis
Date: Monday, April 28, 2008 - 2:09 pm

[Ingo - please replace "PATCH 07/11" with this one.]

    *	Remove 544k bytes from the kernel by removing the boot_cpu_pda
	array from the data section and allocating it during startup.

	Fixed panic in setup_per_cpu_areas when HOTPLUG_CPU not set.

For inclusion into sched-devel/latest tree.

Based on:
git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
+   sched-devel/latest  .../mingo/linux-2.6-sched-devel.git


Signed-off-by: Mike Travis <travis@sgi.com>
---

-- 
--

From: Ingo Molnar
Date: Monday, April 28, 2008 - 2:45 pm

sched-devel.git randconfig testing found another crash with your queue:

[    0.111060] Brought up 1 CPUs
[    0.111986] Total of 1 processors activated (4022.73 BogoMIPS).
[    0.112987] Testing NMI watchdog ... <1>BUG: unable to handle kernel NULL pointer dereference at 0000000000000040
[    0.114982] IP: [<ffffffff8180d4a0>] check_nmi_watchdog+0xb0/0x210
[    0.114982] PGD 0
[    0.114982] Oops: 0000 [1] SMP
[    0.114982] CPU 0
[............]

 http://redhat.com/~mingo/misc/config-Mon_Apr_28_23_25_25_CEST_2008.bad
 http://redhat.com/~mingo/misc/log-Mon_Apr_28_23_25_25_CEST_2008.bad

	Ingo
--

From: Mike Travis
Date: Tuesday, April 29, 2008 - 9:07 am

Ouch! ;-) Ok, I'll check it out.

Thanks,
Mike
--

From: Mike Travis
Date: Tuesday, April 29, 2008 - 10:38 am

Hi Ingo,

I need a bit more information on your hardware configuration.  Building a
kernel with the above config file started up fine on both the Intel and AMD
boxes.

Based on the above output it looks like it might be a UP machine?  

Thanks,
Mike 
--

From: Mike Travis
Date: Tuesday, April 29, 2008 - 11:53 am

...

Ok, I think I found it.  In check_nmi_watchdog():


        for (cpu = 0; cpu < NR_CPUS; cpu++)
                prev_nmi_count[cpu] = cpu_pda(cpu)->__nmi_count;

As I mentioned it works fine on both of my systems so could you try it out?

Thanks!
Mike
--
Subject: [PATCH 1/1] x86: change check_nmi_watchdog to use nr_cpu_ids

  * Change function check_nmi_watchdog() to use nr_cpu_ids instead of NR_CPUS.

Based on:
	git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
    +   sched-devel/latest  .../mingo/linux-2.6-sched-devel.git


Signed-off-by: Mike Travis <travis@sgi.com>
---
 arch/x86/kernel/nmi_64.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- linux-2.6.sched.orig/arch/x86/kernel/nmi_64.c
+++ linux-2.6.sched/arch/x86/kernel/nmi_64.c
@@ -88,7 +88,7 @@ int __init check_nmi_watchdog(void)
 	if (!atomic_read(&nmi_active))
 		return 0;
 
-	prev_nmi_count = kmalloc(NR_CPUS * sizeof(int), GFP_KERNEL);
+	prev_nmi_count = kmalloc(nr_cpu_ids * sizeof(int), GFP_KERNEL);
 	if (!prev_nmi_count)
 		return -1;
 
@@ -99,7 +99,7 @@ int __init check_nmi_watchdog(void)
 		smp_call_function(nmi_cpu_busy, (void *)&endflag, 0, 0);
 #endif
 
-	for (cpu = 0; cpu < NR_CPUS; cpu++)
+	for (cpu = 0; cpu < nr_cpu_ids; cpu++)
 		prev_nmi_count[cpu] = cpu_pda(cpu)->__nmi_count;
 	local_irq_enable();
 	mdelay((20*1000)/nmi_hz); // wait 20 ticks
--

From: Ingo Molnar
Date: Tuesday, April 29, 2008 - 2:44 pm

yeah, that makes sense ... i'll reinstate your patches and check.

	Ingo
--

From: Ingo Molnar
Date: Tuesday, April 29, 2008 - 11:04 pm

they crashed after about 3 randconfig iterations with:

  early res: 4 [8000-afff] PGTABLE
  early res: 5 [b000-b87f] MEMNODEMAP
PANIC: early exception 0e rip 10:ffffffff8077a150 error 2 cr2 37
Pid: 0, comm: swapper Not tainted 2.6.25-sched-devel.git-x86-latest.git #14

Call Trace:
 [<ffffffff81466196>] early_idt_handler+0x56/0x6a
 [<ffffffff8077a150>] ? numa_set_node+0x30/0x60
 [<ffffffff8077a129>] ? numa_set_node+0x9/0x60
 [<ffffffff8147a543>] numa_init_array+0x93/0xf0
 [<ffffffff8147b039>] acpi_scan_nodes+0x3b9/0x3f0
 [<ffffffff8147a496>] numa_initmem_init+0x136/0x150
 [<ffffffff8146da5f>] setup_arch+0x48f/0x700
 [<ffffffff802566ea>] ? clockevents_register_notifier+0x3a/0x50
 [<ffffffff81466a87>] start_kernel+0xd7/0x440
 [<ffffffff81466422>] x86_64_start_kernel+0x222/0x280

RIP 0x10


 http://redhat.com/~mingo/misc/log-Wed_Apr_30_00_28_09_CEST_2008.bad
 http://redhat.com/~mingo/misc/config-Wed_Apr_30_00_28_09_CEST_2008.bad

	Ingo
--

From: Mike Travis
Date: Wednesday, April 30, 2008 - 7:15 am

Thanks, I'll check it out asap.

I might have to change the approach of the "remove pdas" patch a bit to
make it more "fail safe".

Thanks,
Mike
--

From: Mike Travis
Date: Wednesday, April 30, 2008 - 8:02 am

Ingo Molnar wrote:
...
Here's the fixup...  This one should follow the previous patches.

Thanks,
Mike
---
Subject: [PATCH 1/1] x86: leave initial __cpu_pda array in place until cpus are booted

  * The __cpu_pda table must be a set of NULL entries (except for the boot cpu)
    to indicate to numa_set_node during early system initialization that the
    cpu pdas are not yet setup.  The pda nodenumber is set later in pda_init().

Based on:
	git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
    +   sched-devel/latest  .../mingo/linux-2.6-sched-devel.git


Signed-off-by: Mike Travis <travis@sgi.com>
---
 arch/x86/kernel/head64.c |   10 +++-------
 1 file changed, 3 insertions(+), 7 deletions(-)

--- linux-2.6.sched.orig/arch/x86/kernel/head64.c
+++ linux-2.6.sched/arch/x86/kernel/head64.c
@@ -29,17 +29,13 @@
 static struct x8664_pda _boot_cpu_pda __read_mostly;
 
 #ifdef CONFIG_SMP
-#ifdef CONFIG_DEBUG_PER_CPU_MAPS
 /*
- * We install an empty cpu_pda pointer table to trap references before
- * the actual cpu_pda pointer table is created in setup_cpu_pda_map().
+ * We install an empty cpu_pda pointer table to indicate to early users
+ * (numa_set_node) that the cpu_pda pointer table for cpus other than
+ * the boot cpu is not yet setup.
  */
 static struct x8664_pda *__cpu_pda[NR_CPUS] __initdata;
 #else
-static struct x8664_pda *__cpu_pda[1] __read_mostly;
-#endif
-
-#else /* !CONFIG_SMP (NR_CPUS will be 1) */
 static struct x8664_pda *__cpu_pda[NR_CPUS] __read_mostly;
 #endif
 
--

Previous thread: [PATCH] ptrace: conditionalize compat_ptrace_request by Roland McGrath on Monday, April 28, 2008 - 1:57 pm. (1 message)

Next thread: [PATCH 1/1] x86: remove static boot_cpu_pda array v2 by Mike Travis on Monday, April 28, 2008 - 2:09 pm. (1 message)