[Ingo - please replace "PATCH 07/11" with this one.]
* Remove 544k bytes from the kernel by removing the boot_cpu_pda
array from the data section and allocating it during startup.
Fixed panic in setup_per_cpu_areas when HOTPLUG_CPU not set.
For inclusion into sched-devel/latest tree.
Based on:
git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
+ sched-devel/latest .../mingo/linux-2.6-sched-devel.git
Signed-off-by: Mike Travis <travis@sgi.com>
---
--
--
sched-devel.git randconfig testing found another crash with your queue: [ 0.111060] Brought up 1 CPUs [ 0.111986] Total of 1 processors activated (4022.73 BogoMIPS). [ 0.112987] Testing NMI watchdog ... <1>BUG: unable to handle kernel NULL pointer dereference at 0000000000000040 [ 0.114982] IP: [<ffffffff8180d4a0>] check_nmi_watchdog+0xb0/0x210 [ 0.114982] PGD 0 [ 0.114982] Oops: 0000 [1] SMP [ 0.114982] CPU 0 [............] http://redhat.com/~mingo/misc/config-Mon_Apr_28_23_25_25_CEST_2008.bad http://redhat.com/~mingo/misc/log-Mon_Apr_28_23_25_25_CEST_2008.bad Ingo --
Ouch! ;-) Ok, I'll check it out. Thanks, Mike --
Hi Ingo, I need a bit more information on your hardware configuration. Building a kernel with the above config file started up fine on both the Intel and AMD boxes. Based on the above output it looks like it might be a UP machine? Thanks, Mike --
...
Ok, I think I found it. In check_nmi_watchdog():
for (cpu = 0; cpu < NR_CPUS; cpu++)
prev_nmi_count[cpu] = cpu_pda(cpu)->__nmi_count;
As I mentioned it works fine on both of my systems so could you try it out?
Thanks!
Mike
--
Subject: [PATCH 1/1] x86: change check_nmi_watchdog to use nr_cpu_ids
* Change function check_nmi_watchdog() to use nr_cpu_ids instead of NR_CPUS.
Based on:
git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
+ sched-devel/latest .../mingo/linux-2.6-sched-devel.git
Signed-off-by: Mike Travis <travis@sgi.com>
---
arch/x86/kernel/nmi_64.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
--- linux-2.6.sched.orig/arch/x86/kernel/nmi_64.c
+++ linux-2.6.sched/arch/x86/kernel/nmi_64.c
@@ -88,7 +88,7 @@ int __init check_nmi_watchdog(void)
if (!atomic_read(&nmi_active))
return 0;
- prev_nmi_count = kmalloc(NR_CPUS * sizeof(int), GFP_KERNEL);
+ prev_nmi_count = kmalloc(nr_cpu_ids * sizeof(int), GFP_KERNEL);
if (!prev_nmi_count)
return -1;
@@ -99,7 +99,7 @@ int __init check_nmi_watchdog(void)
smp_call_function(nmi_cpu_busy, (void *)&endflag, 0, 0);
#endif
- for (cpu = 0; cpu < NR_CPUS; cpu++)
+ for (cpu = 0; cpu < nr_cpu_ids; cpu++)
prev_nmi_count[cpu] = cpu_pda(cpu)->__nmi_count;
local_irq_enable();
mdelay((20*1000)/nmi_hz); // wait 20 ticks
--
yeah, that makes sense ... i'll reinstate your patches and check. Ingo --
they crashed after about 3 randconfig iterations with: early res: 4 [8000-afff] PGTABLE early res: 5 [b000-b87f] MEMNODEMAP PANIC: early exception 0e rip 10:ffffffff8077a150 error 2 cr2 37 Pid: 0, comm: swapper Not tainted 2.6.25-sched-devel.git-x86-latest.git #14 Call Trace: [<ffffffff81466196>] early_idt_handler+0x56/0x6a [<ffffffff8077a150>] ? numa_set_node+0x30/0x60 [<ffffffff8077a129>] ? numa_set_node+0x9/0x60 [<ffffffff8147a543>] numa_init_array+0x93/0xf0 [<ffffffff8147b039>] acpi_scan_nodes+0x3b9/0x3f0 [<ffffffff8147a496>] numa_initmem_init+0x136/0x150 [<ffffffff8146da5f>] setup_arch+0x48f/0x700 [<ffffffff802566ea>] ? clockevents_register_notifier+0x3a/0x50 [<ffffffff81466a87>] start_kernel+0xd7/0x440 [<ffffffff81466422>] x86_64_start_kernel+0x222/0x280 RIP 0x10 http://redhat.com/~mingo/misc/log-Wed_Apr_30_00_28_09_CEST_2008.bad http://redhat.com/~mingo/misc/config-Wed_Apr_30_00_28_09_CEST_2008.bad Ingo --
Thanks, I'll check it out asap. I might have to change the approach of the "remove pdas" patch a bit to make it more "fail safe". Thanks, Mike --
Ingo Molnar wrote:
...
Here's the fixup... This one should follow the previous patches.
Thanks,
Mike
---
Subject: [PATCH 1/1] x86: leave initial __cpu_pda array in place until cpus are booted
* The __cpu_pda table must be a set of NULL entries (except for the boot cpu)
to indicate to numa_set_node during early system initialization that the
cpu pdas are not yet setup. The pda nodenumber is set later in pda_init().
Based on:
git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
+ sched-devel/latest .../mingo/linux-2.6-sched-devel.git
Signed-off-by: Mike Travis <travis@sgi.com>
---
arch/x86/kernel/head64.c | 10 +++-------
1 file changed, 3 insertions(+), 7 deletions(-)
--- linux-2.6.sched.orig/arch/x86/kernel/head64.c
+++ linux-2.6.sched/arch/x86/kernel/head64.c
@@ -29,17 +29,13 @@
static struct x8664_pda _boot_cpu_pda __read_mostly;
#ifdef CONFIG_SMP
-#ifdef CONFIG_DEBUG_PER_CPU_MAPS
/*
- * We install an empty cpu_pda pointer table to trap references before
- * the actual cpu_pda pointer table is created in setup_cpu_pda_map().
+ * We install an empty cpu_pda pointer table to indicate to early users
+ * (numa_set_node) that the cpu_pda pointer table for cpus other than
+ * the boot cpu is not yet setup.
*/
static struct x8664_pda *__cpu_pda[NR_CPUS] __initdata;
#else
-static struct x8664_pda *__cpu_pda[1] __read_mostly;
-#endif
-
-#else /* !CONFIG_SMP (NR_CPUS will be 1) */
static struct x8664_pda *__cpu_pda[NR_CPUS] __read_mostly;
#endif
--
