Hi,
linux-next fails to boot on OLPC's XO-1 laptop based on Geode LX processor.
This is because of
commit f49aa448561fe9215f43405cac6f31eb86317792
Author: Jason Baron <jbaron@redhat.com>
Date: Fri Sep 17 11:08:51 2010 -0400
jump label: Make dynamic no-op selection available outside of ftrace
http://git.kernel.org/?p=linux/kernel/git/mingo/linux-2.6-x86.git;a=commitdiff;h=f49aa44
The Geode hangs inside the asm() statement inside arch_init_ideal_nop5.
Any thoughts?
I'm a bit out of my depth debugging assembly code, but will happily
take pointers and patches to help diagnose.
cheers
Daniel
--
From: Daniel Drake <dsd@laptop.org> maybe because the Geode doesn't have both the P6_NOP5 and the NOP with 4 0x66 prefixes: http://kerneltrap.org/mailarchive/linux-kernel/2010/8/27/4612336 and for some reason the trap can't find the fixup address. You say "hangs" so you don't even get an "invalid opcode" OOPS? Maybe on a Geode jmp label should use the 2-byte jmp + 3 1-byte nops unconditionally after adding a synthetic CPUID flag in init_amd_k6(), in the Geode part. -- Regards/Gruss, Boris. --
From: Borislav Petkov <bp@alien8.de>
Maybe something like the following? (only tested in kvm)
--
diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h
index 220e2ea..e2ef56c 100644
--- a/arch/x86/include/asm/cpufeature.h
+++ b/arch/x86/include/asm/cpufeature.h
@@ -89,7 +89,7 @@
#define X86_FEATURE_LFENCE_RDTSC (3*32+18) /* "" Lfence synchronizes RDTSC */
#define X86_FEATURE_11AP (3*32+19) /* "" Bad local APIC aka 11AP */
#define X86_FEATURE_NOPL (3*32+20) /* The NOPL (0F 1F) instructions */
- /* 21 available, was AMD_C1E */
+#define X86_FEATURE_NO_NOPL (3*32+21) /* "" Missing NOPL (e.g. Geode LX) */
#define X86_FEATURE_XTOPOLOGY (3*32+22) /* cpu topology enum extensions */
#define X86_FEATURE_TSC_RELIABLE (3*32+23) /* TSC is known to be reliable */
#define X86_FEATURE_NONSTOP_TSC (3*32+24) /* TSC does not stop in C states */
diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index cb0e6d3..3e1fa13 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -666,6 +666,9 @@ void __init arch_init_ideal_nop5(void)
*
* TODO: check the cpuid to determine the best nop.
*/
+ if (boot_cpu_has(X86_FEATURE_NO_NOPL))
+ goto early_done;
+
asm volatile (
"ftrace_test_jmp:"
"jmp ftrace_test_p6nop\n"
@@ -688,6 +691,7 @@ void __init arch_init_ideal_nop5(void)
_ASM_EXTABLE(ftrace_test_nop5, 3b)
: "=r"(faulted) : "0" (faulted));
+early_done:
switch (faulted) {
case 0:
pr_info("converting mcount calls to 0f 1f 44 00 00\n");
diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
index 7b875fd..d1e5cf4 100644
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -139,7 +139,7 @@ static void __cpuinit init_amd_k6(struct cpuinfo_x86 *c)
if (c->x86_model == 10) {
/* AMD Geode LX is model 10 */
- /* placeholder for any needed mods */
+ set_cpu_cap(c, X86_FEATURE_NO_NOPL);
return;
}
}
--
Regards/Gruss,
Boris.
--
From: Borislav Petkov <bp@alien8.de>
Gaah, forgot the actual fallback to jmp + nop, test this one instead:
--
diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h
index 220e2ea..e2ef56c 100644
--- a/arch/x86/include/asm/cpufeature.h
+++ b/arch/x86/include/asm/cpufeature.h
@@ -89,7 +89,7 @@
#define X86_FEATURE_LFENCE_RDTSC (3*32+18) /* "" Lfence synchronizes RDTSC */
#define X86_FEATURE_11AP (3*32+19) /* "" Bad local APIC aka 11AP */
#define X86_FEATURE_NOPL (3*32+20) /* The NOPL (0F 1F) instructions */
- /* 21 available, was AMD_C1E */
+#define X86_FEATURE_NO_NOPL (3*32+21) /* "" Missing NOPL (e.g. Geode LX) */
#define X86_FEATURE_XTOPOLOGY (3*32+22) /* cpu topology enum extensions */
#define X86_FEATURE_TSC_RELIABLE (3*32+23) /* TSC is known to be reliable */
#define X86_FEATURE_NONSTOP_TSC (3*32+24) /* TSC does not stop in C states */
diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index cb0e6d3..d5fad2b 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -666,6 +666,11 @@ void __init arch_init_ideal_nop5(void)
*
* TODO: check the cpuid to determine the best nop.
*/
+ if (boot_cpu_has(X86_FEATURE_NO_NOPL)) {
+ faulted = 2;
+ goto early_done;
+ }
+
asm volatile (
"ftrace_test_jmp:"
"jmp ftrace_test_p6nop\n"
@@ -688,6 +693,7 @@ void __init arch_init_ideal_nop5(void)
_ASM_EXTABLE(ftrace_test_nop5, 3b)
: "=r"(faulted) : "0" (faulted));
+early_done:
switch (faulted) {
case 0:
pr_info("converting mcount calls to 0f 1f 44 00 00\n");
diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
index 7b875fd..d1e5cf4 100644
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -139,7 +139,7 @@ static void __cpuinit init_amd_k6(struct cpuinfo_x86 *c)
if (c->x86_model == 10) {
/* AMD Geode LX is model 10 */
- /* placeholder for any needed mods */
+ set_cpu_cap(c, X86_FEATURE_NO_NOPL);
return;
}
...The XO doesn't have standard VGA, so it is difficult to debug such
early crashes. This is crashing so early that kernel messages don't
even start to get sent over serial.
To debug these things, I checkpoint the code with various calls which
send individual characters over serial:
static void log_serial(char c)
{
while ((inb(0x3fd) & 0x20) == 0) ;
outb(c, 0x3f8);
while ((inb(0x3fd) & 0x40) == 0) ;
}
So, I'm not really sure if/how it crashed or oops'd. However, I can
confirm that panic() does not get reached, since I put a character log
in there and it doesn't get sent. Let me know if you want me to put
character logging in other places.
I applied your two patches by hand and it doesn't solve the issue,
because the init_amd_k6() code is called long after
arch_init_ideal_nop5()
Daniel
--
From: Daniel Drake <dsd@laptop.org> Nice! Debugging a machine like that looks like a lot of jumping through hoops. But let me ask you this: did you bisect linux-next to come up Dang, we'll have to push the CPUID feature check into the early-routine which runs before than arch_init_ideal_nop5() in setup_arch(), updated patch is below. Just to make sure we're matching the right model, does /proc/cpuinfo say family 5, model 10 on the machine? Thanks. -- From: Borislav Petkov <bp@alien8.de> Date: Sun, 3 Oct 2010 11:11:54 +0200 Subject: [PATCH] jmp label: Fix boot hang on Geode LX Since Geode LX doesn't implement the NOPL versions of NOP, fallback to using JMP as a NOP on it. Signed-off-by: Borislav Petkov <bp@alien8.de> --- arch/x86/include/asm/cpufeature.h | 2 +- arch/x86/kernel/alternative.c | 6 ++++++ arch/x86/kernel/cpu/amd.c | 12 ++++++------ 3 files changed, 13 insertions(+), 7 deletions(-) diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h index 220e2ea..e2ef56c 100644 --- a/arch/x86/include/asm/cpufeature.h +++ b/arch/x86/include/asm/cpufeature.h @@ -89,7 +89,7 @@ #define X86_FEATURE_LFENCE_RDTSC (3*32+18) /* "" Lfence synchronizes RDTSC */ #define X86_FEATURE_11AP (3*32+19) /* "" Bad local APIC aka 11AP */ #define X86_FEATURE_NOPL (3*32+20) /* The NOPL (0F 1F) instructions */ - /* 21 available, was AMD_C1E */ +#define X86_FEATURE_NO_NOPL (3*32+21) /* "" Missing NOPL (e.g. Geode LX) */ #define X86_FEATURE_XTOPOLOGY (3*32+22) /* cpu topology enum extensions */ #define X86_FEATURE_TSC_RELIABLE (3*32+23) /* TSC is known to be reliable */ #define X86_FEATURE_NONSTOP_TSC (3*32+24) /* TSC does not stop in C states */ diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c index cb0e6d3..d5fad2b 100644 --- a/arch/x86/kernel/alternative.c +++ b/arch/x86/kernel/alternative.c @@ -666,6 +666,11 @@ void __init arch_init_ideal_nop5(void) * * TODO: check the cpuid to ...
Did you try earlyprintk? That is, on the kernel command line add: earlyprintk=ttyS0,115200 or whatever the tty and baud rate is. This will start printing out the serial right at kernel startup. Even before printk is configured. --
Hi Daniel,
looks like I moved dynamic no-op selection way too early in the boot -
before the exception tables required to select invalid opcodes were
even set up. The patch belows moves them later and should resolve this
issue for you. thanks for narrowing it down!
thanks,
-Jason
move arch_init_ideal_nop5 later
arch_init_ideal_nop5() was being called from setup_arch() before
the exception table was setup. Move it later into
alternative_instructions().
Fixes a boot hang on OLPC's XO-1 laptop based on Geode LX
processor.
Reported-by: Daniel Drake <dsd@laptop.org>
Signed-off-by: Jason Baron <jbaron@redhat.com>
---
arch/x86/kernel/alternative.c | 5 +++++
arch/x86/kernel/setup.c | 6 ------
2 files changed, 5 insertions(+), 6 deletions(-)
diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index cb0e6d3..d8b5b21 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -454,6 +454,7 @@ extern struct paravirt_patch_site __start_parainstructions[],
void __init alternative_instructions(void)
{
+ unsigned long flags;
/* The patching is not fully atomic, so try to avoid local interruptions
that might execute the to be patched code.
Other CPUs are not running. */
@@ -508,6 +509,10 @@ void __init alternative_instructions(void)
(unsigned long)__smp_locks_end);
restart_nmi();
+
+ local_irq_save(flags);
+ arch_init_ideal_nop5();
+ local_irq_restore(flags);
}
/**
diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index 468ccd2..85b02b6 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -112,7 +112,6 @@
#include <asm/numa_64.h>
#endif
#include <asm/mce.h>
-#include <asm/alternative.h>
/*
* end_pfn only includes RAM, while max_pfn_mapped includes all e820 entries.
@@ -690,7 +689,6 @@ void __init setup_arch(char **cmdline_p)
{
int acpi = 0;
int k8 = 0;
- unsigned long flags;
#ifdef CONFIG_X86_32
...Applied this without the earlier patches from Borislav. Thanks, that fixes it. Daniel --
BTW, why the irq disabling around the call? -- Steve --
Only b/c the selection of the nop was originally done by 'ftrace_dyn_arch_init()' which had irq's disabled around it. So maybe that can be dropped now? thanks, -Jason --
Yeah, that disable is there for legacy reasons. It can be scrapped. I'll remove it. Thanks! -- Steve --
This code is fundamentally toxic and needs to be scrapped completely -- it is simply broken beyond repair. We tried exactly this type of dynamic selection before, and it doesn't work on broken virtualizers; in particular Microsoft VirtualPC can pass the exception test and yet fail later. The end result is very simple: you can always use NOPL on 64 bits, you can never use NOPL on 32 bits. 66 66 66 66 90 will always *work* (as in, it will never fail) but it's pretty slow on older CPUs which took a hit on handle prefixes -- but it might still be faster than a jump on those. Thus, in your code the JMP case will never be reached anyway. There isn't, of course, a classic 5-byte sequence, although the sequence: 2E 8D 75 26 00 ... should work (leal %ds:0(,%esi,1),%esi). However, 66 ... 90 is likely to work better on modern processors (although I haven't measured it.) -hpa --
The jmp was there because of paranoia, and I never expected it to be The point is, this nop will be at _every_ function call (it replaces the mcount call). Not just scattered throughout the kernel. It is imperative that we have the best nop available. So what would you recommend? -- Steve --
Yup. Fun, isn't it? :( Unfortunately, broken virtualizers appear as broken CPUs to us. We used to do the #UD probe for NOPL, but it didn't NOPL is special, because it's the only NOP sequence that isn't actually *supported* on all processors (and we have found that we can't even use it on 32 bits, even though the vast majority of all real-life 32-bit processors do support it.) Borislav is just checking to see if we can just use NOPL unconditionally on 64 bits; as far as 32 bits is concerned the only option for picking what is "best" is probably to benchmark some set of sequences on the set of processors we care about. However, I suspect that on any modern processors either 66 66 66 66 90 or 2E 8D 75 26 00 will work equally well. With a bit of benchmarking I think we could adopt the policy of using NOPL on 64 bits and one of the above sequences on 32 bits. -hpa --
As far as I'm aware, the check is just broken in VMs that trace their lineage to Connectix's VirtualPC that Microsoft acquired back in 2006. Is this correct? On balance, is this really to be cared about? As far as 'common' processors that don't play nicely with NOPL... VIA Edens based on the 'Samuel 2' design do not support CMOV or NOPL. All VIA Edens based on the 'Nehemiah' design support CMOV but not NOPL. (Introduced in 2003.) Via C3s based on the 'Samuel 2'or 'Ezra'/'Ezra-T' design do not support CMOV or NOPL. All C3s based on the 'Nehemiah' design support CMOV but not NOPL. (Introduced in 2003) National Semi's GXm, GXLV and GX1 do not support CMOV or NOPL. All Geodes since and including National Semi's GX2 support CMOV but not NOPL. (Introduced in 2002) The AMD branded Geodes (GX and LX) support CMOV but not NOPL. The Cyrix 6x86 processors do not support CMOV or NOPL. The Cyrix 8x86MX / Cyrix MII do support CMOV but not NOPL. The AMD K6 and K6-2 do not support CMOV or NOPL. There are probably a few more I've forgotten about... Nick --
If it was a huge win from using NOPL, then it would perhaps not be -- or we'd go through more effort at detecting the broken cases. However, since it's not, it's easier to just avoid it. -hpa --
Hi, Bump... This patch has not been merged. I see that there is still some lack of consensus in this area but please don't forget about it. The offending commit (f49aa448561fe) has ended up in Linus' tree, so --
yeah, sorry...I've been busy chasing down a compiler issue...I haven't forgotten, and will post this shortly. thanks, -Jason --
| Greg KH | Og dreams of kernels |
| Jens Axboe | [PATCH 31/33] Fusion: sg chaining support |
| Arnd Bergmann | Re: finding your own dead "CONFIG_" variables |
| Mark Brown | [PATCH 2/2] Subject: natsemi: Allow users to disable workaround for DspCfg reset |
| Tony Breeds | [LGUEST] Look in object dir for .config |
git: | |
| Brian Downing | Re: Git in a Nutshell guide |
| John Benes | Re: master has some toys |
| Matthias Lederhofer | [PATCH 4/7] introduce GIT_WORK_TREE to specify the work tree |
| Alexander Sulfrian | [RFC/PATCH] RE: git calls SSH_ASKPASS even if DISPLAY is not set |
| Junio C Hamano | Re: Rss produced by git is not valid xml? |
| Linux Kernel Mailing List | iSeries: fix section mismatch in iseries_veth |
| Linux Kernel Mailing List | ixbge: remove TX lock and redo TX accounting. |
| Linux Kernel Mailing List | ixgbe: fix several counter register errata |
| Linux Kernel Mailing List |
