Re: Dynamic nop selection breaks boot on Geode LX

Previous thread: Urgent Write Back by Dr. Raymond Kuo Fung CH'IEN. on Saturday, October 2, 2010 - 10:02 am. (1 message)

Next thread: INVESTMENT PROJECT IN YOUR COUNTRY. (HOW CAN I INVEST IN YOUR COUNTRY)? by kunihira barbara on Friday, October 1, 2010 - 12:59 am. (1 message)
From: Daniel Drake
Date: Saturday, October 2, 2010 - 12:16 pm

Hi,

linux-next fails to boot on OLPC's XO-1 laptop based on Geode LX processor.

This is because of

commit f49aa448561fe9215f43405cac6f31eb86317792
Author: Jason Baron <jbaron@redhat.com>
Date:   Fri Sep 17 11:08:51 2010 -0400

    jump label: Make dynamic no-op selection available outside of ftrace

http://git.kernel.org/?p=linux/kernel/git/mingo/linux-2.6-x86.git;a=commitdiff;h=f49aa44

The Geode hangs inside the asm() statement inside arch_init_ideal_nop5.

Any thoughts?

I'm a bit out of my depth debugging assembly code, but will happily
take pointers and patches to help diagnose.

cheers
Daniel
--

From: Borislav Petkov
Date: Saturday, October 2, 2010 - 10:50 pm

From: Daniel Drake <dsd@laptop.org>

maybe because the Geode doesn't have both the
P6_NOP5 and the NOP with 4 0x66 prefixes:
http://kerneltrap.org/mailarchive/linux-kernel/2010/8/27/4612336

and for some reason the trap can't find the fixup address. You say
"hangs" so you don't even get an "invalid opcode" OOPS?

Maybe on a Geode jmp label should use the 2-byte jmp + 3 1-byte nops
unconditionally after adding a synthetic CPUID flag in init_amd_k6(), in
the Geode part.

-- 
Regards/Gruss,
    Boris.
--

From: Borislav Petkov
Date: Sunday, October 3, 2010 - 2:26 am

From: Borislav Petkov <bp@alien8.de>

Maybe something like the following? (only tested in kvm)

--
diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h
index 220e2ea..e2ef56c 100644
--- a/arch/x86/include/asm/cpufeature.h
+++ b/arch/x86/include/asm/cpufeature.h
@@ -89,7 +89,7 @@
 #define X86_FEATURE_LFENCE_RDTSC (3*32+18) /* "" Lfence synchronizes RDTSC */
 #define X86_FEATURE_11AP	(3*32+19) /* "" Bad local APIC aka 11AP */
 #define X86_FEATURE_NOPL	(3*32+20) /* The NOPL (0F 1F) instructions */
-					  /* 21 available, was AMD_C1E */
+#define X86_FEATURE_NO_NOPL	(3*32+21) /* "" Missing NOPL (e.g. Geode LX) */
 #define X86_FEATURE_XTOPOLOGY	(3*32+22) /* cpu topology enum extensions */
 #define X86_FEATURE_TSC_RELIABLE (3*32+23) /* TSC is known to be reliable */
 #define X86_FEATURE_NONSTOP_TSC	(3*32+24) /* TSC does not stop in C states */
diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index cb0e6d3..3e1fa13 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -666,6 +666,9 @@ void __init arch_init_ideal_nop5(void)
 	 *
 	 * TODO: check the cpuid to determine the best nop.
 	 */
+	if (boot_cpu_has(X86_FEATURE_NO_NOPL))
+		goto early_done;
+
 	asm volatile (
 		"ftrace_test_jmp:"
 		"jmp ftrace_test_p6nop\n"
@@ -688,6 +691,7 @@ void __init arch_init_ideal_nop5(void)
 		_ASM_EXTABLE(ftrace_test_nop5, 3b)
 		: "=r"(faulted) : "0" (faulted));
 
+early_done:
 	switch (faulted) {
 	case 0:
 		pr_info("converting mcount calls to 0f 1f 44 00 00\n");
diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
index 7b875fd..d1e5cf4 100644
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -139,7 +139,7 @@ static void __cpuinit init_amd_k6(struct cpuinfo_x86 *c)
 
 	if (c->x86_model == 10) {
 		/* AMD Geode LX is model 10 */
-		/* placeholder for any needed mods */
+		set_cpu_cap(c, X86_FEATURE_NO_NOPL);
 		return;
 	}
 }

-- 
Regards/Gruss,
    Boris.
--

From: Borislav Petkov
Date: Sunday, October 3, 2010 - 7:47 am

From: Borislav Petkov <bp@alien8.de>

Gaah, forgot the actual fallback to jmp + nop, test this one instead:

--
diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h
index 220e2ea..e2ef56c 100644
--- a/arch/x86/include/asm/cpufeature.h
+++ b/arch/x86/include/asm/cpufeature.h
@@ -89,7 +89,7 @@
 #define X86_FEATURE_LFENCE_RDTSC (3*32+18) /* "" Lfence synchronizes RDTSC */
 #define X86_FEATURE_11AP	(3*32+19) /* "" Bad local APIC aka 11AP */
 #define X86_FEATURE_NOPL	(3*32+20) /* The NOPL (0F 1F) instructions */
-					  /* 21 available, was AMD_C1E */
+#define X86_FEATURE_NO_NOPL	(3*32+21) /* "" Missing NOPL (e.g. Geode LX) */
 #define X86_FEATURE_XTOPOLOGY	(3*32+22) /* cpu topology enum extensions */
 #define X86_FEATURE_TSC_RELIABLE (3*32+23) /* TSC is known to be reliable */
 #define X86_FEATURE_NONSTOP_TSC	(3*32+24) /* TSC does not stop in C states */
diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index cb0e6d3..d5fad2b 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -666,6 +666,11 @@ void __init arch_init_ideal_nop5(void)
 	 *
 	 * TODO: check the cpuid to determine the best nop.
 	 */
+	if (boot_cpu_has(X86_FEATURE_NO_NOPL)) {
+		faulted = 2;
+		goto early_done;
+	}
+
 	asm volatile (
 		"ftrace_test_jmp:"
 		"jmp ftrace_test_p6nop\n"
@@ -688,6 +693,7 @@ void __init arch_init_ideal_nop5(void)
 		_ASM_EXTABLE(ftrace_test_nop5, 3b)
 		: "=r"(faulted) : "0" (faulted));
 
+early_done:
 	switch (faulted) {
 	case 0:
 		pr_info("converting mcount calls to 0f 1f 44 00 00\n");
diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
index 7b875fd..d1e5cf4 100644
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -139,7 +139,7 @@ static void __cpuinit init_amd_k6(struct cpuinfo_x86 *c)
 
 	if (c->x86_model == 10) {
 		/* AMD Geode LX is model 10 */
-		/* placeholder for any needed mods */
+		set_cpu_cap(c, X86_FEATURE_NO_NOPL);
 		return;
 	}
 ...
From: Daniel Drake
Date: Sunday, October 3, 2010 - 9:32 am

The XO doesn't have standard VGA, so it is difficult to debug such
early crashes. This is crashing so early that kernel messages don't
even start to get sent over serial.

To debug these things, I checkpoint the code with various calls which
send individual characters over serial:

static void log_serial(char c)
{
   while ((inb(0x3fd) & 0x20) == 0) ;
   outb(c, 0x3f8);
   while ((inb(0x3fd) & 0x40) == 0) ;
}

So, I'm not really sure if/how it crashed or oops'd. However, I can
confirm that panic() does not get reached, since I put a character log
in there and it doesn't get sent. Let me know if you want me to put
character logging in other places.

I applied your two patches by hand and it doesn't solve the issue,
because the init_amd_k6() code is called long after
arch_init_ideal_nop5()

Daniel
--

From: Borislav Petkov
Date: Sunday, October 3, 2010 - 10:12 am

From: Daniel Drake <dsd@laptop.org>

Nice! Debugging a machine like that looks like a lot of jumping through
hoops. But let me ask you this: did you bisect linux-next to come up

Dang, we'll have to push the CPUID feature check into the early-routine
which runs before than arch_init_ideal_nop5() in setup_arch(), updated
patch is below. Just to make sure we're matching the right model, does
/proc/cpuinfo say family 5, model 10 on the machine?

Thanks.

--
From: Borislav Petkov <bp@alien8.de>
Date: Sun, 3 Oct 2010 11:11:54 +0200
Subject: [PATCH] jmp label: Fix boot hang on Geode LX

Since Geode LX doesn't implement the NOPL versions of NOP, fallback to
using JMP as a NOP on it.

Signed-off-by: Borislav Petkov <bp@alien8.de>
---
 arch/x86/include/asm/cpufeature.h |    2 +-
 arch/x86/kernel/alternative.c     |    6 ++++++
 arch/x86/kernel/cpu/amd.c         |   12 ++++++------
 3 files changed, 13 insertions(+), 7 deletions(-)

diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h
index 220e2ea..e2ef56c 100644
--- a/arch/x86/include/asm/cpufeature.h
+++ b/arch/x86/include/asm/cpufeature.h
@@ -89,7 +89,7 @@
 #define X86_FEATURE_LFENCE_RDTSC (3*32+18) /* "" Lfence synchronizes RDTSC */
 #define X86_FEATURE_11AP	(3*32+19) /* "" Bad local APIC aka 11AP */
 #define X86_FEATURE_NOPL	(3*32+20) /* The NOPL (0F 1F) instructions */
-					  /* 21 available, was AMD_C1E */
+#define X86_FEATURE_NO_NOPL	(3*32+21) /* "" Missing NOPL (e.g. Geode LX) */
 #define X86_FEATURE_XTOPOLOGY	(3*32+22) /* cpu topology enum extensions */
 #define X86_FEATURE_TSC_RELIABLE (3*32+23) /* TSC is known to be reliable */
 #define X86_FEATURE_NONSTOP_TSC	(3*32+24) /* TSC does not stop in C states */
diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index cb0e6d3..d5fad2b 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -666,6 +666,11 @@ void __init arch_init_ideal_nop5(void)
 	 *
 	 * TODO: check the cpuid to ...
From: Steven Rostedt
Date: Monday, October 4, 2010 - 11:06 am

Did you try earlyprintk? That is, on the kernel command line add:

	earlyprintk=ttyS0,115200

or whatever the tty and baud rate is. This will start printing out the
serial right at kernel startup. Even before printk is configured.



--

From: Jason Baron
Date: Monday, October 4, 2010 - 8:46 am

Hi Daniel,

looks like I moved dynamic no-op selection way too early in the boot -
before the exception tables required to select invalid opcodes were
even set up. The patch belows moves them later and should resolve this
issue for you. thanks for narrowing it down!

thanks,

-Jason


move arch_init_ideal_nop5 later

arch_init_ideal_nop5() was being called from setup_arch() before
the exception table was setup. Move it later into
alternative_instructions().

Fixes a boot hang on OLPC's XO-1 laptop based on Geode LX
processor.


Reported-by: Daniel Drake <dsd@laptop.org>
Signed-off-by: Jason Baron <jbaron@redhat.com>
---
 arch/x86/kernel/alternative.c |    5 +++++
 arch/x86/kernel/setup.c       |    6 ------
 2 files changed, 5 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index cb0e6d3..d8b5b21 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -454,6 +454,7 @@ extern struct paravirt_patch_site __start_parainstructions[],
 
 void __init alternative_instructions(void)
 {
+	unsigned long flags;
 	/* The patching is not fully atomic, so try to avoid local interruptions
 	   that might execute the to be patched code.
 	   Other CPUs are not running. */
@@ -508,6 +509,10 @@ void __init alternative_instructions(void)
 				(unsigned long)__smp_locks_end);
 
 	restart_nmi();
+
+	local_irq_save(flags);
+	arch_init_ideal_nop5();
+	local_irq_restore(flags);
 }
 
 /**
diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index 468ccd2..85b02b6 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -112,7 +112,6 @@
 #include <asm/numa_64.h>
 #endif
 #include <asm/mce.h>
-#include <asm/alternative.h>
 
 /*
  * end_pfn only includes RAM, while max_pfn_mapped includes all e820 entries.
@@ -690,7 +689,6 @@ void __init setup_arch(char **cmdline_p)
 {
 	int acpi = 0;
 	int k8 = 0;
-	unsigned long flags;
 
 #ifdef CONFIG_X86_32
 ...
From: Daniel Drake
Date: Monday, October 4, 2010 - 9:49 am

Applied this without the earlier patches from Borislav.

Thanks, that fixes it.

Daniel
--

From: Steven Rostedt
Date: Monday, October 4, 2010 - 1:31 pm

BTW, why the irq disabling around the call?

-- Steve


--

From: Jason Baron
Date: Monday, October 4, 2010 - 1:39 pm

Only b/c the selection of the nop was originally done by
'ftrace_dyn_arch_init()' which had irq's disabled around it. So maybe
that can be dropped now?

thanks,

-Jason
--

From: Steven Rostedt
Date: Monday, October 4, 2010 - 3:11 pm

Yeah, that disable is there for legacy reasons. It can be scrapped.

I'll remove it.

Thanks!

-- Steve


--

From: H. Peter Anvin
Date: Monday, October 4, 2010 - 2:51 pm

This code is fundamentally toxic and needs to be scrapped completely --
it is simply broken beyond repair.

We tried exactly this type of dynamic selection before, and it doesn't
work on broken virtualizers; in particular Microsoft VirtualPC can pass
the exception test and yet fail later.

The end result is very simple: you can always use NOPL on 64 bits, you
can never use NOPL on 32 bits.

66 66 66 66 90 will always *work* (as in, it will never fail) but it's
pretty slow on older CPUs which took a hit on handle prefixes -- but it
might still be faster than a jump on those.  Thus, in your code the JMP
case will never be reached anyway.

There isn't, of course, a classic 5-byte sequence, although the sequence:

	2E 8D 75 26 00

... should work (leal %ds:0(,%esi,1),%esi).  However, 66 ... 90 is
likely to work better on modern processors (although I haven't measured it.)

	-hpa
--

From: Steven Rostedt
Date: Monday, October 4, 2010 - 3:15 pm

The jmp was there because of paranoia, and I never expected it to be

The point is, this nop will be at _every_ function call (it replaces the
mcount call). Not just scattered throughout the kernel. It is imperative
that we have the best nop available.

So what would you recommend?

-- Steve


--

From: H. Peter Anvin
Date: Monday, October 4, 2010 - 3:22 pm

Yup.  Fun, isn't it?  :(  Unfortunately, broken virtualizers appear as
broken CPUs to us.  We used to do the #UD probe for NOPL, but it didn't

NOPL is special, because it's the only NOP sequence that isn't actually
*supported* on all processors (and we have found that we can't even use
it on 32 bits, even though the vast majority of all real-life 32-bit
processors do support it.)

Borislav is just checking to see if we can just use NOPL unconditionally
on 64 bits; as far as 32 bits is concerned the only option for picking
what is "best" is probably to benchmark some set of sequences on the set
of processors we care about.  However, I suspect that on any modern
processors either 66 66 66 66 90 or 2E 8D 75 26 00 will work equally well.

With a bit of benchmarking I think we could adopt the policy of using
NOPL on 64 bits and one of the above sequences on 32 bits.

	-hpa


--

From: Nick Lowe
Date: Monday, October 4, 2010 - 3:27 pm

As far as I'm aware, the check is just broken in VMs that trace their
lineage to Connectix's VirtualPC that Microsoft acquired back in 2006.
Is this correct? On balance, is this really to be cared about?
As far as 'common' processors that don't play nicely with NOPL...

VIA Edens based on the 'Samuel 2' design do not support CMOV or NOPL.
All VIA Edens based on the 'Nehemiah' design support CMOV but not
NOPL. (Introduced in
2003.)

Via C3s based on the 'Samuel 2'or 'Ezra'/'Ezra-T' design do not
support CMOV or NOPL.
All C3s based on the 'Nehemiah' design support CMOV but not NOPL.
(Introduced in 2003)

National Semi's GXm, GXLV and GX1 do not support CMOV or NOPL.
All Geodes since and including National Semi's GX2 support CMOV but
not NOPL. (Introduced in
2002)
The AMD branded Geodes (GX and LX) support CMOV but not NOPL.

The Cyrix 6x86 processors do not support CMOV or NOPL.
The Cyrix 8x86MX / Cyrix MII do support CMOV but not NOPL.

The AMD K6 and K6-2 do not support CMOV or NOPL.

There are probably a few more I've forgotten about...
Nick

--

From: H. Peter Anvin
Date: Monday, October 4, 2010 - 3:32 pm

If it was a huge win from using NOPL, then it would perhaps not be -- or
we'd go through more effort at detecting the broken cases.  However,
since it's not, it's easier to just avoid it.

	-hpa
--

From: Daniel Drake
Date: Tuesday, October 26, 2010 - 1:08 pm

Hi,


Bump... This patch has not been merged. I see that there is still some
lack of consensus in this area but please don't forget about it. The
offending commit (f49aa448561fe) has ended up in Linus' tree, so
--

From: Jason Baron
Date: Tuesday, October 26, 2010 - 1:12 pm

yeah, sorry...I've been busy chasing down a compiler issue...I haven't
forgotten, and will post this shortly.

thanks,

-Jason

--

Previous thread: Urgent Write Back by Dr. Raymond Kuo Fung CH'IEN. on Saturday, October 2, 2010 - 10:02 am. (1 message)

Next thread: INVESTMENT PROJECT IN YOUR COUNTRY. (HOW CAN I INVEST IN YOUR COUNTRY)? by kunihira barbara on Friday, October 1, 2010 - 12:59 am. (1 message)