Hi,
it looks like we currently clear X86_FEATURE_NOPL in detect_nopl()
unconditionally and not only on 32-bit, as it should be according to
ba0593bf553c450a03dbc5f8c1f0ff58b778a0c8.
I'm thinking maybe something like the following might be in order
--
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index caa967d..9a2cfeb 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -713,7 +713,9 @@ void __init early_cpu_init(void)
*/
static void __cpuinit detect_nopl(struct cpuinfo_x86 *c)
{
+#ifdef CONFIG_X86_32
clear_cpu_cap(c, X86_FEATURE_NOPL);
+#endif
}
static void __cpuinit generic_identify(struct cpuinfo_x86 *c)
--
or maybe get even more radical and rip out the whole X86_FEATURE_NOPL
thing?
--
Regards/Gruss,
Boris.
--
Yes, that's a bug. It should be disabled only on 32 bits. -- H. Peter Anvin, Intel Open Source Technology Center I work for Intel. I don't speak on their behalf. --
ba0593bf553c450a03dbc5f8c1f0ff58b778a0c8 cleared the aforementioned
cpuid bit only on 32-bit due to various problems with Virtual PC. This
somehow got lost during the 32- + 64-bit merge so restore the feature
bit on 64-bit.
Signed-off-by: Borislav Petkov <bp@alien8.de>
---
arch/x86/kernel/cpu/common.c | 2 ++
1 files changed, 2 insertions(+), 0 deletions(-)
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index caa967d..9a2cfeb 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -713,7 +713,9 @@ void __init early_cpu_init(void)
*/
static void __cpuinit detect_nopl(struct cpuinfo_x86 *c)
{
+#ifdef CONFIG_X86_32
clear_cpu_cap(c, X86_FEATURE_NOPL);
+#endif
}
static void __cpuinit generic_identify(struct cpuinfo_x86 *c)
--
1.7.2.3
--
Regards/Gruss,
Boris.
--
I think we actually need to set it on 64 bits. -- Sent from my mobile phone. Please pardon any lack of formatting. --
From: "H. Peter Anvin" <hpa@zytor.com>
Ok, correct me if I'm wrong but I think it is already "indirectly" set
on 64-bits through REQUIRED_MASK3:
#define REQUIRED_MASK3 (NEED_NOPL)
which is defined as
#define NEED_NOPL (1<<(X86_FEATURE_NOPL & 31))
when either CONFIG_X86_P6_NOP or CONFIG_X86_64 are selected.
and cpu_has() does
#define cpu_has(c, bit) \
(__builtin_constant_p(bit) && \
...
(((bit)>>5)==3 && (1UL<<((bit)&31) & REQUIRED_MASK3)) || \
...
? 1 : test_cpu_cap(c, bit))
which returns 1 when testing for X86_FEATURE_NOPL, no?
--
Regards/Gruss,
Boris.
--
Sort of... only if the input is a constant. -- Sent from my mobile phone. Please pardon any lack of formatting. --
ba0593bf553c450a03dbc5f8c1f0ff58b778a0c8 cleared the aforementioned
cpuid bit only on 32-bit due to various problems with Virtual PC. This
somehow got lost during the 32- + 64-bit merge so restore the feature
bit on 64-bit. For that, set it explicitly for non-constant arguments of
cpu_has(). Update comment for future reference.
Signed-off-by: Borislav Petkov <bp@alien8.de>
---
arch/x86/kernel/cpu/common.c | 15 ++++++++++-----
1 files changed, 10 insertions(+), 5 deletions(-)
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index caa967d..4b68bda 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -704,16 +704,21 @@ void __init early_cpu_init(void)
}
/*
- * The NOPL instruction is supposed to exist on all CPUs with
- * family >= 6; unfortunately, that's not true in practice because
- * of early VIA chips and (more importantly) broken virtualizers that
- * are not easy to detect. In the latter case it doesn't even *fail*
- * reliably, so probing for it doesn't even work. Disable it completely
+ * The NOPL instruction is supposed to exist on all CPUs of family >= 6;
+ * unfortunately, that's not true in practice because of early VIA
+ * chips and (more importantly) broken virtualizers that are not easy
+ * to detect. In the latter case it doesn't even *fail* reliably, so
+ * probing for it doesn't even work. Disable it completely on 32-bit
* unless we can find a reliable way to detect all the broken cases.
+ * Enable it explicitly on 64-bit for non-constant inputs of cpu_has().
*/
static void __cpuinit detect_nopl(struct cpuinfo_x86 *c)
{
+#ifdef CONFIG_X86_32
clear_cpu_cap(c, X86_FEATURE_NOPL);
+#else
+ set_cpu_cap(c, X86_FEATURE_NOPL);
+#endif
}
static void __cpuinit generic_identify(struct cpuinfo_x86 *c)
--
1.7.2.3
--
Regards/Gruss,
Boris.
--
Commit-ID: 366d4a43b1a7a861c356a0e407c4f03f454d42ea Gitweb: http://git.kernel.org/tip/366d4a43b1a7a861c356a0e407c4f03f454d42ea Author: Borislav Petkov <bp@alien8.de> AuthorDate: Mon, 4 Oct 2010 09:31:27 +0200 Committer: H. Peter Anvin <hpa@linux.intel.com> CommitDate: Mon, 4 Oct 2010 11:22:24 -0700 x86, cpu: Fix X86_FEATURE_NOPL ba0593bf553c450a03dbc5f8c1f0ff58b778a0c8 cleared the aforementioned cpuid bit only on 32-bit due to various problems with Virtual PC. This somehow got lost during the 32- + 64-bit merge so restore the feature bit on 64-bit. For that, set it explicitly for non-constant arguments of cpu_has(). Update comment for future reference. Signed-off-by: Borislav Petkov <bp@alien8.de> LKML-Reference: <20101004073127.GA20305@liondog.tnic> Cc: Ryan O'Neill <ryan@innosecc.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> --- arch/x86/kernel/cpu/common.c | 15 ++++++++++----- 1 files changed, 10 insertions(+), 5 deletions(-) diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c index f2f9ac7..927e630 100644 --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -704,16 +704,21 @@ void __init early_cpu_init(void) } /* - * The NOPL instruction is supposed to exist on all CPUs with - * family >= 6; unfortunately, that's not true in practice because - * of early VIA chips and (more importantly) broken virtualizers that - * are not easy to detect. In the latter case it doesn't even *fail* - * reliably, so probing for it doesn't even work. Disable it completely + * The NOPL instruction is supposed to exist on all CPUs of family >= 6; + * unfortunately, that's not true in practice because of early VIA + * chips and (more importantly) broken virtualizers that are not easy + * to detect. In the latter case it doesn't even *fail* reliably, so + * probing for it doesn't even work. Disable it completely on 32-bit * unless we can find a reliable way to ...
From: tip-bot for Borislav Petkov <bp@alien8.de> tag it for -stable too? -- Regards/Gruss, Boris. --
What is the flaw that justifies it for stable, or even .36? It seems relatively harmless, with at most a very minor performance issue, unless I'm missing something? -hpa --
From: "H. Peter Anvin" <hpa@linux.intel.com> I was thinking more along the lines of this being partially broken when supplying non-constant arguments to cpu_has(), something like <arch/x86/kernel/cpu/proc.c:show_cpuinfo()>, for example. But all this could cause is /proc/cpuinfo not to report the "nopl" feature bit. I guess this is too minor an issue to even to be considered for stable. -- Regards/Gruss, Boris. --
I don't think this is right.
The cpu_has() logic depends not on x86-64, but on X86_P6_NOP.
Which has
depends on X86_64
depends on (MCORE2 || MPENTIUM4 || MPSC)
as its config rules, not just X86_64.
So I think your patch is bogus. It makes the current situation even
_more_ confusing than it is.
So what is it? Should we get rid of that odd X86_P6_NOP thing
entirely? Or should we use it consistently for the "this machine has
NOPL"? Should we always use P6_NOP for x86-64 and just remove the
"MCORE2 || MPENTIUM4 || MPSC" thing?
Whatever we do, I don't think this patch is the right one.
Linus
--
Actually, cpu_has() depends on:
#if defined(CONFIG_X86_P6_NOP) || defined(CONFIG_X86_64)
Obviously, if we *use* P6 NOPs they better be available on the
processor, but we also are pretty sure that every 64-bit processor
Right; the top clause, of course, was added later, as we found out that
it was unsafe to ever use NOPL on 32 bits, because of Microsoft f*ckups.
CONFIG_X86_P6_NOP was intended to indicate that using NOPL is
*preferred*, whereas the CPUID bit -- cpu_has() -- was (and is) intended
to indicate that NOPL is *supported*, not necessarily preferred.
As such, the code I believe is technically correct for the current
situation (NOPL is always supported on 64 bits, never on 32 bits), but
as you quite correctly point out it is definitely more confusing than is
desirable; this is probably also reflected by the following code in
alternative.c:
static const unsigned char *const *__init_or_module find_nop_table(void)
{
if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL &&
boot_cpu_has(X86_FEATURE_NOPL))
return p6_nops;
else
return k8_nops;
}
The vendor check here is really ugly.
The only case where we need CONFIG_X86_P6_NOP as a compile-time check is
for the ASM_NOP* macros; for dynamic code we're of course better off
with a runtime check, but it would be better if that was part of the CPU
routines. Perhaps X86_FEATURE_FAST_NOPL or something like that.
-hpa
--
Ahh. Right you are. The place that depends on just P6_NOP is the
default NOP choice logic in <asm/nops.h>
But the end result ends up being the same: can we please clean this
all up so that it isn't so confusing? Rather than add to the
confusion?
Linus
--
Agreed that this should be cleaned up. However, in the meantime I'd like to keep Borislav's patch in the tree since it makes the code technically correct at least. OK? -hpa --
From: "H. Peter Anvin" <hpa@zytor.com>
I think the cleanup should be easy: on 64-bit unconditionally return
p6_nops in find_nop_table() since every 64-bit processor should support
them and on 32-bit never return p6_nops since X86_FEATURE_NOPL is not
set there and fall back to intel_nops on non-AMD. Which means we can get
rid of X86_FEATURE_NOPL altogether. Too radical?
Hmmm...
--
Regards/Gruss,
Boris.
--
This is fine with me, but you may want to check with your employer if the Intel NOPs will perform slower than the 66 ... 90 NOPs on AMD processors. -hpa --
FWIW, if P6_NOPs work as well as 66...90 on AMD, I'd much rather switch to the fixed sequence for all 64-bit processors. -hpa --
From: "H. Peter Anvin" <hpa@zytor.com>
--
Regards/Gruss,
Boris.
--
Another piece of the confusion I noticed a couple of days ago: X86_MINIMUM_CPU_FAMILY defaults to "6" if X86_32 && X86_P6_NOP; whereas X86_P6_NOP depends on X86_64. Hugh --
Again, it's completely consistent -- if you keep in mind that CONFIG_X86_P6_NOP depending on X86_64 is a policy decision; that policy can theoretically be changed. However, it really doesn't seem worth it to ever contemplate at this point. -hpa --
