I recently discovered that x86 kernels won't boot under Virtual PC. In this
case paravirt was not built into the kernel. The kernel just "hangs" on
attempted boot with no error messages. Git-bisect pinpointed the following
commit as the problem:
commit 32c464f5d9701db45bc1673288594e664065388e
Author: Jan Beulich <jbeulich@novell.com>
Date: Wed Oct 17 18:04:41 2007 +0200
x86: multi-byte single instruction NOPs
Add support for and use the multi-byte NOPs recently documented to be
available on all PentiumPro and later processors.
This patch only applies cleanly on top of the "x86: misc.
constifications" patch sent earlier.
[ tglx: arch/x86 adaptation ]
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
arch/x86/kernel/alternative.c | 23 ++++++++++++++++++++++-
include/asm-x86/processor_32.h | 22 ++++++++++++++++++++++
include/asm-x86/processor_64.h | 22 ++++++++++++++++++++++
eb662fea6ed7904074c87dd004b9f6d23964c844 M include
Any suggestion on where to go from here would be appreciated.
--
What CPU does Virtual PC emulate? As far as Wikipedia is concerned (not that I'd take it on complete faith) it emulates a 32-bit Intel Pentium II. And that commit makes the kernel use the "P6 nops" for such hardware. Maybe Virtual PC doesn't support the newer intel nop things? Intel docs say that it should be available on any intel CPU that has CPUID.01H.EAX[11:8] = 0110B or 1111B. That's the "family ID", and Pentium II should have a family ID of 6 (ie that 0110B case). So it sounds like a Virtual PC bug, but I dunno. And maybe we should just use the legcay nops for anything that isn't modern (ie P4+ or Core)? Linus --
On Sun, 31 Aug 2008 11:47:04 -0700 (PDT) it's probably even a security bug in that I don't see what would be stopping a ring 3 user process from executing these instructions... --
Well, it could be that Virtual PC raises a #UD exception in the virtual machine. In user space, that would just cause the kernel to kill the poor innocent victim. But when the kernel gets a #UD exception on what it expects to be a nop, it just won't work. Linus --
31 Aug 2008, David Sanders wrote:
Sorry about the confusion. I was just learning git and applied a couple
patches wrong that lead me to believe I had not found the answer, when in
fact I had. The problem is the multibyte NOP instructions introduced into
the kernel in 2007. For 2.6.24-rc1 reverting the patch (32c464f5) corrects
the problem. But then of course we cannot use these instructions as we would
like. I propose the following patch which adds a boot-time parameter to not
use these multibyte NOPs in systems that are unprepared to deal with them.
The patch disables a optimization used in one place by commenting out some
lines in nop.h. Please comment.
x86: Fix to multibyte NOPs breaking Virtual PC (win)
To be applied to linux-2.6.27-rc5.
Signed-off-by: David Sanders <linux@sandersweb.net>
diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index 2763cb3..6a63e78 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -58,6 +58,15 @@ static int __init setup_noreplace_paravirt(char *str)
__setup("noreplace-paravirt", setup_noreplace_paravirt);
#endif
+static int intel_legacy_nops = 0;
+
+static int __init setup_intel_legacy_nops(char *str)
+{
+ intel_legacy_nops = 1;
+ return 1;
+}
+__setup("intel-legacy-nops", setup_intel_legacy_nops);
+
#define DPRINTK(fmt, args...) if (debug_alternative) \
printk(KERN_DEBUG fmt, args)
@@ -165,13 +174,14 @@ static const struct nop {
const unsigned char *const *find_nop_table(void)
{
const unsigned char *const *noptable = intel_nops;
- int i;
+ int i = 0;
- for (i = 0; noptypes[i].cpuid >= 0; i++) {
+ while (noptypes[i].cpuid >= 0 && !intel_legacy_nops) {
if (boot_cpu_has(noptypes[i].cpuid)) {
noptable = noptypes[i].noptable;
break;
}
+ i++;
}
return noptable;
}
diff --git a/include/asm-x86/nops.h b/include/asm-x86/nops.h
index ad0bedd..a6f0571 100644
--- a/include/asm-x86/nops.h
+++ ...I disagree here: If I configure a 686+ kernel, I expect these NOPs to be that way (and to work). If you want to run on something that's not compliant, you just shouldn't configure your kernel that way. Jan --
Well, if you actually do a git grep 'ASM_NOP[0-9]' you'll find that just the _definitions_ of those things are the bulk of it BY FAR, and that there doesn't seem to be a single user that cares even remotely about performance. So I actually think that the whole thing is a waste of time. We should probably - pick a single set of NOP's per 32-bit/64-bit (since the good nops in 32-bit aren't 64-bit instructions at all, so we do want different nops depending on _that_) The whole static choice by microarchitecture is pure garbage. - Probably also just declare that those default nops are single instructions, just so that we never even have to think about it from a dynamic replacement angle. Look at the uses again, and realize that it really is just pure garbage to have this kind of complex and subtle stuff going on. - Move the optimized nop definitions (K7_NOPx etc) to the only place that cares - asm/x86/kernel/alternative.c. When we do things _dynamically_, it can actually make sense to pick a nop more precisely, but for this whole static thing it's just a pain. IOW, if it actually _worked_ reasonably, I wouldn't care. But clearly it doesn't. And once it's not working reasonably, it should be fixed. Linus --
yes - we had 3-4 tries already and while it looked worthwile initially it's clearly not showing signs of getting more robust and whatever benefits there might be slightly better NOPs is dwarved by all these robustness problems. Peter? Ingo --
There is already a fix for this based on a first-principles test in tip. -hpa --
Well, the paravirt_ops patching uses multibyte nops to pad out the
unused space in a patch site, and they're generally on hot paths
(otherwise we wouldn't bother with patching). But even then I don't
think the particular nop chosen matters all that much, and even if it
did using the dumb redundant prefix long nops seems to be as good as or
better than the p6 nops. The "call mcount" patching ftrace wants to do
Yep. We could run the p6 nops with an exception handler to see if the
cpu actually supports them or not. And if not, just fall back to
something simple and good enough.
J
--
Okay, we do not generate P6 NOPs statically if CONFIG_X86_GENERIC_CPU is enabled -- unless, of course gcc/binutils generates them, which they now do for certain CPU types. If you are selecting a specific CPU *and* you don't select GENERIC, you should get valid code for the selected CPU only. As far as using them dynamically, there is code in -tip that determines dynamically and explicitly if the long NOPs are available, and only uses them if they are. -hpa --
Hmm.. I'm not a huge fan of the ASM_NOP mess, but you also disable it for 64-bit x86 too. On 32-bit, at least the generic nops are fairly reasonable, but the default nops for 64-bit really look pretty sad, and the P6 nops really do look better. So I would suggest perhaps moving the static P6 nop selection into the CONFIG_X86_64 thing. The alternative is to just get rid of that static nop selection, and just have two cases: 32-bit and 64-bit, and just pick obviously safe cases for them. So I think that particular part would be better off with changing the Kconfig language instead. Ie something like this.. (I removed the 32-bit CPU's from the choices, except for MPENTIUM4 that really should be merged with MPSC - the difference between MPENTIUM4 and MPSC seems to be just a totally bogus 32-bit vs 64-bit thing. Yes, yes, there are PENTIUM4's without 64-bit, but there are also Prescott chips that run 32-bit kernels, so the thing is a bit confused. Linus --- arch/x86/Kconfig.cpu | 13 ++++++++----- 1 files changed, 8 insertions(+), 5 deletions(-) diff --git a/arch/x86/Kconfig.cpu b/arch/x86/Kconfig.cpu index 2c518fb..b225219 100644 --- a/arch/x86/Kconfig.cpu +++ b/arch/x86/Kconfig.cpu @@ -382,14 +382,17 @@ config X86_OOSTORE # P6_NOPs are a relatively minor optimization that require a family >= # 6 processor, except that it is broken on certain VIA chips. # Furthermore, AMD chips prefer a totally different sequence of NOPs -# (which work on all CPUs). As a result, disallow these if we're -# compiling X86_GENERIC but not X86_64 (these NOPs do work on all -# x86-64 capable chips); the list of processors in the right-hand clause -# are the cores that benefit from this optimization. +# (which work on all CPUs). In addition, it looks like Virtual PC +# does not understand them. +# +# As a result, disallow these if we're not compiling for X86_64 (these +# NOPs do work on all x86-64 capable chips); the list of processors in +# the right-hand ...
Heh -- I originally only added it because you're asked for it ... Yes going back to some simple generic nops and avoid all these problems would be a great thing. -Andi --
Virtual PC does not emulate a processor like it does with the motherboard, video, NIC ,etc. What you see in the virtual machine is the actual processor, except that Virtual PC looks at all of your instructions and modifies ring 0 code and the like. It may be that Virtual PC was not designed with an awareness of these nops that the commit added. I would suggest an configuration option to select legacy-nops or newer-nops and a kernel boot-time parameter so it can be disabled to allow installation of a distribution for example. I would be happy to submit such a patch if you agree (or I'll try to anyway). --
OK, below is my patch. The problem is I just built with it and the boot
problem still exists. Either the patch is flawed or git-bisect gave me the
wrong commit. Any help would be appreciated.
David
Signed-off-by: David Sanders <linux@sandersweb.net>
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index ed92864..a38f362 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -445,6 +445,13 @@ config PARAVIRT_DEBUG
Enable to debug paravirt_ops internals. Specifically, BUG if
a paravirt_op is missing when it is called.
+config INTEL_NOPS
+ bool "Use legacy x86 NOPS"
+ depends on X86_32
+ help
+ This option tells kernel to use legacy x86 nops
+
+
config MEMTEST
bool "Memtest"
help
diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index 2763cb3..10713e4 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -58,6 +58,16 @@ static int __init setup_noreplace_paravirt(char *str)
__setup("noreplace-paravirt", setup_noreplace_paravirt);
#endif
+static int intel_legacy_nops = 0;
+
+static int __init setup_intel_legacy_nops(char *str)
+{
+ intel_legacy_nops = 1;
+ return 1;
+}
+
+__setup("intel-legacy-nops", setup_intel_legacy_nops);
+
#define DPRINTK(fmt, args...) if (debug_alternative) \
printk(KERN_DEBUG fmt, args)
@@ -167,12 +177,17 @@ const unsigned char *const *find_nop_table(void)
const unsigned char *const *noptable = intel_nops;
int i;
- for (i = 0; noptypes[i].cpuid >= 0; i++) {
- if (boot_cpu_has(noptypes[i].cpuid)) {
- noptable = noptypes[i].noptable;
- break;
+ if (!intel_legacy_nops) {
+#ifndef CONFIG_INTEL_NOPS
+ for (i = 0; noptypes[i].cpuid >= 0; i++) {
+ if (boot_cpu_has(noptypes[i].cpuid)) {
+ noptable = noptypes[i].noptable;
+ break;
+ }
}
+#endif /* CONFIG_INTEL_NOPS */
}
+
return noptable;
}
--
Try just reverting the commit that you bisected to, and double-check. If your bisect was flawed, there's no point even staring at the nop code ;) Linus --
Thanks Linus. I reverted the commit and the problem persits, so nops are definitely not the problem. I started an new bisection but am now in a state that won't compile due to errors. What do I do in that case? David --
You can try "git bisect skip", but in general the better choice is to just do git bisect visualize to open up gitk with the current set of possible targets, and then just pick a likely point by hand, and do git reset --hard <sha1-of-the-thing-you-picked> and try that one instead. There's some talk about this in "man git-bisect" but maybe it's not very good. Linus --
OK thanks, I had to build a new version of git because the one installed with my distribution was too old to have git bisect skip. I am trying to use the gitk method you mentioned but it is going slow and on my screen gitk's fonts are very hard to read. David --
Yeah - gitk's default fonts are a nightmare, I get a semi readable
interface by adding this to ~/.gitk
set mainfont {{dejavu sans mono} 10}
set textfont {{dejavu sans mono} 9}
set uifont {{dejavu sans mono} 10 bold}
But it seems the tcl/tk font rendering is pretty horrible all-round. I
generally prefer to use qgit for this reason.
--
Thanks that is a big improvement. --
