Re: [BUG] x86 kenel won't boot under Virtual PC

Previous thread: [RFC] x86: io-apic - convert DO_ACTION macro into function by Cyrill Gorcunov on Sunday, September 7, 2008 - 12:26 pm. (3 messages)

Next thread: [GIT PULL] kmemcheck updates for tip/kmemcheck by Vegard Nossum on Sunday, September 7, 2008 - 1:09 pm. (5 messages)
From: David Sanders
Date: Sunday, September 7, 2008 - 1:07 pm

Peter, sorry about delay in replying I was out of town for weekend.  I will 
apply your patch and git back to you on results.  A quick look a the patches 
indicates that nop.h and the users of nop.h are not addressed so I may have 
to combine your patches with Linus's Kconfig patch to get a working kernel.
David


--

From: David Sanders
Date: Sunday, September 7, 2008 - 4:22 pm

Yes the patch you supplied (and now in Linus's tree) fixes (part) of the problem.
In order to get a working kernel I also need a patch for the nops.h issue.  
I have tested and confirmed that the patch posted by Linus works and finally 
solves the problem with virtual PC.

Linus please apply your patch (which I quote below).
---
 arch/x86/Kconfig.cpu |   13 ++++++++-----
 1 files changed, 8 insertions(+), 5 deletions(-)

diff --git a/arch/x86/Kconfig.cpu b/arch/x86/Kconfig.cpu
index 2c518fb..b225219 100644
--- a/arch/x86/Kconfig.cpu
+++ b/arch/x86/Kconfig.cpu
@@ -382,14 +382,17 @@ config X86_OOSTORE
 # P6_NOPs are a relatively minor optimization that require a family >=
 # 6 processor, except that it is broken on certain VIA chips.
 # Furthermore, AMD chips prefer a totally different sequence of NOPs
-# (which work on all CPUs).  As a result, disallow these if we're
-# compiling X86_GENERIC but not X86_64 (these NOPs do work on all
-# x86-64 capable chips); the list of processors in the right-hand clause
-# are the cores that benefit from this optimization.
+# (which work on all CPUs).  In addition, it looks like Virtual PC
+# does not understand them.
+#
+# As a result, disallow these if we're not compiling for X86_64 (these
+# NOPs do work on all x86-64 capable chips); the list of processors in
+# the right-hand clause are the cores that benefit from this optimization.
 #
 config X86_P6_NOP
 	def_bool y
-	depends on (X86_64 || !X86_GENERIC) && (M686 || MPENTIUMII || MPENTIUMIII || MPENTIUMM || MCORE2 || MPENTIUM4 || MPSC)
+	depends on X86_64
+	depends on (MCORE2 || MPENTIUM4 || MPSC)
 
 config X86_TSC
 	def_bool y



--

From: H. Peter Anvin
Date: Sunday, September 7, 2008 - 6:48 pm

No, please don't.  Instead, David, please enable CONFIG_X86_GENERIC.

	-hpa
--

From: David Sanders
Date: Sunday, September 7, 2008 - 7:49 pm

I checked the distribution I'm using (debian) and the kernel shipped with it 
does not have  CONFIG_X86_GENERIC set.  This means _when_ they ship a 2.6.27 
kernel it won't work with Virtual PC so I won't be able to even install it.
However, with Linus's patch I am guaranteed to work for all kernels built for 
X86_32 and even for X86_64 because in that case the virtual environment will 
support the multibyte NOPs.  I need Linus's fix in order to support the 
Linux-using virtual pc community.  I don't have the resources to lobby each 
individual distribution about their kernel config.  I want it to just work.  
        David
--

From: H. Peter Anvin
Date: Sunday, September 7, 2008 - 9:04 pm

Under that logic we shouldn't even have CPU configurables, since you 
want it to "just work" whatever crap you're running on.  That is EXACTLY 
what CONFIG_X86_GENERIC means, and the fact that any particular 
distribution is broken with respect to not enabling it is a bug in that 
distribution, and not grounds for breaking the upstream kernel.

	-hpa
--

From: Andi Kleen
Date: Monday, September 8, 2008 - 2:42 am

> Under that logic we shouldn't even have CPU configurables, since you 


At least when I introduced X86_GENERIC it was just an optimization,
not a requirement.

That is the kernel pretty much always did "just work"
(with only a very few exceptions like PAE vs non PAE) on all 
CPUs. The CPU configs also just specified optimizations, not correctness.
The code for all CPUs used to be always there.
X86_GENERIC was mostly just to do things like always use 
the largest cache alignment.

I think someone changed that recently, but imho that wasn't
an improvement. As you can see it just causes endless support

Well it's more like that you guys changed the semantics 
without warnings the distributions. I'm not sure you can blame
Debian for that.

-Andi
-- 
ak@linux.intel.com
--

From: David Sanders
Date: Monday, September 8, 2008 - 6:25 am

Well, OK I could do that.  But I can't bug every distribution in existence for 
I don't see that Linus's patch breaks the upstream kernel.  Just the opposite, 
you go and determine in alternative.c that the processor doesn't support NOPL 
and then go ahead and use it anyway in nops.h.  That makes no sense to me.

Could we use the result of find_nop_table() instead of nops.h?

David
--

From: Linus Torvalds
Date: Monday, September 8, 2008 - 8:09 am

I dunno.. Event he help-text doesn't actually agree with that:

  config X86_GENERIC
        bool "Generic x86 support"
        depends on X86_32
        help
          Instead of just including optimizations for the selected
          x86 variant (e.g. PII, Crusoe or Athlon), include some more
          generic optimizations as well. This will make the kernel
          perform better on x86 CPUs other than that selected.
          
          This is really intended for distributors who need more
          generic optimizations.

Also, quite frankly, while the CPU processor type message says

          The kernel will not necessarily run on earlier architectures than
          the one you have chosen, e.g. a Pentium optimized kernel will run on
          a PPro, but not necessarily on a i486.

I thought you agreed that CPU virtualization can be a problem? That was 
the whole excuse for why the dynamic code was changed. Why would it not be 
true for the static code?

The fact is, if you want to run on a Core2 or other modern CPU, then 
"Virtual PC" is apparently buggy in this respect. You worked around it for 
the dynamic choice - but that's totally _pointless_ if you then don't want 
to work around it for the static one.

			Linus
--

From: Ingo Molnar
Date: Monday, September 8, 2008 - 8:23 am

yes. X86_P6_NOPS is a totally insignificant optimization and if it makes 
_any_ CPU not boot (be that virtual or real), then it's frankly not 
worth it.

David, exactly how does the kernel fail to boot with latest -git? 
(v2.6.27-rc5-313-g64f996f or later) Does detect_nopl() run? It really 
should, and it should detect the non-working instructions.

	Ingo
--

From: H. Peter Anvin
Date: Monday, September 8, 2008 - 8:36 am

Okay, a few things here...

1. First, I wrote up a patch yesterday to update the CONFIG_X86_GENERIC 
description and to make it "default y".  It is currently on 
x86/defconfig, but I think it should be promoted to mainline immediately.

2. X86_P6_NOPS is not the only source of static NOPLs.  If gcc is set to 
optimize for specific architectures, then gcc/binutils will generate 
static NOPLs.  The only way we can prevent that is by not using specific 
-march options, as far as I can tell.

3. I'm not positive that CONFIG_X86_GENERIC currently avoid all cases of 
(2), but it obviously should.  I will verify that today and add a 
followup patch to the Makefiles if necessary.

Given all of this, I really think that putting this on 
CONFIG_X86_GENERIC, *AND* making CONFIG_X86_GENERIC the default is the 
right choice.

	-hpa

--

From: David Sanders
Date: Monday, September 8, 2008 - 8:38 am

Ingo,
With CONFIG_X86_GENERIC=y, the latest v2.6.27 in Linus's tree boots fine.  But 
if you don't select that option (and some distributions don't) it won't boot 
at all.  It just hangs (blank screen) with no error messages and nothing in 
dmesg.  I assume it is hitting one of the ASM_NOP? instructions.
David
--

From: H. Peter Anvin
Date: Monday, September 8, 2008 - 8:38 am

It almost certainly never gets as far as detect_nopl().

	-hpa
--

From: Andi Kleen
Date: Monday, September 8, 2008 - 8:45 am

The help text matches in how I wrote this option originally.
The original use case was the 128 byte P4 cache lines.

-Andi
--

From: H. Peter Anvin
Date: Monday, September 8, 2008 - 8:43 am

The help text is indeed out of date.  I did a patch yesterday to, among 
other things, update it; I also want to verify that we are disabling all 
options that can cause gcc or binutils to generate nopl's; I plan to 
push it today.

	-hpa
--

From: Andi Kleen
Date: Monday, September 8, 2008 - 8:50 am

You (or whoever did those changes) likely broke a lot of distribution setups 
subtly then. Hopefully the changes were worth that.

-Andi
-- 
ak@linux.intel.com
--

From: H. Peter Anvin
Date: Monday, September 8, 2008 - 8:50 am

Likely broke a lot of distribution setups how?

	-hpa
--

From: Andi Kleen
Date: Monday, September 8, 2008 - 8:57 am

Previously they could set CPU x and it would still run on other CPUs
even if that option was not set. If that's not the case anymore then
they will have some unhappy users once they update their kernels.

-Andi
-- 
ak@linux.intel.com
--

From: H. Peter Anvin
Date: Monday, September 8, 2008 - 8:54 am

That was broken long before by the gcc and binutils authors.

	-hpa
--

From: H. Peter Anvin
Date: Monday, September 8, 2008 - 8:50 am

OK, turns out that is already happending, by virtue of:

# add at the end to overwrite eventual tuning options from earlier
# cpu entries
cflags-$(CONFIG_X86_GENERIC)    += $(call tune,generic,$(call tune,i686))

-mtune=generic on 32-bit CPUs disable NOPL generation, and any 
gcc/binutils combo too old to have -mtune=generic won't generate them at 
all.

I think I verified this some time in the past, but I just had to refresh 
my memory.  This was a major reason to put this functionality on 
CONFIG_X86_GENERIC.

	-hpa
--

From: Linus Torvalds
Date: Monday, September 8, 2008 - 9:07 am

Peter.

The help text may be out of date because of changes to NOPL usage, but you 
should ask yourself whether the change is actually a _good_ change.

IOW, I really don't see why you are pushing changing the help-text, 
instead of just making the kernel work better.

The fact that some broken gcc/binutils versions may screw us over _anyway_ 
may well mean that we should just push back on _that_ change instead.

Quite frankly, from a user perspective, even a very _technical_ one, 
please tell me what the advantage of not being fairly generic by default 
is. Really.

Yes, there are some _big_ ISA issues where it is worth doing real static 
code selection (as opposed to just instruction selection and scheduling 
etc that still _works_ for everybody, but optimizes for certain 
archtiectures).

So things like cmpxchg/xadd (for atomics) and cmov (for compiler-generated 
code), and bswap (for networking) can really make a big difference, and 
are not really realistic to do dynamically. 

But NOPL? That's simply not _worth_ it being painful over.

And the fact is, the current help text describes

 (a) the historical meaning (optimize for a specific architecture, but 
     don't make extreme choices that are bad for others)
 (b) what people would generally _want_.

and I really don't think that changing the help text is the right solution 
here. It may be "technically correct", but it is simply not user-friendly 
or smart.

			Linus
--

From: H. Peter Anvin
Date: Monday, September 8, 2008 - 9:13 am

The issue at hand is that at least with the current toolchain, we need 
to pass -march=generic in order for these instructions to be generated. 
  We have an option for this, CONFIG_X86_GENERIC, which distributions 
really *should* be using anyway.

And yes, it should be the default.  The patch I have makes it
"default y" as well as change the help text.

Would it make you happier if this option was forced enabled unless 
CONFIG_EMBEDDED was on?

	-hpa

--

From: H. Peter Anvin
Date: Monday, September 8, 2008 - 9:15 am

I guess the other option is to create a new option for selecting the 
dangerous -mtune= variants, and possibly lock *that* option to 
CONFIG_EMBEDDED.

	-hpa
--

From: David Sanders
Date: Monday, September 8, 2008 - 9:26 am

The option that works best for users is for the kernel to not use NOPLs on 
32-bit cpu's.  That is what Linus's patch accomplishes.  I am not having any 
issues with gcc or gas generating NOPLs.  The kernel should work out of the 
box for PentiumPro or later cpu's running in 32-bit mode.  I think you are 
trying to over-engineer the issue.  Just apply the patch and move on.
David
--

From: Linus Torvalds
Date: Monday, September 8, 2008 - 9:20 am

It sounds like it shouldn't be a default at all, it should just _always_ 
be on, if there really are gcc's that care that much. Most of our 
optimizations have historically really been about _optimizing_, not about 
"it won't work", even if we have had exceptions (but as mentioned, I think 

Yes, putting it behind EMBEDDED will certainly fix the issue. Anybody who 
actually enables EMBEDDED and does all his choices by hand should no 
longer expect to not have to know _exactly_ what he is doing.

So if it's behind EMBEDDED, and defaults to "on", then I have no problem 
with changing the help text to say "If you do this, we'll statically do 
things that really _require_ you to have a CPU that looks _exactly_ like 
the CPU you claimed".

		Linus
--

From: H. Peter Anvin
Date: Monday, September 8, 2008 - 9:32 am

Sounds like a plan.  I'll have a patch shortly.

	-hpa
--

From: Ingo Molnar
Date: Monday, September 8, 2008 - 10:02 am

yep. I thought about changing the name and adding a default y but 
keeping the name and also putting it behind EMBEDDED would be perfect.

	Ingo
--

From: david
Date: Monday, September 8, 2008 - 9:34 am

I always understood the CPU selection to be "this CPU and ones compatible 
with it will work, others won't" unless generic was enabled. the fact that 
only a few CPU's wouldn't work and the rest was optimization is true, but 
the details of what chips would and wouldn't work were never that clear. 
The difference between a kernel compiled for generic and once compiled for 
a specific CPU can be very significant. (I ran into 30% differences back 
in the 2.4 days between generic and Athlon) This is why all the distros 
don't enable the generic cpu option on their kernels nowdays. I'd hate to 
see all the distros enabling embedded just to get this performance boost

David Lang
--

From: H. Peter Anvin
Date: Monday, September 8, 2008 - 9:42 am

It's about producing code that works generically, not a performance boost.

	-hpa
--

From: david
Date: Monday, September 8, 2008 - 11:51 am

I was arguing against the proposal to enable generic unless someone went 
under embedded to disable it.

if that's not what was being proposed, my apologies for misunderstanding.

David Lang
--

From: Linus Torvalds
Date: Monday, September 8, 2008 - 9:59 am

No.

Read the help text.. 

Yes, we care about features that MATTER. But if compiles start using 
features that don't really matter, and make a specific kernel _too_ 
specific, then we need to reign in the madness.

IOW, it's a balance. On one hand, yes, the uarch makes sense. On the 
other, it's just stupid to have to worry about details that don't 
realistically make any difference at all - except whether the machine 
works or not.

And yes, we could just put this up as a Virtual PC bug. It clearly is. But 
in the end, it _still_ all boils down to a balance between "do we actually 
win anything by using NOPL statically" vs "do we lose anything by being 
too damn inconvenient".

		Linus
--

From: Andi Kleen
Date: Monday, September 8, 2008 - 10:00 am

Originally it was an option because the 128 byte cache line it selects
caused bloat in several important data structures. That was back
then when cache line padded NR_CPUs arrays were still pretty common.
I don't know if it's still a problem, but before making it default y
it would be good to check the dynamic+static memory cost at least.

-Andi
--

From: Linus Torvalds
Date: Monday, September 8, 2008 - 10:04 am

Btw, I do think that the whole NOPL issue is separate from all the other 
issues. There can be _other_ cases where it really is worth doing some 
"generic" optimizations or being more "specific", and my argument really 
is that NOPL is _not_ one of those cases.

So I'm still not sure that X86_GENERIC is necessarily the answer. The 
answer may be:

 - never use NOPL statically unless we _know_ it works (eg x86-64)

 - never allow such a stupid decision by gcc as to use NOPL on x86-32.

..and then leave X86_GENERIC alone wrt everything else.

Peter - does gcc actually use NOPL in _32-bit_ code too? It really seems 
to be a stupid decision to make a binary not run on other CPU's over 
something as trivial as that. That's something I'd expect out of an Intel 
compiler just to mess with AMD, not out of gcc.

			Linus
--

From: H. Peter Anvin
Date: Monday, September 8, 2008 - 10:08 am

Yes, it does:

   /* We need to decide which NOP sequence to use for 32bit and
      64bit. When -mtune= is used:

      1. For PROCESSOR_I386, PROCESSOR_I486, PROCESSOR_PENTIUM and
      PROCESSOR_GENERIC32, f32_patt will be used.
      2. For PROCESSOR_PENTIUMPRO, PROCESSOR_PENTIUM4, PROCESSOR_NOCONA,
      PROCESSOR_CORE, PROCESSOR_CORE2, and PROCESSOR_GENERIC64,
      alt_long_patt will be used.
      3. For PROCESSOR_ATHLON, PROCESSOR_K6, PROCESSOR_K8 and
      PROCESSOR_AMDFAM10, alt_short_patt will be used.

      When -mtune= isn't used, alt_long_patt will be used if
      cpu_arch_isa_flags has Cpu686. Otherwise, f32_patt will
      be used.

"alt_long_patt" uses NOPL and its variants.

	-hpa
--

From: H. Peter Anvin
Date: Monday, September 8, 2008 - 10:12 am

I should point out that the code above is from binutils (gas), not from 
gcc.  It is entirely possible that doing something like
"-mtune=core2 -Wa,-mtune=generic" might actually work, but I have not 
dug through the gcc code proper to figure out if that (a) would work, 
and (b) will remain the case...

	-hpa
--

From: Linus Torvalds
Date: Monday, September 8, 2008 - 10:41 am

Doesn't matter. If gcc passes "-mtune=i686" down to gas and thus the end 
result doesn't match gcc's own documentation (and whole point of -mtune), 
then it's still a gcc bug, even if it may not be a "cc1" bug. The fact 
that "cc1" and "gas" are two different phases is pretty much irrelevant. 
gcc calls them both.

But sure, if there are separate bugzillas, post it in both. 

			Linus
--

From: Linus Torvalds
Date: Monday, September 8, 2008 - 10:38 am

That is A TOTAL PIECE OF SH*T, and against gcc's own documentation.

"-mtune=x" is very much defined to be a performance _tuning_ option, not 
an "architectural" option. 

Quite frankly, this is a gcc bug. Plain and simple. 

Look at the gcc man-pages. It very much says:

       -mtune=cpu_type
           Set only the instruction scheduling parameters for machine type
           cpu_type.  The instruction set is not changed.

(in various different formats - it says the same things with different 
words for different architectures. For example, for 386/x86-64 it 
_specifically_ says:

       -mtune=cpu-type
           Tune to cpu-type everything applicable about the generated code,
           except for the ABI and the set of available instructions.  The
           choices for cpu-type are:

note the "except for the ABI and the set of available instructions".

Qutie frankly, I think this is a *much* bigger bug than any possible 
VirtualPC idiocy. It's expressly against the whole idea of "-mtune".

Of course, we do set "-march=i686" too, so maybe the gcc people will claim 
that that means "Intel CPU's only". But quite frankly, if they really 
claim that, then they are lying pond-scum:

           i686
               Same as "generic", but when used as "march" option, PentiumPro
               instruction set will be used, so the code will run on all i686
               family chips.

which any sane x86 person would claim includes all the various clone 
manufacturers too. Saying that NOPL is so important that "i686" should 
suddenly be Intel-only is dishonest and totally stupid.

IOW, somebody who has a gcc bugzilla login should just create a bug-report 
on this. There is no question but that this is a bug, plain and simple.

			Linus
--

From: H. Peter Anvin
Date: Monday, September 8, 2008 - 10:59 am

No argument there.

	-hpa
--

From: H. Peter Anvin
Date: Monday, September 8, 2008 - 12:09 pm

OK, digging some more into this garbage...

Apparently the situation isn't quite as dire as it first seems.
At least gcc 4.3.0 doesn't actually pass -mtune= to gas; it just drops 
the option on the floor.  This means this bug isn't manifest when 
calling from gcc (as opposed to invoking as directly.)

However, I would still like to push the following patch to be on the 
safe side, ok?

	-hpa

From: Linus Torvalds
Date: Monday, September 8, 2008 - 3:42 pm

Ok, goodie. And we don't pass the -mtune=xyz option when we use gas 

I have no problem with it, with the fix to actually assemble things. But I 
also don't think it's a huge deal (ie not 2.6.27 stuff - we don't know 
if some odd version of as does something odd with -mtune=generic, or 
whether this can interact oddly with some version of gcc perhaps passing 
the _correct_ -mtune to the assembler?).

		Linus
--

From: Andi Kleen
Date: Monday, September 8, 2008 - 10:18 am

FWIW I personally think Linux should always use very conservative nops. 
Unconditionally. This issue already cost far too much developer time

gcc just uses .p2align, so it comes down to binutils

Yes there seems to be a pretty scary ISA selection code in there.

Line 565 of

http://www.google.com/codesearch?hl=en&q=i386_align_code+show:pxieOi43ptA:OXGjJbna...

I haven't decoded it completely, but I suspect it does the wrong thing.

-Andi
--

Previous thread: [RFC] x86: io-apic - convert DO_ACTION macro into function by Cyrill Gorcunov on Sunday, September 7, 2008 - 12:26 pm. (3 messages)

Next thread: [GIT PULL] kmemcheck updates for tip/kmemcheck by Vegard Nossum on Sunday, September 7, 2008 - 1:09 pm. (5 messages)