Re: i387/FPU init issues...

Previous thread: [patch 4/4] MIPS: ELF handling - use SELFMAG instead of numeric constant by Cyrill Gorcunov on Saturday, May 3, 2008 - 3:18 am. (3 messages)

Next thread: [GIT PULL] SLUB cleanups for 2.6.26 by Pekka J Enberg on Saturday, May 3, 2008 - 3:38 am. (1 message)
From: jamal
Date: Saturday, May 3, 2008 - 3:32 am

Peoplez,

Ive narrowed down a problem i am having with an old P2 to commit
61c4628b538608c1a85211ed8438136adfeb9a95 with subject "x86, fpu: split
FPU state from task struct - v5" (Authored by Suresh and committed by
Ingo on Apr/19).

In the process i learnt how painfully time consuming and boring a blind
git bisect feast could be (the last time a kernel worked on the P2 was
back in 2.6.23). I literally spent no less than 10 hours tracking this
(Ok, I was chewing tobbaco in between running git bisect bad/good,
compile, copy over kernel, spit here, reboot, test).
Also this patch is so huge that given my lack of knowledge in the area,
i couldnt do better bisecting to be more exact on what is causing this.
i.e the patch is not bisect-friendly. 
So the best i can do is have other people take it from here.

I am able to reproduce the issue consistently on my laptop using qemu
(which helped speed debugging a bit). I have also narrowed it down to
include/asm-x86/i387.h::__save_init_fpu in (32 bit version) - it dies
somewhere in calling the following line:

----
        alternative_input(
                "fnsave %[fx] ;fwait;" GENERIC_NOP8 GENERIC_NOP4,
                "fxsave %[fx]\n"
                "bt $7,%[fsw] ; jnc 1f ; fnclex\n1:",
                X86_FEATURE_FXSR,
                [fx] "m" (tsk->thread.xstate->fxsave),
                [fsw] "m" (tsk->thread.xstate->fxsave.swd) : "memory");
----------

The only thing that has changed there compared to good version is the
last two lines. But that looks sane to me given the struct naming has
changed. So i am suspecting the calling path perhaps not setting
something or other.

------------ boot output paste ----------------------
[....]
Compat vDSO mapped to ffffe000.
CPU: Intel Pentium II (Klamath) stepping 03
Checking 'hlt' instruction... OK.
Freeing SMP alternatives: 0k freed
invalid opcode: 0000 [#1]
Modules linked in:

Pid: 0, comm: swapper Not tainted (2.6.25-00000-g61c4628 #22)
EIP: 0060:[<c01012d0>] EFLAGS: ...
From: James Courtier-Dutton
Date: Saturday, May 3, 2008 - 3:57 am

My guess would be that the jnc 1f is now wrong.

--

From: jamal
Date: Saturday, May 3, 2008 - 6:53 am

Do elucidate. The instruction with jnc 1f has always been there.
Is there something as a result of the compiler or the codepath that
makes it now wrong?

cheers,
jamal

--

From: Thomas Gleixner
Date: Saturday, May 3, 2008 - 8:31 am

This looks bad.

  0f ae 00		fxsave (%eax)
  0f ba 60 02 07       	btl    $0x7,0x2(%eax)
  73 02                	jae    (skip fnclex)
  db e2                	fnclex 
  0f 1f 00             	nopl   (%eax)

 ^^^^ This is a P4+ instruction. So it's not surprising that the P2
 chokes. The question is where this comes from.

we have:
#define P6_NOP3 ".byte 0x0f,0x1f,0x00\n"

So the alternatives code applies the wrong nop padding for your
CPU. This was probably introduced with commit
32c464f5d9701db45bc1673288594e664065388e. 

Jan, are you sure that P3 knows the P6 NOPs ? AFAICT its P4, but I
have to dig up the manuals.

Jamal, does the following patch solve your problem ? Please provide
also output of /proc/cpuinfo.

Thanks,
	tglx
---

--- linux-2.6.orig/arch/x86/kernel/alternative.c
+++ linux-2.6/arch/x86/kernel/alternative.c
@@ -158,7 +158,6 @@ static const struct nop {
 	{ X86_FEATURE_K8, k8_nops },
 	{ X86_FEATURE_K7, k7_nops },
 	{ X86_FEATURE_P4, p6_nops },
-	{ X86_FEATURE_P3, p6_nops },
 	{ -1, NULL }
 };
 
--

From: jamal
Date: Saturday, May 3, 2008 - 10:02 am

Dang - I feel i should have saved myself all that git bisecting



mambo:~# cat /proc/cpuinfo
processor       : 0
vendor_id       : GenuineIntel
cpu family      : 6
model           : 3
model name      : Pentium II (Klamath)
stepping        : 3
cpu MHz         : 1063.771
cache size      : 128 KB
fdiv_bug        : no
hlt_bug         : no
f00f_bug        : no
coma_bug        : no
fpu             : yes
fpu_exception   : yes
cpuid level     : 2
wp              : yes
flags           : fpu de pse tsc msr pae mce cx8 sep pge cmov mmx fxsr
sse sse2
bogomips        : 2160.92
clflush size    : 32
power management:

cheers,
jamal

--

From: Ingo Molnar
Date: Saturday, May 3, 2008 - 10:34 am

your bisection was still very useful - it pinpointed the NOPs - the 
assembly code around the NOP changed so a different length NOP was 
patched in => which did not work on your CPU. So thanks for that! The 

great. So this NOP is indeed not generally known to all "P6 and later" 
CPUs. (the PII)

	Ingo
--

From: Thomas Gleixner
Date: Saturday, May 3, 2008 - 10:39 am

Looks like. My analysis was wrong, as I got the P6 vs. PII/PIII
confused :) Damn unintutive numbering, I thought ARM is worse but I'm
not so sure anymore.

But the oops clearly identified that instruction sequence. So for now
we remove the X86_FEATURE_P3 -> P6_NOPS to be on the safe side.

Thanks,
	tglx
--

From: Jan Engelhardt
Date: Sunday, May 4, 2008 - 2:31 pm

Guess that Intel named it Pentium II either because Hexium
("5"86:Pentium, "6"86:Hexium) would have been a strange name, or the
successor to the Pentium/586 was not that great an improvement.
Or something else? Always kept me wondering.
--

From: H. Peter Anvin
Date: Sunday, May 4, 2008 - 2:37 pm

Yeah, "Hexium" didn't quite work, and they thought they'd already gotten 
a working brand with "Pentium".  That it clashed with their previous 
public prerelease naming scheme of P+number ("P", I believe, for 
"project" or "processor") didn't matter.

The Pentium 4 is properly called the P7, but almost noone calls it that.

"Pentium" is also a highly unstable isotope of hydrogen (Hydrogen-5), 
with a half-life under a zeptosecond.

	-hpa
--

From: Lennart Sorensen
Date: Monday, May 5, 2008 - 6:00 am

Well the brand was well recognized, so they called the P6 "Pentium Pro".
Then when they went to make a cheaper version for regular consumers,
they made the Pentium II based on the PPro, by making the L2 cache half
speed (the PPro ran L2 cache at full speed) and put the L2 cache on a
PCB with the core in a cartridge.  After a while they added SSE and
called it the Pentium !!! (check the logo, they really used exclamation
marks) and a while later they managed to shrink their process enough
that they could integrate the L2 cache on die, so they made the cache
full speed again but on the same die as the cpu core, and soon after
started offering plain CPUs again rather than cartridges.  The Pentium 4
is of course a totally unrelated design.  The Pentium M followed up on
the P3 but aiming for lower power consumption rather than maximum
performance, and later evolved into the Core and then Core 2 (where
again they focused on maximum performance).  I think the Pentium 4 may
have tarnished the Pentium brand enough that they needed a new name, and
hence "Core" came about.

-- 
Len Sorensen
--

From: H. Peter Anvin
Date: Saturday, May 3, 2008 - 11:48 am

This is very odd.

Could you try running the attached C program on this processor and 
report the result?  (Binary included for convenience.)

Arjan: this seems to directly contradict the Intel documentation.  Do 
you have any way to find out what the deal is with this?

	-hpa
From: Mikael Pettersson
Date: Saturday, May 3, 2008 - 1:07 pm

H. Peter Anvin writes:
 > jamal wrote:
 > > 
 > > Indeed it does - thanks.
 > > 
 > >> Please provide
 > >> also output of /proc/cpuinfo.
 > > 
 > > mambo:~# cat /proc/cpuinfo
 > > processor       : 0
 > > vendor_id       : GenuineIntel
 > > cpu family      : 6
 > > model           : 3
 > > model name      : Pentium II (Klamath)
 > > stepping        : 3
 > > cpu MHz         : 1063.771
 > > cache size      : 128 KB
 > > fdiv_bug        : no
 > > hlt_bug         : no
 > > f00f_bug        : no
 > > coma_bug        : no
 > > fpu             : yes
 > > fpu_exception   : yes
 > > cpuid level     : 2
 > > wp              : yes
 > > flags           : fpu de pse tsc msr pae mce cx8 sep pge cmov mmx fxsr
 > > sse sse2
 > > bogomips        : 2160.92
 > > clflush size    : 32
 > > power management:
 > > 
 > 
 > This is very odd.
 > 
 > Could you try running the attached C program on this processor and 
 > report the result?  (Binary included for convenience.)
 > 
 > Arjan: this seems to directly contradict the Intel documentation.  Do 
 > you have any way to find out what the deal is with this?

hpa's p6nops test program works fine here on both a PII (family 6 model 3)
and a Pentium Pro (family 6 model 1 stepping 9).

I'm noticing another anomaly in jamal's /proc/cpuinfo above: since when
can a PII have sse and sse2? As far as I recall, it was the PIII that
added sse, and sse2 came with the P4.

/Mikael
--

From: H. Peter Anvin
Date: Saturday, May 3, 2008 - 1:03 pm

Jamal, is this the /proc/cpuinfo from Qemu or from your hardware platform?

	-hpa
--

From: H. Peter Anvin
Date: Saturday, May 3, 2008 - 10:42 am

Pentium III is the P6 core, so it will.

Intel explicitly documents "all processors with family 6 or F."

	-hpa
--

From: James Courtier-Dutton
Date: Saturday, May 3, 2008 - 10:50 am

From the intel manual

0F 1F /0 NOP
The multi-byte form of NOP is available on processors with model encoding:
• CPUID.01H.EAX[Bytes 11:8] = 0110B or 1111B
The multi-byte NOP instruction does not alter the content of a register 
and will not
issue a memory operation. The instruction’s operation is the same in 
non-64-bit
modes and 64-bit mode.


--

From: H. Peter Anvin
Date: Saturday, May 3, 2008 - 10:51 am

I believe that's what I just said...

	-hpa
--

From: Thomas Gleixner
Date: Saturday, May 3, 2008 - 11:18 am

Yeah, I looked up myself and noticed that I confused those stupid
numbers again.

Nevertheless reality seems to tell a different story :)

Thanks,

	tglx
--

From: Mikael Pettersson
Date: Saturday, May 3, 2008 - 11:58 am

Thomas Gleixner writes:
 > On Sat, 3 May 2008, H. Peter Anvin wrote:
 > 
 > > Thomas Gleixner wrote:
 > > > 
 > > > Jan, are you sure that P3 knows the P6 NOPs ? AFAICT its P4, but I
 > > > have to dig up the manuals.
 > > > 
 > > 
 > > Pentium III is the P6 core, so it will.
 > > 
 > > Intel explicitly documents "all processors with family 6 or F."
 > 
 > Yeah, I looked up myself and noticed that I confused those stupid
 > numbers again.
 > 
 > Nevertheless reality seems to tell a different story :)

I've tested the "0f 1f 00" 3-byte NOP on two PIIs,
a Klamath (family 6 model 3 stepping 4) and a Deschutes
(family 6 model 5 stepping 0), and it worked fine on both.

Are you absolutely sure it's this 3-byte NOP that oopses?

which is either wrong or indicates serious overclocking.
The PIIs maxed out at about 400/450MHz.

/Mikael
--

From: H. Peter Anvin
Date: Saturday, May 3, 2008 - 12:03 pm

Good spotting, and yes, the roof for Pentium II was 450 MHz.  Pentium II 
had a TSC, so it shouldn't be wrong unless there wasn't a bigger 
timekeeping error on this platform.

Jamal, can you clarify?

	-hpa
--

From: H. Peter Anvin
Date: Saturday, May 3, 2008 - 12:08 pm

It's worse than that... it's claiming to be a *Klamath*, which topped 
out at 300 MHz.

	-hpa
--

From: Thomas Gleixner
Date: Saturday, May 3, 2008 - 12:17 pm

This might also be caused by a calibration error. We've seen enough of
them already.

Thanks,
	tglx

--

From: H. Peter Anvin
Date: Saturday, May 3, 2008 - 12:24 pm

No doubt.  However, there is *something* ugly going on.

	-hpa
--

From: Ingo Molnar
Date: Saturday, May 3, 2008 - 12:54 pm

i think a vital clue can be found in the original report:

|  I am able to reproduce the issue consistently on my laptop using qemu 
|  (which helped speed debugging a bit). I have also narrowed it down to 
|  include/asm-x86/i387.h::__save_init_fpu in (32 bit version) - it dies 
|  somewhere in calling the following line:

so it might just be incorrect Qemu emulation of a PII's NOP instruction?

(which btw. probably proves that Linux is the first OS to make 
intelligent use of those instructions?)

	Ingo
--

From: H. Peter Anvin
Date: Saturday, May 3, 2008 - 12:56 pm

Ah, yes, this is a Qemu bug, then.

And yes, it probably does mean exactly that :)

	-hpa
--

From: Maciej W. Rozycki
Date: Saturday, May 3, 2008 - 12:49 pm

It depends (as usually with Intel) on what document you are looking at.  
Based on my short research, the instruction has been retroactively added
to the list of supported opcodes.  Even my somewhat dated P4 manual does
not list it, never mind its predecessors.  It could have been accidentally
omitted or even buggy in some early members of the P6 family and this
could have been the reason for not documenting it from the beginning (the
case of FFREEP comes to mind).

  Maciej
--

From: H. Peter Anvin
Date: Saturday, May 3, 2008 - 1:06 pm

It has retroactively been added to the documented list for all P6 core 
chips - that should mean it works on all of them.  The most common 
reason for not documenting something (other than various Pure Evil NDA 
schemes) is that it hasn't been properly verified.  However, 
verification can be done a posteori.

	-hpa
--

From: Maciej W. Rozycki
Date: Saturday, May 3, 2008 - 2:17 pm

True, but people do make mistakes from time to time. :)  If I had a
choice between a piece of silicon and a piece of documentation to trust, I
would choose the former.

 Obviously this specific case has turned out to be an issue with Qemu, so
it is somewhat irrelevant and given how the opcodes were added to the
architecture I would consider the emulator excused.

  Maciej
--

From: jamal
Date: Saturday, May 3, 2008 - 2:46 pm

Sorry folks - had to run some errands. Wow, thanks for all the
responses.

Yes, the posting i just did (as pointed in my first email) was on
qemu (so was the /proc/cpuinfo). I moved to qemu because it was less
painful to do the git bisecting on my laptop; i used the same .config.
I should also note that qemu seems to have worked fine in the past.
In any case I will try to rerun on the older hardware which i can access
next week again. I am begining to doubt myself if it is the same issue;
i know even there it was pointing to FPU.
So is the correct fix then to go patch qemu then? I should point i am
running a slightly older version of qemu that has a few patches (nothing
to do with x86 emulation).

hpa, results from running on qemu (not the hardware) are:
-----
mambo:~# ./hpa
Test 0: ok
Test 1: ok
Test 2: ok
Test 3: err
Test 4: err
Test 5: err
Test 6: err
Test 7: err
Test 8: err
----------------

cheers,
jamal

--

From: Arjan van de Ven
Date: Sunday, May 4, 2008 - 1:24 pm

the other reason is that certain groups of "unknown" opcodes will act as NOP.

--

From: H. Peter Anvin
Date: Sunday, May 4, 2008 - 2:07 pm

Specifically, I believe the P6 added a whole class of instructions which 
would act as NOP's if used on older chips.  It has been used for 
prefetch instructions, etc.  At some point, the 0F 1F /0 group was 
declared to be NOP, and nothing else, for all future.

	-hpa

--

Previous thread: [patch 4/4] MIPS: ELF handling - use SELFMAG instead of numeric constant by Cyrill Gorcunov on Saturday, May 3, 2008 - 3:18 am. (3 messages)

Next thread: [GIT PULL] SLUB cleanups for 2.6.26 by Pekka J Enberg on Saturday, May 3, 2008 - 3:38 am. (1 message)