Re: [Lguest] 2.6.33 guest crash (was: Re: 2.6.32-rc5 guest crash)

Previous thread: [PATCH] 8250_pci: Remove duplicate struct pciserial_board by Shawn Bohrer on Sunday, October 25, 2009 - 8:55 pm. (1 message)

Next thread: 2.6.32-rc5-git3 -- System lockup running "cat /sys/kernel/debug/dri/0/i915_regs" by Miles Lane on Sunday, October 25, 2009 - 8:57 pm. (2 messages)
From: Rusty Russell
Date: Sunday, October 25, 2009 - 8:56 pm

OK, it's the non-paravirt "cli" we're breaking on (the pushfl is bad too).

This is because alternatives haven't been subbed yet.  The Right Thing is to
make this asm code paravirt aware.  The Easy Thing is to fix this code to
only get included when we're actually compiling for a 386 or 486 (this was
introduced with the original pentium AFAICT).

How's this Ingo, Arjan?

Subject: x86: sidestep lguest problem by only building cmpxchg8b_emu for pre-Pentium

Arjen's 79e1dd05d1a22 "x86: Provide an alternative() based cmpxchg64()" broke
lguest, even on systems which have cmpxchg8b support.  The emulation code
gets used until alternatives get run, but it contains native instructions,
not their paravirt alternatives.

The simplest fix is to turn this code off except for 386 and 486 builds.

Reported-by: Johannes Stezenbach <js@sig21.net>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

diff --git a/arch/x86/Kconfig.cpu b/arch/x86/Kconfig.cpu
--- a/arch/x86/Kconfig.cpu
+++ b/arch/x86/Kconfig.cpu
@@ -400,7 +400,7 @@ config X86_TSC
 
 config X86_CMPXCHG64
 	def_bool y
-	depends on X86_PAE || X86_64 || MCORE2 || MPENTIUM4 || MPENTIUMM || MPENTIUMIII || MPENTIUMII || M686 || MATOM
+	depends on !M386 && !M486
 
 # this should be set for all -march=.. options where the compiler
 # generates cmov.



--

From: tip-bot for Rusty Russell
Date: Monday, October 26, 2009 - 4:38 am

Commit-ID:  ae1b22f6e46c03cede7cea234d0bf2253b4261cf
Gitweb:     http://git.kernel.org/tip/ae1b22f6e46c03cede7cea234d0bf2253b4261cf
Author:     Rusty Russell <rusty@rustcorp.com.au>
AuthorDate: Mon, 26 Oct 2009 14:26:04 +1030
Committer:  Ingo Molnar <mingo@elte.hu>
CommitDate: Mon, 26 Oct 2009 12:33:02 +0100

x86: Side-step lguest problem by only building cmpxchg8b_emu for pre-Pentium

Commit 79e1dd05d1a22 "x86: Provide an alternative() based
cmpxchg64()" broke lguest, even on systems which have cmpxchg8b
support.  The emulation code gets used until alternatives get
run, but it contains native instructions, not their paravirt
alternatives.

The simplest fix is to turn this code off except for 386 and 486
builds.

Reported-by: Johannes Stezenbach <js@sig21.net>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Acked-by: H. Peter Anvin <hpa@zytor.com>
Cc: lguest@ozlabs.org
Cc: Arjan van de Ven <arjan@infradead.org>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
LKML-Reference: <200910261426.05769.rusty@rustcorp.com.au>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
 arch/x86/Kconfig.cpu |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/arch/x86/Kconfig.cpu b/arch/x86/Kconfig.cpu
index f2824fb..2649840 100644
--- a/arch/x86/Kconfig.cpu
+++ b/arch/x86/Kconfig.cpu
@@ -400,7 +400,7 @@ config X86_TSC
 
 config X86_CMPXCHG64
 	def_bool y
-	depends on X86_PAE || X86_64 || MCORE2 || MPENTIUM4 || MPENTIUMM || MPENTIUMIII || MPENTIUMII || M686 || MATOM
+	depends on !M386 && !M486
 
 # this should be set for all -march=.. options where the compiler
 # generates cmov.
--

From: Johannes Stezenbach
Date: Monday, October 26, 2009 - 12:11 pm

FWIW, I've tested it both with original host kernel (only used recompiled
kernel as guest), and after reboot, the guest works in both cases and
the host kernel still boots with the patch applied.


Thanks,
Johannes
--

From: Johannes Stezenbach
Date: Sunday, March 14, 2010 - 10:34 am

I recently installed 2.6.33 and now the error which was
fixed in 2.6.32 is back since the fix got reverted in
db677ffa5f5a4f15b9dad4d132b3477b80766d82

What now?

Am I correct to assume that I can avoid the issue

Thanks
Johannes
--

From: Johannes Stezenbach
Date: Sunday, March 14, 2010 - 2:23 pm

Silly question  ;-/

So what would be the real fix?


Johannes
--

From: Rusty Russell
Date: Monday, March 29, 2010 - 9:27 pm

That patch broke Real Machines.  The real answer is actually to do some
more emulation in the host; I like lguest but I can't really justify many
lguest-specific hacks outside the lguest dirs.

There are a few patches needed to make Linus' latest work, I'll post them
soon.  But for this specific issue, how's this?

Subject: lguest: workaround cmpxchg8b_emu by ignoring cli in the guest.

It's only used by cmpxchg8b_emu (see db677ffa5f5a for the gory
details), and fixing that to be paravirt aware would be more work than
simply ignoring it (and AFAICT only help lguest).

(We can't emulate it properly: the popf which expects to restore interrupts
does not trap).

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: virtualization@lists.osdl.org

diff --git a/drivers/lguest/x86/core.c b/drivers/lguest/x86/core.c
--- a/drivers/lguest/x86/core.c
+++ b/drivers/lguest/x86/core.c
@@ -288,6 +288,18 @@ static int emulate_insn(struct lg_cpu *c
 	insn = lgread(cpu, physaddr, u8);
 
 	/*
+	 * Around 2.6.33, the kernel started using an emulation for the
+	 * cmpxchg8b instruction in early boot on many configurations.  This
+	 * code isn't paravirtualized, and it tries to disable interrupts.
+	 * Ignore it, which will Mostly Work.
+	 */
+	if (insn == 0xfa) {
+		/* "cli", or Clear Interrupt Enable instruction.  Skip it. */ 
+		cpu->regs->eip++;
+		return 1;
+	}
+
+	/*
 	 * 0x66 is an "operand prefix".  It means it's using the upper 16 bits
 	 * of the eax register.
 	 */
--

From: Jeremy Fitzhardinge
Date: Monday, March 29, 2010 - 9:51 pm

Why isn't the cli getting paravirtualized?


--

From: Johannes Stezenbach
Date: Tuesday, April 13, 2010 - 8:29 am

Hi Rusty,


I just tested this patch with 2.6.34-rc4 (as both host and guest),
it seems to work fine.


Thanks,
Johannes
--

Previous thread: [PATCH] 8250_pci: Remove duplicate struct pciserial_board by Shawn Bohrer on Sunday, October 25, 2009 - 8:55 pm. (1 message)

Next thread: 2.6.32-rc5-git3 -- System lockup running "cat /sys/kernel/debug/dri/0/i915_regs" by Miles Lane on Sunday, October 25, 2009 - 8:57 pm. (2 messages)