Re: [PATCH 1/2] x86: eliminate TS_XSAVE

Previous thread: [PATCH 2/2] x86: Introduce 'struct fpu' and related API by Avi Kivity on Sunday, May 2, 2010 - 7:53 am. (2 messages)

Next thread: Re: s2ram slow (radeon) / failing (usb) by Alan Stern on Sunday, May 2, 2010 - 8:07 am. (3 messages)
From: Avi Kivity
Date: Sunday, May 2, 2010 - 7:53 am

Currently all fpu accessors are wedded to task_struct.  However kvm also uses
the fpu in a different context.  Introduce an FPU API, and replace the
current uses with the new API.

While this patchset is oriented towards deeper changes, as a first step it
simlifies xsave for kvm.

Avi Kivity (2):
  x86: eliminate TS_XSAVE
  x86: Introduce 'struct fpu' and related API

 arch/x86/include/asm/i387.h        |  135 +++++++++++++++++++++++++++---------
 arch/x86/include/asm/processor.h   |    6 ++-
 arch/x86/include/asm/thread_info.h |    1 -
 arch/x86/include/asm/xsave.h       |    7 +-
 arch/x86/kernel/cpu/common.c       |    5 +-
 arch/x86/kernel/i387.c             |  107 ++++++++++++++---------------
 arch/x86/kernel/process.c          |   20 +++---
 arch/x86/kernel/process_32.c       |    2 +-
 arch/x86/kernel/process_64.c       |    2 +-
 arch/x86/kernel/xsave.c            |    8 +-
 arch/x86/math-emu/fpu_aux.c        |    6 +-
 11 files changed, 181 insertions(+), 118 deletions(-)

--

From: Avi Kivity
Date: Sunday, May 2, 2010 - 7:53 am

The fpu code currently uses current->thread_info->status & TS_XSAVE as
a way to distinguish between XSAVE capable processors and older processors.
The decision is not really task specific; instead we use the task status to
avoid a global memory reference - the value should be the same across all
threads.

Eliminate this tie-in into the task structure by using an alternative
instruction keyed off the XSAVE cpu feature; this results in shorter and
faster code, without introducing a global memory reference.

Signed-off-by: Avi Kivity <avi@redhat.com>
---
 arch/x86/include/asm/i387.h        |   20 ++++++++++++++++----
 arch/x86/include/asm/thread_info.h |    1 -
 arch/x86/kernel/cpu/common.c       |    5 +----
 arch/x86/kernel/i387.c             |    5 +----
 arch/x86/kernel/xsave.c            |    6 +++---
 5 files changed, 21 insertions(+), 16 deletions(-)

diff --git a/arch/x86/include/asm/i387.h b/arch/x86/include/asm/i387.h
index da29309..52f95c0 100644
--- a/arch/x86/include/asm/i387.h
+++ b/arch/x86/include/asm/i387.h
@@ -56,6 +56,18 @@ extern int restore_i387_xstate_ia32(void __user *buf);
 
 #define X87_FSW_ES (1 << 7)	/* Exception Summary */
 
+static inline bool use_xsave(void)
+{
+	bool has_xsave;
+
+	alternative_io("xor %0, %0",
+		       "mov $1, %0",
+		       X86_FEATURE_XSAVE,
+		       "=g"(has_xsave));
+
+	return has_xsave;
+}
+
 #ifdef CONFIG_X86_64
 
 /* Ignore delayed exceptions from user space */
@@ -99,7 +111,7 @@ static inline void clear_fpu_state(struct task_struct *tsk)
 	/*
 	 * xsave header may indicate the init state of the FP.
 	 */
-	if ((task_thread_info(tsk)->status & TS_XSAVE) &&
+	if (use_xsave() &&
 	    !(xstate->xsave_hdr.xstate_bv & XSTATE_FP))
 		return;
 
@@ -164,7 +176,7 @@ static inline void fxsave(struct task_struct *tsk)
 
 static inline void __save_init_fpu(struct task_struct *tsk)
 {
-	if (task_thread_info(tsk)->status & TS_XSAVE)
+	if (use_xsave())
 		xsave(tsk);
 	else
 		fxsave(tsk);
@@ -218,7 +230,7 @@ ...
From: Brian Gerst
Date: Sunday, May 2, 2010 - 10:38 am

I think you should either just use cpu_has_xsave, or extend this use
of alternatives to all cpu features.  It doesn't make sense to only do
it for xsave.

--
Brian Gerst
--

From: H. Peter Anvin
Date: Monday, May 3, 2010 - 2:45 pm

I asked Suresh to comment on this, since he wrote the original code.  He
did confirm that the intent was to avoid a global memory reference.

	-hpa

--

From: Avi Kivity
Date: Tuesday, May 4, 2010 - 12:41 am

Ok, so you're happy with the patch as is?

-- 
error compiling committee.c: too many arguments to function

--

From: Suresh Siddha
Date: Tuesday, May 4, 2010 - 11:15 am

As use_xsave() is in the hot context switch path, I would like to go
with Avi's proposal.

thanks,
suresh

--

From: Suresh Siddha
Date: Tuesday, May 4, 2010 - 11:03 am

Acked-by: Suresh Siddha <suresh.b.siddha@intel.com>

--

Previous thread: [PATCH 2/2] x86: Introduce 'struct fpu' and related API by Avi Kivity on Sunday, May 2, 2010 - 7:53 am. (2 messages)

Next thread: Re: s2ram slow (radeon) / failing (usb) by Alan Stern on Sunday, May 2, 2010 - 8:07 am. (3 messages)