Re: [patch 2/2] x86, fpu: lazy allocation of FPU area - v3

Previous thread: [patch 1/2] x86, fpu: split FPU state from task struct - v3 by Suresh Siddha on Monday, March 3, 2008 - 4:02 pm. (8 messages)

Next thread: [PATCH] sound: replace remaining __FUNCTION__ occurences by Harvey Harrison on Monday, March 3, 2008 - 4:32 pm. (1 message)
From: Suresh Siddha
Date: Monday, March 3, 2008 - 4:02 pm

Only allocate the FPU area when the application actually uses FPU, i.e., in the
first lazy FPU trap. This could save memory for non-fpu using apps.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
---
v3: Fixed the non-atomic calling sequence in atomic context.
v2: Ported to x86.git#testing with some name changes.
---

Index: linux-2.6-x86/arch/x86/kernel/i387.c
===================================================================
--- linux-2.6-x86.orig/arch/x86/kernel/i387.c	2008-03-03 14:15:17.000000000 -0800
+++ linux-2.6-x86/arch/x86/kernel/i387.c	2008-03-03 14:15:17.000000000 -0800
@@ -9,7 +9,6 @@
 #include <linux/sched.h>
 #include <linux/module.h>
 #include <linux/regset.h>
-#include <linux/bootmem.h>
 #include <asm/processor.h>
 #include <asm/i387.h>
 #include <asm/math_emu.h>
@@ -67,7 +66,6 @@
 	else
 		xstate_size = sizeof(struct i387_fsave_struct);
 #endif
-	init_task.thread.xstate = alloc_bootmem(xstate_size);
 }
 
 #ifdef CONFIG_X86_64
@@ -105,6 +103,12 @@
 		return;
 	}
 
+	/*
+	 * Memory allocation at the first usage of the FPU and other state.
+	 */
+	if (!tsk->thread.xstate)
+		tsk->thread.xstate = kmem_cache_alloc(task_xstate_cachep, GFP_KERNEL);
+
 	if (cpu_has_fxsr) {
 		struct i387_fxsave_struct *fx = &tsk->thread.xstate->fxsave;
 
Index: linux-2.6-x86/arch/x86/kernel/process.c
===================================================================
--- linux-2.6-x86.orig/arch/x86/kernel/process.c	2008-03-03 14:15:17.000000000 -0800
+++ linux-2.6-x86/arch/x86/kernel/process.c	2008-03-03 14:15:17.000000000 -0800
@@ -5,24 +5,33 @@
 #include <linux/slab.h>
 #include <linux/sched.h>
 
-static struct kmem_cache *task_xstate_cachep;
+struct kmem_cache *task_xstate_cachep;
 
 int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src)
 {
 	*dst = *src;
-	dst->thread.xstate = kmem_cache_alloc(task_xstate_cachep, GFP_KERNEL);
-	if (!dst->thread.xstate)
-		return ...
From: Christoph Hellwig
Date: Monday, March 3, 2008 - 6:20 pm

Please don't do over 80 char lines.  Also don't we need some kind of
error handling here?

--

From: Suresh Siddha
Date: Monday, March 3, 2008 - 6:43 pm

Currently it uses SLAB_PANIC.

thanks,
suresh
--

From: Ingo Molnar
Date: Tuesday, March 4, 2008 - 3:32 am

but SLAB_PANIC only covers kmem_cache_create() failures. 

kmem_cache_alloc() can fail (return NULL) and not handling it is a bug.

	Ingo
--

From: Suresh Siddha
Date: Tuesday, March 4, 2008 - 10:55 am

oops. you are correct. Will send a sigsegv in the failure case then. Thanks.
--

From: Pavel Machek
Date: Wednesday, March 5, 2008 - 12:47 pm

You are introducing possibility of hard to debug error, where previous
code just worked... Does not look like good idea to me.
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
--

From: Ingo Molnar
Date: Thursday, March 6, 2008 - 8:51 am

hm, how does it differ from any other allocation failure? We could fail 
to allocate a pagetable page. We could fail to allocate the task_struct 
to begin with.

	Ingo
--

From: Suresh Siddha
Date: Thursday, March 6, 2008 - 12:10 pm

Yes. This happens under out of memory conditions. And we are also using
GFP_KERNEL here, which can block.
--

From: Pavel Machek
Date: Thursday, March 6, 2008 - 1:24 pm

Well, we should not be sending SIGSEGV...? SIGBUS would be cleaner, or
SIGKILL... what happens when userland tries to catch this one?

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
--

From: Andi Kleen
Date: Thursday, March 6, 2008 - 1:52 pm

When this happens the kernel is already in a severe out of memory
situation and no matter what you do user land will not be able
to handle this well.

-Andi
--

From: H. Peter Anvin
Date: Friday, March 7, 2008 - 5:29 am

I'm confused...

Normally when we need memory for userspace and can't get it, we put the 
process to sleep until memory is available.

Why is this different in any way?

	-hpa
--

From: Arjan van de Ven
Date: Friday, March 7, 2008 - 6:06 am

this is just for handling the case where that fails
(basically near/totally OOM or the case where you get a fatal signal)

maybe we need a GFP_KILLABLE now that we have a TASK_KILLABLE...

--

From: Andi Kleen
Date: Friday, March 7, 2008 - 6:18 am

I didn't think GFP_KERNEL was interruptible by signals...
(although sometimes under oom thrashing I think it would be great if it was...) 

-Andi
--

From: Arjan van de Ven
Date: Friday, March 7, 2008 - 6:20 am

we need to make it (or with GFP_KILLABLE); would make total sense...
(so yeah it was more wishful thinking than reality)
--

From: Andi Kleen
Date: Friday, March 7, 2008 - 6:27 am

I think it wouldn't be that difficult for the normal anonymous user allocations
(standard page fault path), but doing it for everything would be pretty
hard because you would need to add signal-bail-out paths everywhere. 

But doing it for some simple cases like page fault only would be a nice 
project for someone, shouldn't be too difficult.

-Andi
--

From: Pavel Machek
Date: Wednesday, March 5, 2008 - 12:48 pm

How many such apps are on your system, and how much does this
'optimalization' cost?

ISTR glibc always using FPU...?
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
--

From: Suresh Siddha
Date: Thursday, March 6, 2008 - 12:26 pm

On a normal kernel boot, where there are 200 or so tasks running, only 20

Apparently not, alteast not on my system. And also going forward, as we extend
this state, thought is nice to have.

thanks,
suresh
--

From: Pavel Machek
Date: Thursday, March 6, 2008 - 2:21 pm

Aha, now I see it is useful ;-).
									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
--

Previous thread: [patch 1/2] x86, fpu: split FPU state from task struct - v3 by Suresh Siddha on Monday, March 3, 2008 - 4:02 pm. (8 messages)

Next thread: [PATCH] sound: replace remaining __FUNCTION__ occurences by Harvey Harrison on Monday, March 3, 2008 - 4:32 pm. (1 message)