Only allocate the FPU area when the application actually uses FPU, i.e., in the
first lazy FPU trap. This could save memory for non-fpu using apps.
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
---
v3: Fixed the non-atomic calling sequence in atomic context.
v2: Ported to x86.git#testing with some name changes.
---
Index: linux-2.6-x86/arch/x86/kernel/i387.c
===================================================================
--- linux-2.6-x86.orig/arch/x86/kernel/i387.c 2008-03-03 14:15:17.000000000 -0800
+++ linux-2.6-x86/arch/x86/kernel/i387.c 2008-03-03 14:15:17.000000000 -0800
@@ -9,7 +9,6 @@
#include <linux/sched.h>
#include <linux/module.h>
#include <linux/regset.h>
-#include <linux/bootmem.h>
#include <asm/processor.h>
#include <asm/i387.h>
#include <asm/math_emu.h>
@@ -67,7 +66,6 @@
else
xstate_size = sizeof(struct i387_fsave_struct);
#endif
- init_task.thread.xstate = alloc_bootmem(xstate_size);
}
#ifdef CONFIG_X86_64
@@ -105,6 +103,12 @@
return;
}
+ /*
+ * Memory allocation at the first usage of the FPU and other state.
+ */
+ if (!tsk->thread.xstate)
+ tsk->thread.xstate = kmem_cache_alloc(task_xstate_cachep, GFP_KERNEL);
+
if (cpu_has_fxsr) {
struct i387_fxsave_struct *fx = &tsk->thread.xstate->fxsave;
Index: linux-2.6-x86/arch/x86/kernel/process.c
===================================================================
--- linux-2.6-x86.orig/arch/x86/kernel/process.c 2008-03-03 14:15:17.000000000 -0800
+++ linux-2.6-x86/arch/x86/kernel/process.c 2008-03-03 14:15:17.000000000 -0800
@@ -5,24 +5,33 @@
#include <linux/slab.h>
#include <linux/sched.h>
-static struct kmem_cache *task_xstate_cachep;
+struct kmem_cache *task_xstate_cachep;
int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src)
{
*dst = *src;
- dst->thread.xstate = kmem_cache_alloc(task_xstate_cachep, GFP_KERNEL);
- if (!dst->thread.xstate)
- return ...Please don't do over 80 char lines. Also don't we need some kind of error handling here? --
Currently it uses SLAB_PANIC. thanks, suresh --
but SLAB_PANIC only covers kmem_cache_create() failures. kmem_cache_alloc() can fail (return NULL) and not handling it is a bug. Ingo --
oops. you are correct. Will send a sigsegv in the failure case then. Thanks. --
You are introducing possibility of hard to debug error, where previous code just worked... Does not look like good idea to me. -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html --
hm, how does it differ from any other allocation failure? We could fail to allocate a pagetable page. We could fail to allocate the task_struct to begin with. Ingo --
Yes. This happens under out of memory conditions. And we are also using GFP_KERNEL here, which can block. --
Well, we should not be sending SIGSEGV...? SIGBUS would be cleaner, or SIGKILL... what happens when userland tries to catch this one? -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html --
When this happens the kernel is already in a severe out of memory situation and no matter what you do user land will not be able to handle this well. -Andi --
I'm confused... Normally when we need memory for userspace and can't get it, we put the process to sleep until memory is available. Why is this different in any way? -hpa --
this is just for handling the case where that fails (basically near/totally OOM or the case where you get a fatal signal) maybe we need a GFP_KILLABLE now that we have a TASK_KILLABLE... --
I didn't think GFP_KERNEL was interruptible by signals... (although sometimes under oom thrashing I think it would be great if it was...) -Andi --
we need to make it (or with GFP_KILLABLE); would make total sense... (so yeah it was more wishful thinking than reality) --
I think it wouldn't be that difficult for the normal anonymous user allocations (standard page fault path), but doing it for everything would be pretty hard because you would need to add signal-bail-out paths everywhere. But doing it for some simple cases like page fault only would be a nice project for someone, shouldn't be too difficult. -Andi --
How many such apps are on your system, and how much does this 'optimalization' cost? ISTR glibc always using FPU...? -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html --
On a normal kernel boot, where there are 200 or so tasks running, only 20 Apparently not, alteast not on my system. And also going forward, as we extend this state, thought is nice to have. thanks, suresh --
Aha, now I see it is useful ;-). Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html --
