On Tue, 2008-09-30 at 12:37 -0500, Matt Mackall wrote:
Ok, on closer inspection, this is part of the x86_64 calling convention.
When calling a varargs function, the caller passes the number of
floating point SSE regs used in rax. The callee then has to save these
away for va_list use. The GCC prologue apparently sets aside space for
xmm0-xmm7 (16 bytes each) all the time (plus rdi, rsi, rdx, rcx, r8, and
r9).
Obviously, we're never passing floating point args in the kernel, so
we're taking about a 40+ byte hit in code size and 128 byte hit in stack
size for every varargs call.
Looks like the gcc people have a patch in progress:
http://gcc.gnu.org/ml/gcc-patches/2008-08/msg02165.html
So I think we should assume that x86_64 will sort this out eventually.
--
Mathematics is the supreme nostalgia of our time.
--
| Amit K. Arora | [RFC] Heads up on sys_fallocate() |
| Tarkan Erimer | Re: Dual-Licensing Linux Kernel with GPL V2 and GPL V3 |
| Greg Kroah-Hartman | [PATCH 001/196] Chinese: Add the known_regression URI to the HOWTO |
git: | |
| Gerrit Renker | [PATCH 27/37] dccp: Integration of dynamic feature activation - part 2 (server side) |
| Jarek Poplawski | Re: [PATCH] pkt_sched: Destroy gen estimators under rtnl_lock(). |
| David Miller | [GIT]: Networking |
| Matheos Worku | 2.6.24 BUG: soft lockup - CPU#X |
