>>> And you know what? This is likely not the end yet! It's possible
Sparc64 has register windows: it passes arguments in registers, but it
must allocate space for that registers. If the call stack is too deep (8
levels), the CPU runs out of registers and starts spilling the registers
of the function 8-levels-deep to the stack.
The stack usage could be reduced to 176 bytes with little work from gcc
developers and to 128 bytes with more work (ABI change). If you wanted to
go below 128 bytes, you could use one register to indicate number of used
registers and modify the spill/fill handlers to load only that number of
registers and reduce the stack usage even more --- that would be a big
code change in both gcc and linux.
Mikulas
--