I inlined three of them, I think I can inline another two. So hopefully,
I'll be able to shring 8-call depth to 3-call depth.
Tail call optimization is not done at all if you compile kernel with stack
checking. This contributes to the stack overflow too.
I fixed that too.
BTW. what's the purpose of having 192-byte stack frame? There are 16
8-byte registers being saved per function call, so 128-byte frame should
be sufficient, shoudn't? The ABI specifies that some additional entries
must be present even if unused, but I don't see reason for them. Would
something bad happen if GCC started to generate 128-byte stacks?
Mikulas
--