"Some system calls are magic, and don't just take the arguments in registers: they also care about the actual stack pointer and the whole pt_regs struct when returning to user mode."
Ingo Molnar [interview] posted a set of 11 patches introducing "the first release of the 'Syslet' kernel feature and kernel subsystem, which provides generic asynchrous system call support". Ingo explains:
"Syslets are small, simple, lightweight programs (consisting of system-calls, 'atoms') that the kernel can execute autonomously (and, not the least, asynchronously), without having to exit back into user-space. Syslets can be freely constructed and submitted by any unprivileged user-space context - and they have access to all the resources (and only those resources) that the original context has access to."
Ingo goes on in his email to explain in greater detail how syslets work, then adds, "as it might be obvious to some of you, the syslet subsystem takes many ideas and experience from my Tux in-kernel webserver :) The syslet code originates from a heavy rewrite of the Tux-atom and the Tux-cachemiss infrastructure." He also offered some benchmark results, showing a 33.9% speedup comparing uncached synchronous IO to syslets, and a 19.2% speedup comparing cached synchronous IO to syslets, "so syslets, in this particular workload, are a nice speedup /both/ in the uncached and in the cached case. (note that i used only a single disk, so the level of parallelism in the hardware is quite limited.)"