Re: [PATCH 2/4] x86&x86-64 support for sys_indirect

Previous thread: [PATCH 4/4] first use of sys_indirect system call by Ulrich Drepper on Thursday, November 15, 2007 - 12:41 pm. (1 message)

Next thread: [PATCH 1/4] actual sys_indirect code by Ulrich Drepper on Thursday, November 15, 2007 - 12:41 pm. (1 message)
To: <linux-kernel@...>
Cc: <akpm@...>, <torvalds@...>
Date: Thursday, November 15, 2007 - 12:41 pm

This part adds support for sys_indirect on x86 and x86-64.

b/arch/x86/ia32/ia32entry.S | 1 +
b/arch/x86/kernel/syscall_table_32.S | 1 +
b/include/asm-x86/indirect.h | 5 +++++
b/include/asm-x86/indirect_32.h | 27 +++++++++++++++++++++++++++
b/include/asm-x86/indirect_64.h | 30 ++++++++++++++++++++++++++++++
b/include/asm-x86/unistd_32.h | 3 ++-
b/include/asm-x86/unistd_64.h | 2 ++
7 files changed, 68 insertions(+), 1 deletion(-)

--- a/arch/x86/ia32/ia32entry.S
+++ b/arch/x86/ia32/ia32entry.S
@@ -726,4 +726,5 @@ ia32_sys_call_table:
.quad compat_sys_timerfd
.quad sys_eventfd
.quad sys32_fallocate
+ .quad sys_indirect /* 325 */
ia32_syscall_end:
diff --git a/arch/x86/kernel/syscall_table_32.S b/arch/x86/kernel/syscall_table_32.S
index 8344c70..92095b2 100644
--- a/arch/x86/kernel/syscall_table_32.S
+++ b/arch/x86/kernel/syscall_table_32.S
@@ -324,3 +324,4 @@ ENTRY(sys_call_table)
.long sys_timerfd
.long sys_eventfd
.long sys_fallocate
+ .long sys_indirect /* 325 */
diff --git a/include/asm-x86/unistd_32.h b/include/asm-x86/unistd_32.h
index 9b15545..8ee0b20 100644
--- a/include/asm-x86/unistd_32.h
+++ b/include/asm-x86/unistd_32.h
@@ -330,10 +330,11 @@
#define __NR_timerfd 322
#define __NR_eventfd 323
#define __NR_fallocate 324
+#define __NR_indirect 325

#ifdef __KERNEL__

-#define NR_syscalls 325
+#define NR_syscalls 326

#define __ARCH_WANT_IPC_PARSE_VERSION
#define __ARCH_WANT_OLD_READDIR
diff --git a/include/asm-x86/unistd_64.h b/include/asm-x86/unistd_64.h
index 5ff4d3e..66eab33 100644
--- a/include/asm-x86/unistd_64.h
+++ b/include/asm-x86/unistd_64.h
@@ -635,6 +635,8 @@ __SYSCALL(__NR_timerfd, sys_timerfd)
__SYSCALL(__NR_eventfd, sys_eventfd)
#define __NR_fallocate 285
__SYSCALL(__NR_fallocate, sys_fallocate)
+#define __NR_indirect 286
+__SYSCALL(__NR_indirect, sys_indirect)

#ifndef __NO_STUBS
#define __ARCH_WANT_OLD_READDIR
diff --git a/i...

To: Ulrich Drepper <drepper@...>
Cc: Linux Kernel Mailing List <linux-kernel@...>, Andrew Morton <akpm@...>, Ingo Molnar <mingo@...>, Thomas Gleixner <tglx@...>
Date: Thursday, November 15, 2007 - 1:02 pm

[ Ingo, Thomas - see the whole series on linux-kernel ]

The thing is, not all system calls can do this.

Some system calls are magic, and don't just take the arguments in
registers: they also care about the actual stack pointer and the whole
pt_regs struct when returning to user mode.

So this does need more infrastructure: some way of marking which system
calls cannot be executed indirectly.

The magic system calls are things like:

- sys_iopl() - this one changes the eflags value restored on iret
- execve/clone/vfork() - need direct access to pt_regs
- vm86() - does magic with the stack, cares about pt_regs
- sigreturn - magic pt_regs accesses again

and there may be others I have forgotten about.

Calling these system calls from C code will just corrupt the kernel stack,
and is a big big no-no.

Linus
-

Previous thread: [PATCH 4/4] first use of sys_indirect system call by Ulrich Drepper on Thursday, November 15, 2007 - 12:41 pm. (1 message)

Next thread: [PATCH 1/4] actual sys_indirect code by Ulrich Drepper on Thursday, November 15, 2007 - 12:41 pm. (1 message)