Intercepting System Calls via a Virtual Machine

Submitted by Anonymous
on December 10, 2008 - 2:30pm

All code is on a Ubuntu 2.6.24-19-generic kernel

Here's the setup:

I'm currently working on a project that requires a bit of system call foo. We're using qemu and hooking system calls as they come across the cpu. Everything works great on the front end, I'm catching int 0x80s and sysenters just fine. The problem comes on the back end when I try to match up system calls with their return values. What I'm doing now is setting up a per-process table and when I get a system call I store the EIP after the call instruction and the syscall number in the table. Then when I get to an IRET or SysExit, I check the PID in the table and see if the stored EIP matches the EIP that we're returning to. If it matches I grab EAX and match it up with the previous call.

Here's the problem:

There are some instances when a process makes a system call, then at the end of the syscall when switching from ring 0->3 it checks to see if another process needs to be scheduled, and switches that that process. When it returns to the original process (somehow?) it never seems to return back to the EIP after the call, at least not via an IRET or SysExit, so I never see the return, and another system call gets called violating our one pending system call per process assumption.

My question is:

What is the mechanism that the kernel uses to return to the original process, and is there a good reason why it wouldn't return to the instruction after the syscall?

Thanks a ton!

interrupts can do it

Ferdinand (not verified)
on
December 16, 2008 - 8:22pm

A device interrupt handler can cause the kernel to switch to a different process. You may have to hook more than just int 0x80.

preemption

on
December 16, 2008 - 8:49pm

unix is a preemptive multitasking system and doesn't only switch processes in syscalls but schedules to another process if the time slice has run out. one of the device interrupts to watch is the timer interrupt. there are multiple time sources.

good reason

on
December 16, 2008 - 9:15pm

is there a good reason why it wouldn't return to the instruction after the syscall?

  • fork returns to the same EIP in two processes
  • the process can receive a signal while waiting in the syscall and the signal handler can longjmp somewhere else in the program and continue or just exit -- the syscall may or may not return during the live time of the process.
  • execve 'returns' to a new process image under the same pid, the EIP will be different in most cases

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.