System call instrumentation

!MAILaRCHIVE_VOTE_RePLACE
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]
To: Ingo Molnar <mingo@...>
Cc: <linux-kernel@...>, <systemtap@...>, Frank Ch. Eigler <fche@...>
Date: Sunday, May 4, 2008 - 9:48 am

Hi Ingo,

I looked at the system call instrumentation present in LTTng lately. I
tried different solutions, e.g. hooking a kernel-wide syscall trace in
do_syscall_trace, but it appears that I ended up re-doing another
syscall table, which consists of specialized functions which extracts
the string and data structure parameters from user-space. Since code
duplication is not exactly wanted, I think that the original approach
taken in my patchset, which is to instrument the kernel code at the
sys_* level (e.g. sys_open), which is the earliest level where the
parameter information is made available to the kernel, is still the best
way to go.

I would still identify the execution mode changes in the same way I do
currently, which is by instrumenting do_syscall_trace, just to know as
soon as possible when the mode has changed from user-space to
kernel-space so we can do time accounting more accurately. I already
have the patchset which adds the KERNEL_TRACE thread flag to every
architectures. It's tested in assembly in the same way SYSCALL_TRACE is
tested, but is activated globally by iterating on all the threads.

So, the currently proposed scheme for a system call would be (for the
open() example)

shown as : 
kernel stack
  trace: event name (parameters)


do_syscall_trace()
  trace: kernel_arch_syscall_entry (syscall id, instruction pointer)

do_sys_open()
  trace: fs_open (fd, filename)

do_syscall_trace()
  kernel_arch_syscall_exit (return value)

If we take this open() example, filename is ready only in do_sys_open,
which is called by sys_open and sys_openat. So the logical
instrumentation site for this would really be do_sys_open(). The
information about which system call has been done is made available in
the kernel_arch_syscall_entry event. It is not present anymore at the
do_sys_open level because this execution path can be called from more
than one syscall.

What do you think of this approach ?


Mathieu

-- 
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68
--
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

Messages in current thread:
System call instrumentation, Mathieu Desnoyers, (Sun May 4, 9:48 am)
Re: System call instrumentation, Ingo Molnar, (Mon May 5, 2:55 am)
Re: System call instrumentation, Mathieu Desnoyers, (Mon May 5, 6:59 am)
Re: System call instrumentation, Ingo Molnar, (Mon May 5, 7:10 am)
Re: System call instrumentation, Mathieu Desnoyers, (Mon May 5, 7:30 am)
Re: System call instrumentation, Ingo Molnar, (Mon May 5, 8:28 am)
Re: System call instrumentation, Mathieu Desnoyers, (Mon May 19, 11:44 pm)
Re: System call instrumentation, Arjan van de Ven, (Tue May 20, 10:18 am)
Re: System call instrumentation, Mathieu Desnoyers, (Thu May 22, 8:47 am)
Re: System call instrumentation, Masami Hiramatsu, (Tue May 6, 4:52 pm)