[PATCH 1/1] system call notification with self_ptrace

Previous thread: by Tarkan Erimer on Monday, September 8, 2008 - 5:04 am. (1 message)

Next thread: Recent copy of libpcap from CVS by Keith Owens on Monday, September 8, 2008 - 5:27 am. (10 messages)
From: Pierre Morel
Date: Monday, September 8, 2008 - 5:02 am

Subject: [PATCH] system call notification with self_ptrace

From: Pierre Morel <pmorel@fr.ibm.com>


PTRACE SELF

This patch adds a new functionality to ptrace: system call notification to
the current process.
When a process requests self ptrace, with the new request PTRACE_SELF_ON:

 1.  the next system call performed by the process will not be executed
 2.  self ptrace will be disabled for the process
 3.  a SIGSYS signal will be sent to the process.

With an appropriate SIGSYS signal handler, the process can access its own
data structures to

 1.  get the system call number from the siginfo structure
 2.  get the system call arguments from the stack
 3.  instrument the system call with other system calls
 4.  emulate the system call with other system calls
 5.  change the arguments of the system call
 6.  perform the system call for good
 7.  change the return value of the system call
 8.  request self ptrace again before returning.

The new request PTRACE_SELF_OFF disables self ptrace.


Signed-off-by: Pierre Morel <pmorel@fr.ibm.com>
Signed-off-by: Volker Sameske <sameske@de.ibm.com>
---

 arch/s390/kernel/ptrace.c     |   16 ++++++++++++++++
 arch/s390/kernel/signal.c     |    5 +++++
 arch/x86/kernel/ptrace.c      |   29 +++++++++++++++++++++++++++++
 arch/x86/kernel/signal_32.c   |    5 +++++
 arch/x86/kernel/signal_64.c   |    5 +++++
 include/asm-generic/siginfo.h |    6 ++++++
 include/linux/ptrace.h        |   18 ++++++++++++++++++
 include/linux/sched.h         |    1 +
 kernel/ptrace.c               |   32 ++++++++++++++++++++++++++++++++
 9 files changed, 117 insertions(+)

Index: linux-2.6.26/arch/s390/kernel/ptrace.c
===================================================================
--- linux-2.6.26.orig/arch/s390/kernel/ptrace.c
+++ linux-2.6.26/arch/s390/kernel/ptrace.c
@@ -583,6 +583,22 @@ syscall_trace(struct pt_regs *regs, int 

 	if (!test_thread_flag(TIF_SYSCALL_TRACE))
 		goto out;
+
+	if (is_self_ptracing(regs->gprs[2])) {
+		if ...
From: Andrew Morton
Date: Monday, September 8, 2008 - 5:04 pm

On Mon, 08 Sep 2008 14:02:01 +0200

It sounds like it might be useful.

Are there any userspace tools available with which people can utilise

Maintainers of the other 30-odd architectures would appreciate a test
application which they can use to develop and test their ports, please.

Michael Kerrisk will no doubt be looking for manpage assistance. 
Please cc him on this material.

It would be good to get suitable testcases integrated into LTP (if LTP
has ptrace tests).

The patch title uses the term "self_ptrace", but the patch itself uses
the term "ptrace_self".  Let's get it consistent everywhere.

The patch adds a

+	u64	instrumentation;

to the task_struct but no explanation is provided as to why this was
added, why it is a 64-bit field, what its locking rules are, etc. 
Please fix this.


--

From: Pierre Morel
Date: Wednesday, September 10, 2008 - 7:17 am

Hi,

Yes, of course I have one for x86 and one for s390.

I used to steal one bit in the ptrace bit-field of the task_struct but 
Oleg pointed out that the ptrace bit-field is used in a lot of places 
without any bit mask, so I chose another way to remember that I (the 
thread) am instrumenting myself.

Alternatively, I could also use the ptrace bit-field and modify every 
reference to use a mask for any test, set or reset of the bit-field.

I provision a 64 bit wide bit-field for future extensions of the 
instrumentation.  I could of course use a smaller bit-field as only 1 
bit is really useful for now. I used 64 bit to be memory aligned with 
most of the architectures.

There is no lock for the instrumentation bit-field because it is used 
for self tracing only, and only current ever accesses the flag.


-- 
=============
Pierre Morel
RTOS and Embedded Linux

--

From: Oleg Nesterov
Date: Tuesday, September 9, 2008 - 5:43 am

I still think this patch shouldn't change handle_signal().

Once again. The signal handler for SIGSYS can first do
sys_ptrace(PTRACE_SELF_OFF) (which is filtered out), and then use any
other syscall, so this change is not needed, afaics.

The overhead of the additional PTRACE_SELF_OFF syscall is very small,
especially compared to signal delivery. I don't think this functionality
will be widely used, but this change adds the unconditional overhead
to handle_signal().

Btw, the check above looks wrong, shouldn't it be

	if (current->instrumentation & PTS_SELF)

?

And. According to the prior discussion, this requires to hook every
signal handler in user space, otherwise we can miss syscall. But every


The code looks strange. How about

	if (request == PTRACE_SELF_ON) {
		ret = -EPERM;
		task_lock(current);
		if (!current->ptrace) {
			set_thread_flag(TIF_SYSCALL_TRACE);
			current->instrumentation |= PTS_INSTRUMENTED|PTS_SELF;
			ret = 0;
		}
		task_unlock(current);
		goto out;
	}

?

I don't understand how task_lock() can help. This code runs under

So. PTRACE_SELF_OFF doesn't clear PTS_INSTRUMENTED? How can the task

So, PTRACE_SELF_XXX disables the "normal" ptrace. Not sure this is good.

Oleg.

--

From: Pierre Morel
Date: Wednesday, September 10, 2008 - 8:11 am

Hello,

Yes it can but what if the application forget to do it?
Yes you are right, in fact I do not need two flags, I will remove

I use task_lock to protect the current->ptrace bit-field which can be 
accessed by another thread, like the one you pointed out previously.
I agree it is not necessary with lock_kernel().
I think that having two tracing system one over the other may be
quite difficult to handle.



-- 
=============
Pierre Morel
RTOS and Embedded Linux

--

From: Oleg Nesterov
Date: Wednesday, September 10, 2008 - 9:20 am

The (buggy) task can be killed, this has nothing to do with security.

From the security pov, this case doesn't differ from, say,

	void sigh(int sig)
	{
		kill(getpid(), sig);
	}

	void main(void)
	{
		signal(SIGSYS, sigh);
		kill(getpid(), SIGSYS);
	}


Yes I see.

But... well, I think we need Roland's opinion. I must admit, I am a bit
sceptical about this patch ;) I mean, I don't really understand why it
is useful. We can do the same with fork() + ptrace(). Yes, in that
case we need an "extra" context switch for any traced syscall. But,
do you have any "real life" example to demonstrate that the user-space
solution sucks? We can even use CLONE_MM to speedup the context switch.

Pierre, don't get me wrong. I never used debuggers for myself, I will
be happy to know I am wrong. I just don't understand.


As for ->instrumentation. If you are going to remove PTS_INSTRUMENTED,
we need only one bit. We could use PF_PTS_SELF, but ->flags is already
"contended". Perhaps you can do something like

	--- include/linux/sched.h
	+++ include/linux/sched.h
	@@ -1088,6 +1088,7 @@ struct task_struct {
		/* ??? */
		unsigned int personality;
		unsigned did_exec:1;
	+	unsigned pts_self:1;
		pid_t pid;
		pid_t tgid;
	 

Both did_exec and pts_self can only be changed by current, so it is
safe to share the same word. This way we don't enlarge task_struct.

Oleg.

--

From: Dave Hansen
Date: Wednesday, September 10, 2008 - 9:23 am

Yeah, I don't think they're completely coming clean on it, yet.  

-- Dave

--

From: Pierre Morel
Date: Friday, September 12, 2008 - 5:22 am

Hi Dave,

be optimistic, why not?
I bet we will ;)

Pierre



-- 
=============
Pierre Morel
RTOS and Embedded Linux

--

From: Pierre Morel
Date: Friday, September 12, 2008 - 5:19 am

Hello Oleg,

You are right, the functionality can be implemented with the system call.
But it means we have the overhead of a system call just to clear two bits,
the TIF_SYSCALL_TRACE and the PTS_SELF.

On the other hand we have an overhead of one single "if" inside
the handle_signal() function.

We can do the same with fork and ptrace, yes, but with a very big 
overhead on each system call and this is why this patch is so usefull: 
because with this patch you sit inside the thread when analysing it and 
have a direct access to all data without the need of IPC, ptrace or any 
task switch.

I will provide a test program and plan to release a tracing tool based 
on it.
I think I can reduce the task struct modification by using just a bit 
like you suggest if nobody seen any problem with this.

best regards,

Pierre



-- 
=============
Pierre Morel
RTOS and Embedded Linux

--

From: Oleg Nesterov
Date: Friday, September 12, 2008 - 7:32 am

Hello Pierre,


Yes.

So you want to optimize the code for the (imho very exotic) functionality.
And again, the overhead of a system call is nothing compared to the signal

What if everyone who wants to add the new functionality will add one
single "if + code" to the core kernel just because he wants to add
a very minor optimization for his needs?

And you forgot about the maintaince overhead. You forgot that this extra
"if" uglifies/complicates the code.

This all is imho of course, and I'm not maintainer. But I promise I

Yes please, this would be very nice. Please do not count me, but I'm
afraid I am not alone who needs to really understand why this patch
is useful.

Oleg.

--

From: Dave Hansen
Date: Wednesday, September 10, 2008 - 9:19 am

I see you didn't like my suggestions for consolidating some of these
repetitive code bits across all the architectures.  Did you give that a
a shot?  Would you like me to produce a patch on top of what you have
here before this gets merged into mm?

-- Dave

--

From: Pierre Morel
Date: Friday, September 12, 2008 - 5:30 am

Hi Dave,

I have read your suggestion, but the actual ptrace() implementation uses 
explicit reference to the different architecture dependent registers and 
I think that this portion of code is more readable if the patch keeps 
No Dave, thank you.

regards,



-- 
=============
Pierre Morel
RTOS and Embedded Linux

--

Previous thread: by Tarkan Erimer on Monday, September 8, 2008 - 5:04 am. (1 message)

Next thread: Recent copy of libpcap from CVS by Keith Owens on Monday, September 8, 2008 - 5:27 am. (10 messages)