Re: [PATCH 09/16] ptrace: make do_signal_stop() use ptrace_stop() if the task is being ptraced

Previous thread: [PATCH 04/16] ptrace: kill tracehook_notify_jctl() by Tejun Heo on Monday, December 6, 2010 - 9:56 am. (3 messages)

Next thread: [PATCH 03/16] signal: remove superflous try_to_freeze() loop in do_signal_stop() by Tejun Heo on Monday, December 6, 2010 - 9:56 am. (2 messages)
From: Tejun Heo
Date: Monday, December 6, 2010 - 9:56 am

A ptraced task would still stop at do_signal_stop() when it's stopping
for stop signals and do_signal_stop() behaves the same whether the
task is ptraced or not.  However, in addition to stopping,
ptrace_stop() also does ptrace specific stuff like calling
architecture specific callbacks, so this behavior makes the code more
fragile and difficult to understand.

This patch makes do_signal_stop() test whether the task is ptraced and
use ptrace_stop() if so.  This renders tracehook_notify_jctl() rather
pointless as the ptrace notification is now handled by ptrace_stop()
regardless of the return value from the tracehook.  It probably is a
good idea to update it.

This doesn't solve the whole problem as tasks already in stopped state
would stay in the regular stop when ptrace attached.  That part will
be handled by the next patch.

Oleg spotted a minor userland visible change.  In some cases, the
ptracee's state would now be TASK_TRACED where it used to be
TASK_STOPPED, which is visible via fs/proc.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Roland McGrath <roland@redhat.com>
Cc: Jan Kratochvil <jan.kratochvil@redhat.com>
---
 kernel/signal.c |   43 +++++++++++++++++++++++++------------------
 1 files changed, 25 insertions(+), 18 deletions(-)

diff --git a/kernel/signal.c b/kernel/signal.c
index faf218b..a6bc4cf 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -1785,7 +1785,6 @@ void ptrace_notify(int exit_code)
 static int do_signal_stop(int signr)
 {
 	struct signal_struct *sig = current->signal;
-	int notify = 0;
 
 	if (!(current->group_stop & GROUP_STOP_PENDING)) {
 		unsigned int gstop = GROUP_STOP_PENDING | GROUP_STOP_CONSUME;
@@ -1815,29 +1814,37 @@ static int do_signal_stop(int signr)
 			} else
 				task_clear_group_stop(t);
 	}
-	/*
-	 * If there are no other threads in the group, or if there is
-	 * a group stop in progress and we are the last to stop, report
-	 * to the parent.  When ptraced, every thread ...
From: Oleg Nesterov
Date: Thursday, December 23, 2010 - 5:26 am

Jan, Roland, this change needs your review.

As it was already discussed, this is the user-visible change, and
I am starting to worry we can underestimate it.

Again, I am not saying this can break something, I simply do not
know. However, I think there is something non-consistent in the
new behaviour, please see below.


In short: suppose that the tracee recieves a signal, reports it
to debugger, and the debugger does ptrace(PTRACE_CONT, pid, SIGSTOP).

Before the patch the tracee sleeps in TASK_STOPPED, after the patch

I missed this part of the changelog. "visible via fs/proc" is not
the only change. Another change is how the tracee reacts to SIGCONT
after that. With this patch it can't wake the tracee up.

Consider the simplest example. The tracee is single-threaded and
debugger == parent. Something like

	int main(void)
	{
		int child, status;

		child = fork();
		if (!child) {
			ptrace(PTRACE_TRACEME);

			kill(getpid(), SIGSTOP);

			return 0;
		}

		wait(&status)
		// the tracee reports the signal
		assert(WIFSTOPPED() && WSTOPSIG() == SIGSTOP);
		// it should stop after that
		ptrace(PTRACE_CONT, child, SIGSTOP);

		wait(&status);
		// now it is stopped
		assert(WIFSTOPPED() && WSTOPSIG() == SIGSTOP);

		kill(child, SIGCONT);

		wait(&status);
		assert(WIFSTOPPED() && WSTOPSIG() == SIGCONT);

This won't work with this patch. the last do_wait() will hang forever.
Probably this is fine, I do not know. Please take a look and ack/nack
explicitly.


However. There is something I missed previously, and this small
detail doesn't look good to me: the behaviour of SIGCONT becomes
a bit unpredictable. Suppose it races with do_signal_stop() and
clears GROUP_STOP_PENDING or SIGNAL_STOP_DEQUEUED before, in this
case in can "wakeup" the tracee.

IOW. Let's remove the 2nd wait() in the code above, the parent
does

		wait(&status)
		// the tracee reports the signal
		assert(WIFSTOPPED() && WSTOPSIG() == SIGSTOP);
		// it should stop after ...
From: Tejun Heo
Date: Thursday, December 23, 2010 - 6:53 am

Hello,


Yes, before the change, the task would respond to SIGCONT before the
first ptrace request succeeds after attach.  To me, this doesn't seem
to be anything intentional tho.  It seems a lot of ptrace and group
stop interactions is in the grey area with only the current (quirky,
I'm afraid) behavior drawing almost arbitrary lines across different
behaviors.

We can try to preserve all those behaviors but I don't think that will
be very beneficial.  I think the best way to proceed would be
identifying the sensible and depended upon assumptions and then draw
clear lines from there stating what's guaranteed and what's undefined.
We'll have to keep some of quirkiness for sure.

Anyways, pondering and verifying all the possibly visible changes
definitely is necessary, but that said, we fortunately have rather
limited number of ptrace users and their usages don't seem to be too
wild (at least on my cursory investigation), so I think it to be
doable without breaking anything noticeably.  But yeap we definitely

We can change ptrace_stop() to use TASK_STOPPED if it's stopping for
group stop to preserve the original behavior but if it doesn't disturb
current users (and I doubt it would), I think it would be far cleaner
to state that the behavior is undefined.  The current behavior - it
works if there hasn't been whichever ptrace operation inbetween - is
quite unexpected anyway, IMHO.

And, for longer term, I think it would be a good idea to separate
group stop and ptrace trap mechanisms, so that ptrace trap works
properly on per-task level and properly transparent from group stop
handling.  The intertwining between the two across different domains
of threads inhfferently involves a lot of grey areas where there is no

IIUC, it dumps the register window to userland memory.  ia64 has this
stacked windows of registers which gets wound up and unwound according
to function calls and those need to be dumped to userland memory so
that the debugger can PEEK and POKE them.  Not really ...
From: Oleg Nesterov
Date: Thursday, December 23, 2010 - 9:06 am

Not exactly. But perhaps you meant that even without this change,
any ptrace() request after ptrace(PTRACE_CONT, SIGSTOP) will change

Agreed.

However. Strangely, I didn't think about this before. With this
change, it is not possible to trace/debug the application so that
it can properly react to SIGCONT. Yes, currently we have a lot
more problems here, including do_wait, so probably this doesn't
matter.

Still I'd like to know what Jan and Roland think. I am paranoid,

Yes, at least I think it makes sense to document this change in the
changelog. This can simplify the life if we have a bug report blaiming


Yes, that was my question.

Oleg.

--

From: Tejun Heo
Date: Thursday, December 23, 2010 - 9:33 am

Definitely.

Hmm... the mention of SIGCONT behavior reminds me of your mention of
notification behavior on the other message.  I might write a more
detailed reply there but, anyways, I don't think we'll be able to find
one good behavior with the current set of operations.  If we hide the
group stop handling and notification from the ptracer, it would allow,
for example, transparent strace, but it will at the same time restrict
what a debugger can do behind the tracee's back in a regressive way
(ie. the current users would notice).

I think it would be better to introduce a few new ptrace operations
and give them more control in clearly defined and hopefully intutive
way.  For example, a ptracer is notified when a task is stopping for
group stop and it should be able to decide and tell the kernel whether
to participate in the group stop or not.  For SIGCONT, likewise, the
kernel can notify all ptracers in the group and can notify the real
parent if all ptracers decide to participate in the continuation.

That way, debuggers can still work behind the real parent's back (they
often want to) while audit type tools can sniff transparently.  I
think the trickier part would be adapting the existing operations so
that they have clearly defined (mostly) compatible behavior while not
hindering too much with addition of new capabilities.

Thanks.

-- 
tejun
--

Previous thread: [PATCH 04/16] ptrace: kill tracehook_notify_jctl() by Tejun Heo on Monday, December 6, 2010 - 9:56 am. (3 messages)

Next thread: [PATCH 03/16] signal: remove superflous try_to_freeze() loop in do_signal_stop() by Tejun Heo on Monday, December 6, 2010 - 9:56 am. (2 messages)