Re: [RFC PATCH] set TASK_TRACED before arch_ptrace code to fix a race

Previous thread: [PATCH][RFC]fix soft lock up at NFS mount by per-SB LRU-list of unused dentries by Kentaro Makita on Wednesday, May 21, 2008 - 7:22 pm. (7 messages)

Next thread: [PATCH] fix sys_prctl() returned uninitialized value by Shi Weihua on Wednesday, May 21, 2008 - 8:19 pm. (9 messages)
From: Luming Yu
Date: Wednesday, May 21, 2008 - 7:47 pm

Hello list,

The following patch is to fixed a race in ptrace_stop handling which
causes "strace" hang if the target process blocks SIGTRAP with the
test case filed at
https://bugzilla.redhat.com/show_bug.cgi?id=446200#c16.
Please note this is just IA64 problem because just IA64 has
arch_ptrace_stop_needed defined, and has arch_ptrace_stop defined that
would set notify_resume flags for syncing rbs...but it also opens the
door to invoke ia64_do_signal->get_signal_to_deliver before setting
current PTRACED flag. Please help review.

**The patch is enclosed in text attachment*
**I'm using web client to send the patch* *

Signed-off-by: Yu Luming <luming.yu@intel.com>
--------------------------------------
 signal.c |    5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)
From: Petr Tesarik
Date: Thursday, May 22, 2008 - 1:47 am

That's probably not what we want. What happens if the task then sleeps
during the user-space access? Unless I forgot something obvious, it will
never get scheduled again...


--

From: Luming Yu
Date: Thursday, May 22, 2008 - 2:16 am

My intention is to disable signal delivering before TASK_TRACED flag
is set for correctly handling ptrace_stop() with SIGTRAP masked.
Although this patch totally is a hack, but it should clearly shows
--

From: Roland McGrath
Date: Thursday, May 22, 2008 - 4:18 am

I really cannot figure out from anything you've said what the failure mode
is or how you think it should be affected.


Thanks,
Roland
--

From: Petr Tesarik
Date: Thursday, May 22, 2008 - 5:12 am

At least you should be able to see the related bug report at:

https://bugzilla.redhat.com/show_bug.cgi?id=446200

I'm getting Access Denied. :(

Petr Tesarik

--

From: Roland McGrath
Date: Thursday, May 22, 2008 - 1:39 pm

That really doesn't help (yes, I can see it).  That is a bug for the RHEL4
kernel, which is based on a much older kernel and has none of the code
we're discussing here.  The test case is a simple program that blocks
SIGTRAP using sigprocmask, then forks and exec's a simple shell script that
just runs some random command; run this under strace.  There is no further
information in that bugzilla report that helps us understand the behavior
of that test case in the current kernel, which Luming says has a problem.


Thanks,
Roland
--

From: Luming Yu
Date: Friday, May 23, 2008 - 5:33 am

ok , I just got permit to share the test case here..  Please check out
the attachment.
 1. tar jxvf testpro.tar.bz2
       testpro/test1.c
               test2.sh
2.cd testpro
3.gcc -o test1 test1.c
4.strace -f -o log.txt ./test1

hope you can reproduce the problem.

Thanks,
Luming
From: Luming Yu
Date: Thursday, May 22, 2008 - 6:24 am

Sorry for confusion, Let me try to explain it more:

For ia64,the code path is like:
ptrace_notify (to let the debugger run)--> ptrace_stop
-->spin_unlock_irq->arch_ptrace_stop (ia64_ptrace_stop) ->[sync rbs
and set NOTIFY_RESUME.....]-->spin_lock_irq->set TASK_TRACED flag (to
let the debugger run)

For x86, the code path is like:
ptrace_notify (to let the debugger run) ->ptrace_stop->set TASK_TRACED
flag (to let the debugger run)-->spin_unlock_irq

If TASK_TRACED is not set earlier before arch_ptrace_stop on ia64
ptrace_notify code path, some signals would be delivered without
letting debugger run.. (i.e. PTRACED logica in get_signal_to_deliver
would be ignored totally!). These should cause the test case hang on
ia64. And x86 just works..

If you have any question , I would dig further..

Thanks,
Luming
--

From: Roland McGrath
Date: Thursday, May 22, 2008 - 1:34 pm

> Sorry for confusion, Let me try to explain it more:


I do not understand this at all, and it has given no information you did
not give before.  Please describe the scenario you see in fine-grained
terms.


Thanks,
Roland
--

From: Luming Yu
Date: Thursday, May 22, 2008 - 8:42 pm

In the code path mentioned above, I see only ia64 has chance to let
ptraced thread deliver those pending signals before TASK_TRACED is
set. Then debugger thread would lose chance to interfere the
delivering of those signals if I correctly understand PT_PTRACED logic
in get_signal_to_deliver, and the relationship between the two flag :
TASK_TRACED and PT_TRACED.

Since you write those code, Please clarify, in ptrace_notify code
path, is it allowed that ptraced thread can run signal handler without
telling debugger what happened?

I noticed the only difference between x86 and IA64 ,  and it does make
the test case work on
x86, and fail on IA64... So I made the patch trying to eliminate the
difference. It indeed seems to solve my problem although it is still
hack, and I don't know what kind of signals strace handled has such
magic..

As for how the door is only open for ia64, I can explain further if
you want to know.
--

From: Roland McGrath
Date: Thursday, May 22, 2008 - 9:19 pm

This is the key thing that makes no sense to me.  What do you mean by
"deliver" here?  The normal meaning of to "deliver" a signal means that
get_signal_to_deliver() dequeues and processes it.  This can never happen
inside any other kernel code path.  It certainly can never happen inside
ptrace_stop or ptrace_notify.

The difference between ia64 and others is that inside ptrace_stop(), it can
release the siglock and then can block (in page faults, or by preemption).
No signal delivery can happen.  

What perhaps can happen is that TIF_SIGPENDING being set can cause a
TASK_INTERRUPTIBLE wait in the page fault path to return early and
ia64_sync_user_rbs will bail out before doing all its work.  But you
haven't described any problem like that.

Please tell us the exact code path you think is happening in the error case
you can reproduce.  Describe the actual code path, not high-level notions
like "deliver those pending signals".


Thanks,
Roland
--

From: Luming Yu
Date: Thursday, May 22, 2008 - 10:24 pm

"deliver" here I mean get_signal_to_deliver()...
Please clarify, in ptrace_notify code path, is it allowed that ptraced
thread can ignore signals (such as SIGTRAP) without telling debugger
what happened? This is key problem I'm not quite sure... Because if
the test doesn't block SIGTRAP, it works on IA64 too.. Hope I have
made the problem clear...
--

From: Roland McGrath
Date: Sunday, May 25, 2008 - 5:15 pm

> "deliver" here I mean get_signal_to_deliver()...

This can never happen inside ptrace_stop.  

It was already quite clear what elements have to be present to cause a
problem with the test case (ia64 and blocked SIGTRAP).  Nothing else about
it is yet clear to me, sorry.

Please tell us the exact code path you think is happening in the error case
you can reproduce.  


Thanks,
Roland
--

From: Luming Yu
Date: Sunday, May 25, 2008 - 6:30 pm

I will try to customize kernel to capture call trace for a precise code path.

--Luming
--

From: Luming Yu
Date: Monday, May 26, 2008 - 8:31 pm

It does happen!!

Call Trace:
 [<a000000100011bd0>] show_stack+0x50/0xa0
                                sp=e000000146bbfbb0 bsp=e000000146bb0e08
 [<a000000100011c50>] dump_stack+0x30/0x60
                                sp=e000000146bbfd80 bsp=e000000146bb0de8
 [<a0000001000979a0>] get_signal_to_deliver+0x60/0x6e0
                                sp=e000000146bbfd80 bsp=e000000146bb0d80
 [<a0000001000343d0>] ia64_do_signal+0xb0/0xd00
                                sp=e000000146bbfd80 bsp=e000000146bb0cd8
 [<a000000100012650>] do_notify_resume_user+0xf0/0x140
                                sp=e000000146bbfe20 bsp=e000000146bb0ca8
 [<a00000010000aac0>] notify_resume_user+0x40/0x60
                                sp=e000000146bbfe20 bsp=e000000146bb0c58
 [<a00000010000a9f0>] skip_rbs_switch+0xe0/0x110
                                sp=e000000146bbfe30 bsp=e000000146bb0c58
 [<a000000000010740>] __kernel_syscall_via_break+0x0/0x20

I applied the following patch , and got the call trace above..
If apply my RFC patch as antidote,  I don't see "deliver" ...
Is the problem clear now?  I will serve you until every thing is clear to you.

Thanks,
Luming

Signed-off-by: Yu Luming <luming.yu@intel.com>

diff -Bru 1/kernel/signal.c 0/kernel/signal.c
--- 1/kernel/signal.c   2008-05-27 15:18:48.000000000 +0800
+++ 0/kernel/signal.c   2008-05-27 15:08:51.000000000 +0800
@@ -38,6 +38,7 @@
  */

 static struct kmem_cache *sigqueue_cachep;
+unsigned long global_arch_ptrace_stop_flag =0;

 static int __sig_ignored(struct task_struct *t, int sig)
 {
@@ -1501,9 +1502,12 @@
                 * siglock.  That must prevent us from sleeping in TASK_TRACED.
                 * So after regaining the lock, we must check for SIGKILL.
                 */
+               global_arch_ptrace_stop_flag = 1;
                spin_unlock_irq(&current->sighand->siglock);
                arch_ptrace_stop(exit_code, info);
+
                spin_lock_irq(&current->sighand->siglock);
+               ...
From: Roland McGrath
Date: Monday, May 26, 2008 - 9:04 pm

> > if happens, it should be a bug, right?

It doesn't even make sense that it should be possible.
So if it somehow is possible, that is certainly a bug.


So this here shows a perfectly normal trace that bottoms out at a syscall
entry from user mode.  You seem to be saying that, somehow, inside
ptrace_stop(), we tried to return to user mode--I guess you mean losing the
kernel stack with the call chain leading to ptrace_stop()--and then

With just that diagnostic patch as shown, these might be two different
threads.  But I guess you've ruled that out somehow?  If this does in fact
happen in the thread that is supposed to be in ptrace_stop(), then the


That's quite a commitment!  My full enlightenment may be a long time off.
I won't hold you to it once we've fixed this particular bug, though. ;-)

What should be happening is that ia64_ptrace_stop() should do its work,
possibly blocking, and then return to its caller in ptrace_stop().  At no
point should it be possible for ia64_ptrace_stop() to return directly to
user mode, or to reenter notify_resume_user() in any fashion.

Please focus on the exact code path taken inside the ia64_ptrace_stop()
call.  It should be possible to identify every step of that and see exactly
where it goes astray from what we expect.


Thanks,
Roland
--

From: Luming Yu
Date: Monday, May 26, 2008 - 10:49 pm

I revised patch a bit, and managed to get this:

Call Trace:
 [<a000000100011bd0>] show_stack+0x50/0xa0
                                sp=e000000141c9fbb0 bsp=e000000141c90ea8
 [<a000000100011c50>] dump_stack+0x30/0x60
                                sp=e000000141c9fd80 bsp=e000000141c90e90
 [<a000000100097a20>] get_signal_to_deliver+0xa0/0x720
                                sp=e000000141c9fd80 bsp=e000000141c90e28
 [<a0000001000343d0>] ia64_do_signal+0xb0/0xd00
                                sp=e000000141c9fd80 bsp=e000000141c90d78
 [<a000000100012650>] do_notify_resume_user+0xf0/0x140
                                sp=e000000141c9fe20 bsp=e000000141c90d48
 [<a00000010000aac0>] notify_resume_user+0x40/0x60
                                sp=e000000141c9fe20 bsp=e000000141c90cf8
 [<a00000010000a9f0>] skip_rbs_switch+0xe0/0x110
                                sp=e000000141c9fe30 bsp=e000000141c90cf8
 [<a000000000010740>] __kernel_syscall_via_break+0x0/0x20
                                sp=e000000141ca0000 bsp=e000000141c90cf8
 [<a0000001000971c0>] ptrace_stop+0xa0/0x3e0
                                sp=e00000014716fdb0 bsp=e000000147160ca8
 [<a000000100097650>] ptrace_notify+0x150/0x1c0
                                sp=e00000014716fdb0 bsp=e000000147160c88
 [<a00000010002adb0>] syscall_trace+0x30/0xc0
                                sp=e00000014716fe30 bsp=e000000147160c70
 [<a00000010002aea0>] syscall_trace_enter+0x60/0xa0
                                sp=e00000014716fe30 bsp=e000000147160c18
 [<a00000010000a300>] ia64_trace_syscall+0x40/0x110
                                sp=e00000014716fe30 bsp=e000000147160c18
 [<a000000000010740>] __kernel_syscall_via_break+0x0/0x20
                                sp=e000000147170000 bsp=e000000147160c18
--

From: Roland McGrath
Date: Monday, May 26, 2008 - 11:12 pm

If this is a backtrace of a single thread on the same kernel stack, it has
a stack depth of over 84MB.  You still need to follow the path inside
ia64_ptrace_stop() to see how any of this is possible.


Thanks,
Roland
--

From: Petr Tesarik
Date: Monday, May 26, 2008 - 11:25 pm

Indeed, there seems to be a large hole here. So, this is either a bug in
the unwinder, or a bug in the RBS synchronization, which causes
corruption. My test machine currently needs some work to run 2.6.25
again, but I'll try your test case as soon as I re-install it later this
week.

Cheers,
--

From: Luming Yu
Date: Monday, June 2, 2008 - 11:04 pm

Just want to check if the test case works for you?
--

From: Petr Tesarik
Date: Tuesday, June 3, 2008 - 2:01 am

Yes, the test case hangs here too. But the problem seems to be
elsewhere. Did you look into the strace output? This line is pretty
suspicious:

3258  clone2(child_stack=0, stack_size=0,
flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD,
child_tidptr=0x200000000004e290) = 1

Obviously, strace cannot attach PID 1, and since it is not designed to
handle this situation, it hangs. I'm going to investigate why the return
value of the clone2 syscall is seen as 1 by the tracer. Might even turn
out to be a bug in strace...

Petr Tesarik

--

From: Petr Tesarik
Date: Tuesday, June 3, 2008 - 7:32 am

It's definitely a bug in strace. For some reason (I don't care about)
the execve() syscall produces an extra notification. However, this
notification message is suppressed when SIGTRAP is blocked. This
explains why the test case fails only when SIGTRAP is blocked.

Now, you may ask why it only fails on ia64 and not on i386 or x86_64.
Well, I was so good that I even looked into strace sources to make sure.
Whereas for i386 and x86_64, the value of EAX/RAX is checked for -ENOSYS
in syscall_fixup(), for ia64 the first ptrace() after an execve() is
unconditionally ignored, see code in get_scno().

I don't know why Luming's fix helps here, but, please, fix strace, don't
introduce weird behaviour in the kernel.

The only thing I'm willing to talk about is why the extra notification
message is sent, and how userspace (strace) is supposed to recognize it.
FWIW the backtrace (system tap was at __group_send_sig_info):

 0xa0000001000b1a60 : __group_send_sig_info+0x0/0x180 []
 0xa0000001000b1e30 : do_notify_parent_cldstop+0x250/0x2c0 []
 0xa0000001000b2230 : ptrace_stop+0x2b0/0x3c0 []
 0xa0000001000b5200 : get_signal_to_deliver+0x200/0xa40 []
 0xa000000100035920 : ia64_do_signal+0xa0/0xee0 []
 0xa000000100014b60 : do_notify_resume_user+0x100/0x160 []
 0xa00000010000d040 : notify_resume_user+0x40/0x60 []
 0xa00000010000cf40 : skip_rbs_switch+0xf0/0x150 []
 0xa000000000010620 : __kernel_syscall_via_break+0x0/0x20 []

Regards,
Petr Tesarik

--

From: Roland McGrath
Date: Tuesday, June 3, 2008 - 2:01 pm

What do you mean by "extra"?  There is a SIGTRAP sent after execve
completes when ptraced, even when PTRACE_SYSCALL is not being used.
So for an execve that succeeds under PTRACE_SYSCALL, there is a
ptrace_notify at syscall entry, then a SIGTRAP queued (i.e., not seen
by ptrace if blocked), then a ptrace_notify at syscall exit.  If
that's what's happening (including the blocked SIGTRAP not being seen
by the ptracer, i.e. strace), then there is no mystery (and no bug).


Thanks,
Roland
--

From: Luck, Tony
Date: Tuesday, June 3, 2008 - 2:31 pm

This might not be the same bug ... but I do have a definite 100%
reproducible bug (latest git kernel, old version of strace (4.5.15-1.el4.1))

Run:

	$ strace -o logit -f make

in any directory where make is actually going to have to do some
work.  You'll see that the command hangs after make outputs the
first action that it will take.  Looking at the stack traces of
the 3 processes involved it seems that make forked, the child
stopped in ptrace waiting for some action from strace, but strace
isn't woken from its sleep in wait().

Backtrace of pid 6442 (strace)

Call Trace:
 [<a0000001007069b0>] schedule+0x11f0/0x1380
                                sp=e0000001b28cfdb0 bsp=e0000001b28c0e00
 [<a0000001000842d0>] do_wait+0x1110/0x1520
                                sp=e0000001b28cfdd0 bsp=e0000001b28c0d58
 [<a0000001000849c0>] sys_wait4+0x140/0x1a0
                                sp=e0000001b28cfe30 bsp=e0000001b28c0cd8
 [<a00000010000aa60>] ia64_ret_from_syscall+0x0/0x20
                                sp=e0000001b28cfe30 bsp=e0000001b28c0cd8
 [<a000000000010740>] __kernel_syscall_via_break+0x0/0x20
                                sp=e0000001b28d0000 bsp=e0000001b28c0cd8

Backtrace of pid 6443 (make)

Call Trace:
 [<a0000001007069b0>] schedule+0x11f0/0x1380
                                sp=e0000001b768fb40 bsp=e0000001b7680d58
 [<a000000100707800>] schedule_timeout+0x40/0x180
                                sp=e0000001b768fb60 bsp=e0000001b7680d28
 [<a000000100706d60>] wait_for_common+0x220/0x380
                                sp=e0000001b768fb90 bsp=e0000001b7680cd8
 [<a000000100706f00>] wait_for_completion+0x40/0x60
                                sp=e0000001b768fbf0 bsp=e0000001b7680cb8
 [<a0000001000794d0>] do_fork+0x430/0x4a0
                                sp=e0000001b768fbf0 bsp=e0000001b7680c60
 [<a00000010000a340>] sys_clone+0x60/0x80
                                sp=e0000001b768fc20 bsp=e0000001b7680c10
 [<a00000010000a990>] ...
From: Roland McGrath
Date: Tuesday, June 3, 2008 - 3:13 pm

This trace (do_fork->wait_for_completion) tells us this is a vfork call.

This is the normal trace for the child having received a signal and stopped
to tell ptrace about it (not a syscall tracing stop).

I think you need to look into what strace is doing.  There is far too much
going to know much of anything just from the kernel state where the
processes sit.  In particular, the sequence of ptrace and wait calls strace
made.  If the same strace (identical everything) behaved differently with
an older kernel, then compare the sequence of ptrace and wait calls and see
where it differs.


Thanks,
Roland
--

From: Luming Yu
Date: Tuesday, June 10, 2008 - 1:23 am

For this test case, utrace doesn't work too..
--

From: Luming Yu
Date: Tuesday, June 3, 2008 - 7:16 pm

This is exact problem I suspected and I was trying to address in my hack..
Since there are several processes involved in the pretty complex
ptrace scenario.,
I need to capture all processes context with kdump to confirm this is
exact root-cause
for the problem. But kdump doesn't work for me..I'm trying to solve it now..

I'm also in doubt about the semantic correctness of the test case..
Since SIGTRAP is so necessary to get ptrace work, is it legitimate to
block it in test case?

One more thing I need to say is:
Same strace works for utrace enabled kernel on IA64.. If the bug is in
strace, how could it happen?
--

From: Petr Tesarik
Date: Wednesday, June 4, 2008 - 2:16 am

No idea, but send me the strace.log file from running

strace -o strace.log strace -f -o log.txt ./test1

and I may be able to tell.

Petr Tesarik
--

From: Petr Tesarik
Date: Thursday, June 5, 2008 - 4:16 am

Hm, I think without utrace, it gets out-of-sync once, so syscall entries
and exits are swapped from that point on. With utrace, it gets
out-of-sync _twice_, so it eventually looks fine. But the strace output
definitely looks incorrect even with utrace:

5718  execve("./test2.sh", [], [/* 23 vars */]) = 1
5718  execve("", [0x840c001000100003, 0x26230c14203032, 0x8cb0008800140a81, 0xa643100801808402, 0x2400905000040088, 0x11600a0072000001, 0xad814a00402e0, 0x2200012464009344, 0x1180418512c40026, 0x400003081880008, 0x2100010840910404, 0x8045120000800003, 0x6400000c0000600, 0xc20063440501400, 0x1048015002008081, 0xe02226005008c010, ...], [/* 0 vars */]) = 1
5718  access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory)

Note that strace missed a brk() syscall, although I can actually see
this in the other trace you sent me:

wait4(-1, [{WIFSTOPPED(s) && WSTOPSIG(s) == SIGTRAP}], __WALL, NULL) = 5704
ptrace(PTRACE_PEEKUSER, 5704, psr, NULL) = 4398046511120
ptrace(PTRACE_PEEKUSER, 5704, r15, NULL) = 1060
ptrace(PTRACE_SYSCALL, 5704, 0x1, SIG_0) = 0

Look at the value of r15, and compare it with unistd.h:
#define __NR_brk                        1060

I _guess_ this is caused by the fact that test2.sh is a shell script, so
the kernel executes the shell, and maybe utrace produces a second execve
notifications in this case? Roland, can you shed some light?

Petr Tesarik
--

From: Roland McGrath
Date: Thursday, June 5, 2008 - 5:07 pm

Not really.  The utrace kernels Luming is trying are intended to match the
vanilla ptrace behavior.  I don't think it's very useful to worry about the
difference between some utrace kernel and the current vanilla kernel.
Let's just look at what the current vanilla kernel is doing and compare
that to what an older vanilla kernel did if older versions produced
different results for the test case.


Thanks,
Roland
--

From: Luming Yu
Date: Monday, September 8, 2008 - 8:06 pm

Just have chance to re-check the problem. I have spotted the place that strace
misbehaves when SIGTRAP is blocked by task being straced with -f flag
(follow trace)
The problem is that after debugee blocking SIGTRAP the task's exec
path would not wake up debugger (strace) with this signal. But strace
don't know about it, and expect such a wake up. Upon receiving a
subsequent wake up,strace still thought it was a wake up from "SIGTRAP
in previous exec path". From now on, Debuger and Debugee misunderstand
each other..

With the test case posted before in this thread and a customized
kernel, I got the following debug info from kernel:

syscall_trace_enter exiting (4717)
report_exec:pid= 4717 parent=4715, real_parent=4716,current->ptrace =00000001,
report_exec:pid= 4717 parent=4715, real_parent=4716,current->ptrace =00000001,
syscall_trace_leave entering (4717, syscall:1033)
 ..sig_ignored=1..
4715, PTRACE_PEEKUSR 4717 @addr 00000830 return 00000000
4715, PTRACE_PEEKUSR 4717 @addr 000008c0 return 00000000
4715, PTRACE_PEEKUSR 4717 @addr 000008d0 return 00000000
4715, PTRACE_PEEKUSR 4717 @addr 000008d0 return 00000000
4715, PTRACE_PEEKUSR 4717 @addr 000008c0 return 00000000
syscall_trace current(4717)->exit_code(0)
syscall_trace_leave exiting (4717)
syscall_trace_enter beginning (4717, syscall: 1060)
 ..sig_ignored=1..
4715, PTRACE_PEEKUSR 4717 @addr 00000830 return 00000010
4715, PTRACE_PEEKUSR 4717 @addr 000008b8 return 00000424
syscall_trace current(4717)->exit_code(0)
syscall_trace_enter exiting (4717)

Delete the line that blocks SIGTRAP in the test case, rerun it, I got:

syscall_trace current(4724)->exit_code(0)
syscall_trace_enter exiting (4724)
report_exec:pid= 4724 parent=4722, real_parent=4723,current->ptrace =00000001,
report_exec:pid= 4724 parent=4722, real_parent=4723,current->ptrace =00000001,
syscall_trace_leave entering (4724, syscall:1033)
 ..sig_ignored=1..
4722, PTRACE_PEEKUSR 4724 @addr 00000830 return 00000000
4722, PTRACE_PEEKUSR 4724 @addr 000008c0 ...
From: Roland McGrath
Date: Tuesday, September 9, 2008 - 10:55 pm

This subject line is quite far from what you're now addressing, btw.

I certainly wouldn't recommend that patch.  The interference between ptrace
uses of SIGTRAP and a program that blocks SIGTRAP is just a fact of life
with ptrace.  If you wanted to change the exec SIGTRAP to be more
consistent with other signals induced by debuggers (such as breakpoints and
single-step), you could just change send_sig to force_sig.  That unblocks
SIGTRAP and resets its handler when it's blocked.  Conversely, a tracer
program could use PTRACE_O_TRACEEXEC if it cared to deal with tracing a
process that blocks SIGTRAP.  (That uses ptrace_notify rather than any
actual signal that can be blocked.)

But, none of this goes to your actual problem at all AFAICT.  You've said
your actual problem with strace only arises on ia64.  The behavior of
ptrace'd exec, and the effect of a program blocking SIGTRAP, is exactly the
same across all machines.  Changing it does not explain your situation.
I'm in favor of understanding what is actually going on before changing
things.

If the SIGTRAP from exec is actually seen by strace on a different machine
like it's not on ia64, that suggests that something different happened
between the two.  In the case that doesn't exhibit the problem, either
SIGTRAP got unblocked by something before the exec, or userland (strace)
behaves differently so that it doesn't care about not seeing that SIGTRAP.


Thanks,
Roland
--

From: Luming Yu
Date: Tuesday, September 16, 2008 - 1:50 am

The reason is not as complicated as I thought. It is because x86
strace don't test TCB_WAITEXECVE. please take a look at defs.h
(strace-4.5.16), the flag is defined as follows:

#ifdef LINUX
# if defined(ALPHA) || defined(SPARC) || defined(SPARC64) ||
defined(POWERPC) || defined(IA64) || defined(HPPA) || defined(SH) ||
defined(SH64) || defined(S390) || defined(S390X) || defined(ARM)
#  define TCB_WAITEXECVE 02000  /* ignore SIGTRAP after exceve */
# endif

I'm not an strace expert, so I have no idea what the TCB_WAITEXECVE means.
And why x86 strace can handle SIGTRAP after execve but ALL other Arch can not..
Could anybody shield some lights on the flag?
--

From: Roland McGrath
Date: Wednesday, September 17, 2008 - 10:01 am

So is there or is there not any regression (change) in the kernel's behavior?

If you are now just talking about strace's own code, then the place for
that is strace-devel@lists.sourceforge.net, not any kernel lists.


Thanks,
Roland
--

From: Luming Yu
Date: Wednesday, September 17, 2008 - 10:44 pm

The another mysterious that I completely don't understand  probably
needs your help.
Why kernels with utrace patch applied on ia64 just work?
Sounds like utrace delivers an extra SIGTRAP comparing with kernel
without utrace code..
Dose it make sense? Hmm.. Sounds like a bug to me..

--

From: Luming Yu
Date: Monday, May 26, 2008 - 11:34 pm

you are right, I just find outputs of two different thread are mixed!
Please just ignore the call trace above..
--

From: Luming Yu
Date: Tuesday, May 27, 2008 - 1:48 am

I tend to believe singal "deliver" in ptrace_notify path is impossible..
Given the fact that if not block SIGTRAP, the test case just works on
ia64, I still believe some signals have been consumed without letting
debugger know.. but it is just my guess, We still need to investigate
how it happened..I assume you should know how the test case failed. PS
shows the test case hangs in process 6553 which seem to be impossible
... Please help me understand what kind of situation could cause it
happen. That would be very helpful.

root      6516  0.0  0.0 57344 5184 ttyS1    Ss+  19:33   0:00 -bash
root      6551  0.0  0.0  4992 2560 ttyS1    T    19:33   0:00  \_
strace -f ./test1
root      6552  0.0  0.0  4416 1792 ttyS1    T    19:33   0:00      \_ ./test1
root      6553  0.0  0.0 56832 4288 ttyS1    T    19:33   0:01
 \_ /bin/bash ./test2.sh

Thanks,
Luming
--

From: Luming Yu
Date: Wednesday, May 28, 2008 - 2:14 am

Some correction about the test case hang,  the ps output should looks
like the following:
13925 S+ strace -f ./test1
13926 S+ ./test1
13927 T+  /bin/bash ./test2

I'm trying upstream kdump to get more detailed data for help analysis
the scenario..
But unfortunately upstream kernel just hang when I echo c to
sysrq-trigger.  Downgrade the kernel to 2.6.22,  'echo c' doesn't
hang, but I just got "zero" dump file...
Will try F9 later... But from the symptom shouldn't we ask this
question: why  process 13927 can't wake up?  Roland,  please confirm
if ptrace_untrace is the only way to get Traced process wake up?  Back
up to caller of ptrace_untrace, maybe we can find out why my RFC patch
happen to fix the problem... Any suggestions are welcome!
--Luming
--

From: Luming Yu
Date: Monday, June 2, 2008 - 11:02 pm

Upstream kdump doesn't work! don't know what cause the regression,
possibly kexec-tools...
Need investigation here!
--

From: Roland McGrath
Date: Friday, May 30, 2008 - 1:05 am

I'd say the first thing to do is understand the sequence of ptrace calls,
wait results seen by the ptracer, and the ptrace_stop() path, in your test
case.  Compare that on x86 and ia64, where the behavior you see differs.
When you pinpoint the difference in that sequence, we will have some sense
of what it is that really matters to your test.


Thanks,
Roland
--

Previous thread: [PATCH][RFC]fix soft lock up at NFS mount by per-SB LRU-list of unused dentries by Kentaro Makita on Wednesday, May 21, 2008 - 7:22 pm. (7 messages)

Next thread: [PATCH] fix sys_prctl() returned uninitialized value by Shi Weihua on Wednesday, May 21, 2008 - 8:19 pm. (9 messages)