> > I don't think I follow the scenario you have in mind there. When siglockSo you concur that there is no misbehavior from that lock-drop? (I'm all for getting rid of it, just don't want to have a wrong thought in my head about what the precise issues are.) I see. I certainly didn't have in mind permitting that. But if the only thing that can happen is that if the parent called wait already to reap the child, it never gets the SIGCHLD for it, then technically that is fine. (In fact, we are not POSIX-conforming now because pending SIGCHLDs are supposed to be cleared by a wait call made with SIGCHLD blocked when there are no more unwaited-for children ready.) It seemed relevant to share that nit, but I don't think we want to perturb the behavior here any time soon. The status quo is that the CLD_CONTINUED signal-sending call is made by an unrelated task, the one sending the SIGCONT to the child, rather than with child==current as other SIGCHLD's are. I don't think that should matter, because nothing in the do_notify_parent_cldstop code path consults current for anything. With your new plan, that path will be entered instead by some thread in the group that was woken by generating SIGCONT. If it did matter, that would presumably only be better. I guess the issue you meant was that the task_struct argument to do_notify_parent_cldstop is whichever thread woke up and took the siglock first, rather than the recipient of the SIGCONT (usually the group leader). Now I understand your reference to ptrace--it does tsk = tsk->group_leader unless ptraced. AFAICT, what this affects is only the si_pid, si_uid, si_[us]time values in the siginfo_t for the SIGCHLD with si_code = CLD_CONTINUED. Not that ptracers really care one way or the other, but I think the ptrace special case really only sensibly applies to CLD_STOPPED. There aren't going to be CLD_CONTINUED notifications for each and every thread anyway (and I don't think they would be all that useful to a ptracer). So independent of all this, even as things stand now I would do: if (why == CLD_STOPPED && (tsk->ptrace & PT_PTRACED)) to clear that up. I haven't really reviewed that stuff properly (sorry about that, behind on many things). Let's not tie up this can of worms with that one for now. If SIGSTOP was not dequeued yet, then it was pending and was cleared by generating SIGCONT, so nothing happened: no stop, no continue, just a SIGCONT is now pending. (There is also the example where it's SIGTSTP and it was blocked, so there's a case that does not depend on a race happening.) If the group stop has begun, then at least one thread has already begun stopping, and experienced effects like a syscall returning -EINTR. Those effects shouldn't be seen if "nothing happened". But they are expected if the process stopped and then continued very quickly. If that's what happened, then the CLD_STOPPED SIGCHLD was posted before the continue happened. So that was my original rationale. But there are still races. (This can't come up in the blocked-SIGTSTP case, where the behavior is reliably testable from the application perspective.) An unblocked stop signal might have caused a signal_wake_up on some thread that then lost the race for the siglock. The SIGCONT-sender clears the pending SIGSTOP that was the reason that thread's syscall got interrupted. Then that thread will see -EINTR and such effects, but there will never be any CLD_STOPPED report to justify it. If that same thread is not the one to immediately run the SIGCONT handler, there is no other excuse for the side effects. Since that possibility still exists (and is probably impossible to avoid), maybe I shouldn't worry about the group-stop-in-progress case. Clearly this isn't something anyone else ever really worries about but me. Still, when group_stop_count > 0 that means that we know for sure that every thread has TIF_SIGPENDING and so all those that were in syscalls will for sure see -EINTR and such side effects. So I do lean towards having that situation reported as "stopped and then continued" rather than "nothing happened". I certainly think we should clean up all the callers to __group_send_sig_info from outside signal.c, but I did not want to get into that cleanup in the middle of this change. Once we finish hashing out any changes to the locking requirements for calling into the signals code, we can clean those uses up. I like it too. My inclination is still to do the other cleanup first, and also do enough other cleanups first so that we're confident of using a signal_struct.flags bit for this right off. In practice no ptracer ever wanted a CLD_CONTINUED notification or cares about them now. So even "breaking" it wouldn't get us in much trouble. But I don't see any problem at all, especially if we change the do_notify_parent_cldstop behavior to use tsk = tsk->group_leader as I discussed above. This is only ever needed after we have just been in do_signal_stop and it actually stopped (returned 1). That always gets to a goto relock. So this could be outside the for loop. I'd just inline it rather than have another function that conditionally unlocks, which sparse hates. So: spin_lock_irq(¤t->sighand->siglock); /* * long and helpful comment */ if (unlikely(current->signal->notify_parent)) { int why = current->signal->notify_parent; current->signal->notify_parent = 0; spin_unlock_irq(¤t->sighand->siglock); read_lock(&tasklist_lock); do_notify_parent_cldstop(current, why); read_unlock(&tasklist_lock); goto relock; } for (;;) { Thanks, Roland --
| Greg KH | [GIT PATCH] driver core patches against 2.6.24 |
| Heiko Carstens | Re: -mm merge plans for 2.6.23 -- sys_fallocate |
| Bart Van Assche | Integration of SCST in the mainstream Linux kernel |
| Greg KH | Re: Dual-Licensing Linux Kernel with GPL V2 and GPL V3 |
| Gary Thomas | Marvell 88E609x switch? |
| Jarek Poplawski | Re: [PATCH] pkt_sched: Destroy gen estimators under rtnl_lock(). |
| Jeff Garzik | Re: [PATCH] drivers/net: remove network drivers' last few uses of IRQF_SAMPLE_RANDOM |
| Natalie Protasevich | [BUG] New Kernel Bugs |
git: | |
