In a stunning turn of events, I've actually been able to make another -rc
release despite all the discussion (*cough*flaming*cough*) about other
issues, and we now have a brand-spanking-new Linux 2.6.22-rc5 release
out there!
As usual, you get *five* kernels for the price of one! You can get the
tar-ball, the patches, or the git tree. And the tar-balls and patches come
not just in the old gzipped format, you can have them bzip2'd too! So you
have an incredible selection of kernels to choose from, fresh from the
oven!
By a happy turn of events, we have even succeeded in solving a number of
annoying regressions! Special thanks go to Tejun Heo, who worked night and
day, and finally reproduced and fixed the odd suspend/resume problems in
the ATA layer that got introduced by some previous cleanups.
But that's not all! We have other regressions fixed, and we have a bonus
round of random architecture fixes, and an extra-bonus round of Blackfin
updates.
[ Ok, so I can't really call the Blackfin fixes for regressions, but let's
face it, it's not like anybody is actually going to care. So doing the
Blackfin merge at this point may not be strictly proper -rc policy, but
hey, it's a new architecture. It certainly won't cause any more
regressions, if only because there's not a whole lot to regress there.
The same largely goes for the libertas wireless driver ;^/ ]
On a more serious note, I have to admit that I'm a bit unhappy with the
pure *volume* of changes this late in the game. I was really wanting to
stop some of the merges, but while not all of it really fixed regressions,
there really are a lot of bugfixes in there.
Yeah, the *bulk* of the patch is things like PA-RISC, Blackfin and the
libertas driver, but I really *really* wish people would also more
obviously be working on the regression list.
On that note - I doubly appreciate the people who *did* work on
regressions. Thanks, guys,
Linus
---
Adam Litke (1):
hugetlb: fix ...signalfd still has the broken behavior w.r.t. signal delivery to threads. Is this going to get fixed before 2.6.22 proper is released, or should it just be disabled entirely so no userspace apps grow to depend on current wrong behavior? -- Nicholas Miell <nmiell@comcast.net> -
At the moment, with Ben's patch applied, signalfd can see all group-sent signals, and locally-directed thread signals. Linus, we can leave this as is, or we can use the ququed-signalfd that was implemented in the first versions of signalfd. In such case, since signalfd hooks to the sighand, all signals will be visible to signalfd and they will not compete against dequeue_signal with the tasks. So there will be no races in the queue retrieval. The issue that remained to be solved was a simple way to limit memory allocated by the queue. What do you prefer? - Davide -
But there's still no way for multiple threads to read from a single signalfd and get their own thread-specific signals in addition to process-wide signals, right? I think this was agreed to be the least -- Nicholas Miell <nmiell@comcast.net> -
Multiple threads can wait on the signalfd. Each one will dequeue either its own private signals (tsk->pending) or the process shared ones (tsk->signal->shared_pending). This will be the behaviour once Ben's patch is applied. - Davide -
Ah, ok, that's great. I didn't see anything like that in linux.git, missed Ben's patch to the list, and mixed up your description with the original TIF_SIGPENDING work. Sorry for the confusion. -- Nicholas Miell <nmiell@comcast.net> -
They will still race on the signal queue though. That is, if you create a signalfd and three threads are doing a read(2) over it, if a signal is sent to the thread group (tsk->signal->shared_pending), only one will fetch the signal. I think that's the correct behaviour anyway. Andrew or Linus, did you get Ben's patch? - Davide -
It might have been missed... I can resend later today. Ben. -
I did indeed just miss it. I intended to apply it (and actually thought I had), but I see it's still just an email in that long thread. (It's often a good idea to re-write the subject line and make it be that standard "[PATCH] ..description..", because that just makes it show up much better when I go through my unread emails.. Not that that is any kind of *guarantee* that I won't miss it). Anyway, no need to re-send, it's now *really* in my queue of things to apply. Linus -
Davide Libenzi wrote:
>
> On Sun, 17 Jun 2007, Nicholas Miell wrote:
>
> > But there's still no way for multiple threads to read from a single
> > signalfd and get their own thread-specific signals in addition to
> > process-wide signals, right? I think this was agreed to be the least
> > surprising behavior.
>
> Multiple threads can wait on the signalfd. Each one will dequeue either
> its own private signals (tsk->pending) or the process shared ones
> (tsk->signal->shared_pending). This will be the behaviour once Ben's patch
> is applied.
The commited "Fix signalfd interaction with thread-private signals"
(commit caec4e8dc85e0644ec24aeb36285e1ba02da58cc) doesn't implement
this.
We can do something like
int signalfd_dequeue_signal(struct task_struct *tsk, sigset_t *mask, siginfo_t *info)
{
if (tsk->tgid == current->tgid)
tsk = current;
return dequeue_signal(tsk, mask, info);
}
(still I can't understand why should we change signalfd).
Oleg.
-
Indeed, if you want what Davide described, you need to also change signalfd side. The patch I did merely prevents another thread from dequeuing somebody else private signals. As it stands, signalfd will still try it... and only get the shared ones. It will not however get the private signals of the caller if it is different from the context associated with the signalfd. Ben. -
Yes I see, but why do we need this change? Yes, we can dequeue SIGSEGV from another thread. Just don't do it if you have a handler for SIGSEGV? Oleg. -
Well, for such synchronous signals, it's a fairly stupid idea, especially since you can't predict who will get it. Signals such as SEGV are forced-in, which means they are force-unblocked. Thus, you can't know for sure whome of signalfd or the target thread will get it first, depending on who catches the siglock first I suppose. In one case, you'll manage to steal it, in the other, you'll thread will be killed. Ben. -
Yes. As I said, I think this falls into the "just don't do that" category. But nothing bad happens from the kernel POV. Also, suppose that some thread does for (;;) signal(SIGSEGV, SIG_IGN); Now we have the same situation. do_sigaction() can steal SIGSEGV from another thread. Perhaps, the proposed behaviour > Multiple threads can wait on the signalfd. Each one will dequeue either > its own private signals (tsk->pending) or the process shared ones > (tsk->signal->shared_pending). is more convenient, I can't judge. If we implement this, sys_signalfd(-1) in essence means "attach to the thread group, not current". This also makes sense. But what we have now (with this patch applied) is a bit strange, imho. Oleg. -
Actually, that shouldn't be possible. See "force_sig_info()". It does not allow blocking or ignoring forced signals. We will reset such a signal handler to SIG_DFL, and unlock it. So if you get a SIGSEGV while SIGSEGV's are blocked or ignored, the kernel *will* kill you. No questions asked. Linus -
Yes, and no. Yes, force_sig() unblocks and un-ignores the signal. However, unlike group-wide signals, thread-specific signals do not convert themselves to SIGKILL on delivery. The target thread should dequeue SIGSEGV and then it calls do_group_exit(). Before it does so, another thread doing signal(SIGSEGV, SIG_IGN) can steal the signal. Of course, this is unlikely, and the target thread will take page fault again. The same for signalfd. Oleg. -
No it couldn't. Why? Because the target thread is the one that *caused* the SIGSEGV in the first place. It's not going to dequeue *anything*. It's either going to take the SIGSEGV, or it's going to get another SIGSEGV and now it's no longer masked/handled and it's going to die. Linus -
Hmm, can't understand. Yes, the target thread is the one that caused the SIGSEGV, it sends the signal to itself. entry.S:ret_from_exception should notice this signal and _dequeue_ it, no? This signal could be stealed by signal(SIG_IGN) which runs after it Yes sure. As I said, > and the target thread will take page fault again. My point was that it is _possible_ to steal a thread-local SIGSEGV even without signalfd, nothing bad should happen. Oleg. -
Right. But it will dequeue it by *taking* it. IOW, this has absolutely nothing to do with signalfd. That makes no sense. You don't "steal" it. You take it. It's what SIGSEGV (and _any_ signal) has always been about. You get the signal, enter the signal handler, and handle it. No "stealing". No signalfd, no *nothing*. Just normal signal behaviour. Linus -
_Another_ thread could steal SIGSEGV via read(signalfd) without Ben's patch. This is what Ben and Davide are worried about. I think we should not worry, we have the same situation if this "another" thread does for (;;) signal(SIGSEGV, SIG_IGN); do_sigaction() does rm_from_queue_full(). Oleg. -
Yeah well... I wanted to have the least surprise path... that is, without my patch, signalfd will "sometimes" steal the SIGSEGV depending on who races to the lock first, thus causing the target thread to re-execute the faulting instruction and taking another SIGSEGV, and sometimes not. It's bad from both the faulting thread point of view and the signalfd use who gets signals "sometimes" without any guarantee. I like the current code that at least implement a precise semantic for all thread local signals -> they are only ever delivered to that thread, period. If you really want to do funky things from outside, you can still do ptrace ;-) Ben. -
OK. But in that case I think we should go further, and make signalfd "per process", not "per thread", see http://marc.info/?l=linux-kernel&m=118241815219430 Every thread gets its own local signals plus shared ones. (I promise, this is the last piece of spam from me on this topic, but please-please-please nack this patch explicitly if you don't like it :) Oleg. -
No, I think your patch do make sense. That is, it -does- make sense to be able to create a signal singalfd in a process and have N threads reading from it and getting either shared signals or their local private signals. I just don't like the actual implementation of it by changing the task pointer on the fly... My main issue is a matter of consistency of the signalfd API as a whole... the whole bloody thing is instanciated & attached to a thread in the first place. Maybe we should change that and say that one instanciates a signalfd on a thread group... that is, it always gets attached to the leader. It might well be that signalfd's concept of context is wrong in the first place and it should be attached to processes rather than threads and that made more explicit in the first place... but that leaves the door open to what a write() API should be ... Ben. -
It does exactly so, please note this chunk
@@ -330,7 +339,7 @@ asmlinkage long sys_signalfd(int ufd, si
init_waitqueue_head(&ctx->wqh);
ctx->sigmask = sigmask;
- ctx->tsk = current;
Exactly!
Oleg.
-
Yup, looks like I was looking at a wrong patch... I think it's the right thing to do indeed. Ben. -
Quite frankly, it strikes me that if we want to do this, then we shouldn't save the _process_ information at all, we should save the "sighand" instead. So either we save the process info, or we save the sighand, but saving the "group_leader" seems totally bogus. Especially as the group leader can change (by execve()). One thing that strikes me as I look at that function is that the whole signalfd thing doesn't seem to do any reference counting. Ie it looks totally buggy wrt passing the resulting fd off to somebody else, and then exiting in the original process. What did I miss? Linus -
Probably nothing... doesn't look good. What are the lifetime rules of a struct sighand tho ? Ben. -
Ah got it, signalfd_detach() in include/linux/signalfd.h from exit_signal plus some rcu bits in signalfd lock/unlock. Ben. -
You could just get rid of the process/sighand/whatever reference entirely and just make reads on a signalfd always dequeue signals for the current thread. You'd lose the ability to pass signalfds around to other processes, but I'm not convinced that is even useful. (But I'm sure somebody smarter than me has a valid use case and would love to share :-) -- Nicholas Miell <nmiell@comcast.net> -
Wasn't it you that bitched (just a few days ago) because multiple threads could not use the same signalfd and they (by your initial thought) had to create one per thread? - Davide -
Nevermind, I wasn't entirely clear on the reason why signalfd_ctx had a tsk pointer. (I wrongly thought it was a vestige of the mechanism for the original delivery semantics.) -- Nicholas Miell <nmiell@comcast.net> -
He said multiple process and you say multiple threads... If signalfd isn't attached to any context, it would then be useable by all threads in a process, delivering them their private signals and the process shared signals. Makes sense to me. By removing that context thing, you lose the ability to listen to some other -process- signals, which is probably a bad idea in the first place anyway... if you're going to do that, use ptrace (yuck) :-) Now, you -might- have valid uses for that later ability, but if not, it then makes some sense to only "attach" when an actual read or poll is done and only for the duration of that read/poll and only for that reader/poller (not the whole signalfd instance). I think that's what Nicholas means... and it may even simplify the code. Ben. -
That is what I was suggesting, but I don't understand the internals of Linux signal delivery enough to know if it is possible without unpleasant contortions to make it work. -- Nicholas Miell <nmiell@comcast.net> -
We intercept the sighand going out of business, and we do not access it anymore after that (by the mean of signalfd_lock() returning zero). I'd be OK with Oleg patch, although I really prefer signalfd being more flexible (that is, with sync signals disabled in signalfd, and with Ben's patch reverted). Unless clear point of breakage are shown with such approach. - Davide -
Note that exec() can change ->sighand as well, so in general this can't help. Currently not a problem, exec() detaches process from signalfd anyway. This reminds me, we have a similar problem with !CPUCLOCK_PERTHREAD() cpu-timers, iirc. They can survive after exec(), but only if it was ->group_leader who does exec. Oleg. -
Oh, absolutely. I thought we all agreed that Ben's patch was the right thing. It's been merged for some time now (even if it hasn't been merged for as long as I initially _thought_, since I had forgotten to apply the the first time around ;) So yes, all my emails have been about the _current_ code. Linus -
I believe it can be confusing to have private signals dequeued from another thread. The kernel expect those to be dequeued by the target thread. - Davide -
Well, I think the kernel doesn't make any assumptions on that. It can't guarantee the signal will be actually dequeued, to begin with. (That said, I probably missed something, in that case I'd like to be educated. This is the real reason why I am making the noise :) Oleg. -
What happens if a task gets a page fault that results in a SIGSEGV, and another task steals the SIGSEGV using a signalfd, before the faulted task has the chance to get into do_notify_resume() and notice it? ;) - Davide -
Yes, of course. I was waiting for Ben's patch to go in, before fixing signalfd. I did not know if Linus would have acked that. - Davide -
OK, this means that signalfd becomes "thread group wide". In that case I'd suggest to also change sys_signalfd(-1), - ctx->tsk = current; + ctx->tsk = current->group_leader; Oleg. -
Actually, I think signalfd is fine as is, with Ben's patch applied. Signalfd should only fetch shared signals, not specific ones (in any case). - Davide -
The only advantage of that additional patch is that it will allow any thread to fetch its own private signals via signalfd, regardless of what context is attached to the signalfd in the first place. Without the patch (with what's currently in Linus tree), a thread will only get its own private signals if it's also the thread that created the signalfd (and thus is attached to the signalfd "context"). I don't mind either way, up to you guys to decide what semantics you want. Ben. -
Ok, why instead don't we go for something like the attached patch?
We exclude sync signals from signalfd, but we don't limit signalfd to
shared signals. Ie, we should be able to fetch a signal sent with
sys_tkill() to threads different from "current", that otherwise we would
not be able to fetch.
Ben, sorry but my memory sucks ... the "notifier" thing was fine in that
case, no?
- Davide
---
fs/signalfd.c | 10 +++++++++-
kernel/signal.c | 6 +-----
2 files changed, 10 insertions(+), 6 deletions(-)
Index: linux-2.6.mod/fs/signalfd.c
===================================================================
--- linux-2.6.mod.orig/fs/signalfd.c 2007-06-19 18:58:15.000000000 -0700
+++ linux-2.6.mod/fs/signalfd.c 2007-06-19 19:02:57.000000000 -0700
@@ -26,6 +26,14 @@
#include <linux/anon_inodes.h>
#include <linux/signalfd.h>
+/*
+ * Signals that a signalfd should never fetch.
+ */
+#define SIGNALFD_EXCLUDE_MASK (sigmask(SIGILL) | sigmask(SIGKILL) | \
+ sigmask(SIGSTOP) | sigmask(SIGTRAP) | \
+ sigmask(SIGFPE) | sigmask(SIGSEGV) | \
+ sigmask(SIGBUS))
+
struct signalfd_ctx {
struct list_head lnk;
wait_queue_head_t wqh;
@@ -320,7 +328,7 @@
if (sizemask != sizeof(sigset_t) ||
copy_from_user(&sigmask, user_mask, sizeof(sigmask)))
return error = -EINVAL;
- sigdelsetmask(&sigmask, sigmask(SIGKILL) | sigmask(SIGSTOP));
+ sigdelsetmask(&sigmask, SIGNALFD_EXCLUDE_MASK);
signotset(&sigmask);
if (ufd == -1) {
Index: linux-2.6.mod/kernel/signal.c
===================================================================
--- linux-2.6.mod.orig/kernel/signal.c 2007-06-19 18:55:07.000000000 -0700
+++ linux-2.6.mod/kernel/signal.c 2007-06-19 18:58:43.000000000 -0700
@@ -365,11 +365,7 @@
{
int signr = 0;
- /* We only dequeue private signals from ourselves, we don't let
- * signalfd steal them
- */
- if (tsk == current)
- signr = __dequeue_signal(&tsk->pending, mask, info);
+ signr = ...I'm generally nervous about the idea of letting signalfd dequeue somebody else private signals... even if we filter out SEGV's and friends but I'll let Linus decide there. Regarding the notifier, it's dodgy in most cases I'd say but I suppose it should be allright to only worry about "current" and not the target task there. Ben. -
I believe that once we exclude synchronous signals from being dequeued, we should be fine. Limiting all signals sent with sys_tkill() from being dequeued with a signalfd is a too restricting behaviour IMO. - Davide -
Try to put a "GPLV3" somewhere in the subject. It might have better chances to get through Linus radar these days :D - Davide -
What if we pass a signalfd to another process with unix socket? Which signals should be dequeued in that case? Only shared ones? I tried to follow this discussion, but I can't understans why the current behaviour is bad. Yes, a thread has to create its own signalfd if it wants to dequeue private signals. But this is simple and understandable. May be I missed something else ? Oleg. -
Hello, MAINTAINERS says riscom8 is orphaned so not sure if anybody cares. Spotted this when playing with modprobe walking /lib/modules/`uname -r`/kernel in a loop ;) BUG: sleeping function called from invalid context at kernel/mutex.c:86 in_atomic():0, irqs_disabled():1 [<c0104542>] show_trace_log_lvl+0x1a/0x30 [<c01050d0>] show_trace+0x12/0x14 [<c010515b>] dump_stack+0x15/0x17 [<c0112bd4>] __might_sleep+0xa8/0xae [<c04148ff>] mutex_lock+0x15/0x1f [<c015f82f>] __unregister_chrdev_region+0x1a/0x85 [<c015f8f7>] unregister_chrdev_region+0x2f/0x3e [<c029c710>] tty_unregister_driver+0x30/0x100 [<df082011>] rc_release_drivers+0x11/0x20 [riscom8] [<df08960c>] riscom8_init_module+0x539/0x57a [riscom8] [<c01345c2>] sys_init_module+0x11b/0x167e [<c0103ee6>] sysenter_past_esp+0x5f/0x85 ======================= Regards, Mariusz -
Oh wow.
I wonder why it does that. The code literally does:
save_flags(flags);
cli();
tty_unregister_driver(riscom_driver);
put_tty_driver(riscom_driver);
restore_flags(flags);
and I don't see the point.
(And that's what then causes the warning: tty_unregister_driver will
eventually get a mutex, but the caller had disabled hardware interrupts..
I see absolutely *zero* reason for that whole save_flags/cli/restore_flags
dance at all, so I think the right thing to do is to just remove it.
I'm Cc'ing Alan, in case he cares.
Mariusz, if you actually *use* the dang thing, and really care, you can
send me a tested patch. I can pretty much guarantee that removing those
irq games won't make things any worse, but no way am I going to just
remove it on my own without any users or any testing.
Linus
-
Unfortunately I can test it only the way I did it so far. That means no real hardware :-/ This also implies I do care but not that much ;-) If you want any patches to be tested that way -> feel free to mail me. Mariusz -
