Now that task->signal can't go away and collect_sigign_sigcatch()
is rcu-safe, task_sig() doesn't need ->siglock.
Remove lock_task_sighand() and unnecessary sigemptyset's, move
collect_sigign_sigcatch() under rcu_read_lock().
Of course, this means we read pending/blocked/etc nonatomically,
but I hope this is OK for fs/proc.
Probably we can change do_task_stat() to avod ->siglock too, except
we can't get tty_nr lockless.
Also, remove the "is this correct?" comment. I think it is safe
to dereference __task_cred(p)->user under rcu lock. In any case,
->siglock can't help to protect cred->user.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---
fs/proc/array.c | 26 ++++++++++----------------
1 file changed, 10 insertions(+), 16 deletions(-)
--- 34-rc1/fs/proc/array.c~PROC_3_TASK_SIG_DONT_USE_SIGLOCK 2010-03-22 17:39:42.000000000 +0100
+++ 34-rc1/fs/proc/array.c 2010-03-22 18:36:13.000000000 +0100
@@ -257,30 +257,24 @@ static void collect_sigign_sigcatch(stru
static inline void task_sig(struct seq_file *m, struct task_struct *p)
{
- unsigned long flags;
sigset_t pending, shpending, blocked, ignored, caught;
int num_threads = 0;
unsigned long qsize = 0;
unsigned long qlim = 0;
- sigemptyset(&pending);
- sigemptyset(&shpending);
- sigemptyset(&blocked);
sigemptyset(&ignored);
sigemptyset(&caught);
- if (lock_task_sighand(p, &flags)) {
- pending = p->pending.signal;
- shpending = p->signal->shared_pending.signal;
- blocked = p->blocked;
- collect_sigign_sigcatch(p, &ignored, &caught);
- num_threads = get_nr_threads(p);
- rcu_read_lock(); /* FIXME: is this correct? */
- qsize = atomic_read(&__task_cred(p)->user->sigpending);
- rcu_read_unlock();
- qlim = task_rlimit(p, RLIMIT_SIGPENDING);
- unlock_task_sighand(p, &flags);
- }
+ blocked = p->blocked;
+ pending = p->pending.signal;
+ shpending = p->signal->shared_pending.signal;
+ qlim = task_rlimit(p, RLIMIT_SIGPENDING);
+ num_threads = ...Except that the data returned might then be inconsistent because you don't hold a lock as you read the various bits of it. David --
Yes. From the changelog: Of course, this means we read pending/blocked/etc nonatomically, but I hope this is OK for fs/proc. But I don't think the returned data could be "really" inconsistent from the /bin/ps pov. Yes, it is possible that, say, some signal is seen as both pending and ignored without ->siglock. Or we can report user->sigpending != 0 while pending/shpending are empty. But this looks harmless to me. We never guaranteed /proc/pid/status can't report the "intermediate" state, and I don't think we can confuse the user-space. Do you agree? Or do you think this can make problems ? Oleg. --
I'm not so sure. Operations like sigprocmask and sigaction really have always been entirely atomic from the userland perspective before. Now it becomes possible to read from /proc e.g. a blocked set that never existed as such (one word updated by sigprocmask but not yet the next word). Thanks, Roland --
Yes, /proc/pid/status can report the intermediate state, I even sent the updated changelog to document this. But if you are not sure this is OK, I am worried. Do you think we should drop this patch? If yes, I won't argue. Oleg. --
I'm not dead-set against it, but I am hesitant. My inclination is not to remove any previous userland atomicity guarantees with regard to observable signal state in any form. At least, don't do that in part of a whole cleanup flurry where it is intermixed with lots of changes that really are pure cleanup with absolutely no userland-observable change. If it really helps to fragment what was atomic before, then we can consider it. But let's not be in a hurry. David mentioned that users who do multiple reads due to using tiny buffers already don't get atomic sampling. That is certainly true but I don't think it's relevant. It is completely reliable that you can easily allocate a buffer big enough to get all the Sig* fields on the first read, and any user program that might care about the coherence of the data, by definition, is already doing that. Thanks, Roland --
OK. Andrew, please drop proc-make-collect_sigign_sigcatch-rcu-safe.patch proc-make-task_sig-lockless.patch OK. Not that I really understand why do we need atomicity, but OK. I was going to remove ->siglock from /fs/proc/ completely (except do_io_accounting), but given that nobody replied to do_task_stat patches OK. Anyway, these changes are simple, we can reconsider them later. Oleg. --
If you have a small userspace buffer, that was previously possible too. David --
Ah, yes. I read that as you meant how procfs accessed the actual data structures, not how the user accessed procfs. It might be worth clarifying I don't know of anything this will affect adversely. In fact, I'm not sure there was a guarantee that it would be atomic anyway. So as far as I'm concerned, you can add: Btw, avoid has an 'i' in it... :-) --
Another reason to update the changelog ;) Andrew, please find the updated changelog for proc-make-task_sig-lockless.patch If this is not convenient, please ignore or tell me what is the "right" way to fix the changelog when the patch is already in -mm. ------------------------------------------------------------------------------ Now that task->signal can't go away and collect_sigign_sigcatch() is rcu-safe, task_sig() doesn't need ->siglock. Remove lock_task_sighand() and unnecessary sigemptyset's, move collect_sigign_sigcatch() under rcu_read_lock(). Of course, this means we read pending/blocked/etc nonatomically and we can report this info in some intermediate state. Say, a signal can be reported as both pending and ignored, or we can report ->sigpending != 0 while pending/shpending are empty, etc. Hopefully this is OK for proc, we never promised this info should be atomic. Probably we can change do_task_stat() to avoid ->siglock too, except we can't get tty_nr lockless. Also, remove the "is this correct?" comment. I think it is safe to dereference __task_cred(p)->user under rcu lock. In any case, ->siglock can't help to protect cred->user. --
