TASK_WAKEKILL && /sbin/init (was: [PATCH 1/2] schedule: fix TASK_WAKEKILL vs SIGKILL race)

Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]
From: Oleg Nesterov
Date: Thursday, June 5, 2008 - 8:23 am

Sorry Matthew, I left this part unanswered because I didn't have the
time yesterday...

On 06/04, Matthew Wilcox wrote:

If lock_page_killable() fails because the task was killed by SIGKILL or another
fatal signal, do_generic_file_read() returns -EIO.

This seems to be OK, because in fact the userspace won't see this error, the
task will dequeue SIGKILL and exit.

However, /sbin/init is different, it will dequeue SIGKILL, ignore it, and be
confused by this bogus -EIO. Please note that while this bug is not likely,
it is _not_ theoretical. It does happen that user-space sends the unhandled
fatal signals to init.

Imho, this is 2.6.26 material. Unless I missed something, of course.

It is not clear to me what should we do. I'd like very much to avoid adding
more SIGNAL_UNKILLABLE checks, but perhaps we don't have another choice.
We can fix the bug with

	--- kernel/signal.c
	+++ kernel/signal.c
	@@ -974,7 +974,7 @@ void zap_other_threads(struct task_struc
	 
	 int fastcall __fatal_signal_pending(struct task_struct *tsk)
	 {
	-	return sigismember(&tsk->pending.signal, SIGKILL);
	+	return signal_group_exit(tsk->signal);
	 }

, but this makes __fatal_signal_pending() slower, and because we use
tsk->signal, schedule() (in particular) can't use this helper.

Anyway. How about the (untested/uncompiled) patch for now? -EINTR or
-ERESTARTNOINTR looks "more correct" regardless.

Oleg.

--- mm/filemap.c
+++ mm/filemap.c
@@ -188,7 +188,7 @@ static int sync_page(void *word)
 static int sync_page_killable(void *word)
 {
 	sync_page(word);
-	return fatal_signal_pending(current) ? -EINTR : 0;
+	return fatal_signal_pending(current) ? -ERESTARTNOINTR : 0;
 }
 
 /**
@@ -1000,8 +1000,9 @@ page_ok:
 
 page_not_up_to_date:
 		/* Get exclusive access to the page ... */
-		if (lock_page_killable(page))
-			goto readpage_eio;
+		error = lock_page_killable(page);
+		if (error)
+			goto readpage_error;
 
 		/* Did it get truncated before we got the lock? */
 		if (!page->mapping) {
@@ -1029,8 +1030,9 @@ readpage:
 		}
 
 		if (!PageUptodate(page)) {
-			if (lock_page_killable(page))
-				goto readpage_eio;
+			error = lock_page_killable(page);
+			if (error)
+				goto readpage_error;
 			if (!PageUptodate(page)) {
 				if (page->mapping == NULL) {
 					/*

--
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

Messages in current thread:
[PATCH 1/2] schedule: fix TASK_WAKEKILL vs SIGKILL race, Oleg Nesterov, (Wed Jun 4, 10:09 am)
TASK_WAKEKILL && /sbin/init (was: [PATCH 1/2] schedule: fi ..., Oleg Nesterov, (Thu Jun 5, 8:23 am)