[PATCH] proc: don't confuse /bin/ps by zombie delay_group_leader's

Previous thread: [PATCH] softlockup: minor cleanup, don't check task->state twice by Oleg Nesterov on Saturday, August 30, 2008 - 10:08 am. (2 messages)

Next thread: [REGRESSION 2.6.27-rc4-git7] mtd broken by cmdfilter move by Mikael Pettersson on Saturday, August 30, 2008 - 10:04 am. (6 messages)
From: Oleg Nesterov
Date: Saturday, August 30, 2008 - 10:08 am

When the main thread exits,

	void *do_thread(void *arg)
	{
		pause();
		return NULL;
	}

	int main(void)
	{
		pthread_t thr;
		pthread_create(&thr, NULL, do_thread, NULL);

		syscall(__NR_exit, 0);
		return 0;
	}

/bin/ps's output looks really confusing, as if the whole process is dead.
I think this even looks like a kernel bug to the user, because it sees a
zombie which is not going to be reaped.

Change get_task_state() to report "S (sleeping)" in this case. Still not
perfect because the task can be confused with the kernel thread (its ->mm
is NULL), but imho better anyway.

Also, uninline get_task_state(), it has 2 callers.

Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>

--- 2.6.27-rc4/fs/proc/array.c~ZOMBIE_LEADER	2008-07-30 13:12:47.000000000 +0400
+++ 2.6.27-rc4/fs/proc/array.c	2008-08-30 18:17:10.000000000 +0400
@@ -146,10 +146,14 @@ static const char *task_state_array[] = 
 	"X (dead)"		/* 32 */
 };
 
-static inline const char *get_task_state(struct task_struct *tsk)
+static const char *get_task_state(struct task_struct *tsk)
 {
-	unsigned int state = (tsk->state & TASK_REPORT) | tsk->exit_state;
 	const char **p = &task_state_array[0];
+	unsigned int state = tsk->exit_state | (tsk->state & TASK_REPORT);
+
+	/* don't confuse /bin/ps if the whole process is not dead */
+	if (tsk->exit_state && delay_group_leader(tsk))
+		state = TASK_INTERRUPTIBLE;
 
 	while (state) {
 		p++;

--

From: Roland McGrath
Date: Sunday, August 31, 2008 - 6:07 pm

I don't think it's right to change /proc/pid/task/tid/status this way.
The real status is true and correct for the individual task, no matter what.
If you want to change /proc/pid/status, then you ought to split it into
proc_tgid_status and proc_tid_status variants to differ this way.

I don't really have an opinion either way about this for /proc/pid/status.
We've known it behaved this way for a very long time, and I've always
considered it procps's problem to display such situations in ways that
users find most useful.  It can already quickly tell that situation by a
"Threads:" line with >1 when "State:" says "Z".  If you do change it, it
might be nicer to display it as a (compatible) special case:

State:	S (delayed-leader)


Thanks,
Roland
--

From: Oleg Nesterov
Date: Monday, September 1, 2008 - 3:06 am

Agreed. It is much better to change /bin/ps and do not touch the kernel.
Albert, what do you think?



Or "Z (delayed-leader)" if we fix ps. In this case both /proc/pid/status
and /proc/pid/task/tid/status can report the same state.

Oleg.

--

Previous thread: [PATCH] softlockup: minor cleanup, don't check task->state twice by Oleg Nesterov on Saturday, August 30, 2008 - 10:08 am. (2 messages)

Next thread: [REGRESSION 2.6.27-rc4-git7] mtd broken by cmdfilter move by Mikael Pettersson on Saturday, August 30, 2008 - 10:04 am. (6 messages)