[PATCH 5/6] call_usermodehelper: simplify/fix UMH_NO_WAIT case

Previous thread: [PATCH, resend] x86: fix placement of FIX_OHCI1394_BASE by Jan Beulich on Monday, March 15, 2010 - 3:11 am. (3 messages)

Next thread: [PATCH] mm: remove return value of putback_lru_pages by Minchan Kim on Monday, March 15, 2010 - 6:16 am. (4 messages)

Hey, so I've rediffed and tested the user mode helper bits from my previous
patch set that got wrecked when the rcustring bits got pulled.  I've got he
origional summary below, and the patches follwoing.  Note, their just my
changes, rediffed to work without Andis patch set.  The other umh call sites can
still use plain old call_usermodehelper as they did previously.  I'll submit
patches as needed for other callers to make use of the init/cleanup functions as
I have the time/ability to test those changes properly.  The changes below I've
been able to test and verify myself.

Hey all-
        So, about 6 months ago, I made a set of changes to how the
core-dump-to-a-pipe feature in the kernel works.  We had reports of several
races, including some reports of apps bypassing our recursion check so that a
process that was forked as part of a core_pattern setup could infinitely crash
and refork until the system crashed.

        We fixes those by improving our recursion checks.  The new check
basically refuses to fork a process if its core limit is zero, which works well.

        Unfortunately, I've been getting grief from maintainer of user space
programs that are inserted as the forked process of core_pattern.  They contend
that in order for their programs (such as abrt and apport) to work, all the
running processes in a system must have their core limits set to a non-zero
value, to which I say 'yes'.  I did this by design, and think thats the right
way to do things.

        But I've been asked to ease this burden on user space enough times that
I thought I would take a look at it.  The first suggestion was to make the
recursion check fail on a non-zero 'special' number, like one.  That way the
core collector process could set its core size ulimit to 1, and enable the
kernel's recursion detection.  This isn't a bad idea on the surface, but I don't
like it since its opt-in, in that if a program like abrt or apport has a bug and
fails to set such a core limit, we're left with a ...
From: Neil Horman
Date: Monday, March 15, 2010 - 5:33 am

Rediff of the usermodehelper changes after all the wreckage.  This patch
encompasses the patches that were formally in -mm as:
kmod-add-init-function-to-usermodehelper.patch
kmod-add-init-function-to-usermodehelper-fix.patch

I also rolled in changes from Oleg, moving the call of the init method down in
____call_usermodehelper, so the init function could override the nice level on
the process.

Create new function call_usermodehelper_fns() and allow it
to assign both an init and cleanup function, as we'll as arbitrary data.

The init function is called from the context of the forked process and
allows for customization of the helper process prior to calling exec.  Its
return code gates the continuation of the process, or causes its exit.
Also add an arbitrary data pointer to the subprocess_info struct allowing
for data to be passed from the caller to the new process, and the
subsequent cleanup process

Also, use this patch to cleanup the cleanup function.  It currently takes
an argp and envp pointer for freeing, which is ugly.  Lets instead just
make the subprocess_info structure public, and pass that to the cleanup
and init routines

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
CC: Oleg Nesterov <oleg@redhat.com>
CC: Andi Kleen <andi@firstfloor.org>


 include/linux/kmod.h |   51 +++++++++++++++++++++++++++++++++++++++++----------
 kernel/kmod.c        |   51 +++++++++++++++++++++++++++++----------------------
 kernel/sys.c         |    6 +++---
 3 files changed, 73 insertions(+), 35 deletions(-)

diff --git a/include/linux/kmod.h b/include/linux/kmod.h
index facb27f..f9edf63 100644
--- a/include/linux/kmod.h
+++ b/include/linux/kmod.h
@@ -23,6 +23,7 @@
 #include <linux/stddef.h>
 #include <linux/errno.h>
 #include <linux/compiler.h>
+#include <linux/workqueue.h>
 
 #define KMOD_PATH_LEN 256
 
@@ -45,7 +46,27 @@ static inline int request_module_nowait(const char *name, ...) { return -ENOSYS;
 
 struct key;
 struct file;
-struct ...
From: Oleg Nesterov
Date: Monday, March 15, 2010 - 10:34 am

This change looks unnecessary, but doesn't hurt.

Reviewed-by: Oleg Nesterov <oleg@redhat.com>

--

From: Neil Horman
Date: Monday, March 15, 2010 - 10:56 am

Yeah, I concur, it was part of the origional patch, when that
call_usermodehelper had already been replaced with call_usermodehelper_cleanup.
I made my change anyway, figuring part of the purpose of this longer term would
be to convert the call_usermodehelper functions to the init/cleanup enabled
variants, which andi's patchset will eventually do here
--

From: Neil Horman
Date: Monday, March 15, 2010 - 5:36 am

Rediff of the usermodehelper changes after all the wreckage.  This patch
encompasses the patches that were formally in -mm as:

kmod-replace-call_usermodehelper_pipe-with-use-of-umh-init-function-and-resolve-limit.patch
kmod-replace-call_usermodehelper_pipe-with-use-of-umh-init-function-and-resolve-limitcleanup.patch

kmod: replace call_usermodehelper_pipe() with use of umh init function and resolve limit

The first patch in this series introduced an init function to the
call_usermodehelper api so that processes could be customized by caller.
This patch takes advantage of that fact, by customizing the helper in
do_coredump to create the pipe and set its core limit to one (for our
recusrsion check).  This lets us clean up the previous uglyness in the
usermodehelper internals and factor call_usermodehelper out entirely.
While I'm at it, we can also modify the helper setup to look for a core
limit value of 1 rather than zero for our recursion check

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
CC: Oleg Nesterov <oleg@redhat.com>
CC: Andi Kleen <andi@firstfloor.org>


 fs/exec.c            |   63 ++++++++++++++++++++++++++++++++++-----
 include/linux/kmod.h |    7 ----
 kernel/kmod.c        |   82 ---------------------------------------------------
 3 files changed, 56 insertions(+), 96 deletions(-)

diff --git a/fs/exec.c b/fs/exec.c
index f351cdb..64a50b4 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -1784,6 +1784,50 @@ static void wait_for_dump_helpers(struct file *file)
 }
 
 
+/*
+ * uhm_pipe_setup
+ * helper function to customize the process used
+ * to collect the core in userspace.  Specifically
+ * it sets up a pipe and installs it as fd 0 (stdin)
+ * for the process.  Returns 0 on success, or
+ * PTR_ERR on failure.
+ * Note that it also sets the core limit to 1.  This
+ * is a special value that we use to trap recursive
+ * core dumps
+ */
+static int umh_pipe_setup(struct subprocess_info *info)
+{
+	struct file *rp, *wp;
+	struct fdtable ...
From: Oleg Nesterov
Date: Monday, March 15, 2010 - 10:39 am

Reviewed-by: Oleg Nesterov <oleg@redhat.com>

--

From: Oleg Nesterov
Date: Monday, March 15, 2010 - 12:46 pm

More wrecked patches on top of Neil's refactoring.

None was changed, so I keeped the acks.

Oleg.

--

From: Oleg Nesterov
Date: Monday, March 15, 2010 - 12:46 pm

call_usermodehelper_keys() uses call_usermodehelper_setkeys() to change
subprocess_info->cred in advance. Now that we have info->init() we can
change this code to set tgcred->session_keyring in context of execing
kernel thread.

Note: since currently call_usermodehelper_keys() is never called with
UMH_NO_WAIT, call_usermodehelper_keys()->key_get() and umh_keys_cleanup()
are not really needed, we could rely on install_session_keyring_to_cred()
which does key_get() on success.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: David Howells <dhowells@redhat.com>
---

 include/linux/kmod.h         |   17 -----------------
 kernel/kmod.c                |   18 ------------------
 security/keys/internal.h     |    1 +
 security/keys/process_keys.c |    3 +--
 security/keys/request_key.c  |   32 ++++++++++++++++++++++++++++++++
 5 files changed, 34 insertions(+), 37 deletions(-)

--- 34-rc1/include/linux/kmod.h~1_CONVERT_KEYS	2010-03-15 20:00:42.000000000 +0100
+++ 34-rc1/include/linux/kmod.h	2010-03-15 20:04:34.000000000 +0100
@@ -71,8 +71,6 @@ struct subprocess_info *call_usermodehel
 						  char **envp, gfp_t gfp_mask);
 
 /* Set various pieces of state into the subprocess_info structure */
-void call_usermodehelper_setkeys(struct subprocess_info *info,
-				 struct key *session_keyring);
 void call_usermodehelper_setfns(struct subprocess_info *info,
 		    int (*init)(struct subprocess_info *info),
 		    void (*cleanup)(struct subprocess_info *info),
@@ -111,21 +109,6 @@ call_usermodehelper(char *path, char **a
 				       NULL, NULL, NULL);
 }
 
-static inline int
-call_usermodehelper_keys(char *path, char **argv, char **envp,
-			 struct key *session_keyring, enum umh_wait wait)
-{
-	struct subprocess_info *info;
-	gfp_t gfp_mask = (wait == UMH_NO_WAIT) ? GFP_ATOMIC : GFP_KERNEL;
-
-	info = call_usermodehelper_setup(path, argv, envp, gfp_mask);
-	if (info == NULL)
-		return ...
From: Oleg Nesterov
Date: Monday, March 15, 2010 - 12:47 pm

Now that nobody ever changes subprocess_info->cred we can kill this
member and related code. ____call_usermodehelper() always runs in the
context of freshly forked kernel thread, it has the proper ->cred
copied from its parent kthread, keventd.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: David Howells <dhowells@redhat.com>
---

 include/linux/cred.h |    1 
 include/linux/kmod.h |    1 
 kernel/cred.c        |   54 ---------------------------------------------------
 kernel/kmod.c        |   19 -----------------
 4 files changed, 75 deletions(-)

--- 34-rc1/include/linux/cred.h~2_KILL_INFO_CRED	2010-03-11 13:11:50.000000000 +0100
+++ 34-rc1/include/linux/cred.h	2010-03-15 20:10:12.000000000 +0100
@@ -156,7 +156,6 @@ extern int copy_creds(struct task_struct
 extern struct cred *cred_alloc_blank(void);
 extern struct cred *prepare_creds(void);
 extern struct cred *prepare_exec_creds(void);
-extern struct cred *prepare_usermodehelper_creds(void);
 extern int commit_creds(struct cred *);
 extern void abort_creds(struct cred *);
 extern const struct cred *override_creds(const struct cred *);
--- 34-rc1/include/linux/kmod.h~2_KILL_INFO_CRED	2010-03-15 20:04:34.000000000 +0100
+++ 34-rc1/include/linux/kmod.h	2010-03-15 20:10:12.000000000 +0100
@@ -55,7 +55,6 @@ enum umh_wait {
 struct subprocess_info {
 	struct work_struct work;
 	struct completion *complete;
-	struct cred *cred;
 	char *path;
 	char **argv;
 	char **envp;
--- 34-rc1/kernel/cred.c~2_KILL_INFO_CRED	2010-02-15 11:15:21.000000000 +0100
+++ 34-rc1/kernel/cred.c	2010-03-15 20:10:12.000000000 +0100
@@ -347,60 +347,6 @@ struct cred *prepare_exec_creds(void)
 }
 
 /*
- * prepare new credentials for the usermode helper dispatcher
- */
-struct cred *prepare_usermodehelper_creds(void)
-{
-#ifdef CONFIG_KEYS
-	struct thread_group_cred *tgcred = NULL;
-#endif
-	struct cred *new;
-
-#ifdef CONFIG_KEYS
-	tgcred = kzalloc(sizeof(*new->tgcred), ...
From: Oleg Nesterov
Date: Monday, March 15, 2010 - 12:47 pm

____call_usermodehelper() correctly calls flush_signal_handlers()
to set SIG_DFL, but sigemptyset(->blocked) and recalc_sigpending()
are not needed.

This kthread was forked by workqueue thread, all signals must be
unblocked and ignored, no pending signal is possible.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---

 kernel/kmod.c |    3 ---
 1 file changed, 3 deletions(-)

--- 34-rc1/kernel/kmod.c~3_DONT_UNBLOCK	2010-03-15 20:10:12.000000000 +0100
+++ 34-rc1/kernel/kmod.c	2010-03-15 20:17:47.000000000 +0100
@@ -134,11 +134,8 @@ static int ____call_usermodehelper(void 
 	struct subprocess_info *sub_info = data;
 	int retval;
 
-	/* Unblock all signals */
 	spin_lock_irq(&current->sighand->siglock);
 	flush_signal_handlers(current, 1);
-	sigemptyset(&current->blocked);
-	recalc_sigpending();
 	spin_unlock_irq(&current->sighand->siglock);
 
 	/* We can run anywhere, unlike our parent keventd(). */

--

From: Oleg Nesterov
Date: Monday, March 15, 2010 - 12:48 pm

__call_usermodehelper(UMH_NO_WAIT) has 2 problems:

	- if kernel_thread() fails, call_usermodehelper_freeinfo()
	  is not called.

	- for unknown reason UMH_NO_WAIT has UMH_WAIT_PROC logic,
	  we spawn yet another thread which waits until the user
	  mode application exits.

Change the UMH_NO_WAIT code to use ____call_usermodehelper() instead of
wait_for_helper(), and do call_usermodehelper_freeinfo() unconditionally.
We can rely on CLONE_VFORK, do_fork(CLONE_VFORK) until the child exits
or execs.

With or without this patch UMH_NO_WAIT does not report the error if
kernel_thread() fails, this is correct since the caller doesn't wait
for result.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---

 kernel/kmod.c |   10 ++++------
 1 file changed, 4 insertions(+), 6 deletions(-)

--- 34-rc1/kernel/kmod.c~5_UMH_NO_WAIT	2010-03-15 20:28:31.000000000 +0100
+++ 34-rc1/kernel/kmod.c	2010-03-15 20:32:23.000000000 +0100
@@ -205,10 +205,7 @@ static int wait_for_helper(void *data)
 			sub_info->retval = ret;
 	}
 
-	if (sub_info->wait == UMH_NO_WAIT)
-		call_usermodehelper_freeinfo(sub_info);
-	else
-		complete(sub_info->complete);
+	complete(sub_info->complete);
 	return 0;
 }
 
@@ -217,13 +214,13 @@ static void __call_usermodehelper(struct
 {
 	struct subprocess_info *sub_info =
 		container_of(work, struct subprocess_info, work);
-	pid_t pid;
 	enum umh_wait wait = sub_info->wait;
+	pid_t pid;
 
 	/* CLONE_VFORK: wait until the usermode helper has execve'd
 	 * successfully We need the data structures to stay around
 	 * until that is done.  */
-	if (wait == UMH_WAIT_PROC || wait == UMH_NO_WAIT)
+	if (wait == UMH_WAIT_PROC)
 		pid = kernel_thread(wait_for_helper, sub_info,
 				    CLONE_FS | CLONE_FILES | SIGCHLD);
 	else
@@ -232,6 +229,7 @@ static void __call_usermodehelper(struct
 
 	switch (wait) {
 	case UMH_NO_WAIT:
+		call_usermodehelper_freeinfo(sub_info);
 		break;
 
 	case UMH_WAIT_PROC:

--

From: Oleg Nesterov
Date: Monday, March 15, 2010 - 12:48 pm

1. wait_for_helper() calls allow_signal(SIGCHLD) to ensure the child
   can't autoreap itself.

   However, this means that a spurious SIGCHILD from user-space can
   set TIF_SIGPENDING and:

   	- kernel_thread() or sys_wait4() can fail due to signal_pending()

   	- worse, wait4() can fail before ____call_usermodehelper() execs
   	  or exits. In this case the caller may kfree(subprocess_info)
   	  while the child still uses this memory.

   Change the code to use SIG_DFL instead of magic "(void __user *)2"
   set by allow_signal(). This means that SIGCHLD won't be delivered,
   yet the child won't autoreap itsefl.

   The problem is minor, only root can send a signal to this kthread.

2. If sys_wait4(&ret) fails it doesn't populate "ret", in this case
   wait_for_helper() reports a random value from uninitialized var.

   With this patch sys_wait4() should never fail, but still it makes
   sense to initialize ret = -ECHILD so that the caller can notice
   the problem.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
---

 kernel/kmod.c |   10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

--- 34-rc1/kernel/kmod.c~4_SIGCHLD_RACE	2010-03-15 20:17:47.000000000 +0100
+++ 34-rc1/kernel/kmod.c	2010-03-15 20:28:31.000000000 +0100
@@ -175,16 +175,16 @@ static int wait_for_helper(void *data)
 	struct subprocess_info *sub_info = data;
 	pid_t pid;
 
-	/* Install a handler: if SIGCLD isn't handled sys_wait4 won't
-	 * populate the status, but will return -ECHILD. */
-	allow_signal(SIGCHLD);
+	/* If SIGCLD is ignored sys_wait4 won't populate the status. */
+	spin_lock_irq(&current->sighand->siglock);
+	current->sighand->action[SIGCHLD-1].sa.sa_handler = SIG_DFL;
+	spin_unlock_irq(&current->sighand->siglock);
 
 	pid = kernel_thread(____call_usermodehelper, sub_info, SIGCHLD);
 	if (pid < 0) {
 		sub_info->retval = pid;
 	} else {
-		int ret;
-
+		int ret = -ECHILD;
 		/*
 		 * Normally it is bogus to call ...
From: Oleg Nesterov
Date: Monday, March 15, 2010 - 12:49 pm

UMH_WAIT_EXEC should report the error if kernel_thread() fails, like
UMH_WAIT_PROC does.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---

 kernel/kmod.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- 34-rc1/kernel/kmod.c~6_UMH_WAIT_EXEC	2010-03-15 20:32:23.000000000 +0100
+++ 34-rc1/kernel/kmod.c	2010-03-15 20:37:07.000000000 +0100
@@ -235,10 +235,10 @@ static void __call_usermodehelper(struct
 	case UMH_WAIT_PROC:
 		if (pid > 0)
 			break;
-		sub_info->retval = pid;
 		/* FALLTHROUGH */
-
 	case UMH_WAIT_EXEC:
+		if (pid < 0)
+			sub_info->retval = pid;
 		complete(sub_info->complete);
 	}
 }

--

From: Oleg Nesterov
Date: Tuesday, March 16, 2010 - 12:37 pm

And some more, this time in fs/exec.c.

do_coredump() is unreadable and needs cleanups.

Perhaps we should add the new helper which opens the file or pipe...

Oleg.

--

From: Oleg Nesterov
Date: Tuesday, March 16, 2010 - 12:38 pm

do_coredump() does a lot of file checks after it opens the file or
calls usermode helper. But all of these checks are only needed in
!ispipe case.

Move this code into the "else" branch and kill the ugly repetitive
ispipe checks.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---

 fs/exec.c |   61 ++++++++++++++++++++++++++++++-------------------------------
 1 file changed, 30 insertions(+), 31 deletions(-)

--- 34-rc1/fs/exec.c~1_ISREG	2010-03-15 20:00:42.000000000 +0100
+++ 34-rc1/fs/exec.c	2010-03-16 18:06:12.000000000 +0100
@@ -1834,7 +1834,6 @@ void do_coredump(long signr, int exit_co
 	char corename[CORENAME_MAX_SIZE + 1];
 	struct mm_struct *mm = current->mm;
 	struct linux_binfmt * binfmt;
-	struct inode * inode;
 	const struct cred *old_cred;
 	struct cred *cred;
 	int retval = 0;
@@ -1911,9 +1910,6 @@ void do_coredump(long signr, int exit_co
 	ispipe = format_corename(corename, signr);
 	unlock_kernel();
 
-	if ((!ispipe) && (cprm.limit < binfmt->min_coredump))
-		goto fail_unlock;
-
  	if (ispipe) {
 		if (cprm.limit == 1) {
 			/*
@@ -1966,39 +1962,42 @@ void do_coredump(long signr, int exit_co
 			       corename);
 			goto fail_dropcount;
  		}
- 	} else
+	} else {
+		struct inode *inode;
+
+		if (cprm.limit < binfmt->min_coredump)
+			goto fail_unlock;
+
 		cprm.file = filp_open(corename,
 				 O_CREAT | 2 | O_NOFOLLOW | O_LARGEFILE | flag,
 				 0600);
-	if (IS_ERR(cprm.file))
-		goto fail_dropcount;
-	inode = cprm.file->f_path.dentry->d_inode;
-	if (inode->i_nlink > 1)
-		goto close_fail;	/* multiple links - don't dump */
-	if (!ispipe && d_unhashed(cprm.file->f_path.dentry))
-		goto close_fail;
+		if (IS_ERR(cprm.file))
+			goto fail_unlock;
 
-	/* AK: actually i see no reason to not allow this for named pipes etc.,
-	   but keep the previous behaviour for now. */
-	if (!ispipe && !S_ISREG(inode->i_mode))
-		goto close_fail;
-	/*
-	 * Dont allow local users get cute and trick others to coredump
-	 * into their pre-created ...
From: Oleg Nesterov
Date: Tuesday, March 16, 2010 - 12:38 pm

- kill "int dump_count", argv_split(argcp) accepts argcp == NULL.

- move "int dump_count" under " if (ispipe)" branch, fail_dropcount
  can check ispipe.

- move "char **helper_argv" as well, change the code to do argv_free()
  right after call_usermodehelper_fns().

- If call_usermodehelper_fns() fails goto close_fail label instead
  of closing the file by hand.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---

 fs/exec.c |   39 +++++++++++++++++----------------------
 1 file changed, 17 insertions(+), 22 deletions(-)

--- 34-rc1/fs/exec.c~2_IFIFO	2010-03-16 18:06:12.000000000 +0100
+++ 34-rc1/fs/exec.c	2010-03-16 18:09:13.000000000 +0100
@@ -1838,10 +1838,7 @@ void do_coredump(long signr, int exit_co
 	struct cred *cred;
 	int retval = 0;
 	int flag = 0;
-	int ispipe = 0;
-	char **helper_argv = NULL;
-	int helper_argc = 0;
-	int dump_count = 0;
+	int ispipe;
 	static atomic_t core_dump_count = ATOMIC_INIT(0);
 	struct coredump_params cprm = {
 		.signr = signr,
@@ -1911,6 +1908,9 @@ void do_coredump(long signr, int exit_co
 	unlock_kernel();
 
  	if (ispipe) {
+		int dump_count;
+		char **helper_argv;
+
 		if (cprm.limit == 1) {
 			/*
 			 * Normally core limits are irrelevant to pipes, since
@@ -1932,6 +1932,7 @@ void do_coredump(long signr, int exit_co
 			printk(KERN_WARNING "Aborting core\n");
 			goto fail_unlock;
 		}
+		cprm.limit = RLIM_INFINITY;
 
 		dump_count = atomic_inc_return(&core_dump_count);
 		if (core_pipe_limit && (core_pipe_limit < dump_count)) {
@@ -1941,26 +1942,21 @@ void do_coredump(long signr, int exit_co
 			goto fail_dropcount;
 		}
 
-		helper_argv = argv_split(GFP_KERNEL, corename+1, &helper_argc);
+		helper_argv = argv_split(GFP_KERNEL, corename+1, NULL);
 		if (!helper_argv) {
 			printk(KERN_WARNING "%s failed to allocate memory\n",
 			       __func__);
 			goto fail_dropcount;
 		}
 
-		cprm.limit = RLIM_INFINITY;
-
-		/* SIGPIPE can happen, but it's just never processed */
-		cprm.file = NULL;
-		if ...
From: Oleg Nesterov
Date: Tuesday, March 16, 2010 - 12:39 pm

Given that do_coredump() calls put_cred() on exit path, it is a bit ugly
to do put_cred() + "goto fail" twice, just add the new "fail_creds" label.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---

 fs/exec.c |   16 ++++++----------
 1 file changed, 6 insertions(+), 10 deletions(-)

--- 34-rc1/fs/exec.c~3_CREDS	2010-03-16 18:09:13.000000000 +0100
+++ 34-rc1/fs/exec.c	2010-03-16 19:09:50.000000000 +0100
@@ -1859,10 +1859,8 @@ void do_coredump(long signr, int exit_co
 		goto fail;
 
 	cred = prepare_creds();
-	if (!cred) {
-		retval = -ENOMEM;
+	if (!cred)
 		goto fail;
-	}
 
 	down_write(&mm->mmap_sem);
 	/*
@@ -1870,8 +1868,7 @@ void do_coredump(long signr, int exit_co
 	 */
 	if (mm->core_state || !__get_dumpable(cprm.mm_flags)) {
 		up_write(&mm->mmap_sem);
-		put_cred(cred);
-		goto fail;
+		goto fail_creds;
 	}
 
 	/*
@@ -1886,10 +1883,8 @@ void do_coredump(long signr, int exit_co
 	}
 
 	retval = coredump_wait(exit_code, &core_state);
-	if (retval < 0) {
-		put_cred(cred);
-		goto fail;
-	}
+	if (retval < 0)
+		goto fail_creds;
 
 	old_cred = override_creds(cred);
 
@@ -2006,9 +2001,10 @@ fail_dropcount:
 	if (ispipe)
 		atomic_dec(&core_dump_count);
 fail_unlock:
+	coredump_finish(mm);
 	revert_creds(old_cred);
+fail_creds:
 	put_cred(cred);
-	coredump_finish(mm);
 fail:
 	return;
 }

--

From: Oleg Nesterov
Date: Tuesday, March 16, 2010 - 12:39 pm

- move the cprm.mm_flags checks up, before we take mmap_sem

- move down_write(mmap_sem) and ->core_state check from do_coredump()
  to coredump_wait()

This simplifies the code and makes the locking symmetrical.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---

 fs/exec.c |   19 +++++++------------
 1 file changed, 7 insertions(+), 12 deletions(-)

--- 34-rc1/fs/exec.c~4_MMAP_SEM	2010-03-16 19:09:50.000000000 +0100
+++ 34-rc1/fs/exec.c	2010-03-16 19:28:23.000000000 +0100
@@ -1659,12 +1659,15 @@ static int coredump_wait(int exit_code, 
 	struct task_struct *tsk = current;
 	struct mm_struct *mm = tsk->mm;
 	struct completion *vfork_done;
-	int core_waiters;
+	int core_waiters = -EBUSY;
 
 	init_completion(&core_state->startup);
 	core_state->dumper.task = tsk;
 	core_state->dumper.next = NULL;
-	core_waiters = zap_threads(tsk, mm, core_state, exit_code);
+
+	down_write(&mm->mmap_sem);
+	if (!mm->core_state)
+		core_waiters = zap_threads(tsk, mm, core_state, exit_code);
 	up_write(&mm->mmap_sem);
 
 	if (unlikely(core_waiters < 0))
@@ -1857,20 +1860,12 @@ void do_coredump(long signr, int exit_co
 	binfmt = mm->binfmt;
 	if (!binfmt || !binfmt->core_dump)
 		goto fail;
+	if (!__get_dumpable(cprm.mm_flags))
+		goto fail;
 
 	cred = prepare_creds();
 	if (!cred)
 		goto fail;
-
-	down_write(&mm->mmap_sem);
-	/*
-	 * If another thread got here first, or we are not dumpable, bail out.
-	 */
-	if (mm->core_state || !__get_dumpable(cprm.mm_flags)) {
-		up_write(&mm->mmap_sem);
-		goto fail_creds;
-	}
-
 	/*
 	 *	We cannot trust fsuid as being the "true" uid of the
 	 *	process nor do we know its entire history. We only know it

--

Previous thread: [PATCH, resend] x86: fix placement of FIX_OHCI1394_BASE by Jan Beulich on Monday, March 15, 2010 - 3:11 am. (3 messages)

Next thread: [PATCH] mm: remove return value of putback_lru_pages by Minchan Kim on Monday, March 15, 2010 - 6:16 am. (4 messages)