[PATCH 04/24] CRED: Neuter sys_capset() [ver #7]

Previous thread: none

Next thread: ACPI: Properly clear flags on false-positives and send uevent on sudden unplug by Holger Macht on Wednesday, August 6, 2008 - 8:56 am. (3 messages)
From: David Howells
Date: Wednesday, August 6, 2008 - 8:37 am

I've brought my patchset up to date with regards the recent merge melee and
built the patches on top of the next branch of James's security testing tree as
per his request.

A tarball of these patches can be retrieved from:

	http://people.redhat.com/~dhowells/cow-creds-7.tar.bz2

I've been testing these patches with the LTP syscalls and SELinux test scripts.

---
There are three parts to this project:

 (1) Implement COW credentials.

 (2) Pass the cred pointer through the vfs_xxx() functions and suchlike to all
     the places that need them.

 (3) Document it.

I'm intending to use this code to implement FS-Cache/CacheFiles, but it could
also be used for NFSD.


The associated patches implement (1) and part of (3).  Some things to note:

 (a) All of {,e,s,fs}{u,g}id and supplementary groups, capabilities, secure
     bits, keyrings, and the task security pointer have migrated into struct
     cred.

 (b) Changing a tasks credentials involves creating a new struct cred (call
     prepare_creds()) and then using RCU to change things over (call
     commit_creds()).

 (c) task_struct::cred is a const struct cred *, as are all pointers that
     aren't used specifically for creating new credentials.  This catches
     places that are changing creds when they shouldn't be at compile time.

     To get a new ref on a const cred, use get_cred() which casts away the
     const and calls atomic_inc().

 (d) It is no longer possible for a task to instantiate another task's
     keyrings.  The keyrings code tries to make sure that the required keyrings
     are present in request_key(), and redirects any attempt to nominate a
     process-specific keyring when instantiating a key to whatever keyring was
     suggested by sys_request_key() (or it uses the default).

 (e) sys_capset() is neutered: it can only affect the caller.

 (f) execve() is cleaner.  The changes are all worked out in a new set of
     credentials, then the whole lot is installed in ...
From: David Howells
Date: Wednesday, August 6, 2008 - 8:38 am

Wrap access to SELinux's task SID, using task_sid() and current_sid() as
appropriate.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: James Morris <jmorris@namei.org>
---

 security/selinux/hooks.c |  416 ++++++++++++++++++++++++----------------------
 1 files changed, 220 insertions(+), 196 deletions(-)


diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
index 1efc990..27d1779 100644
--- a/security/selinux/hooks.c
+++ b/security/selinux/hooks.c
@@ -171,10 +171,35 @@ static int cred_alloc_security(struct cred *cred)
 	return 0;
 }
 
+/*
+ * get the security ID of a task
+ */
+static inline u32 task_sid(const struct task_struct *task)
+{
+	const struct task_security_struct *tsec;
+	u32 sid;
+
+	rcu_read_lock();
+	tsec = __task_cred(task)->security;
+	sid = tsec->sid;
+	rcu_read_unlock();
+	return sid;
+}
+
+/*
+ * get the security ID of the current task
+ */
+static inline u32 current_sid(void)
+{
+	const struct task_security_struct *tsec = current_cred()->security;
+
+	return tsec->sid;
+}
+
 static int inode_alloc_security(struct inode *inode)
 {
-	struct task_security_struct *tsec = current->cred->security;
 	struct inode_security_struct *isec;
+	u32 sid = current_sid();
 
 	isec = kmem_cache_zalloc(sel_inode_cache, GFP_NOFS);
 	if (!isec)
@@ -185,7 +210,7 @@ static int inode_alloc_security(struct inode *inode)
 	isec->inode = inode;
 	isec->sid = SECINITSID_UNLABELED;
 	isec->sclass = SECCLASS_FILE;
-	isec->task_sid = tsec->sid;
+	isec->task_sid = sid;
 	inode->i_security = isec;
 
 	return 0;
@@ -207,15 +232,15 @@ static void inode_free_security(struct inode *inode)
 
 static int file_alloc_security(struct file *file)
 {
-	struct task_security_struct *tsec = current->cred->security;
 	struct file_security_struct *fsec;
+	u32 sid = current_sid();
 
 	fsec = kzalloc(sizeof(struct file_security_struct), GFP_KERNEL);
 	if (!fsec)
 		return -ENOMEM;
 
-	fsec->sid = tsec->sid;
-	fsec->fown_sid = ...
From: David Howells
Date: Wednesday, August 6, 2008 - 8:38 am

Rename is_single_threaded() to is_wq_single_threaded() so that a new
is_single_threaded() can be created that refers to tasks rather than
waitqueues.

Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: James Morris <jmorris@namei.org>
---

 kernel/workqueue.c |    8 ++++----
 1 files changed, 4 insertions(+), 4 deletions(-)


diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 4a26a13..e4e2bd3 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -83,21 +83,21 @@ static cpumask_t cpu_singlethread_map __read_mostly;
 static cpumask_t cpu_populated_map __read_mostly;
 
 /* If it's single threaded, it isn't in the list of workqueues. */
-static inline int is_single_threaded(struct workqueue_struct *wq)
+static inline int is_wq_single_threaded(struct workqueue_struct *wq)
 {
 	return wq->singlethread;
 }
 
 static const cpumask_t *wq_cpu_map(struct workqueue_struct *wq)
 {
-	return is_single_threaded(wq)
+	return is_wq_single_threaded(wq)
 		? &cpu_singlethread_map : &cpu_populated_map;
 }
 
 static
 struct cpu_workqueue_struct *wq_per_cpu(struct workqueue_struct *wq, int cpu)
 {
-	if (unlikely(is_single_threaded(wq)))
+	if (unlikely(is_wq_single_threaded(wq)))
 		cpu = singlethread_cpu;
 	return per_cpu_ptr(wq->cpu_wq, cpu);
 }
@@ -767,7 +767,7 @@ init_cpu_workqueue(struct workqueue_struct *wq, int cpu)
 static int create_workqueue_thread(struct cpu_workqueue_struct *cwq, int cpu)
 {
 	struct workqueue_struct *wq = cwq->wq;
-	const char *fmt = is_single_threaded(wq) ? "%s" : "%s/%d";
+	const char *fmt = is_wq_single_threaded(wq) ? "%s" : "%s/%d";
 	struct task_struct *p;
 
 	p = kthread_create(worker_thread, cwq, fmt, wq->name, cpu);

--

From: David Howells
Date: Wednesday, August 6, 2008 - 8:38 am

Prettify commoncap.c.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Reviewed-by: James Morris <jmorris@namei.org>
---

 security/commoncap.c |  304 +++++++++++++++++++++++++++++++++++++++++---------
 1 files changed, 247 insertions(+), 57 deletions(-)


diff --git a/security/commoncap.c b/security/commoncap.c
index 3da0ada..c91b1af 100644
--- a/security/commoncap.c
+++ b/security/commoncap.c
@@ -39,16 +39,22 @@ int cap_netlink_recv(struct sk_buff *skb, int cap)
 		return -EPERM;
 	return 0;
 }
-
 EXPORT_SYMBOL(cap_netlink_recv);
 
-/*
+/**
+ * cap_capable - Determine whether a task has a particular effective capability
+ * @tsk: The task to query
+ * @cap: The capability to check for
+ *
+ * Determine whether the nominated task has the specified capability amongst
+ * its effective set, returning 0 if it does, -ve if it does not.
+ *
  * NOTE WELL: cap_capable() cannot be used like the kernel's capable()
- * function.  That is, it has the reverse semantics: cap_capable()
- * returns 0 when a task has a capability, but the kernel's capable()
- * returns 1 for this case.
+ * function.  That is, it has the reverse semantics: cap_capable() returns 0
+ * when a task has a capability, but the kernel's capable() returns 1 for this
+ * case.
  */
-int cap_capable (struct task_struct *tsk, int cap)
+int cap_capable(struct task_struct *tsk, int cap)
 {
 	__u32 cap_raised;
 
@@ -59,6 +65,14 @@ int cap_capable (struct task_struct *tsk, int cap)
 	return cap_raised ? 0 : -EPERM;
 }
 
+/**
+ * cap_settime - Determine whether the current process may set the system clock
+ * @ts: The time to set
+ * @tz: The timezone to set
+ *
+ * Determine whether the current process may set the system clock and timezone
+ * information, returning 0 if permission granted, -ve if denied.
+ */
 int cap_settime(struct timespec *ts, struct timezone *tz)
 {
 	if (!capable(CAP_SYS_TIME))
@@ -66,6 +80,15 @@ int cap_settime(struct timespec ...
From: David Howells
Date: Wednesday, August 6, 2008 - 8:38 am

Make inode_has_perm() and file_has_perm() take a cred pointer rather than a
task pointer.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: James Morris <jmorris@namei.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
---

 security/selinux/hooks.c |  140 ++++++++++++++++++++++++++++++----------------
 1 files changed, 92 insertions(+), 48 deletions(-)


diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
index 27d1779..fb76940 100644
--- a/security/selinux/hooks.c
+++ b/security/selinux/hooks.c
@@ -172,16 +172,25 @@ static int cred_alloc_security(struct cred *cred)
 }
 
 /*
+ * get the security ID of a set of credentials
+ */
+static inline u32 cred_sid(const struct cred *cred)
+{
+	const struct task_security_struct *tsec;
+
+	tsec = cred->security;
+	return tsec->sid;
+}
+
+/*
  * get the security ID of a task
  */
 static inline u32 task_sid(const struct task_struct *task)
 {
-	const struct task_security_struct *tsec;
 	u32 sid;
 
 	rcu_read_lock();
-	tsec = __task_cred(task)->security;
-	sid = tsec->sid;
+	sid = cred_sid(__task_cred(task));
 	rcu_read_unlock();
 	return sid;
 }
@@ -196,6 +205,8 @@ static inline u32 current_sid(void)
 	return tsec->sid;
 }
 
+/* Allocate and free functions for each kind of security blob. */
+
 static int inode_alloc_security(struct inode *inode)
 {
 	struct inode_security_struct *isec;
@@ -1366,7 +1377,7 @@ static inline u32 signal_to_av(int sig)
 }
 
 /*
- * Check permission betweeen a pair of tasks, e.g. signal checks,
+ * Check permission between a pair of tasks, e.g. signal checks,
  * fork check, ptrace check, etc.
  * tsk1 is the actor and tsk2 is the target
  */
@@ -1429,7 +1440,7 @@ static int task_has_system(struct task_struct *tsk,
 /* Check whether a task has a particular permission to an inode.
    The 'adp' parameter is optional and allows other audit
    data to be passed (e.g. the dentry). */
-static int inode_has_perm(struct task_struct *tsk,
+static int ...
From: David Howells
Date: Wednesday, August 6, 2008 - 8:38 am

Separate per-task-group keyrings from signal_struct and dangle their anchor
from the cred struct rather than the signal_struct.

Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: James Morris <jmorris@namei.org>
---

 include/linux/cred.h         |   16 +++++++
 include/linux/key.h          |    8 +--
 include/linux/sched.h        |    6 ---
 kernel/cred.c                |   61 ++++++++++++++++++++++++++
 kernel/fork.c                |    8 ---
 security/keys/process_keys.c |   99 +++++++++++++++++-------------------------
 security/keys/request_key.c  |   34 ++++++--------
 7 files changed, 132 insertions(+), 100 deletions(-)


diff --git a/include/linux/cred.h b/include/linux/cred.h
index 5c4e098..28e1d0e 100644
--- a/include/linux/cred.h
+++ b/include/linux/cred.h
@@ -72,6 +72,21 @@ extern int in_group_p(gid_t);
 extern int in_egroup_p(gid_t);
 
 /*
+ * The common credentials for a thread group
+ * - shared by CLONE_THREAD
+ */
+#ifdef CONFIG_KEYS
+struct thread_group_cred {
+	atomic_t	usage;
+	pid_t		tgid;			/* thread group process ID */
+	spinlock_t	lock;
+	struct key	*session_keyring;	/* keyring inherited over fork */
+	struct key	*process_keyring;	/* keyring private to this process */
+	struct rcu_head	rcu;			/* RCU deletion hook */
+};
+#endif
+
+/*
  * The security context of a task
  *
  * The parts of the context break down into two categories:
@@ -114,6 +129,7 @@ struct cred {
 					 * keys to */
 	struct key	*thread_keyring; /* keyring private to this thread */
 	struct key	*request_key_auth; /* assumed request_key authority */
+	struct thread_group_cred *tgcred; /* thread-group shared credentials */
 #endif
 #ifdef CONFIG_SECURITY
 	void		*security;	/* subjective LSM security */
diff --git a/include/linux/key.h b/include/linux/key.h
index 599a37c..1a18ae9 100644
--- a/include/linux/key.h
+++ b/include/linux/key.h
@@ -278,9 +278,7 @@ extern ctl_table key_sysctls[];
  */
 extern void switch_uid_keyring(struct user_struct ...
From: David Howells
Date: Wednesday, August 6, 2008 - 8:38 am

Attach creds to file structs and discard f_uid/f_gid.

file_operations::open() methods (such as hppfs_open()) should use file->f_cred
rather than current_cred().  At the moment file->f_cred will be current_cred()
at this point.

Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: James Morris <jmorris@namei.org>
---

 arch/mips/kernel/vpe.c              |    4 ++--
 drivers/isdn/hysdn/hysdn_procconf.c |    6 ++++--
 fs/coda/file.c                      |    2 +-
 fs/file_table.c                     |    7 ++++---
 fs/hppfs/hppfs.c                    |    4 ++--
 include/linux/fs.h                  |    2 +-
 net/ipv4/netfilter/ipt_LOG.c        |    4 ++--
 net/ipv6/netfilter/ip6t_LOG.c       |    4 ++--
 net/netfilter/nfnetlink_log.c       |    5 +++--
 net/netfilter/xt_owner.c            |   16 ++++++++--------
 net/sched/cls_flow.c                |    4 ++--
 11 files changed, 31 insertions(+), 27 deletions(-)


diff --git a/arch/mips/kernel/vpe.c b/arch/mips/kernel/vpe.c
index 972b2d2..09786e4 100644
--- a/arch/mips/kernel/vpe.c
+++ b/arch/mips/kernel/vpe.c
@@ -1085,8 +1085,8 @@ static int vpe_open(struct inode *inode, struct file *filp)
 	v->load_addr = NULL;
 	v->len = 0;
 
-	v->uid = filp->f_uid;
-	v->gid = filp->f_gid;
+	v->uid = filp->f_cred->fsuid;
+	v->gid = filp->f_cred->fsgid;
 
 #ifdef CONFIG_MIPS_APSP_KSPD
 	/* get kspd to tell us when a syscall_exit happens */
diff --git a/drivers/isdn/hysdn/hysdn_procconf.c b/drivers/isdn/hysdn/hysdn_procconf.c
index 484299b..8f9f491 100644
--- a/drivers/isdn/hysdn/hysdn_procconf.c
+++ b/drivers/isdn/hysdn/hysdn_procconf.c
@@ -246,7 +246,8 @@ hysdn_conf_open(struct inode *ino, struct file *filep)
 	}
 	if (card->debug_flags & (LOG_PROC_OPEN | LOG_PROC_ALL))
 		hysdn_addlog(card, "config open for uid=%d gid=%d mode=0x%x",
-			     filep->f_uid, filep->f_gid, filep->f_mode);
+			     filep->f_cred->fsuid, filep->f_cred->fsgid,
+			     filep->f_mode);
 
 	if ((filep->f_mode & (FMODE_READ | ...
From: David Howells
Date: Wednesday, August 6, 2008 - 8:38 am

Pass credentials through dentry_open() so that the COW creds patch can have
SELinux's flush_unauthorized_files() pass the appropriate creds back to itself
when it opens its null chardev.

The security_dentry_open() call also now takes a creds pointer, as does the
dentry_open hook in struct security_operations.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: James Morris <jmorris@namei.org>
---

 fs/ecryptfs/ecryptfs_kernel.h |    3 ++-
 fs/ecryptfs/kthread.c         |    9 +++++----
 fs/ecryptfs/main.c            |    3 ++-
 fs/exportfs/expfs.c           |    4 +++-
 fs/hppfs/hppfs.c              |    6 ++++--
 fs/nfsctl.c                   |    3 ++-
 fs/nfsd/nfs4recover.c         |    3 ++-
 fs/nfsd/vfs.c                 |    3 ++-
 fs/open.c                     |   17 +++++++++++------
 fs/xfs/linux-2.6/xfs_ioctl.c  |    3 ++-
 include/linux/fs.h            |    4 +++-
 include/linux/security.h      |    7 ++++---
 ipc/mqueue.c                  |   11 +++++++----
 security/capability.c         |    2 +-
 security/security.c           |    4 ++--
 security/selinux/hooks.c      |   15 +++++++++------
 16 files changed, 61 insertions(+), 36 deletions(-)


diff --git a/fs/ecryptfs/ecryptfs_kernel.h b/fs/ecryptfs/ecryptfs_kernel.h
index b73fb75..e73414f 100644
--- a/fs/ecryptfs/ecryptfs_kernel.h
+++ b/fs/ecryptfs/ecryptfs_kernel.h
@@ -709,7 +709,8 @@ int ecryptfs_init_kthread(void);
 void ecryptfs_destroy_kthread(void);
 int ecryptfs_privileged_open(struct file **lower_file,
 			     struct dentry *lower_dentry,
-			     struct vfsmount *lower_mnt);
+			     struct vfsmount *lower_mnt,
+			     const struct cred *cred);
 int ecryptfs_init_persistent_file(struct dentry *ecryptfs_dentry);
 
 #endif /* #ifndef ECRYPTFS_KERNEL_H */
diff --git a/fs/ecryptfs/kthread.c b/fs/ecryptfs/kthread.c
index c440c6b..c6d7a4d 100644
--- a/fs/ecryptfs/kthread.c
+++ b/fs/ecryptfs/kthread.c
@@ -73,7 +73,7 @@ static int ecryptfs_threadfn(void *ignored)
 ...
From: David Howells
Date: Wednesday, August 6, 2008 - 8:38 am

Inaugurate copy-on-write credentials management.  This uses RCU to manage the
credentials pointer in the task_struct with respect to accesses by other tasks.
A process may only modify its own credentials, and so does not need locking to
access or modify its own credentials.

A mutex (cred_replace_mutex) is added to the task_struct to control the effect
of PTRACE_ATTACHED on credential calculations, particularly with respect to
execve().

With this patch, the contents of an active credentials struct may not be
changed directly; rather a new set of credentials must be prepared, modified
and committed using something like the following sequence of events:

	struct cred *new = prepare_creds();
	int ret = blah(new);
	if (ret < 0) {
		abort_creds(new);
		return ret;
	}
	return commit_creds(new);

There are some exceptions to this rule: the keyrings pointed to by the active
credentials may be instantiated - keyrings violate the COW rule as managing
COW keyrings is tricky, given that it is possible for a task to directly alter
the keys in a keyring in use by another task.

This patch and the preceding patches have been tested with the LTP SELinux
testsuite.


This patch makes several logical sets of alteration:

 (1) execve().

     This now prepares and commits credentials in various places in the
     security code rather than altering the current creds directly.

 (2) Temporary credential overrides.

     do_coredump() and sys_faccessat() now prepare their own credentials and
     temporarily override the ones currently on the acting thread, whilst
     preventing interference from other threads by holding cred_replace_mutex
     on the thread being dumped.

     This will be replaced in a future patch by something that hands down the
     credentials directly to the functions being called, rather than altering
     the task's objective credentials.

 (3) LSM interface.

     A number of functions have been changed, added or removed:

     (*) ...
From: David Howells
Date: Wednesday, August 6, 2008 - 8:38 am

Make execve() take advantage of copy-on-write credentials, allowing it to set
up the credentials in advance, and then commit the whole lot after the point
of no return.

This patch and the preceding patches have been tested with the LTP SELinux
testsuite.

This patch makes several logical sets of alteration:

 (1) execve().

     The credential bits from struct linux_binprm are, for the most part,
     replaced with a single credentials pointer (bprm->cred).  This means that
     all the creds can be calculated in advance and then applied at the point
     of no return with no possibility of failure.

     I would like to replace bprm->cap_effective with:

	cap_isclear(bprm->cap_effective)

     but this seems impossible due to special behaviour for processes of pid 1
     (they always retain their parent's capability masks where normally they'd
     be changed - see cap_bprm_set_creds()).

     The following sequence of events now happens:

     (a) At the start of do_execve, the current task's cred_exec_mutex is
     	 locked to prevent PTRACE_ATTACH from obsoleting the calculation of
     	 creds that we make.

     (a) prepare_exec_creds() is then called to make a copy of the current
     	 task's credentials and prepare it.  This copy is then assigned to
     	 bprm->cred.

  	 This renders security_bprm_alloc() and security_bprm_free()
     	 unnecessary, and so they've been removed.

     (b) The determination of unsafe execution is now performed immediately
     	 after (a) rather than later on in the code.  The result is stored in
     	 bprm->unsafe for future reference.

     (c) prepare_binprm() is called, possibly multiple times.

     	 (i) This applies the result of set[ug]id binaries to the new creds
     	     attached to bprm->cred.  Personality bit clearance is recorded,
     	     but now deferred on the basis that the exec procedure may yet
     	     fail.

         (ii) This then calls the new security_bprm_set_creds().  This should
	     ...
From: David Howells
Date: Wednesday, August 6, 2008 - 8:39 am

Document credentials and the new credentials API.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 Documentation/credentials.txt |  563 +++++++++++++++++++++++++++++++++++++++++
 1 files changed, 563 insertions(+), 0 deletions(-)
 create mode 100644 Documentation/credentials.txt


diff --git a/Documentation/credentials.txt b/Documentation/credentials.txt
new file mode 100644
index 0000000..3caf1ea
--- /dev/null
+++ b/Documentation/credentials.txt
@@ -0,0 +1,563 @@
+			     ====================
+			     CREDENTIALS IN LINUX
+			     ====================
+
+By: David Howells <dhowells@redhat.com>
+
+Contents:
+
+ (*) Overview.
+
+ (*) Types of credentials.
+
+ (*) File markings.
+
+ (*) Task credentials.
+
+     - Accessing task credentials.
+     - Accessing another task's credentials.
+     - Altering credentials.
+     - Managing credentials.
+
+ (*) Open file credentials.
+
+ (*) Overriding the VFS's use of credentials.
+
+
+========
+OVERVIEW
+========
+
+There are several parts to the security check performed by Linux when one
+object acts upon another:
+
+ (1) Objects.
+
+     Objects are things in the system that may be acted upon directly by
+     userspace programs.  Linux has a variety of actionable objects, including:
+
+	- Tasks
+	- Files/inodes
+	- Sockets
+	- Message queues
+	- Shared memory segments
+	- Semaphores
+	- Keys
+
+     As a part of the description of all these objects there is a set of
+     credentials.  What's in the set depends on the type of object.
+
+ (2) Object ownership.
+
+     Amongst the credentials of most objects, there will be a subset that
+     indicates the ownership of that object.  This is used for resource
+     accounting and limitation (disk quotas and task rlimits for example).
+
+     In a standard UNIX filesystem, for instance, this will be defined by the
+     UID marked on the inode.
+
+ (3) The objective context.     
+
+     Also amongst the credentials of those objects, ...
From: David Howells
Date: Wednesday, August 6, 2008 - 8:39 am

Allow kernel services to override LSM settings appropriate to the actions
performed by a task by duplicating a set of credentials, modifying it and then
using task_struct::cred to point to it when performing operations on behalf of
a task.

This is used, for example, by CacheFiles which has to transparently access the
cache on behalf of a process that thinks it is doing, say, NFS accesses with a
potentially inappropriate (with respect to accessing the cache) set of
credentials.

This patch provides two LSM hooks for modifying a task security record:

 (*) security_kernel_act_as() which allows modification of the security datum
     with which a task acts on other objects (most notably files).

 (*) security_kernel_create_files_as() which allows modification of the
     security datum that is used to initialise the security data on a file that
     a task creates.

The patch also provides four new credentials handling functions, which wrap the
LSM functions:

 (1) prepare_kernel_cred()

     Prepare a set of credentials for a kernel service to use, based either on
     a daemon's credentials or on init_cred.  All the keyrings are cleared.

 (2) set_security_override()

     Set the LSM security ID in a set of credentials to a specific security
     context, assuming permission from the LSM policy.

 (3) set_security_override_from_ctx()

     As (2), but takes the security context as a string.

 (4) set_create_files_as()

     Set the file creation LSM security ID in a set of credentials to be the
     same as that on a particular inode.

Signed-off-by: Casey Schaufler <casey@schaufler-ca.com> [Smack changes]
Signed-off-by: David Howells <dhowells@redhat.com>
---

 include/linux/cred.h       |    6 ++
 include/linux/security.h   |   28 +++++++++++
 kernel/cred.c              |  113 ++++++++++++++++++++++++++++++++++++++++++++
 security/capability.c      |   12 +++++
 security/security.c        |   10 ++++
 security/selinux/hooks.c   |   46 ++++++++++++++++++
 ...
From: David Howells
Date: Wednesday, August 6, 2008 - 8:39 am

Add a 'kernel_service' object class to SELinux and give this object class two
access vectors: 'use_as_override' and 'create_files_as'.

The first vector is used to grant a process the right to nominate an alternate
process security ID for the kernel to use as an override for the SELinux
subjective security when accessing stuff on behalf of another process.

For example, CacheFiles when accessing the cache on behalf on a process
accessing an NFS file needs to use a subjective security ID appropriate to the
cache rather then the one the calling process is using.  The cachefilesd
daemon will nominate the security ID to be used.

The second vector is used to grant a process the right to nominate a file
creation label for a kernel service to use.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 security/selinux/include/av_perm_to_string.h |    2 ++
 security/selinux/include/av_permissions.h    |    2 ++
 security/selinux/include/class_to_string.h   |    5 +++++
 security/selinux/include/flask.h             |    1 +
 4 files changed, 10 insertions(+), 0 deletions(-)


diff --git a/security/selinux/include/av_perm_to_string.h b/security/selinux/include/av_perm_to_string.h
index 1223b4f..c0c8854 100644
--- a/security/selinux/include/av_perm_to_string.h
+++ b/security/selinux/include/av_perm_to_string.h
@@ -176,3 +176,5 @@
    S_(SECCLASS_DCCP_SOCKET, DCCP_SOCKET__NAME_CONNECT, "name_connect")
    S_(SECCLASS_MEMPROTECT, MEMPROTECT__MMAP_ZERO, "mmap_zero")
    S_(SECCLASS_PEER, PEER__RECV, "recv")
+   S_(SECCLASS_KERNEL_SERVICE, KERNEL_SERVICE__USE_AS_OVERRIDE, "use_as_override")
+   S_(SECCLASS_KERNEL_SERVICE, KERNEL_SERVICE__CREATE_FILES_AS, "create_files_as")
diff --git a/security/selinux/include/av_permissions.h b/security/selinux/include/av_permissions.h
index c4c5116..0ba79fe 100644
--- a/security/selinux/include/av_permissions.h
+++ b/security/selinux/include/av_permissions.h
@@ -841,3 +841,5 @@
 #define DCCP_SOCKET__NAME_CONNECT                 0x00800000UL
 ...
From: David Howells
Date: Wednesday, August 6, 2008 - 8:39 am

Differentiate the objective and real subjective credentials from the effective
subjective credentials on a task by introducing a second credentials pointer
into the task_struct.

task_struct::real_cred then refers to the objective and apparent real
subjective credentials of a task, as perceived by the other tasks in the
system.

task_struct::cred then refers to the effective subjective credentials of a
task, as used by that task when it's actually running.  These are not visible
to the other tasks in the system.

__task_cred(task) then refers to the objective/real credentials of the task in
question.

current_cred() refers to the effective subjective credentials of the current
task.

prepare_creds() uses the objective creds as a base and commit_creds() changes
both pointers in the task_struct (indeed commit_creds() requires them to be the
same).

override_creds() and revert_creds() change the subjective creds pointer only,
and the former returns the old subjective creds.  These are used by NFSD,
faccessat() and do_coredump(), and will by used by CacheFiles.

In SELinux, current_has_perm() is provided as an alternative to
task_has_perm().  This uses the effective subjective context of current,
whereas task_has_perm() uses the objective/real context of the subject.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 fs/nfsd/auth.c            |    5 +++
 include/linux/cred.h      |   29 ++++++++++----------
 include/linux/init_task.h |    1 +
 include/linux/sched.h     |    5 +++
 kernel/cred.c             |   38 ++++++++++++++++++--------
 kernel/fork.c             |    6 +++-
 security/selinux/hooks.c  |   65 ++++++++++++++++++++++++++++-----------------
 7 files changed, 95 insertions(+), 54 deletions(-)


diff --git a/fs/nfsd/auth.c b/fs/nfsd/auth.c
index 836ffa1..0184fe9 100644
--- a/fs/nfsd/auth.c
+++ b/fs/nfsd/auth.c
@@ -34,6 +34,8 @@ int nfsd_setuser(struct svc_rqst *rqstp, struct svc_export *exp)
 	int flags = nfsexp_flags(rqstp, exp);
 	int ret;
 ...
From: David Howells
Date: Wednesday, August 6, 2008 - 8:37 am

Fix the setting of PF_SUPERPRIV by __capable() as it could corrupt the flags
the target process if that is not the current process and it is trying to
change its own flags in a different way at the same time.

__capable() is using neither atomic ops nor locking to protect t->flags.  This
patch removes __capable() and introduces has_capability() that doesn't set
PF_SUPERPRIV on the process being queried.

This patch further splits security_ptrace() in two:

 (1) security_ptrace_may_access().  This passes judgement on whether one
     process may access another only (PTRACE_MODE_ATTACH for ptrace() and
     PTRACE_MODE_READ for /proc), and takes a pointer to the child process.
     current is the parent.

 (2) security_ptrace_traceme().  This passes judgement on PTRACE_TRACEME only,
     and takes only a pointer to the parent process.  current is the child.

     In Smack and commoncap, this uses has_capability() to determine whether
     the parent will be permitted to use PTRACE_ATTACH if normal checks fail.
     This does not set PF_SUPERPRIV.


Two of the instances of __capable() actually only act on current, and so have
been changed to calls to capable().

Of the places that were using __capable():

 (1) The OOM killer calls __capable() thrice when weighing the killability of a
     process.  All of these now use has_capability().

 (2) cap_ptrace() and smack_ptrace() were using __capable() to check to see
     whether the parent was allowed to trace any process.  As mentioned above,
     these have been split.  For PTRACE_ATTACH and /proc, capable() is now
     used, and for PTRACE_TRACEME, has_capability() is used.

 (3) cap_safe_nice() only ever saw current, so now uses capable().

 (4) smack_setprocattr() rejected accesses to tasks other than current just
     after calling __capable(), so the order of these two tests have been
     switched and capable() is used instead.

 (5) In smack_file_send_sigiotask(), we need to allow privileged processes to
     ...
From: David Howells
Date: Wednesday, August 6, 2008 - 8:37 am

Disperse the bits of linux/key_ui.h as the reason they were put here (keyfs)
didn't get in.

Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: James Morris <jmorris@namei.org>
---

 include/keys/keyring-type.h |   31 ++++++++++++++++++++
 include/linux/key-ui.h      |   66 -------------------------------------------
 security/keys/internal.h    |   31 ++++++++++++++++++++
 security/keys/keyring.c     |    1 +
 security/keys/request_key.c |    2 +
 5 files changed, 64 insertions(+), 67 deletions(-)
 create mode 100644 include/keys/keyring-type.h
 delete mode 100644 include/linux/key-ui.h


diff --git a/include/keys/keyring-type.h b/include/keys/keyring-type.h
new file mode 100644
index 0000000..843f872
--- /dev/null
+++ b/include/keys/keyring-type.h
@@ -0,0 +1,31 @@
+/* Keyring key type
+ *
+ * Copyright (C) 2008 Red Hat, Inc. All Rights Reserved.
+ * Written by David Howells (dhowells@redhat.com)
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version
+ * 2 of the License, or (at your option) any later version.
+ */
+
+#ifndef _KEYS_KEYRING_TYPE_H
+#define _KEYS_KEYRING_TYPE_H
+
+#include <linux/key.h>
+#include <linux/rcupdate.h>
+
+/*
+ * the keyring payload contains a list of the keys to which the keyring is
+ * subscribed
+ */
+struct keyring_list {
+	struct rcu_head	rcu;		/* RCU deletion hook */
+	unsigned short	maxkeys;	/* max keys this list can hold */
+	unsigned short	nkeys;		/* number of keys currently held */
+	unsigned short	delkey;		/* key to be unlinked by RCU */
+	struct key	*keys[0];
+};
+
+
+#endif /* _KEYS_KEYRING_TYPE_H */
diff --git a/include/linux/key-ui.h b/include/linux/key-ui.h
deleted file mode 100644
index e8b8a7a..0000000
--- a/include/linux/key-ui.h
+++ /dev/null
@@ -1,66 +0,0 @@
-/* key-ui.h: key userspace interface stuff
- *
- * Copyright (C) 2004 Red Hat, Inc. All ...
From: David Howells
Date: Wednesday, August 6, 2008 - 8:37 am

Constify the kernel_cap_t arguments to the capset LSM hooks.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Acked-by: James Morris <jmorris@namei.org>
---

 include/linux/security.h |   44 ++++++++++++++++++++++++--------------------
 security/commoncap.c     |   10 ++++++----
 security/security.c      |   12 ++++++------
 security/selinux/hooks.c |   10 ++++++----
 4 files changed, 42 insertions(+), 34 deletions(-)


diff --git a/include/linux/security.h b/include/linux/security.h
index dc23a3d..c3eed5a 100644
--- a/include/linux/security.h
+++ b/include/linux/security.h
@@ -49,8 +49,12 @@ extern int cap_settime(struct timespec *ts, struct timezone *tz);
 extern int cap_ptrace_may_access(struct task_struct *child, unsigned int mode);
 extern int cap_ptrace_traceme(struct task_struct *parent);
 extern int cap_capget(struct task_struct *target, kernel_cap_t *effective, kernel_cap_t *inheritable, kernel_cap_t *permitted);
-extern int cap_capset_check(kernel_cap_t *effective, kernel_cap_t *inheritable, kernel_cap_t *permitted);
-extern void cap_capset_set(kernel_cap_t *effective, kernel_cap_t *inheritable, kernel_cap_t *permitted);
+extern int cap_capset_check(const kernel_cap_t *effective,
+			    const kernel_cap_t *inheritable,
+			    const kernel_cap_t *permitted);
+extern void cap_capset_set(const kernel_cap_t *effective,
+			   const kernel_cap_t *inheritable,
+			   const kernel_cap_t *permitted);
 extern int cap_bprm_set_security(struct linux_binprm *bprm);
 extern void cap_bprm_apply_creds(struct linux_binprm *bprm, int unsafe);
 extern int cap_bprm_secureexec(struct linux_binprm *bprm);
@@ -1289,12 +1293,12 @@ struct security_operations {
 	int (*capget) (struct task_struct *target,
 		       kernel_cap_t *effective,
 		       kernel_cap_t *inheritable, kernel_cap_t *permitted);
-	int (*capset_check) (kernel_cap_t *effective,
-			     kernel_cap_t *inheritable,
-			     kernel_cap_t *permitted);
-	void (*capset_set) ...
From: David Howells
Date: Wednesday, August 6, 2008 - 8:37 am

Take away the ability for sys_capset() to affect processes other than current.

This means that current will not need to lock its own credentials when reading
them against interference by other processes.

This has effectively been the case for a while anyway, since:

 (1) Without LSM enabled, sys_capset() is disallowed.

 (2) With file-based capabilities, sys_capset() is neutered.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Acked-by: Andrew G. Morgan <morgan@kernel.org>
Acked-by: James Morris <jmorris@namei.org>
---

 fs/open.c                |   12 ---
 include/linux/security.h |   48 +++-------
 kernel/capability.c      |  215 +++-------------------------------------------
 security/commoncap.c     |   29 ++----
 security/security.c      |   18 ++--
 security/selinux/hooks.c |   10 +-
 6 files changed, 52 insertions(+), 280 deletions(-)


diff --git a/fs/open.c b/fs/open.c
index 07da935..0f410b1 100644
--- a/fs/open.c
+++ b/fs/open.c
@@ -441,17 +441,7 @@ asmlinkage long sys_faccessat(int dfd, const char __user *filename, int mode)
 	current->fsgid = current->gid;
 
 	if (!issecure(SECURE_NO_SETUID_FIXUP)) {
-		/*
-		 * Clear the capabilities if we switch to a non-root user
-		 */
-#ifndef CONFIG_SECURITY_FILE_CAPABILITIES
-		/*
-		 * FIXME: There is a race here against sys_capset.  The
-		 * capabilities can change yet we will restore the old
-		 * value below.  We should hold task_capabilities_lock,
-		 * but we cannot because user_path_at can sleep.
-		 */
-#endif /* ndef CONFIG_SECURITY_FILE_CAPABILITIES */
+		/* Clear the capabilities if we switch to a non-root user */
 		if (current->uid)
 			old_cap = cap_set_effective(__cap_empty_set);
 		else
diff --git a/include/linux/security.h b/include/linux/security.h
index 80c4d00..dc23a3d 100644
--- a/include/linux/security.h
+++ b/include/linux/security.h
@@ -49,8 +49,8 @@ extern int cap_settime(struct timespec *ts, struct timezone *tz);
 extern int ...
From: David Howells
Date: Wednesday, August 6, 2008 - 8:37 am

Change current->fs[ug]id to current_fs[ug]id() so that fsgid and fsuid can be
separated from the task_struct.

Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: James Morris <jmorris@namei.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
---

 arch/ia64/kernel/perfmon.c                |    4 ++--
 arch/powerpc/platforms/cell/spufs/inode.c |    4 ++--
 drivers/isdn/capi/capifs.c                |    4 ++--
 drivers/usb/core/inode.c                  |    4 ++--
 fs/9p/fid.c                               |    2 +-
 fs/9p/vfs_inode.c                         |    4 ++--
 fs/9p/vfs_super.c                         |    4 ++--
 fs/affs/inode.c                           |    4 ++--
 fs/anon_inodes.c                          |    4 ++--
 fs/attr.c                                 |    4 ++--
 fs/bfs/dir.c                              |    4 ++--
 fs/cifs/cifsproto.h                       |    2 +-
 fs/cifs/dir.c                             |   12 ++++++------
 fs/cifs/inode.c                           |    8 ++++----
 fs/cifs/misc.c                            |    4 ++--
 fs/coda/cache.c                           |    6 +++---
 fs/coda/upcall.c                          |    2 +-
 fs/devpts/inode.c                         |    4 ++--
 fs/dquot.c                                |    2 +-
 fs/exec.c                                 |    4 ++--
 fs/ext2/balloc.c                          |    2 +-
 fs/ext2/ialloc.c                          |    4 ++--
 fs/ext3/balloc.c                          |    2 +-
 fs/ext3/ialloc.c                          |    4 ++--
 fs/ext4/balloc.c                          |    3 +--
 fs/ext4/ialloc.c                          |    4 ++--
 fs/fat/file.c                             |    2 +-
 fs/fuse/dev.c                             |    4 ++--
 fs/gfs2/inode.c                           |   10 +++++-----
 fs/hfs/inode.c                            |    4 ++--
 fs/hfsplus/inode.c                        |    4 ++--
 fs/hpfs/namei.c                         ...
From: David Howells
Date: Wednesday, August 6, 2008 - 8:38 am

Detach the credentials from task_struct, duplicating them in copy_process()
and releasing them in __put_task_struct().

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: James Morris <jmorris@namei.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
---

 include/linux/cred.h       |   29 ++++++++++++++
 include/linux/init_task.h  |   16 -------
 include/linux/sched.h      |    1 
 include/linux/security.h   |   26 ++++++------
 kernel/Makefile            |    2 -
 kernel/cred.c              |   94 ++++++++++++++++++++++++++++++++++++++++++++
 kernel/fork.c              |   24 +++--------
 security/capability.c      |    8 ++--
 security/security.c        |    8 ++--
 security/selinux/hooks.c   |   32 +++++++--------
 security/smack/smack_lsm.c |   20 +++++----
 11 files changed, 177 insertions(+), 83 deletions(-)
 create mode 100644 kernel/cred.c


diff --git a/include/linux/cred.h b/include/linux/cred.h
index 17ad725..a0fefc0 100644
--- a/include/linux/cred.h
+++ b/include/linux/cred.h
@@ -158,4 +158,33 @@ do {						\
 	*(_gid) = current->cred->fsgid;		\
 } while(0)
 
+extern void __put_cred(struct cred *);
+extern int copy_creds(struct task_struct *, unsigned long);
+
+/**
+ * get_cred - Get a reference on a set of credentials
+ * @cred: The credentials to reference
+ *
+ * Get a reference on the specified set of credentials.  The caller must
+ * release the reference.
+ */
+static inline struct cred *get_cred(struct cred *cred)
+{
+	atomic_inc(&cred->usage);
+	return cred;
+}
+
+/**
+ * put_cred - Release a reference to a set of credentials
+ * @cred: The credentials to release
+ *
+ * Release a reference to a set of credentials, deleting them when the last ref
+ * is released.
+ */
+static inline void put_cred(struct cred *cred)
+{
+	if (atomic_dec_and_test(&(cred)->usage))
+		__put_cred(cred);
+}
+
 #endif /* _LINUX_CRED_H */
diff --git a/include/linux/init_task.h b/include/linux/init_task.h
index 0abd837..234c24a 100644
--- ...
From: David Howells
Date: Wednesday, August 6, 2008 - 8:37 am

Change most current->e?[ug]id to current_e?[ug]id().

Change some task->e?[ug]id to task_e?[ug]id().  In some places it makes more
sense to use RCU directly rather than a convenient wrapper; these are addressed
by a later patch (see the patch entitled "CRED: Use RCU to access another
task's creds and to release a task's own creds").

Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: James Morris <jmorris@namei.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
---

 arch/ia64/kernel/mca_drv.c       |    2 -
 arch/ia64/kernel/perfmon.c       |   19 +++++---
 arch/ia64/kernel/signal.c        |    4 +-
 arch/mips/kernel/mips-mt-fpaff.c |    5 +-
 arch/parisc/kernel/signal.c      |    2 -
 arch/powerpc/mm/fault.c          |    2 -
 arch/s390/hypfs/inode.c          |    4 +-
 arch/x86/mm/fault.c              |    2 -
 drivers/block/loop.c             |    6 ++-
 drivers/char/tty_audit.c         |    6 ++-
 drivers/gpu/drm/drm_fops.c       |    2 -
 drivers/media/video/cpia.c       |    2 -
 drivers/net/tun.c                |    4 +-
 drivers/net/wan/sbni.c           |    9 ++--
 drivers/usb/core/devio.c         |    8 ++-
 fs/affs/super.c                  |    4 +-
 fs/autofs/inode.c                |    4 +-
 fs/autofs4/inode.c               |    4 +-
 fs/autofs4/waitq.c               |    4 +-
 fs/binfmt_elf_fdpic.c            |    8 ++-
 fs/cifs/cifs_fs_sb.h             |    2 -
 fs/cifs/connect.c                |    4 +-
 fs/cifs/ioctl.c                  |    2 -
 fs/dquot.c                       |    2 -
 fs/ecryptfs/messaging.c          |   18 ++++----
 fs/ecryptfs/miscdev.c            |   20 +++++---
 fs/exec.c                        |   14 +++---
 fs/fat/inode.c                   |    4 +-
 fs/fcntl.c                       |    2 -
 fs/hfs/super.c                   |    4 +-
 fs/hfsplus/options.c             |    4 +-
 fs/hpfs/super.c                  |    4 +-
 fs/inotify_user.c                |    2 -
 fs/ioprio.c                      |    4 +-
 ...
From: David Howells
Date: Wednesday, August 6, 2008 - 8:38 am

Wrap current->cred and a few other accessors to hide their actual
implementation.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: James Morris <jmorris@namei.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
---

 drivers/net/tun.c             |    8 +-
 drivers/usb/core/devio.c      |   10 +-
 fs/binfmt_elf.c               |   10 +-
 fs/binfmt_elf_fdpic.c         |    9 +-
 fs/exec.c                     |    5 +
 fs/fcntl.c                    |    3 -
 fs/file_table.c               |    7 +-
 fs/hugetlbfs/inode.c          |    5 +
 fs/ioprio.c                   |    4 -
 fs/smbfs/dir.c                |    3 -
 include/linux/cred.h          |  187 +++++++++++++++++++++++++++++++----------
 include/linux/securebits.h    |    2 
 ipc/mqueue.c                  |    2 
 ipc/shm.c                     |    4 -
 kernel/sys.c                  |   59 ++++++-------
 kernel/uid16.c                |   31 ++++---
 net/core/scm.c                |    2 
 net/sunrpc/auth.c             |   14 ++-
 security/commoncap.c          |    2 
 security/keys/process_keys.c  |    4 -
 security/keys/request_key.c   |   11 +-
 security/selinux/exports.c    |    8 +-
 security/selinux/xfrm.c       |    6 +
 security/smack/smack_access.c |    2 
 security/smack/smack_lsm.c    |   26 +++---
 security/smack/smackfs.c      |    4 -
 26 files changed, 269 insertions(+), 159 deletions(-)


diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index 4174855..0b618bb 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -644,6 +644,7 @@ static int tun_set_iff(struct net *net, struct file *file, struct ifreq *ifr)
 	struct tun_net *tn;
 	struct tun_struct *tun;
 	struct net_device *dev;
+	const struct cred *cred = current_cred();
 	int err;
 
 	tn = net_generic(net, tun_net_id);
@@ -654,11 +655,12 @@ static int tun_set_iff(struct net *net, struct file *file, struct ifreq *ifr)
 
 		/* Check permissions */
 		if (((tun->owner != -1 &&
-		      current_euid() != tun->owner) ||
+		     ...
From: David Howells
Date: Wednesday, August 6, 2008 - 8:37 am

Separate the task security context from task_struct.  At this point, the
security data is temporarily embedded in the task_struct with two pointers
pointing to it.

Note that the Alpha arch is altered as it refers to (E)UID and (E)GID in
entry.S via asm-offsets.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: James Morris <jmorris@namei.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
---

 arch/alpha/kernel/asm-offsets.c   |   11 +-
 arch/alpha/kernel/entry.S         |   10 +
 arch/mips/kernel/kspd.c           |    4 -
 arch/s390/kernel/compat_linux.c   |   28 ++--
 arch/sparc64/kernel/sys_sparc32.c |   28 ++--
 drivers/connector/cn_proc.c       |    8 +
 fs/binfmt_elf.c                   |   12 +-
 fs/binfmt_elf_fdpic.c             |   12 +-
 fs/exec.c                         |    4 -
 fs/fcntl.c                        |    4 -
 fs/file_table.c                   |    4 -
 fs/fuse/dir.c                     |   12 +-
 fs/hugetlbfs/inode.c              |    4 -
 fs/ioprio.c                       |   12 +-
 fs/nfsd/auth.c                    |   22 ++-
 fs/nfsd/nfs4recover.c             |   12 +-
 fs/nfsd/nfsfh.c                   |    6 -
 fs/open.c                         |   17 +-
 fs/proc/array.c                   |   18 +--
 fs/proc/base.c                    |   16 +-
 fs/xfs/linux-2.6/xfs_cred.h       |    6 -
 fs/xfs/linux-2.6/xfs_globals.h    |    2 
 fs/xfs/linux-2.6/xfs_ioctl.c      |    2 
 fs/xfs/xfs_inode.h                |    2 
 fs/xfs/xfs_vnodeops.h             |   10 +
 include/linux/cred.h              |  155 +++++++++++++++++++---
 include/linux/init_task.h         |   24 ++-
 include/linux/sched.h             |   52 +------
 include/linux/securebits.h        |    2 
 ipc/mqueue.c                      |    2 
 ipc/shm.c                         |    4 -
 kernel/auditsc.c                  |   44 +++---
 kernel/capability.c               |    4 -
 kernel/cgroup.c                   |    4 -
 kernel/exit.c                     |   10 +
 ...
From: David Howells
Date: Wednesday, August 6, 2008 - 8:38 am

Use RCU to access another task's creds and to release a task's own creds.
This means that it will be possible for the credentials of a task to be
replaced without another task (a) requiring a full lock to read them, and (b)
seeing deallocated memory.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: James Morris <jmorris@namei.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
---

 drivers/connector/cn_proc.c  |   16 +++++++----
 fs/binfmt_elf.c              |    8 ++++-
 fs/binfmt_elf_fdpic.c        |    8 ++++-
 fs/fcntl.c                   |   15 +++++++---
 fs/fuse/dir.c                |   23 ++++++++++-----
 fs/ioprio.c                  |   14 +++++++--
 fs/proc/array.c              |   32 ++++++++++++++-------
 fs/proc/base.c               |   32 ++++++++++++++++-----
 include/linux/cred.h         |    3 +-
 kernel/auditsc.c             |   33 ++++++++++++----------
 kernel/cgroup.c              |   16 +++++------
 kernel/exit.c                |   14 ++++++---
 kernel/futex.c               |   22 +++++++++-----
 kernel/futex_compat.c        |    7 +++--
 kernel/ptrace.c              |   22 +++++++++-----
 kernel/sched.c               |   31 ++++++++++++++------
 kernel/signal.c              |   49 ++++++++++++++++++++------------
 kernel/sys.c                 |   11 +++++--
 kernel/tsacct.c              |    6 +++-
 mm/mempolicy.c               |    8 +++--
 mm/migrate.c                 |    8 +++--
 mm/oom_kill.c                |    6 ++--
 security/commoncap.c         |   64 ++++++++++++++++++++++++++----------------
 security/keys/permission.c   |   10 ++++---
 security/keys/process_keys.c |   24 +++++++++-------
 security/selinux/selinuxfs.c |   13 ++++++---
 security/smack/smack_lsm.c   |   32 +++++++++++----------
 27 files changed, 335 insertions(+), 192 deletions(-)


diff --git a/drivers/connector/cn_proc.c b/drivers/connector/cn_proc.c
index 354c1ff..c5afc98 100644
--- a/drivers/connector/cn_proc.c
+++ b/drivers/connector/cn_proc.c
@@ -106,6 ...
From: David Howells
Date: Wednesday, August 6, 2008 - 8:37 am

Alter the use of the key instantiation and negation functions' link-to-keyring
arguments.  Currently this specifies a keyring in the target process to link
the key into, creating the keyring if it doesn't exist.  This, however, can be
a problem for copy-on-write credentials as it means that the instantiating
process can alter the credentials of the requesting process.

This patch alters the behaviour such that:

 (1) If keyctl_instantiate_key() or keyctl_negate_key() are given a specific
     keyring by ID (ringid >= 0), then that keyring will be used.

 (2) If keyctl_instantiate_key() or keyctl_negate_key() are given one of the
     special constants that refer to the requesting process's keyrings
     (KEY_SPEC_*_KEYRING, all <= 0), then:

     (a) If sys_request_key() was given a keyring to use (destringid) then the
     	 key will be attached to that keyring.

     (b) If sys_request_key() was given a NULL keyring, then the key being
     	 instantiated will be attached to the default keyring as set by
     	 keyctl_set_reqkey_keyring().

 (3) No extra link will be made.

Decision point (1) follows current behaviour, and allows those instantiators
who've searched for a specifically named keyring in the requestor's keyring so
as to partition the keys by type to still have their named keyrings.

Decision point (2) allows the requestor to make sure that the key or keys that
get produced by request_key() go where they want, whilst allowing the
instantiator to request that the key is retained.  This is mainly useful for
situations where the instantiator makes a secondary request, the key for which
should be retained by the initial requestor:

	+-----------+        +--------------+        +--------------+
	|           |        |              |        |              |
	| Requestor |------->| Instantiator |------->| Instantiator |
	|           |        |              |        |              |
	+-----------+        +--------------+        +--------------+
	           ...
From: James Morris
Date: Thursday, August 7, 2008 - 1:50 am

These patches don't apply from here on:

Applying CRED: Detach the credentials from task_struct [ver #7]
error: patch failed: include/linux/cred.h:158
error: include/linux/cred.h: patch does not apply
error: patch failed: include/linux/init_task.h:115
error: include/linux/init_task.h: patch does not apply
error: patch failed: include/linux/sched.h:1109
error: include/linux/sched.h: patch does not apply
error: patch failed: kernel/fork.c:145
error: kernel/fork.c: patch does not apply
error: patch failed: security/selinux/hooks.c:166
error: security/selinux/hooks.c: patch does not apply
error: patch failed: security/smack/smack_lsm.c:984
error: security/smack/smack_lsm.c: patch does not apply
Patch failed at 0008.



- James
-- 
James Morris
<jmorris@namei.org>
--

From: David Howells
Date: Thursday, August 7, 2008 - 2:32 am

Hmmm...  Applies for me on top of your next branch (commit ID
421fae06be9e0dac45747494756b3580643815f9).

David
--

From: James Morris
Date: Thursday, August 7, 2008 - 2:56 am

Interesting, looks like git-am got confused when processing the patches as 
a single mailbox, but worked ok when run on a per-patch basis.


- James
-- 
James Morris
<jmorris@namei.org>
--

From: James Morris
Date: Thursday, August 7, 2008 - 5:03 am

These patches are now in 
git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6#next-creds

This is the 'next' branch + the credentials patches.

Stephen, could you add 'next-creds' to linux-next?


- James
-- 
James Morris
<jmorris@namei.org>
--

From: Stephen Rothwell
Date: Thursday, August 7, 2008 - 5:23 pm

Hi James,

On Thu, 7 Aug 2008 22:03:57 +1000 (EST) James Morris <jmorris@namei.org> wr=

OK, I have added it from today with you and David as contacts.  It is
right at the end of the merge, so it will go away if it is too painful (I
don't expect it to be, just letting you know).

--=20
Cheers,
Stephen Rothwell                    sfr@canb.auug.org.au
http://www.canb.auug.org.au/~sfr/
Previous thread: none

Next thread: ACPI: Properly clear flags on false-positives and send uevent on sudden unplug by Holger Macht on Wednesday, August 6, 2008 - 8:56 am. (3 messages)