[PATCH 14/16] NFS: Use local caching [try #3]

Previous thread: Re: [PATCH 00/23] per device dirty throttling -v8 by pointman on Friday, August 10, 2007 - 9:00 am. (1 message)

Next thread: [PATCH] Fix typo in arch/i386/kernel/tsc.c by Josh Triplett on Friday, August 10, 2007 - 9:25 am. (1 message)
From: David Howells
Date: Friday, August 10, 2007 - 9:04 am

These patches add local caching for network filesystems such as NFS and AFS.

FS-Cache now runs fully asynchronously as required by Trond Myklebust for NFS.

--
Changes:
[try #3]:

 (*) Added missing file to CacheFiles patch.

 (*) Made new security functions return errors and pass actual return data via
     argument pointer.

 (*) Cleaned up NFS patch.

 (*) The 'fsc' flag must now be passed to NFS mount by the string options.

 (*) Split the NFS patch into three as requested by Trond.

[try #2]:

 (*) The CacheFiles module no longer accepts directory fds in its cull and
     inuse commands from cachefilesd.  Instead it uses the current working
     directory of the calling process as the basis for looking up the object.
     Corollary to this, fget_light() no longer needs to be exported.

--
A tarball of the patches is available at:

	http://people.redhat.com/~dhowells/fscache/patches/nfs+fscache-22.tar.bz2


To use this version of CacheFiles, the cachefilesd-0.9 is also required.  It
is available as an SRPM:

	http://people.redhat.com/~dhowells/fscache/cachefilesd-0.9-1.fc7.src.rpm

Or as individual bits:

	http://people.redhat.com/~dhowells/fscache/cachefilesd-0.9.tar.bz2
	http://people.redhat.com/~dhowells/fscache/cachefilesd.fc
	http://people.redhat.com/~dhowells/fscache/cachefilesd.if
	http://people.redhat.com/~dhowells/fscache/cachefilesd.te
	http://people.redhat.com/~dhowells/fscache/cachefilesd.spec

The .fc, .if and .te files are for manipulating SELinux.

David
-

From: David Howells
Date: Friday, August 10, 2007 - 9:05 am

The attached patch causes read_cache_pages() to release page-private data on a
page for which add_to_page_cache() fails or the filler function fails. This
permits pages with caching references associated with them to be cleaned up.

The invalidatepage() address space op is called (indirectly) to do the honours.

Signed-Off-By: David Howells <dhowells@redhat.com>
---

 mm/readahead.c |   40 ++++++++++++++++++++++++++++++++++++++--
 1 files changed, 38 insertions(+), 2 deletions(-)

diff --git a/mm/readahead.c b/mm/readahead.c
index 39bf45d..12d1378 100644
--- a/mm/readahead.c
+++ b/mm/readahead.c
@@ -15,6 +15,7 @@
 #include <linux/backing-dev.h>
 #include <linux/task_io_accounting_ops.h>
 #include <linux/pagevec.h>
+#include <linux/buffer_head.h>
 
 void default_unplug_io_fn(struct backing_dev_info *bdi, struct page *page)
 {
@@ -51,6 +52,41 @@ EXPORT_SYMBOL_GPL(file_ra_state_init);
 
 #define list_to_page(head) (list_entry((head)->prev, struct page, lru))
 
+/*
+ * see if a page needs releasing upon read_cache_pages() failure
+ * - the caller of read_cache_pages() may have set PG_private before calling,
+ *   such as the NFS fs marking pages that are cached locally on disk, thus we
+ *   need to give the fs a chance to clean up in the event of an error
+ */
+static void read_cache_pages_invalidate_page(struct address_space *mapping,
+					     struct page *page)
+{
+	if (PagePrivate(page)) {
+		if (TestSetPageLocked(page))
+			BUG();
+		page->mapping = mapping;
+		do_invalidatepage(page, 0);
+		page->mapping = NULL;
+		unlock_page(page);
+	}
+	page_cache_release(page);
+}
+
+/*
+ * release a list of pages, invalidating them first if need be
+ */
+static void read_cache_pages_invalidate_pages(struct address_space *mapping,
+					      struct list_head *pages)
+{
+	struct page *victim;
+
+	while (!list_empty(pages)) {
+		victim = list_to_page(pages);
+		list_del(&victim->lru);
+		read_cache_pages_invalidate_page(mapping, victim);
+	}
+}
+
 /**
  * ...
From: David Howells
Date: Friday, August 10, 2007 - 9:05 am

Recruit a couple of page flags to aid in cache management.  The following extra
flags are defined:

 (1) PG_fscache (PG_owner_priv_2)

     The marked page is backed by a local cache and is pinning resources in the
     cache driver.

 (2) PG_fscache_write (PG_owner_priv_3)

     The marked page is being written to the local cache.  The page may not be
     modified whilst this is in progress.

If PG_fscache is set, then things that checked for PG_private will now also
check for that.  This includes things like truncation and page invalidation.
The function page_has_private() had been added to detect this.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 fs/splice.c                |    2 +-
 include/linux/page-flags.h |   30 +++++++++++++++++++++++++++++-
 include/linux/pagemap.h    |   11 +++++++++++
 mm/filemap.c               |   16 ++++++++++++++++
 mm/migrate.c               |    2 +-
 mm/page_alloc.c            |    3 +++
 mm/readahead.c             |    9 +++++----
 mm/swap.c                  |    4 ++--
 mm/swap_state.c            |    4 ++--
 mm/truncate.c              |   10 +++++-----
 mm/vmscan.c                |    2 +-
 11 files changed, 76 insertions(+), 17 deletions(-)

diff --git a/fs/splice.c b/fs/splice.c
index c010a72..ae4f5b7 100644
--- a/fs/splice.c
+++ b/fs/splice.c
@@ -58,7 +58,7 @@ static int page_cache_pipe_buf_steal(struct pipe_inode_info *pipe,
 		 */
 		wait_on_page_writeback(page);
 
-		if (PagePrivate(page))
+		if (page_has_private(page))
 			try_to_release_page(page, GFP_KERNEL);
 
 		/*
diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
index 209d3a4..eaf9854 100644
--- a/include/linux/page-flags.h
+++ b/include/linux/page-flags.h
@@ -83,19 +83,24 @@
 #define PG_private		11	/* If pagecache, has fs-private data */
 
 #define PG_writeback		12	/* Page is under writeback */
+#define PG_owner_priv_2		13	/* Owner use. If pagecache, fs may use */
 #define PG_compound		14	/* Part of a compound page */
 ...
From: David Howells
Date: Friday, August 10, 2007 - 9:05 am

Provide an add_wait_queue_tail() function to add a waiter to the back of a
wait queue instead of the front.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 include/linux/wait.h |    1 +
 kernel/wait.c        |   18 ++++++++++++++++++
 2 files changed, 19 insertions(+), 0 deletions(-)

diff --git a/include/linux/wait.h b/include/linux/wait.h
index 0e68628..4cae7db 100644
--- a/include/linux/wait.h
+++ b/include/linux/wait.h
@@ -118,6 +118,7 @@ static inline int waitqueue_active(wait_queue_head_t *q)
 #define is_sync_wait(wait)	(!(wait) || ((wait)->private))
 
 extern void FASTCALL(add_wait_queue(wait_queue_head_t *q, wait_queue_t * wait));
+extern void FASTCALL(add_wait_queue_tail(wait_queue_head_t *q, wait_queue_t * wait));
 extern void FASTCALL(add_wait_queue_exclusive(wait_queue_head_t *q, wait_queue_t * wait));
 extern void FASTCALL(remove_wait_queue(wait_queue_head_t *q, wait_queue_t * wait));
 
diff --git a/kernel/wait.c b/kernel/wait.c
index 444ddbf..7acc9cc 100644
--- a/kernel/wait.c
+++ b/kernel/wait.c
@@ -29,6 +29,24 @@ void fastcall add_wait_queue(wait_queue_head_t *q, wait_queue_t *wait)
 }
 EXPORT_SYMBOL(add_wait_queue);
 
+/**
+ * add_wait_queue_tail - Add a waiter to the back of a waitqueue
+ * @q: the wait queue to append the waiter to
+ * @wait: the waiter to be queued
+ *
+ * Add a waiter to the back of a waitqueue so that it gets woken up last.
+ */
+void fastcall add_wait_queue_tail(wait_queue_head_t *q, wait_queue_t *wait)
+{
+	unsigned long flags;
+
+	wait->flags &= ~WQ_FLAG_EXCLUSIVE;
+	spin_lock_irqsave(&q->lock, flags);
+	__add_wait_queue_tail(q, wait);
+	spin_unlock_irqrestore(&q->lock, flags);
+}
+EXPORT_SYMBOL(add_wait_queue_tail);
+
 void fastcall add_wait_queue_exclusive(wait_queue_head_t *q, wait_queue_t *wait)
 {
 	unsigned long flags;

-

From: David Howells
Date: Friday, August 10, 2007 - 9:05 am

The attached patch adds a generic intermediary (FS-Cache) by which filesystems
may call on local caching capabilities, and by which local caching backends may
make caches available:

	+---------+
	|         |                        +--------------+
	|   NFS   |--+                     |              |
	|         |  |                 +-->|   CacheFS    |
	+---------+  |   +----------+  |   |  /dev/hda5   |
	             |   |          |  |   +--------------+
	+---------+  +-->|          |  |
	|         |      |          |--+
	|   AFS   |----->| FS-Cache |
	|         |      |          |--+
	+---------+  +-->|          |  |
	             |   |          |  |   +--------------+
	+---------+  |   +----------+  |   |              |
	|         |  |                 +-->|  CacheFiles  |
	|  ISOFS  |--+                     |  /var/cache  |
	|         |                        +--------------+
	+---------+

The patch also documents the netfs interface and the cache backend
interface provided by the facility.


There are a number of reasons why I'm not using i_mapping to do this.
These have been discussed a lot on the LKML and CacheFS mailing lists,
but to summarise the basics:

 (1) Most filesystems don't do hole reportage.  Holes in files are treated as
     blocks of zeros and can't be distinguished otherwise, making it difficult
     to distinguish blocks that have been read from the network and cached from
     those that haven't.

 (2) The backing inode must be fully populated before being exposed to
     userspace through the main inode because the VM/VFS goes directly to the
     backing inode and does not interrogate the front inode on VM ops.

     Therefore:

     (a) The backing inode must fit entirely within the cache.

     (b) All backed files currently open must fit entirely within the cache at
     	 the same time.

     (c) A working set of files in total larger than the cache may not be
     	 cached.

     (d) A file may not grow larger than the available ...
From: David Howells
Date: Friday, August 10, 2007 - 9:05 am

This one-line patch fixes the missing export of copy_page introduced
by the cachefile patches.  This patch is not yet upstream, but is required
for cachefile on ia64.  It will be pushed upstream when cachefile goes
upstream.

Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Signed-Off-By: David Howells <dhowells@redhat.com>
---

 arch/ia64/kernel/ia64_ksyms.c |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/arch/ia64/kernel/ia64_ksyms.c b/arch/ia64/kernel/ia64_ksyms.c
index bd17190..20c3546 100644
--- a/arch/ia64/kernel/ia64_ksyms.c
+++ b/arch/ia64/kernel/ia64_ksyms.c
@@ -43,6 +43,7 @@ EXPORT_SYMBOL(__do_clear_user);
 EXPORT_SYMBOL(__strlen_user);
 EXPORT_SYMBOL(__strncpy_from_user);
 EXPORT_SYMBOL(__strnlen_user);
+EXPORT_SYMBOL(copy_page);
 
 /* from arch/ia64/lib */
 extern void __divsi3(void);

-

From: David Howells
Date: Friday, August 10, 2007 - 9:05 am

Add an address space operation to write one single page of data to an inode at
a page-aligned location (thus permitting the implementation to be highly
optimised).

This is used by CacheFiles to store the contents of netfs pages into their
backing file pages.

Supply a generic implementation for this that uses the prepare_write() and
commit_write() address_space operations to bound a copy directly into the page
cache.

Hook the Ext2 and Ext3 operations to the generic implementation.

Signed-Off-By: David Howells <dhowells@redhat.com>
---

 fs/ext2/inode.c    |    2 +
 fs/ext3/inode.c    |    3 ++
 include/linux/fs.h |    7 ++++
 mm/filemap.c       |   95 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 107 insertions(+), 0 deletions(-)

diff --git a/fs/ext2/inode.c b/fs/ext2/inode.c
index 0079b2c..b3e4b50 100644
--- a/fs/ext2/inode.c
+++ b/fs/ext2/inode.c
@@ -695,6 +695,7 @@ const struct address_space_operations ext2_aops = {
 	.direct_IO		= ext2_direct_IO,
 	.writepages		= ext2_writepages,
 	.migratepage		= buffer_migrate_page,
+	.write_one_page		= generic_file_buffered_write_one_page,
 };
 
 const struct address_space_operations ext2_aops_xip = {
@@ -713,6 +714,7 @@ const struct address_space_operations ext2_nobh_aops = {
 	.direct_IO		= ext2_direct_IO,
 	.writepages		= ext2_writepages,
 	.migratepage		= buffer_migrate_page,
+	.write_one_page		= generic_file_buffered_write_one_page,
 };
 
 /*
diff --git a/fs/ext3/inode.c b/fs/ext3/inode.c
index de4e316..93809eb 100644
--- a/fs/ext3/inode.c
+++ b/fs/ext3/inode.c
@@ -1713,6 +1713,7 @@ static const struct address_space_operations ext3_ordered_aops = {
 	.releasepage	= ext3_releasepage,
 	.direct_IO	= ext3_direct_IO,
 	.migratepage	= buffer_migrate_page,
+	.write_one_page	= generic_file_buffered_write_one_page,
 };
 
 static const struct address_space_operations ext3_writeback_aops = {
@@ -1727,6 +1728,7 @@ static const struct address_space_operations ext3_writeback_aops = {
 ...
From: David Howells
Date: Friday, August 10, 2007 - 9:05 am

Add a function to install a monitor on the page lock waitqueue for a particular
page, thus allowing the page being unlocked to be detected.

This is used by CacheFiles to detect read completion on a page in the backing
filesystem so that it can then copy the data to the waiting netfs page.

Signed-Off-By: David Howells <dhowells@redhat.com>
---

 include/linux/pagemap.h |    5 +++++
 mm/filemap.c            |   19 +++++++++++++++++++
 2 files changed, 24 insertions(+), 0 deletions(-)

diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index d1049b6..452fdcf 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -220,6 +220,11 @@ static inline void wait_on_page_fscache_write(struct page *page)
 extern void end_page_fscache_write(struct page *page);
 
 /*
+ * Add an arbitrary waiter to a page's wait queue
+ */
+extern void add_page_wait_queue(struct page *page, wait_queue_t *waiter);
+
+/*
  * Fault a userspace page into pagetables.  Return non-zero on a fault.
  *
  * This assumes that two userspace pages are always sufficient.  That's
diff --git a/mm/filemap.c b/mm/filemap.c
index 5e419a2..c60c24e 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -518,6 +518,25 @@ void fastcall wait_on_page_bit(struct page *page, int bit_nr)
 EXPORT_SYMBOL(wait_on_page_bit);
 
 /**
+ * add_page_wait_queue - Add an arbitrary waiter to a page's wait queue
+ * @page - Page defining the wait queue of interest
+ * @waiter - Waiter to add to the queue
+ *
+ * Add an arbitrary @waiter to the wait queue for the nominated @page.
+ */
+void add_page_wait_queue(struct page *page, wait_queue_t *waiter)
+{
+	wait_queue_head_t *q = page_waitqueue(page);
+	unsigned long flags;
+
+	spin_lock_irqsave(&q->lock, flags);
+	__add_wait_queue(q, waiter);
+	spin_unlock_irqrestore(&q->lock, flags);
+}
+
+EXPORT_SYMBOL_GPL(add_page_wait_queue);
+
+/**
  * unlock_page - unlock a locked page
  * @page: the page
  *

-

From: David Howells
Date: Friday, August 10, 2007 - 9:05 am

Export a number of functions for CacheFiles's use.

Signed-Off-By: David Howells <dhowells@redhat.com>
---

 fs/super.c       |    2 ++
 kernel/auditsc.c |    2 ++
 2 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/fs/super.c b/fs/super.c
index fc8ebed..c0d99dd 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -270,6 +270,8 @@ int fsync_super(struct super_block *sb)
 	return sync_blockdev(sb->s_bdev);
 }
 
+EXPORT_SYMBOL_GPL(fsync_super);
+
 /**
  *	generic_shutdown_super	-	common helper for ->kill_sb()
  *	@sb: superblock to kill
diff --git a/kernel/auditsc.c b/kernel/auditsc.c
index 3401293..0112179 100644
--- a/kernel/auditsc.c
+++ b/kernel/auditsc.c
@@ -1526,6 +1526,8 @@ add_names:
 	}
 }
 
+EXPORT_SYMBOL_GPL(__audit_inode_child);
+
 /**
  * auditsc_get_stamp - get local copies of audit_context values
  * @ctx: audit_context for the task

-

From: David Howells
Date: Friday, August 10, 2007 - 9:05 am

Make it possible for a process's file creation SID to be temporarily overridden
by CacheFiles so that files created in the cache have the right label attached.

Without this facility, files created in the cache will be given the current
file creation SID of whatever process happens to have invoked CacheFiles
indirectly by means of opening a netfs file at the time the cache file is
created.

Signed-Off-By: David Howells <dhowells@redhat.com>
---

 include/linux/security.h |   39 +++++++++++++++++++++++++++++++++++++++
 security/dummy.c         |   14 ++++++++++++++
 security/selinux/hooks.c |   20 ++++++++++++++++++++
 3 files changed, 73 insertions(+), 0 deletions(-)

diff --git a/include/linux/security.h b/include/linux/security.h
index c11dc8a..edd1677 100644
--- a/include/linux/security.h
+++ b/include/linux/security.h
@@ -1147,6 +1147,15 @@ struct request_sock;
  *	@secdata contains the security context.
  *	@seclen contains the length of the security context.
  *
+ * @get_fscreate_secid:
+ *	Get the current FS security ID.
+ *	@secid points the location in which to return the security ID.
+ *
+ * @set_fscreate_secid:
+ *	Set the current FS security ID.
+ *	@secid contains the security ID to set.
+ *	@oldsecid points the location in which to return the old security ID.
+ *
  * This is the main security structure.
  */
 struct security_operations {
@@ -1330,6 +1339,8 @@ struct security_operations {
  	int (*setprocattr)(struct task_struct *p, char *name, void *value, size_t size);
 	int (*secid_to_secctx)(u32 secid, char **secdata, u32 *seclen);
 	void (*release_secctx)(char *secdata, u32 seclen);
+	int (*get_fscreate_secid)(u32 *secid);
+	int (*set_fscreate_secid)(u32 secid, u32 *oldsecid);
 
 #ifdef CONFIG_SECURITY_NETWORK
 	int (*unix_stream_connect) (struct socket * sock,
@@ -2127,6 +2138,16 @@ static inline void security_release_secctx(char *secdata, u32 seclen)
 	return security_ops->release_secctx(secdata, seclen);
 }
 
+static inline int ...
From: Casey Schaufler
Date: Friday, August 10, 2007 - 9:52 am

I still object to the use of sids in LSM interfaces. I still owe you
a viable alternative.


Casey Schaufler
casey@schaufler-ca.com
-

From: David Howells
Date: Friday, August 10, 2007 - 9:05 am

Add an act-as SID to task_security_struct that is equivalent to fsuid/fsgid in
task_struct.  This permits a task to perform operations as if it is the
overriding SID, without changing its own SID as that might be needed to control
access to the process by ptrace, signals, /proc, etc.

This is useful for CacheFiles in that it allows CacheFiles to access the cache
files and directories using the cache's security context rather than the
security context of the process on whose behalf it is working, and in the
context of which it is running.

Signed-Off-By: David Howells <dhowells@redhat.com>
---

 include/linux/security.h          |   36 ++++++++
 security/dummy.c                  |   14 +++
 security/selinux/exports.c        |    2 
 security/selinux/hooks.c          |  162 +++++++++++++++++++++++--------------
 security/selinux/include/objsec.h |    1 
 security/selinux/selinuxfs.c      |    2 
 security/selinux/xfrm.c           |    6 +
 7 files changed, 156 insertions(+), 67 deletions(-)

diff --git a/include/linux/security.h b/include/linux/security.h
index edd1677..194ef49 100644
--- a/include/linux/security.h
+++ b/include/linux/security.h
@@ -1156,6 +1156,18 @@ struct request_sock;
  *	@secid contains the security ID to set.
  *	@oldsecid points the location in which to return the old security ID.
  *
+ * @act_as_secid:
+ *	Set the security ID as which to act, returning the security ID as which
+ *      the process was previously acting.
+ *	@secid contains the security ID to act as.
+ *	@oldsecid points the location in which to return the displaced security ID.
+ *
+ * @act_as_self:
+ *	Reset the security ID as which to act to be the same as the process's
+ *      owning security ID, and return the security ID as which the process was
+ *      previously acting.
+ *	@oldsecid points the location in which to return the displaced security ID.
+ *
  * This is the main security structure.
  */
 struct security_operations {
@@ -1341,6 +1353,8 @@ struct ...
From: David Howells
Date: Friday, August 10, 2007 - 9:05 am

Permit an inode's security ID to be obtained by the CacheFiles module.  This is
then used as the SID with which files and directories will be created in the
cache.

Signed-Off-By: David Howells <dhowells@redhat.com>
---

 include/linux/security.h |   19 +++++++++++++++++++
 security/dummy.c         |    7 +++++++
 security/selinux/hooks.c |    9 +++++++++
 3 files changed, 35 insertions(+), 0 deletions(-)

diff --git a/include/linux/security.h b/include/linux/security.h
index 194ef49..a54958a 100644
--- a/include/linux/security.h
+++ b/include/linux/security.h
@@ -414,6 +414,11 @@ struct request_sock;
  *	the size of the buffer required.
  *	Returns number of bytes used/required on success.
  *
+ * @inode_get_secid:
+ *	Retrieve the security ID from an inode.
+ *	@inode refers to the inode to get the security ID from.
+ *	@secid points the location in which to return the security ID.
+ *
  * Security hooks for file operations
  *
  * @file_permission:
@@ -1256,6 +1261,7 @@ struct security_operations {
   	int (*inode_getsecurity)(const struct inode *inode, const char *name, void *buffer, size_t size, int err);
   	int (*inode_setsecurity)(struct inode *inode, const char *name, const void *value, size_t size, int flags);
   	int (*inode_listsecurity)(struct inode *inode, char *buffer, size_t buffer_size);
+	int (*inode_get_secid)(struct inode *inode, u32 *secid);
 
 	int (*file_permission) (struct file * file, int mask);
 	int (*file_alloc_security) (struct file * file);
@@ -1818,6 +1824,13 @@ static inline int security_inode_listsecurity(struct inode *inode, char *buffer,
 	return security_ops->inode_listsecurity(inode, buffer, buffer_size);
 }
 
+static inline int security_inode_get_secid(struct inode *inode, u32 *secid)
+{
+	if (unlikely(IS_PRIVATE(inode)))
+		return 0;
+	return security_ops->inode_get_secid(inode, secid);
+}
+
 static inline int security_file_permission (struct file *file, int mask)
 {
 	return security_ops->file_permission (file, mask);
@@ ...
From: David Howells
Date: Friday, August 10, 2007 - 9:05 am

Get the SID under which the CacheFiles module should operate so that the
SELinux security system can control the accesses it makes.

Signed-Off-By: David Howells <dhowells@redhat.com>
---

 include/linux/security.h |   20 ++++++++++++++++++++
 security/dummy.c         |    7 +++++++
 security/selinux/hooks.c |    7 +++++++
 3 files changed, 34 insertions(+), 0 deletions(-)

diff --git a/include/linux/security.h b/include/linux/security.h
index a54958a..593a4d0 100644
--- a/include/linux/security.h
+++ b/include/linux/security.h
@@ -1173,6 +1173,14 @@ struct request_sock;
  *      previously acting.
  *	@oldsecid points the location in which to return the displaced security ID.
  *
+ * @cachefiles_get_secid:
+ *	Determine the security ID for the CacheFiles module to use when
+ *	accessing the filesystem containing the cache.
+ *	@secid contains the security ID under which cachefiles daemon is
+ *      running.
+ *	@modsecid contains the pointer to where the security ID for the module
+ *	is to be stored.
+ *
  * This is the main security structure.
  */
 struct security_operations {
@@ -1361,6 +1369,7 @@ struct security_operations {
 	int (*set_fscreate_secid)(u32 secid, u32 *oldsecid);
 	int (*act_as_secid)(u32 secid, u32 *oldsecid);
 	int (*act_as_self)(u32 *oldsecid);
+	int (*cachefiles_get_secid)(u32 secid, u32 *modsecid);
 
 #ifdef CONFIG_SECURITY_NETWORK
 	int (*unix_stream_connect) (struct socket * sock,
@@ -2185,6 +2194,11 @@ static inline int security_act_as_self(u32 *oldsecid)
 	return security_ops->act_as_self(oldsecid);
 }
 
+static inline int security_cachefiles_get_secid(u32 secid, u32 *modsecid)
+{
+	return security_ops->cachefiles_get_secid(secid, modsecid);
+}
+
 /* prototypes */
 extern int security_init	(void);
 extern int register_security	(struct security_operations *ops);
@@ -2897,6 +2911,12 @@ static inline u32 security_act_as_self(u32 *oldsecid)
 	return 0;
 }
 
+static inline int security_cachefiles_get_secid(u32 secid, u32 ...
From: David Howells
Date: Friday, August 10, 2007 - 9:06 am

Add an FS-Cache cache-backend that permits a mounted filesystem to be used as a
backing store for the cache.


CacheFiles uses a userspace daemon to do some of the cache management - such as
reaping stale nodes and culling.  This is called cachefilesd and lives in
/sbin.  The source for the daemon can be downloaded from:

	http://people.redhat.com/~dhowells/cachefs/cachefilesd.c

And an example configuration from:

	http://people.redhat.com/~dhowells/cachefs/cachefilesd.conf

The filesystem and data integrity of the cache are only as good as those of the
filesystem providing the backing services.  Note that CacheFiles does not
attempt to journal anything since the journalling interfaces of the various
filesystems are very specific in nature.

CacheFiles creates a proc-file - "/proc/fs/cachefiles" - that is used for
communication with the daemon.  Only one thing may have this open at once, and
whilst it is open, a cache is at least partially in existence.  The daemon
opens this and sends commands down it to control the cache.

CacheFiles is currently limited to a single cache.

CacheFiles attempts to maintain at least a certain percentage of free space on
the filesystem, shrinking the cache by culling the objects it contains to make
space if necessary - see the "Cache Culling" section.  This means it can be
placed on the same medium as a live set of data, and will expand to make use of
spare space and automatically contract when the set of data requires more
space.


============
REQUIREMENTS
============

The use of CacheFiles and its daemon requires the following features to be
available in the system and in the cache filesystem:

	- dnotify.

	- extended attributes (xattrs).

	- openat() and friends.

	- bmap() support on files in the filesystem (FIBMAP ioctl).

	- The use of bmap() to detect a partial page at the end of the file.

It is strongly recommended that the "dir_index" option is enabled on Ext3
filesystems being used as a ...
From: David Howells
Date: Friday, August 10, 2007 - 9:06 am

The attached patch makes it possible for the NFS filesystem to make use of the
network filesystem local caching service (FS-Cache).

To be able to use this, an updated mount program is required.  This can be
obtained from:

	http://people.redhat.com/steved/fscache/util-linux/

To mount an NFS filesystem to use caching, add an "fsc" option to the mount:

	mount warthog:/ /a -o fsc

Signed-Off-By: David Howells <dhowells@redhat.com>
---

 fs/nfs/Makefile           |    1 
 fs/nfs/client.c           |    5 +
 fs/nfs/file.c             |   51 ++++++
 fs/nfs/fscache-def.c      |  288 +++++++++++++++++++++++++++++++++++
 fs/nfs/fscache.c          |  374 +++++++++++++++++++++++++++++++++++++++++++++
 fs/nfs/fscache.h          |  144 +++++++++++++++++
 fs/nfs/inode.c            |   48 +++++-
 fs/nfs/read.c             |   28 +++
 fs/nfs/sysctl.c           |   44 +++++
 include/linux/nfs_fs.h    |    8 +
 include/linux/nfs_fs_sb.h |    7 +
 11 files changed, 988 insertions(+), 10 deletions(-)

diff --git a/fs/nfs/Makefile b/fs/nfs/Makefile
index b55cb23..07c9345 100644
--- a/fs/nfs/Makefile
+++ b/fs/nfs/Makefile
@@ -16,4 +16,5 @@ nfs-$(CONFIG_NFS_V4)	+= nfs4proc.o nfs4xdr.o nfs4state.o nfs4renewd.o \
 			   nfs4namespace.o
 nfs-$(CONFIG_NFS_DIRECTIO) += direct.o
 nfs-$(CONFIG_SYSCTL) += sysctl.o
+nfs-$(CONFIG_NFS_FSCACHE) += fscache.o fscache-def.o
 nfs-objs		:= $(nfs-y)
diff --git a/fs/nfs/client.c b/fs/nfs/client.c
index a49f9fe..f1783b2 100644
--- a/fs/nfs/client.c
+++ b/fs/nfs/client.c
@@ -41,6 +41,7 @@
 #include "delegation.h"
 #include "iostat.h"
 #include "internal.h"
+#include "fscache.h"
 
 #define NFSDBG_FACILITY		NFSDBG_CLIENT
 
@@ -137,6 +138,8 @@ static struct nfs_client *nfs_alloc_client(const char *hostname,
 	clp->cl_state = 1 << NFS4CLNT_LEASE_EXPIRED;
 #endif
 
+	nfs_fscache_get_client_cookie(clp);
+
 	return clp;
 
 error_3:
@@ -168,6 +171,8 @@ static void nfs_free_client(struct nfs_client *clp)
 
 	nfs4_shutdown_client(clp);
 ...
From: David Howells
Date: Friday, August 10, 2007 - 9:06 am

Changes to the kernel configuration defintions and to the NFS mount options to
allow the local caching support added by the previous patch to be enabled.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 fs/Kconfig        |    8 ++++++++
 fs/nfs/client.c   |   14 ++++++++++----
 fs/nfs/internal.h |    2 ++
 fs/nfs/super.c    |   40 ++++++++++++++++++++++++++++++++++------
 4 files changed, 54 insertions(+), 10 deletions(-)

diff --git a/fs/Kconfig b/fs/Kconfig
index 7feb4cb..76d5d16 100644
--- a/fs/Kconfig
+++ b/fs/Kconfig
@@ -1600,6 +1600,14 @@ config NFS_V4
 
 	  If unsure, say N.
 
+config NFS_FSCACHE
+	bool "Provide NFS client caching support (EXPERIMENTAL)"
+	depends on EXPERIMENTAL
+	depends on NFS_FS=m && FSCACHE || NFS_FS=y && FSCACHE=y
+	help
+	  Say Y here if you want NFS data to be cached locally on disc through
+	  the general filesystem cache manager
+
 config NFS_DIRECTIO
 	bool "Allow direct I/O on NFS files"
 	depends on NFS_FS
diff --git a/fs/nfs/client.c b/fs/nfs/client.c
index f1783b2..0de4db4 100644
--- a/fs/nfs/client.c
+++ b/fs/nfs/client.c
@@ -543,7 +543,8 @@ error:
 /*
  * Create a version 2 or 3 client
  */
-static int nfs_init_server(struct nfs_server *server, const struct nfs_mount_data *data)
+static int nfs_init_server(struct nfs_server *server, const struct nfs_mount_data *data,
+			   unsigned int extra_options)
 {
 	struct nfs_client *clp;
 	int error, nfsvers = 2;
@@ -580,6 +581,7 @@ static int nfs_init_server(struct nfs_server *server, const struct nfs_mount_dat
 	server->acregmax = data->acregmax * HZ;
 	server->acdirmin = data->acdirmin * HZ;
 	server->acdirmax = data->acdirmax * HZ;
+	server->options = extra_options;
 
 	/* Start lockd here, before we might error out */
 	error = nfs_start_lockd(server);
@@ -776,6 +778,7 @@ void nfs_free_server(struct nfs_server *server)
  * - keyed on server and FSID
  */
 struct nfs_server *nfs_create_server(const struct nfs_mount_data *data,
+				     unsigned ...
From: David Howells
Date: Friday, August 10, 2007 - 9:06 am

Display the local caching state in /proc/fs/nfsfs/volumes.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 fs/nfs/client.c  |    7 ++++---
 fs/nfs/fscache.h |   12 ++++++++++++
 2 files changed, 16 insertions(+), 3 deletions(-)

diff --git a/fs/nfs/client.c b/fs/nfs/client.c
index 0de4db4..d350668 100644
--- a/fs/nfs/client.c
+++ b/fs/nfs/client.c
@@ -1319,7 +1319,7 @@ static int nfs_volume_list_show(struct seq_file *m, void *v)
 
 	/* display header on line 1 */
 	if (v == &nfs_volume_list) {
-		seq_puts(m, "NV SERVER   PORT DEV     FSID\n");
+		seq_puts(m, "NV SERVER   PORT DEV     FSID              FSC\n");
 		return 0;
 	}
 	/* display one transport per line on subsequent lines */
@@ -1333,12 +1333,13 @@ static int nfs_volume_list_show(struct seq_file *m, void *v)
 		 (unsigned long long) server->fsid.major,
 		 (unsigned long long) server->fsid.minor);
 
-	seq_printf(m, "v%d %02x%02x%02x%02x %4hx %-7s %-17s\n",
+	seq_printf(m, "v%d %02x%02x%02x%02x %4hx %-7s %-17s %s\n",
 		   clp->cl_nfsversion,
 		   NIPQUAD(clp->cl_addr.sin_addr),
 		   ntohs(clp->cl_addr.sin_port),
 		   dev,
-		   fsid);
+		   fsid,
+		   nfs_server_fscache_state(server));
 
 	return 0;
 }
diff --git a/fs/nfs/fscache.h b/fs/nfs/fscache.h
index 44bb0d1..77f3450 100644
--- a/fs/nfs/fscache.h
+++ b/fs/nfs/fscache.h
@@ -56,6 +56,17 @@ extern void __nfs_fscache_invalidate_page(struct page *, struct inode *);
 extern int nfs_fscache_release_page(struct page *, gfp_t);
 
 /*
+ * indicate the client caching state as readable text
+ */
+static inline const char *nfs_server_fscache_state(struct nfs_server *server)
+{
+	if (server->nfs_client->fscache &&
+	    (server->options & NFS_OPTION_FSCACHE))
+		return "yes";
+	return "no ";
+}
+
+/*
  * release the caching state associated with a page if undergoing complete page
  * invalidation
  */
@@ -110,6 +121,7 @@ static inline void nfs_fscache_unregister(void) {}
 static inline void nfs_fscache_get_client_cookie(struct ...
From: Casey Schaufler
Date: Friday, August 10, 2007 - 3:13 pm

How would you expect an LSM that is not SELinux to interface with
CacheFiles? You have gone to a great deal of effort to support the
requirements of an SELinux system, and that's good, but you have
extended the LSM interface to expose SELinux data structures (secids)
and require them for the operation of CacheFiles, and that's bad.
The data used within an LSM is private to the LSM, and this applies
to SELinux as well as to any other LSM that may come along, such
as the Smack LSM I'm working on. This applies to task data as well
as file data. Further, the behavior of the system in the presence
of an LSM should be controlled by the LSM, it is more than a little
scary that CacheFiles is enforcing SELinux policy based on secids
that may be coming from a different LSM.

I applaud the integration of CacheFiles with SELinux. Unfortunately,
you've done so using the LSM interface in such a way that an LSM
other than SELinux is likely to demonstrate inappropriate behaviors
in the presence of CacheFiles because you have so carefully integrated
the SELinux requirements.

If the integration with SELinux is important to you, and I would
expect that it is given the work you've put into it, I suggest that
the SELinux specific behaviors be identified so that another LSM
can provide the behavior appropriate to the policy it chooses to
enforce and put that into SELinux with an LSM interface. I know
that you're looking at a significant effort to do that, but I
wouldn't think that you'd want CacheFiles to behave badly in the
presence of an LSM that doesn't happen to be SELinux.

I also know it's tempting to point out the SELinux is the only
upstream LSM. I hope to change that before too long, and I know
there are others with ambitions as well. I would not like to see
CacheFiles have to get excluded in the presence of other LSMs
and I doubt you would either.


Casey Schaufler
casey@schaufler-ca.com
-

From: David Howells
Date: Saturday, August 11, 2007 - 1:41 am

You have to understand that I didn't know that much about the LSM interface,
so I asked advice of the Red Hat security people, who, naturally, pointed me
at the SELinux mailing list.  I knew my stuff would have to work with SELinux
to be used with RH stuff.

Furthermore, as you pointed out, there aren't any other LSM modules upstream
yet for me to work against.  I would like CacheFiles to work with all LSM
modules in general, but I don't know how to do that yet.

I'm open to suggestion as to how to modify things to support any LSM.


Btw, do you understand the problems that CacheFiles has to deal with?  If I
set this down clearly, this may help you or someone else suggest a better way
to do things.

  (1) Some random process tries to access a file on a network filesystem
      (NFS example).

  (2) NFS goes to the cache to attempt to read the data from there prior to
      going to the network.

  (3) The cache driver wants to access the files in the cache, but it's
      running in the security context of either the aforementioned random
      process, or one of FS-Cache's thread pool.

      This security context, however, doesn't necessarily give it the rights
      to access what's in the cache, so the driver has to be permitted to act
      as a context appropriate to accessing the cache, without changing the
      overall security context of the random process (which would impact
      things trying to act on that process - kill() for example).

  (4) Assuming the data is found in the cache, all well and good, but if it
      isn't, the cache driver will have to create some files in the cache.

      Now, if the cache driver just went ahead and created the files, they
      could end up with their own security contexts being derived from the
      random process's security context, thus potentially making it impossible
      for other processes to access the cache.

      So the file-creation part of the security context must also be
      overridden ...
From: Casey Schaufler
Date: Saturday, August 11, 2007 - 8:56 am

While neither is upstream you can certainly look at AppArmor and Smack,

It's been a long time since I dealt with file system cacheing, and
that was under Unix, and I don't claim to have a working understanding

I think that this is the point you should attack. Control the security
characteristics of the cache driver properly and you shouldn't need the

Can you run the cache as an independent thread and send it messages
rather than trying to do things in the context of the calling process?

Yes, and the SELinux semantics for what label to give a file don't
help much, either. The problem with the "act_as" interfaces is that
I wouldn't expect them to be any more reliable than the old access()

Ideally you want to be running in the right context to create the
new file so that no one can use it and then label it "correctly"


The cache driver is a unique case with an unusual function. It's pretty
obvious that the kernel architecture, the VFS architecture, LSM, SELinux,
NFS and pretty much everyone else has given no thought whatever to the
implications of their designs on file system cacheing. For all concerned,
I'll say "sorry 'bout that".

Casey Schaufler
casey@schaufler-ca.com
-

From: David Howells
Date: Monday, August 13, 2007 - 3:54 am

How?  The cache driver acts on behalf of someone else.  That someone else has
one security context, but the cache itself has to have a different context so
that the cache can be shared.


It introduces more complexity, which I believe you were just arguing against
above...  It also incurs more kernel threads - which I really really want to
avoid.

I would rank the complexity and resource overhead of the act-as stuff in LSM
(or at least in SELinux) as much less than what you're suggesting.

As it stands, the FS-Cache layer has a pool of threads that CacheFiles makes
use of, but this can't be bound to the security of a specific cache because



I suspect that's more by the fact that security wasn't particularly thought
about when these interfaces were first written.  As with everything in the

Meaning you think I should just give up on this?

How about I reduce the interface I'm proposing to two functions:

  (1) int security_act_as(struct task_struct *context)

	Temporarily make the current process act as the given task, including,
	for example, for SELinux, the security ID with which this task acts on
	things, and the security ID with which this task creates files.

  (2) int security_act_as_self(void);

	Restore the context as which we're asking.

This would mean that the task's security context would have to be able to store
acting security IDs for everything, but I don't think that's too much of a
stretch resourcewise.

David
-

From: Casey Schaufler
Date: Monday, August 13, 2007 - 6:46 am

No, sorry, sometimes I sound meaner than I really am. I meant that

I haven't looked into the issues at all and I bet there are plenty,
maybe in audit and places outside of the security realm, but this
looks like a clean approach from the LSM interface standpoint. Do
you want the entire task or just task->security? I could see it
either way, but I suspect the task is your best bet. If you call
security_act_as() twice, then security_act_as_self() do you pop a
stack, or return to the initial state? How about security_act_as(NULL)
returning you to the initial state, and dropping security_act_as_self()?

Thank you for taking the effort to address the issues I raised.
I appreciate your willingness to accommodate my concerns even after
I'd flamed you.


Casey Schaufler
casey@schaufler-ca.com
-

From: David Howells
Date: Monday, August 13, 2007 - 7:51 am

It would probably have to be the task struct, lest the security information

Good point.  I've pondered that.  What I have at the moment partly acts like a
stack in that I store some of the shifted-out context on the machine stack (in
struct cachefiles_secctx).  The act-as context should probably be shifted too,

That would be fine.

Actually, to address Stephen Smalley's requirements also, how about making
things a bit more complex.  Have the following suite of functions:

 (1) int security_get_context(struct sec **_context);

	This allocates and gives the caller a blob that describes the current
	context of all the LSM module states attached to the current task and
	stores a pointer to it in *_context.

 (2) int security_push(struct sec *context, struct sec **_old_context)

	This causes all the LSM modules on the current task to switch to a new
	acting state, passing back the old state.  It does not change how
	other tasks do things to this one.

 (3) int security_pop(struct sec *context)

	This causes all the LSM modules on the current task to switch to a new
	acting state, deleting the old state.  It does not change how
	other tasks do things to this one.

 (4) int security_delete_context(struct sec *context)

	This deletes a context blob.

The context blob could then be structured very simply.  Give each loaded LSM
module an integer index as it is registered.  Having a limit to the number of
LSM modules would make things simpler.  The blob would then be an array of
void pointers, one per LSM module, indexed by the integer index for each one.
It you don't have a limit on the number of LSM modules, you'd also need a
count of slots in the blob.

Any LSM module that wanted to implement the above three functions would fill
in or otherwise use the slot that belongs to it.  Otherwise the slot would
just be left NULL.

For example:

	context --->+--------+                                    +---------+
	            | SLOT 0 |----------------------------------->| ...
From: Stephen Smalley
Date: Monday, August 13, 2007 - 7:57 am

Seems like over-design - we don't need to support LSM stacking, and we
don't need to support pushing/popping more than one level of context.

What was the objection again to the original interface, aside from
replacing "u32 secids" with "void* security blobs"?

-- 
Stephen Smalley
National Security Agency

-

From: David Howells
Date: Monday, August 13, 2007 - 8:22 am

It will, at some point hopefully, be possible for someone to try, say, NFS
exporting a cached ISO9660 mount (CDROM) - in which case, we'd should allow
for two levels of stack.  If we can pass the displaced context to the caller
to restore later then that allows for more or less unlimited depth.

It occurs to me that the following is almost good enough, but not quite:

  (1) int security_get_context(void **_context);

	This allocates and gives the caller a blob that describes the current
	context of all the LSM module states attached to the current task and
	stores a pointer to it in *_context.

  (2) int security_push(void *context, struct sec **_old_context)

	This causes all the LSM modules on the current task to switch to a new
	acting state, passing back the old state.  It does not change how
	other tasks do things to this one.

  (3) int security_pop(void *context)

	This causes all the LSM modules on the current task to switch to a new
	acting state, deleting the old state.  It does not change how
	other tasks do things to this one.

  (4) int security_delete_context(void *context)

I still need a way to transform the cachefilesd context into the kernel's
context.  See patch:

   Subject: [Linux-cachefs] [PATCH 12/16] CacheFiles: Get the SID under which
	the CacheFiles module should operate [try #3]

However, this seems to add a fairly generic tranformation, so that could be
generalised:


I got the impression that Casey thought much of this was tied to SELinux, but
rereading his/her emails, I'm not so certain.  Maybe that's sufficient.  Casey?

However, I've realised a problem (as outlined above) with what I've got.
Namely its stack isn't necessarily deep enough.  Alternatively, nfsd perhaps
should suppress caching on what it reads.

David
-

From: Casey Schaufler
Date: Monday, August 13, 2007 - 9:20 am

I assume that you're talking about the LSM specific data changing,
not the LSM itself.

If you change the task->security information you are definitly going
to change what other tasks can do to the calling task. This is part of
the dark side of label swapping. This is what I was trying to suggest
when I said that if you're going to switch labels you switch to a
system-daemon label, do your work, then change the file label explicitly.
Stephen may have a trick up his sleeve for SELinux, but I don't




I did get the impression that your initial design was focused
on SELinux, and that the implications of alternative LSM modules
had not been very high on your priority list. It's clear from

That's the really nice thing about cans of worms.
They come in six-packs.



Casey Schaufler
casey@schaufler-ca.com
-

From: David Howells
Date: Monday, August 13, 2007 - 9:31 am

I dealt with that in my current act-as patch.  Under SELinux a task has two
primary labels.  One with which it is labelled and is used to govern effects
upon it, and one that is used to act upon things and follows changes to the

In CacheFiles case, the cachefilesd daemon's security label into the label the

Yeah...

David
-

From: Casey Schaufler
Date: Monday, August 13, 2007 - 9:58 am

The specification of your push interface that the push operation
not affect how others access the process is OK for SELinux, but
not for any other MAC scheme that I've dealt with, and I think
that's most of them. Nuts. Smack, for example, uses exactly one
label on the process for all purposes.

Are you concerned about accesses other than signals? Signals
could be staitforward to deal with in a pushed situation, but
I'd hesitate to say that the solution would generalize without

I'm not sure I understand what this is doing.


Casey Schaufler
casey@schaufler-ca.com
-

From: David Howells
Date: Monday, August 13, 2007 - 12:52 pm

It's a fairly important concept.  The victimisation security context on a
process must not change, even if the kernel overrides the security context
that that process acts as so that it can transparently do work on its behalf.

IMO, the right way to do this is to pass the security context directly to

There's also /proc and ptrace() for example.  ps -z must not show the

CacheFiles consists of two parts: the kernel module which creates things in
the cache and does accesses into the cache on behalf of processes that access
cached filesystems, and the userspace daemon that builds cull tables and
deletes things.

The reason there are two security labels is that the daemon's label gives it
just enough rights to be able to do its job.  More or less all it can do is
lookup, opendir, readdir, stat, rmdir, unlink and open the chardev for talking
to the kernel module.  This means that the daemon can't, for example, be made
to read or modify cache storage objects.

Thus means, however, that the daemon's label isn't sufficient for the kernel
module to do its job.  But since there's no way for the kernel module to
directly get a label (and indeed it doesn't know the label it needs), a
transformation has to be applied that turns the process label used by the
daemon into a process label that the kernel, and only the kernel, can use.

The kernel's label gives it, amongst other things, the additional rights to do
mkdir, creat, open, read, write, setxattr, getxattr, rename - things the
daemon isn't allowed to do.

David
-

From: Casey Schaufler
Date: Monday, August 13, 2007 - 2:44 pm

With Smack you can leave the label alone, raise CAP_MAC_OVERRIDE,
do your business of setting the label correctly, and then drop
the capability. No new hooks required.


Casey Schaufler
casey@schaufler-ca.com
-

From: David Howells
Date: Tuesday, August 14, 2007 - 2:39 am

That sounds like a contradiction.  How can you both leave it alone and set it?

David
-

From: Casey Schaufler
Date: Tuesday, August 14, 2007 - 8:53 am

Whoops, sorry. You leave the process label alone and explicitly
set the file label using the xattr interfaces.


Casey Schaufler
casey@schaufler-ca.com
-

From: Stephen Smalley
Date: Tuesday, August 14, 2007 - 10:42 am

xattr interfaces don't help with the initial labeling of the file when
it is created.

-- 
Stephen Smalley
National Security Agency

-

From: Casey Schaufler
Date: Wednesday, August 15, 2007 - 9:30 am

That's true. The deamon needs to run with an appropriate label.
I don't believe that this is situation with a really simple solution


Casey Schaufler
casey@schaufler-ca.com
-

From: David Howells
Date: Tuesday, August 14, 2007 - 10:58 am

That's the wrong way to do things.  There'd then be a window in which
cachefilesd (the userspace daemon) could attempt to view the file when the
file has the wrong label attached.

David
-

From: Stephen Smalley
Date: Tuesday, August 14, 2007 - 10:50 am

Except that CAP_MAC_OVERRIDE doesn't exist upstream, and if it did, it
would represent Smack-specific logic in the core kernel (when you're
complaining about SELinux-specific logic there).  So even that would
have to be encapsulated within a hook.

-- 
Stephen Smalley
National Security Agency

-

From: Casey Schaufler
Date: Monday, August 13, 2007 - 8:42 am

LSM stacking has always been contentious and I don't see
that it addresses the issue, which is changing the data used

The objection centers around exposing LSM specific data outside
the LSM, and it applies to either secids or blobs, really. If you
need this information outside the LSM odds are good that what you're
using it for is going to be LSM specific, and hence should be inside
the LSM. I admit to two gray areas, audit and system service tasks
such as the two cited here. I like simplicity and find the single
security_act_as() interface attractive for the latter case.


Casey Schaufler
casey@schaufler-ca.com
-

From: Stephen Smalley
Date: Monday, August 13, 2007 - 6:50 am

I don't see how that helps with nfsd assuming the label of a remote

-- 
Stephen Smalley
National Security Agency

-

From: Casey Schaufler
Date: Monday, August 13, 2007 - 8:10 am

Well, assuming that nfsd assuming the label of a remote client is
a good idea ...

    newtask = taskstructdup(current);
    newtask->security = security_of_client;
    security_act_as(newtask);
    ... do interesting things ...
    security_act_as_self(); /* security_act_as(NULL); ? */
    cleanup_newtask(...)

... would be the basic flow. For what it's worth, and the whole
issue is being debated with gusto elsewhere, there are enough
problems with nfsd using this approach that it may not be worth


Casey Schaufler
casey@schaufler-ca.com
-

From: Stephen Smalley
Date: Monday, August 13, 2007 - 6:01 am

Parts of it are unique, but some of the same issues crop up in nfs - we
will need a way there as well for nfsd to assume the client process'
label for permission checking and new file labeling purposes, and the
act_as hook is not fundamentally different than what nfsd does today
with the fsuid/fsguid, just applied to the security label.

-- 
Stephen Smalley
National Security Agency

-

Previous thread: Re: [PATCH 00/23] per device dirty throttling -v8 by pointman on Friday, August 10, 2007 - 9:00 am. (1 message)

Next thread: [PATCH] Fix typo in arch/i386/kernel/tsc.c by Josh Triplett on Friday, August 10, 2007 - 9:25 am. (1 message)