Re: [PATCH -V4 08/11] vfs: Add new file and directory create permission flags

Previous thread: [PATCH] x86-64: simplify is32entry.S again by Jan Beulich on Friday, September 24, 2010 - 5:35 am. (2 messages)

Next thread: [PATCH v4] power: introduce library for device-specific OPPs by Nishanth Menon on Friday, September 24, 2010 - 5:50 am. (9 messages)
From: Aneesh Kumar K.V
Date: Friday, September 24, 2010 - 5:48 am

Hi,

The following set of patches implements VFS changes needed to implement
a new acl model for linux. Rich ACLs are an implementation of NFSv4 ACLs,
extended by file masks to fit into the standard POSIX file permission model.
They are designed to work seamlessly locally as well as across the NFSv4 and
CIFS/SMB2 network file system protocols.

The patch set consists of four parts:

The first set of patches, posted as a follow up, contains VFS changes needed
to implement the Rich ACL model. The second set [1] contains the Rich ACL model
and Ext4 implementation. The third set [2] contains mapping of Rich ACL to
NFSv4 ACL (how to apply file mask to access mask) and implementation of
Richacl ACL for NFS server and client. The fourth set [3] contains POSIX ACL
to Rich ACL mapping and its ext4 usage.

[1] git://git.kernel.org/pub/scm/linux/kernel/git/agruen/linux-2.6-richacl.git richacl-minimal
[2] git://git.kernel.org/pub/scm/linux/kernel/git/agruen/linux-2.6-richacl.git richacl-upstream
[3] git://git.kernel.org/pub/scm/linux/kernel/git/agruen/linux-2.6-richacl.git richacl-fullset

A user-space utility for displaying and changing richacls is available at [4]
(a number of examples can be found at http://acl.bestbits.at/richacl/examples.html).

[4] git://git.kernel.org/pub/scm/linux/kernel/git/agruen/richacl.git master

To test richacl on ext4 use -o richacl mount option. This mount option may later be
dropped in favour of a feature flag.

More details regarding richacl can be found at 
http://acl.bestbits.at/richacl/

Changes from V3:
a) Droped may_delete and may_create inode operations callback and reworked
   the patch series to use additional check flags. 
b) Rebased to the latest kernel
c) The patch series now contain only the minimal VFS changes.

Changes from V2:
1) Git repo include check-acl branch that drop newly added inode_operations
   callback in favour for additional access check flags (MAY_CREATE_FILE,
   MAY_CREATE_DIR, MAY_DELETE_CHILD, MAY_DELETE_SELF, ...
From: Aneesh Kumar K.V
Date: Friday, September 24, 2010 - 5:48 am

From: Andreas Gruenbacher <agruen@suse.de>

Some file permission models differentiate between writing to a file
(MAY_WRITE) and appending to it (MAY_WRITE | MAY_APPEND).  Pass all the
mask flags down to iop->check_acl so that filesystems can distinguish
between writing and appending.

All users of iop->check_acl pass the mask value back into
posix_acl_permission(); strip off the additional mask flags there.

Signed-off-by: Andreas Gruenbacher <agruen@suse.de>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 fs/namei.c     |    4 +---
 fs/posix_acl.c |    2 ++
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/fs/namei.c b/fs/namei.c
index c5920b5..7cadd07 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -174,8 +174,6 @@ static int acl_permission_check(struct inode *inode, int mask,
 {
 	umode_t			mode = inode->i_mode;
 
-	mask &= MAY_READ | MAY_WRITE | MAY_EXEC;
-
 	if (current_fsuid() == inode->i_uid)
 		mode >>= 6;
 	else {
@@ -192,7 +190,7 @@ static int acl_permission_check(struct inode *inode, int mask,
 	/*
 	 * If the DACs are ok we don't need any capability check.
 	 */
-	if ((mask & ~mode) == 0)
+	if ((mask & (MAY_READ | MAY_WRITE | MAY_EXEC) & ~mode) == 0)
 		return 0;
 	return -EACCES;
 }
diff --git a/fs/posix_acl.c b/fs/posix_acl.c
index 39df95a..c60bddf 100644
--- a/fs/posix_acl.c
+++ b/fs/posix_acl.c
@@ -213,6 +213,8 @@ posix_acl_permission(struct inode *inode, const struct posix_acl *acl, int want)
 	const struct posix_acl_entry *pa, *pe, *mask_obj;
 	int found = 0;
 
+	want &= MAY_READ | MAY_WRITE | MAY_EXEC;
+
 	FOREACH_ACL_ENTRY(pa, acl, pe) {
                 switch(pa->e_tag) {
                         case ACL_USER_OBJ:
-- 
1.7.0.4

--

From: Aneesh Kumar K.V
Date: Friday, September 24, 2010 - 5:48 am

From: Andreas Gruenbacher <agruen@suse.de>

Signed-off-by: Andreas Gruenbacher <agruen@suse.de>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 fs/namei.c |    6 ++++--
 1 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/fs/namei.c b/fs/namei.c
index 7cadd07..bf822b8 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -167,7 +167,7 @@ EXPORT_SYMBOL(putname);
 #endif
 
 /*
- * This does basic POSIX ACL permission checking
+ * This does the basic permission checking
  */
 static int acl_permission_check(struct inode *inode, int mask,
 		int (*check_acl)(struct inode *inode, int mask))
@@ -212,7 +212,7 @@ int generic_permission(struct inode *inode, int mask,
 	int ret;
 
 	/*
-	 * Do the basic POSIX ACL permission checks.
+	 * Do the basic permission checks.
 	 */
 	ret = acl_permission_check(inode, mask, check_acl);
 	if (ret != -EACCES)
@@ -246,6 +246,8 @@ int generic_permission(struct inode *inode, int mask,
  * We use "fsuid" for this, letting us set arbitrary permissions
  * for filesystem access without changing the "normal" uids which
  * are used for other things.
+ *
+ * When checking for MAY_APPEND, MAY_WRITE must also be set in @mask.
  */
 int inode_permission(struct inode *inode, int mask)
 {
-- 
1.7.0.4

--

From: Aneesh Kumar K.V
Date: Friday, September 24, 2010 - 5:48 am

From: Andreas Gruenbacher <agruen@suse.de>

Signed-off-by: Andreas Gruenbacher <agruen@suse.de>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 fs/namei.c |    6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/fs/namei.c b/fs/namei.c
index 24896e8..c5920b5 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -200,7 +200,7 @@ static int acl_permission_check(struct inode *inode, int mask,
 /**
  * generic_permission  -  check for access rights on a Posix-like filesystem
  * @inode:	inode to check access rights for
- * @mask:	right to check for (%MAY_READ, %MAY_WRITE, %MAY_EXEC)
+ * @mask:	right to check for (%MAY_READ, %MAY_WRITE, %MAY_EXEC, ...)
  * @check_acl:	optional callback to check for Posix ACLs
  *
  * Used to check for read/write/execute permissions on a file.
@@ -242,7 +242,7 @@ int generic_permission(struct inode *inode, int mask,
 /**
  * inode_permission  -  check for access rights to a given inode
  * @inode:	inode to check permission on
- * @mask:	right to check for (%MAY_READ, %MAY_WRITE, %MAY_EXEC)
+ * @mask:	right to check for (%MAY_READ, %MAY_WRITE, %MAY_EXEC, ...)
  *
  * Used to check for read/write/execute permissions on an inode.
  * We use "fsuid" for this, letting us set arbitrary permissions
@@ -288,7 +288,7 @@ int inode_permission(struct inode *inode, int mask)
 /**
  * file_permission  -  check for additional access rights to a given file
  * @file:	file to check access rights for
- * @mask:	right to check for (%MAY_READ, %MAY_WRITE, %MAY_EXEC)
+ * @mask:	right to check for (%MAY_READ, %MAY_WRITE, %MAY_EXEC, ...)
  *
  * Used to check for read/write/execute permissions on an already opened
  * file.
-- 
1.7.0.4

--

From: Jeff Layton
Date: Friday, September 24, 2010 - 8:50 am

On Fri, 24 Sep 2010 18:18:10 +0530

This may just be my own ignorance of ACL semantics and unfamiliarity
with the ACL code in general. It seems a bit unusual though...

Just to be clear...this patch implies that with richacls you can deny
or grant access to the owner of the file even if the mode bits say
otherwise. With POSIX acls, this seems to be the other way around.

Hmm....guess I should read the spec...

-- 
Jeff Layton <jlayton@redhat.com>
--

From: Aneesh Kumar K. V
Date: Friday, September 24, 2010 - 11:55 am

To be POSIX compatible we need to ensure that additional file access
control mechanisms may only further restrict the access permissions defined
by the file permission bits. So with rich acl, similar to POSIX ACL,
we use file mask as an upper bound of the acess permission
allowed. Unlike POSIX ACL where the 'owner' and 'other' ACL entry access
mask is kept in sync with mode bits, rich acl include a file mask even
for 'owner' and 'everyone' entries. 

The patch that gives more details about the permission check algo can be
found at 

http://git.kernel.org/?p=linux/kernel/git/agruen/linux-2.6-richacl.git;a=commitdiff;h=...

-aneesh
--

From: Andreas Gruenbacher
Date: Monday, September 27, 2010 - 6:03 am

That's true but I don't think it fully answers Jeff's question.

With POSIX ACLs, the owner file permission bits are always identical to the 
permissions that the owner is granted through the ACL.  Therefore, when 
acl_permission_check() is invoked on behalf of the owner, the ACL does not 
need to be consulted at all.  For non-owners, the ACL always needs to be 
checked.  This optimization is also true for richacls for the base permissions 
(read, write, execute), but:

 * Some permissions are more fine-grained than the file mode permission
   bits: richacls distinguish between write and append, and between creating
   directories and non-directories.

 * Some permissions go beyond what the owner is implicitly allowed or what can
   be expressed with read, write, execute: in a richacl, a user can be granted
   the right to delete a specific file even without write access to the
   containing directory and to take ownership of a file

(* In addition, a richacl can grant the right to chmod and set the acl of a
   file, and to explicitly set the file timestamps.  These are permissions
   which the owner is implicitly allowed anyway, so they are not relevant to
   this change to acl_permission_check().)

To handle those cases correctly too, we always look at the acl for richacls, 
even for the owner.  (We could still skip the acl check in some, but fewer, 
cases.)

Thanks,
Andreas
--

From: Jeff Layton
Date: Friday, September 24, 2010 - 8:54 am

On Fri, 24 Sep 2010 18:18:11 +0530
									^^^^^

						^^^^ this is a little
						scary, but even if it's
						a directory, it'll get
						kicked out in a later
						check. Would it be
						clearer to move up the
						S_ISDIR() check in this
						function and then pass


-- 
Jeff Layton <jlayton@redhat.com>
--

From: Aneesh Kumar K. V
Date: Friday, September 24, 2010 - 12:16 pm

Can you elaborate on this ? 

-aneesh

--

From: Andreas Gruenbacher
Date: Monday, September 27, 2010 - 6:14 am

Ah, you mean this:

--- a/fs/namei.c
+++ b/fs/namei.c
@@ -2450,7 +2450,9 @@ int vfs_link(struct dentry *old_dentry, struct inode *dir, struct dentry *new_de
 	if (!inode)
 		return -ENOENT;
 
-	error = may_create(dir, new_dentry, S_ISDIR(inode->i_mode));
+	if (S_ISDIR(inode->i_mode))
+		return -EPERM;
+	error = may_create(dir, new_dentry, 0);
 	if (error)
 		return error;
 
@@ -2464,8 +2466,6 @@ int vfs_link(struct dentry *old_dentry, struct inode *dir, struct dentry *new_de
 		return -EPERM;
 	if (!dir->i_op->link)
 		return -EPERM;
-	if (S_ISDIR(inode->i_mode))
-		return -EPERM;
 
 	error = security_inode_link(old_dentry, dir, new_dentry);
 	if (error)

This is a clear improvement; I don't think it matters that user-space will
get -EPERM instead of -EXDEV when trying to hard-link a directory across
devices.

Thanks,
Andreas
--

From: Andreas Dilger
Date: Sunday, January 2, 2011 - 10:20 pm

Actually, I think it is required by POSIX that EXDEV is returned when cross-linking files. For mv calling rename() at least it uses a return EXDEV to copy the file to the target filesystem instead of just giving up and returning an error. It may be other applications like tar have a similar dependency for link(). 

Cheers, Andreas

--

From: Andreas Dilger
Date: Sunday, January 2, 2011 - 10:59 pm

Never mind, I didn't notice this was only for hard linking directories...

Cheers, Andreas

--

From: Aneesh Kumar K. V
Date: Monday, January 3, 2011 - 7:20 am

The patch is needed only with richacl patches, that add an extra argument to
may_create.

-aneesh
--

From: J. Bruce Fields
Date: Monday, October 11, 2010 - 5:24 pm

The one thing I remember not liking before was a flag that told the user
whether a given ACL was originally mapped from POSIX or not.  Is that
still there?

Overall I'm for doing this: I don't like NFSv4/Windows ACLs more than
anyone else, but they're too useful to ignore, and the mapping that
Samba and the NFSv4 server try to do is painful for users.

--

From: Aneesh Kumar K. V
Date: Tuesday, October 12, 2010 - 12:17 am

We still have that. But we can resolve that once we decide on how to
migrate an existing file system containing posix acl to richacl. Most of
those patches will need to be updated based on the feedback from
different local file system maintainers. That is why those patches are
pushed towards the end and is  part of last set.

What we need in the first step is to get VFS changes reviewed.Once we
agree on the VFS changes done, then we can start looking at the changes
upto NFS richacl nfs support. When get that merged then we can start
having discussion on how local file system maintainers want to migrate


-aneesh
--

From: J. Bruce Fields
Date: Monday, October 25, 2010 - 12:09 pm

Is there any progress on this?

--b.
--

From: Aneesh Kumar K. V
Date: Monday, October 25, 2010 - 9:35 pm

The next step would be to get Al Viro or Christoph to look at the
proposed VFS changes and get an ACK on them. Meanwhile i can rebase
the full series to the latest linux kernel. 

Apart from that is there any specific changes you would like to see
as a part of richacl patch series. Would you like to see the full
patchset posted to the list or should we go in steps as mentioned above
?

-aneesh
--

Previous thread: [PATCH] x86-64: simplify is32entry.S again by Jan Beulich on Friday, September 24, 2010 - 5:35 am. (2 messages)

Next thread: [PATCH v4] power: introduce library for device-specific OPPs by Nishanth Menon on Friday, September 24, 2010 - 5:50 am. (9 messages)