Re: [PATCH 16/35] union-mount: Writable overlays/union mounts documentation

Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]
From: Valerie Aurora
Date: Thursday, April 29, 2010 - 1:20 pm

On Thu, Apr 29, 2010 at 11:33:39AM +0200, Miklos Szeredi wrote:

Sure.  The short version is that unionfs has to allocate another copy
of each file system structure - inode, etc. - and then keep an array
of the matching structures from each of the file system layers.  Each
unionfs file system op copies data up and down between the unionfs
structures and the underlying structures, and then calls the lower
file system op as necessary.  Often it has to duplicate code from the
VFS before calling the lower file system ops.

Where union mounts has the advantage is that we make zero copies of
file system data structures and therefore don't need copyup or
interposition on as many ops.  But if you wait until the file system
op is called, you have to attach your union-related data to the
associated data structure, and the underlying file system is already
using the private data pointer.  And you have to keep a copy of the
underlying file system ops.  And each data structure can be part of
multiple unions.  So you end up with an effective second copy of the
file system data structure and a mess of linked lists or pointers.


Unfortunately, dentries aren't unioned - paths (dentry/mnt pairs) are.
So you can get the parent dentry in the file system op, but the dentry
is potentially part of many different mounts.  There's no mapping from
a lower-level read-only dentry to the covering read-write parent
dentry because the read-only dentry could potentially be mounted in 5
different places.  Which union mount is this dentry part of?  You have
to record the parent's path during lookup and carry it around until
you do the copyup - for every syscall that alters a file, not just
open() and write(), but chmod(), etc.  So if you implement it in the
VFS, you don't have to carry that info across the file system op
boundary.

I think the chmod() case really shows the issues well.  user_path_nd()
records the parent's path during lookup (in an inefficient, possibly
racy manner), then union_copyup() does the copy (too early, before a
lot of permission checks).  The underlying file system doesn't get
involved until the ->setattr() call in notify_change(), and all that
gets is the dentry.


That's somewhat of an issue right now.  For union mounts to be most
efficient and wonderful, system calls should be separated into two
sequential parts called from the same context as the user_path()
lookup:

1) permission checks and all read-only checks that can fail.
[union copyup happens here]
2) the actual write or change to the file system

Otherwise we have to push the parent nameidata down through the stack
to where the actual change happens.  So if want to avoid copying up
the file unless chmod() succeeds, in the current code structure I'd
have to add a nameidata and a mnt to notify_change()'s arguments.  But
this is an optimization, not a correctness problem.

-VAL
--
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

Messages in current thread:
[PATCH 00/35] Union mounts - everything but the xattrs, Valerie Aurora, (Thu Apr 15, 4:04 pm)
[PATCH 01/35] VFS: Make lookup_hash() return a struct path, Valerie Aurora, (Thu Apr 15, 4:04 pm)
[PATCH 02/35] VFS: Add read-only users count to superblock, Valerie Aurora, (Thu Apr 15, 4:04 pm)
[PATCH 08/35] whiteout: tmpfs whiteout support, Valerie Aurora, (Thu Apr 15, 4:04 pm)
[PATCH 10/35] whiteout: ext2 whiteout support, Valerie Aurora, (Thu Apr 15, 4:04 pm)
[PATCH 11/35] whiteout: jffs2 whiteout support, Valerie Aurora, (Thu Apr 15, 4:04 pm)
[PATCH 12/35] fallthru: Basic fallthru definitions, Valerie Aurora, (Thu Apr 15, 4:04 pm)
[PATCH 13/35] fallthru: ext2 fallthru support, Valerie Aurora, (Thu Apr 15, 4:04 pm)
[PATCH 14/35] fallthru: jffs2 fallthru support, Valerie Aurora, (Thu Apr 15, 4:04 pm)
[PATCH 15/35] fallthru: tmpfs fallthru support, Valerie Aurora, (Thu Apr 15, 4:04 pm)
[PATCH 20/35] union-mount: Implement union lookup, Valerie Aurora, (Thu Apr 15, 4:04 pm)
[PATCH 26/35] union-mount: In-kernel copyup routines, Valerie Aurora, (Thu Apr 15, 4:04 pm)
[PATCH 28/35] union-mount: Implement union-aware link(), Valerie Aurora, (Thu Apr 15, 4:04 pm)
[PATCH 29/35] union-mount: Implement union-aware rename(), Valerie Aurora, (Thu Apr 15, 4:04 pm)
[PATCH 31/35] union-mount: Implement union-aware chown(), Valerie Aurora, (Thu Apr 15, 4:04 pm)
[PATCH 34/35] union-mount: Implement union-aware lchown(), Valerie Aurora, (Thu Apr 15, 4:04 pm)
Re: [PATCH 13/35] fallthru: ext2 fallthru support, David Woodhouse, (Mon Apr 19, 6:02 am)
Re: [PATCH 11/35] whiteout: jffs2 whiteout support, David Woodhouse, (Mon Apr 19, 6:03 am)
Re: [PATCH 13/35] fallthru: ext2 fallthru support, Jan Blunck, (Mon Apr 19, 6:23 am)
Re: [PATCH 13/35] fallthru: ext2 fallthru support, Jan Blunck, (Mon Apr 19, 7:12 am)
Re: [PATCH 13/35] fallthru: ext2 fallthru support, Valerie Aurora, (Mon Apr 19, 7:23 am)
Re: [PATCH 11/35] whiteout: jffs2 whiteout support, Valerie Aurora, (Mon Apr 19, 7:26 am)
Re: [PATCH 13/35] fallthru: ext2 fallthru support, Jan Blunck, (Wed Apr 21, 1:42 am)
Re: [PATCH 13/35] fallthru: ext2 fallthru support, Jamie Lokier, (Wed Apr 21, 2:22 am)
Re: [PATCH 13/35] fallthru: ext2 fallthru support, Jamie Lokier, (Wed Apr 21, 2:52 am)
Re: [PATCH 13/35] fallthru: ext2 fallthru support, Miklos Szeredi, (Wed Apr 21, 3:17 am)
Re: [PATCH 13/35] fallthru: ext2 fallthru support, Jamie Lokier, (Wed Apr 21, 10:36 am)
Re: [PATCH 13/35] fallthru: ext2 fallthru support, Valerie Aurora, (Wed Apr 21, 2:38 pm)
Re: [PATCH 13/35] fallthru: ext2 fallthru support, Jamie Lokier, (Wed Apr 21, 3:10 pm)
Re: [PATCH 00/35] Union mounts - everything but the xattrs, Valerie Aurora, (Wed Apr 21, 4:35 pm)
Re: [PATCH 13/35] fallthru: ext2 fallthru support, J. R. Okajima, (Thu Apr 22, 3:30 am)
Re: [PATCH 16/35] union-mount: Writable overlays/union mou ..., Valerie Aurora, (Thu Apr 29, 1:20 pm)