Re: funlink() for fun!

Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]
From: Greg A. Woods
Date: Monday, July 14, 2003 - 11:30 am

[ On Monday, July 14, 2003 at 18:34:27 (+0200), Matthias Buelow wrote: ]

Yes, "file handles" as some folks call them.  They are effectively
vnodes in the *BSD terminology.


Exposing vnodes to userland is effectively what getfh(2) already does.


We already have fhopen(2), fhstat(2), and fhstatfs().

Of course these calls (including getfh()) are currently restricted to
the superuser because they have not had the necessary ACL semantics
defined for them.

We are missing at least fhchdir(2) (though for the superuser this can be
emulated with, for example:  fchdir(fhopen(getfh(".")))


You can't do that (directly) in any filesystem that has all of the same
semantics as a unix filesystem.

File names are just pointers to files that exist in special files called
directory files.  By convention the first file (inode #0) in a
filesystem is also the root directory for the namespace we lay over top
of the filesystem.  By convention we have the first two entries in a
directory file point to the directory file itself and the parent
directory file.

However by convention we do not have the filename(s) recorded in the
files themselves and thus the only way to find the name for a file is to
traverse the directory structure until one encounters a name pointing to
the file in question.  Of course since a file may have more than one
name there's never any sure way to know if the name encountered is the
one the user had in mind for such a multi-named file.  Finding all the
names for a file is of course possible (especially since the link count
tells us how many to look for), but it still doesn't help decide which
was indented by the user.

Note that we don't want to try to record the filename(s) in the file
because there would be significantly more overhead and complication to
maintain those "reverse pointers", especially if you consider the number
of possible updates needed in a hierarchical filesystem for an operation
such as "mv /usr /user".  We also don't want to do this because we don't
want to have to have to allocate variable numbers of disk blocks for one
inode (which we would likely end up having to do sometimes if a file had
many names, even on filesystems with large blocks).

Once you begin down the path of designing a filesystem which has as its
major attributes a hirearchical naming scheme with directories and
sub-directories, and which allows multiple hard links to files, then you
must give up on the notion of having a guaranteed and fast way to
determine a file's name when all you have is a file handle, or file
descriptor, inode, or vnode, etc.  You can still find the name(s) for a
file when given one of those index pointers, but the time it will take
depends on the size of the namespace, and anyone who's run "find -inum"
on a very large filesystem will know this can be a very long time.  It's
not impossible -- it's just a lot more painful than we might desire,
especially when we highly desire to do it for something like funlink().


fhunlink(2) faces the same problems as my funlink(2) does.  Inodes can
still have multiple names and their names can still be changed and moved
within the directory tree between the time th handle or descriptor was
obtained and the time the unlink is attempted.


Unix filesystems, and indeed all hierarchical filesystems which support
multiple hard links and allow file rename operations, cannot produce any
such indirection without forcing great overhead on operations we
currently consider simple and fast for unix filesystems to implement.

The emulation of this indirection, and its possible optimizations, is
exactly what I've proposed for funlink(2) (and would be shared with
fhunlink(2)), and the way I've proposed it only impacts this new
operation, and not as far as I can tell any existing operation.


We already have getfh(), fhopen(), etc.  Continuing on with fhchdir()
would be a natural extension to the established unix API, especially if
ACL semantics were defined for these system calls.  Rewriting open() and
dup() to be library calls sitting atop fhopen() and fhdup() would
probably be possible, though probably not wise.  :-)


It's not surprising at all if you consider the desire for a hierarchical
filesystem, especially one that supports hard links and rename operations.

-- 
						Greg A. Woods

+1 416 218-0098                  VE3TCP            RoboHack <woods@robohack.ca>
Planix, Inc. <woods@planix.com>          Secrets of the Weird <woods@weird.com>
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

Messages in current thread:
Re: fsync performance hit on 1.6.1, Daniel Brewer, (Mon Jul 7, 12:45 am)
Re: fsync performance hit on 1.6.1, der Mouse, (Mon Jul 7, 12:53 am)
Re: fsync performance hit on 1.6.1, Chuck Silvers, (Mon Jul 7, 9:33 am)
Re: fsync performance hit on 1.6.1, Chuck Silvers, (Mon Jul 7, 9:39 am)
Re: fsync performance hit on 1.6.1, Greg A. Woods, (Mon Jul 7, 11:45 am)
Re: fsync performance hit on 1.6.1, David Laight, (Mon Jul 7, 1:58 pm)
Re: fsync performance hit on 1.6.1, Jochen Kunz, (Tue Jul 8, 1:16 am)
Re: fsync performance hit on 1.6.1, David Laight, (Tue Jul 8, 5:17 am)
Re: fsync performance hit on 1.6.1, Matthias Buelow, (Tue Jul 8, 11:18 am)
Re: fsync performance hit on 1.6.1, Greg A. Woods, (Tue Jul 8, 1:57 pm)
Re: fsync performance hit on 1.6.1, Matthias Buelow, (Tue Jul 8, 3:32 pm)
Re: fsync performance hit on 1.6.1, Greg A. Woods, (Tue Jul 8, 6:59 pm)
Re: fsync performance hit on 1.6.1, Matthias Buelow, (Tue Jul 8, 7:54 pm)
Re: fsync performance hit on 1.6.1, der Mouse, (Tue Jul 8, 8:09 pm)
Re: fsync performance hit on 1.6.1, Greg A. Woods, (Tue Jul 8, 8:44 pm)
Re: fsync performance hit on 1.6.1, Greg A. Woods, (Wed Jul 9, 12:01 am)
Re: fsync performance hit on 1.6.1, der Mouse, (Wed Jul 9, 12:11 am)
Re: fsync performance hit on 1.6.1, Christoph Hellwig, (Wed Jul 9, 1:04 am)
Re: fsync performance hit on 1.6.1, Christoph Hellwig, (Wed Jul 9, 1:07 am)
Re: fsync performance hit on 1.6.1, Greg A. Woods, (Wed Jul 9, 1:36 am)
Re: fsync performance hit on 1.6.1, Greg A. Woods, (Wed Jul 9, 9:26 am)
Re: fsync performance hit on 1.6.1, Christoph Hellwig, (Wed Jul 9, 9:43 am)
Re: fsync performance hit on 1.6.1, Matt Thomas, (Wed Jul 9, 11:13 am)
POSIX shm_open() vs. mmap(MAP_ANON|MAP_SHARED)...., Greg A. Woods, (Wed Jul 9, 11:17 am)
Re: fsync performance hit on 1.6.1, Matthias Buelow, (Wed Jul 9, 11:34 am)
Re: fsync performance hit on 1.6.1, Matthias Buelow, (Wed Jul 9, 11:43 am)
Re: fsync performance hit on 1.6.1, der Mouse, (Wed Jul 9, 12:21 pm)
Re: fsync performance hit on 1.6.1, Greg A. Woods, (Wed Jul 9, 12:22 pm)
Re: fsync performance hit on 1.6.1, Greg A. Woods, (Wed Jul 9, 12:56 pm)
Re: fsync performance hit on 1.6.1, der Mouse, (Wed Jul 9, 1:05 pm)
Re: fsync performance hit on 1.6.1, Greg A. Woods, (Wed Jul 9, 1:40 pm)
Re: fsync performance hit on 1.6.1, der Mouse, (Wed Jul 9, 1:50 pm)
Re: fsync performance hit on 1.6.1, Greywolf, (Wed Jul 9, 2:06 pm)
Re: fsync performance hit on 1.6.1, Christoph Hellwig, (Wed Jul 9, 5:14 pm)
Re: fsync performance hit on 1.6.1, Matthias Buelow, (Wed Jul 9, 6:32 pm)
Re: fsync performance hit on 1.6.1, Kamal R Prasad, (Wed Jul 9, 11:11 pm)
Re: fsync performance hit on 1.6.1, Greg A. Woods, (Thu Jul 10, 12:06 am)
Re: fsync performance hit on 1.6.1, Kamal R Prasad, (Thu Jul 10, 12:23 am)
Re: funlink() for fun!, Greg A. Woods, (Thu Jul 10, 9:35 am)
Re: POSIX shm_open() vs. mmap(MAP_ANON|MAP_SHARED)...., Greg A. Woods, (Thu Jul 10, 9:37 am)
Re: fsync performance hit on 1.6.1, Greg A. Woods, (Thu Jul 10, 9:40 am)
Re: funlink() for fun!, Greywolf, (Thu Jul 10, 2:52 pm)
Re: fsync performance hit on 1.6.1, Matthew Mondor, (Thu Jul 10, 3:08 pm)
Re: funlink() for fun!, Greg A. Woods, (Thu Jul 10, 3:34 pm)
Re: funlink() for fun!, Greywolf, (Thu Jul 10, 3:51 pm)
Re: funlink() for fun!, Greg A. Woods, (Thu Jul 10, 4:56 pm)
Re: funlink() for fun!, Greywolf, (Thu Jul 10, 5:31 pm)
Re: funlink() for fun!, der Mouse, (Thu Jul 10, 8:41 pm)
Re: funlink() for fun!, Greg A. Woods, (Thu Jul 10, 11:38 pm)
Re: funlink() for fun!, der Mouse, (Thu Jul 10, 11:43 pm)
Re: funlink() for fun!, Greg A. Woods, (Fri Jul 11, 12:31 am)
Re: funlink() for fun!, der Mouse, (Fri Jul 11, 1:12 am)
Re: funlink() for fun!, David Laight, (Fri Jul 11, 2:36 am)
Re: funlink() for fun!, Roland Dowdeswell, (Fri Jul 11, 8:03 am)
Re: funlink() for fun!, Greg A. Woods, (Fri Jul 11, 10:47 am)
Re: funlink() for fun!, Greg A. Woods, (Fri Jul 11, 10:58 am)
Re: funlink() for fun!, Greywolf, (Fri Jul 11, 1:42 pm)
Re: funlink() for fun!, der Mouse, (Fri Jul 11, 1:48 pm)
Re: funlink() for fun!, Greg A. Woods, (Fri Jul 11, 4:02 pm)
Re: funlink() for fun!, Greywolf, (Fri Jul 11, 4:20 pm)
Re: funlink() for fun!, Matthias Buelow, (Fri Jul 11, 6:38 pm)
Re: funlink() for fun!, Matthias Buelow, (Fri Jul 11, 6:53 pm)
Re: funlink() for fun!, Greywolf, (Fri Jul 11, 7:20 pm)
Re: funlink() for fun!, Greg A. Woods, (Sat Jul 12, 1:11 am)
Re: funlink() for fun!, Greg A. Woods, (Sat Jul 12, 1:47 am)
Re: funlink() for fun!, Greg A. Woods, (Sat Jul 12, 1:54 am)
Re: funlink() for fun!, der Mouse, (Sat Jul 12, 1:57 am)
Re: funlink() for fun!, Greywolf, (Sat Jul 12, 2:11 am)
Re: funlink() for fun!, joerg, (Sat Jul 12, 4:07 am)
Re: funlink() for fun!, Ignatios Souvatzis, (Sat Jul 12, 6:18 am)
Re: funlink() for fun!, Greg A. Woods, (Sat Jul 12, 10:48 am)
Re: funlink() for fun!, Greg A. Woods, (Sat Jul 12, 10:57 am)
Re: funlink() for fun!, Matthias Buelow, (Sun Jul 13, 1:25 pm)
Re: funlink() for fun!, Greg A. Woods, (Sun Jul 13, 3:58 pm)
Re: funlink() for fun!, Matthias Buelow, (Mon Jul 14, 7:19 am)
Re: funlink() for fun!, Greg A. Woods, (Mon Jul 14, 9:08 am)
Re: funlink() for fun!, Greywolf, (Mon Jul 14, 9:24 am)
Re: funlink() for fun!, Matthias Buelow, (Mon Jul 14, 9:34 am)
Re: funlink() for fun!, Greg A. Woods, (Mon Jul 14, 10:44 am)
Re: funlink() for fun!, Ben Harris, (Mon Jul 14, 11:06 am)
Re: funlink() for fun!, Greywolf, (Mon Jul 14, 11:20 am)
Re: funlink() for fun!, Greg A. Woods, (Mon Jul 14, 11:30 am)
Re: funlink() for fun!, Matthias Buelow, (Mon Jul 14, 12:46 pm)
Re: funlink() for fun!, Matthias Buelow, (Mon Jul 14, 1:51 pm)
Re: funlink() for fun!, Greg A. Woods, (Mon Jul 14, 1:58 pm)
Re: funlink() for fun!, Greg A. Woods, (Mon Jul 14, 2:43 pm)
Re: funlink() for fun!, der Mouse, (Mon Jul 14, 3:00 pm)
Re: funlink() for fun!, Greywolf, (Mon Jul 14, 4:05 pm)
Re: funlink() for fun!, der Mouse, (Mon Jul 14, 4:56 pm)
Re: funlink() for fun!, Greg A. Woods, (Mon Jul 14, 5:07 pm)
Re: funlink() for fun!, der Mouse, (Mon Jul 14, 5:42 pm)
re: funlink() for fun!, matthew green, (Mon Jul 14, 8:55 pm)
re: funlink() for fun!, Bill Studenmund, (Tue Jul 15, 10:31 am)
Re: funlink() for fun!, Bill Studenmund, (Tue Jul 15, 10:34 am)
Re: funlink() for fun!, Bill Studenmund, (Tue Jul 15, 11:47 am)
Re: funlink() for fun!, Bill Studenmund, (Tue Jul 15, 12:00 pm)
Re: funlink() for fun!, ww, (Tue Jul 15, 12:09 pm)
Re: funlink() for fun!, Bill Studenmund, (Tue Jul 15, 12:40 pm)
Re: funlink() for fun!, der Mouse, (Tue Jul 15, 12:49 pm)
Re: funlink() for fun!, Ignatios Souvatzis, (Tue Jul 15, 1:02 pm)
Re: funlink() for fun!, Bill Studenmund, (Tue Jul 15, 1:28 pm)
Re: funlink() for fun!, Greg A. Woods, (Wed Jul 16, 12:05 am)