On Nov 28, 2006 05:54 +0000, Christoph Hellwig wrote:
As usual, Christoph is a model of diplomacy :-).
IMHO, this is a logical extension to readv/writev. It allows a single
readx/writex syscall to specify different targets in the file instead of
needing separate syscalls. So, for example, a single syscall could be
given to dump a sparse corefile or a compiled+linked binary, allowing
the filesystem to optimize the allocations instead of getting essentially
random IO from several separate syscalls.
This is a big win for clustered filesystems. Some "stat" items are
a lot more work to gather than others, and if an application (e.g.
"ls --color" which is default on all distros) doesn't need anything
except the file mode to print "*" and color an executable green it
is a waste to gather the remaining ones.
My objection to the current proposal is that it should be possible
to _completely_ specify which fields are required and which are not,
instead of having a split "required" and "optional" section to the
stat data. In some implementations, it might be desirable to only
find the file blocks (e.g. ls -s, du -s) and not the owner, links,
metadata, so why implement a half-baked version of a "lite" stat()?
Also, why pass the "st_litemask" as a parameter set in the struct
(which would have to be reset on each call) instead of as a parameter
to the function (makes the calling convention much clearer)?
int statlite(const char *fname, struct stat *buf, unsigned long *statflags);
[ readdirplus not referenced ]
It would be prudent, IMHO, that if we are proposing statlite() and
readdirplus() syscalls, that the readdirplus() syscall be implemented
as a special case of statlite(). It avoids the need for yet another
version in the future "readdirpluslite()" or whatever...
Namely readdirplus() takes a "statflags" paremeter (per above) so that
the dirent_plus data only has to retrieve the required stat data (e.g. ls
-iR only needs inode number) and not all of it. Each returned stat
has a mask of valid fields in it, as e.g. some items might be in cache
already and can contain more information than others.
Strange, group is called HECIWG, website is "hecewg"?
Cheers, Andreas
--
Andreas Dilger
Principal Software Engineer
Cluster File Systems, Inc.
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html