On Dec 04, 2006 10:15 -0500, Trond Myklebust wrote:I think the "barrier semantics" are something that have just crept into this discussion and is confusing the issue. The primary goal (IMHO) of this syscall is to allow the filesystem (primarily distributed cluster filesystems, but HFS and NTFS developers seem on board with this too) to avoid tens to thousands of stat RPCs in very common ls -R, find, etc. kind of operations. I can't see how fadvise() could help this case? Yes, it would tell the filesystem that it could do readahead of the readdir() data, but the app will still be doing stat() on each of the thousands of files in the directory, instantiating inodes and dentries on that node (which need locking, and potentially immediate lock revocation if the files are being written to by other nodes). In some cases (e.g. rm -r, grep -r) that might even be a win, because the client will soon be touching all of those files, but not necessarily in the ls -lR, find cases. The filesystem can't always do "stat-ahead" on the files because that requires instantiating an inode on the client which may be stale (lock revoked) by the time the app gets to it, and the app (and the VFS) have no idea just how stale it is, and whether the stat is a "real" stat or "only" the readdir stat (because the fadvise would only be useful on the directory, and not all of the child entries), so it would need to re-stat the file. Also, this would potentially blow the client's real working set of inodes out of cache. Doing things en-masse with readdirplus() also allows the filesystem to do the stat() operations in parallel internally (which is a net win if there are many servers involved) instead of serially as the application would do. Cheers, Andreas PS - I changed the topic to separate this from the openfh() thread. -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
| Peter Zijlstra | Re: Problem with ata layer in 2.6.24 |
| david | Re: Dual-Licensing Linux Kernel with GPL V2 and GPL V3 |
| Bart Van Assche | Re: Integration of SCST in the mainstream Linux kernel |
| Andi Kleen | Re: [patch] Add basic sanity checks to the syscall execution patch |
git: | |
| Johannes Schindelin | Re: git on MacOSX and files with decomposed utf-8 file names |
| Junio C Hamano | Re: [PATCH resend] make "git push" update origin and mirrors, "git push --mirror" ... |
| Morten Welinder | Re: [Census] So who uses git? |
| Steven Grimm | Segmentation fault in git-svn |
| GVG GVG | ssh_exchange_identification: Connection closed by remote host |
| frantisek holop | nptd regression in 4.2 |
| Josh Grosse | Re: Real men don't attack straw men |
| peter | ntpd not synching |
| Jim Winstead Jr. | Re: Root Disk/Book Disk Compatibility |
| Dong Liu | Re: CXterm for LINUX |
| erc | HARDWARE COMPATIBILITY LIST |
| Douglas Graham | Re: Buggy omit-frame-pointer? |
