On Wed, Aug 18, 2010 at 03:53:59PM +1000, Neil Brown wrote:
We can also skip the update whenever current_nfsd_time is greater than
the inode's mtime--that's enough to ensure that the next
file_update_time() call will get a time different from the inode's
current mtime.
And that means that a sequence like
file_update_time()
N nfsd_getattr()'s
doesn't make N updates to current_nfsd_time, when only 1 was necessary.
... which would only happen on hardware that could process a getattr and
a data update per nanosecond continuously for a jiffy.
OK, got it, I think: so this is the same as a global version of Alan's
clock, except that the extra ticks only happen when they need to.
The properties it satisfies:
- It's still a single global clock, so it's consistent between
files.
- It degenerates to jiffies in the absence of getattr's from
nfsd.
- It need only invalidate the other cpus' cached value of the
clock on the first getattr of a file that follows less than a
jiffy after an update of the file's data.
- Absent utime(), time going backwards, or futuristic hardware,
it guarantees that two nfsd reads of an inode's mtime will
return different values iff the inode's data was modified in
between the two.
Shortcomings:
- The clock advances in units only of either 1 jiffy or 1 ns.
This will look odd. But when the alternative is units of 1
jiffy or 0 ns, it seems an improvement....
- A slowdown due to inodes being file_update_time() marking inodes
dirty more frequently?
- Doesn't help with ext3. Oh well.
Would the extra expense rule out treating sys_stat() the same as nfsd?
It would be nice to be able to solve the same problem for userspace
nfsd's (or any other application that might be using mtime to save
rereading data).
--b.
--