Re: Proposal: Use hi-res clock for file timestamps

Previous thread: defunct processes owned by init, not being reaped ? by Chris Friesen on Friday, August 13, 2010 - 11:22 am. (1 message)

Next thread: BUG: IPv6 stops working after a while, needs ip ne del command to reset by Thomas Habets on Friday, August 13, 2010 - 10:55 am. (24 messages)
From: Patrick J. LoPresti
Date: Friday, August 13, 2010 - 11:25 am

For concreteness, let me start with the patch I have in mind.  Call it
"patch version 1".


--- linux-2.6.32.13-0.4/kernel/time.c.orig      2010-08-13
10:52:50.000000000 -0700
+++ linux-2.6.32.13-0.4/kernel/time.c   2010-08-13 10:53:20.000000000 -0700
@@ -229,7 +229,7 @@ SYSCALL_DEFINE1(adjtimex, struct timex _
  */
 struct timespec current_fs_time(struct super_block *sb)
 {
-        struct timespec now = current_kernel_time();
+       struct timespec now = getnstimeofday();
        return timespec_trunc(now, sb->s_time_gran);
 }
 EXPORT_SYMBOL(current_fs_time);

...

I recently spent nearly a week tracking down an NFS cache coherence
problem in an application:

http://www.spinics.net/lists/linux-nfs/msg14974.html

Here is what caused my problem:

1) File dir/A is created locally on NFS server.
2) NFS client does LOOKUP on file dir/B, gets ENOENT.
3) File dir/B is created locally on NFS server.

In my case, these all happened in less than 4 milliseconds (much less,
actually).  Since HZ on my system is 250, the file creation in step
(3) failed to update the ctime/mtime on the directory.  The result is
that the NFS client's "dentry lookup cache" became stale, but did not
know it was stale (since it relies on the directory ctime/mtime to
detect that).  Worse, the staleness persists even if additional
changes are made to the directory from the NFS client, thanks to NFS
v3's "weak cache consistency" optimizations.

Why did this take me a week to diagnose?  Because I am using XFS, and
I know XFS and NFS use nanosecond resolution for file timestamps.  It
never occurred to me that, here in 2010, Linux would have an actual
file timestamp resolution 6.5 orders of magnitude worse.

I know, I know, "use NFS v4 and i_version".  But that is not the
point.  The point is that 4 milliseconds is a very long time these
days; an awful lot of file system operations can happen in such an
interval.

I am guessing the objection to the above patch will be:  "Waaah it's
slow!"  My ...
From: john stultz
Date: Friday, August 13, 2010 - 11:45 am

On Fri, Aug 13, 2010 at 11:25 AM, Patrick J. LoPresti

Your stats are off here. The only fast clocksource on x86 is the TSC,
and its busted on many, many systems. The cpu vendors have only
recently taken it seriously and resolved the majority of problems
(however, issues still remain on large numa systems, but its much
better then the story was 3-7 years ago).

On those TSC broken systems that use the hpet or acpi_pm, a
getnstimeofday call can take 0.5-1.3us, so the penalty can be quite
severe. And even with the TSC, expect some performance impact, as
reading hardware and doing the multiply is more costly then just
fetching a value from memory.

thanks
-john
--

From: Patrick J. LoPresti
Date: Friday, August 13, 2010 - 11:57 am

Thank you for the correction.  Still, the number of systems where TSC
works is large, it is growing over time, and....  Really now,

So you are saying my proposal is a bad idea forever?  (But then why
even bother having nanosecond resolution on ext4?)

Or that it is a bad idea for now?

Or that it needs to be refined?  Maybe use hi-res precision on systems

Relative to file system operations?  Seriously?  What performance hit
would you expect on real-world applications?
Something like 0.1% (10 nsec / 10 usec) worst case?

 - Pat
--

From: john stultz
Date: Friday, August 13, 2010 - 12:09 pm

I'm not judging the idea as good/bad, just providing information for

If you can show this does not affect performance in benchmarks, etc, I'm
sure it will be easier to push the patch. As outside of performance, I
don't think there's much of an issue with the change.

So other then "show some numbers", my only thought that might make the
patch more attractive is that rather than a global change, or a static
CONFIG_ option, would it maybe make more sense as a mount option?

thanks
-john

--

From: Patrick J. LoPresti
Date: Friday, August 13, 2010 - 1:53 pm

I really like this idea.

Consider the following "revision 2" of my proposal:

1) Add a function pointer "current_fs_time" to struct super_block.

2) Replace all calls of the form:

    current_fs_time(sb);

with

  sb->current_fs_time(sb);

 3) Arrange for the default value to point to the current implementation.

These first three could be one patch.  They change no functionality;
they just enable the next step.

Finally:

  4) Add a mount option to cause sb->current_fs_time(sb) to use the
hi-res implementation.

Comments?

 - Pat
--

From: Bret Towe
Date: Saturday, August 14, 2010 - 6:50 pm

I'm not sure how nfs works but if this is a client side issue I don't
see anything wrong
with a CONFIG_ item but if its server side it might be better off as a
procfs or sysfs tunable
so reboots are not required to change the setting

performance wise why would there be any difference the same amount of
bits are being set on the disk drive no?
--

From: Andi Kleen
Date: Tuesday, August 17, 2010 - 7:54 am

I suspect it will be a performance disaster on x86 for VFS intensive
applications on capable file systems. VFS is very performance
critical. These checks lurk on unexpected places too, e.g. on /dev
access.

Even TSC is much slower than just reading the variable.

Also you should check if the file system granuality 
even supports it, it's completely wasted on a ext3 for example.

Maybe as a optional sysctl, default to off.

-Andi

-- 
ak@linux.intel.com -- Speaking for myself only.
--

From: J. Bruce Fields
Date: Tuesday, August 17, 2010 - 10:41 am

OK, so that leaves us with the race, even on newer filesystems:

	1. File is modified, mtime updated
	2. Client fetches mtime to revalidate cache
	3. File is modified again, mtime updated
	4. Client fetches new mtime to revalidate cache

If step 3 doesn't change the mtime, then step 4 (no matter how much
later it is performed) will return the wrong result, and client
applications will see stale data.

If we want to avoid that race, every modification of file data must
result in the mtime being updated to something different from the last
mtime seen by the client.

(A slight window between data modification and mtime update may be OK,
as long as the update happens eventually, and before the change is
committed to disk--close-to-open semantics mean that NFS clients can
live with not seeing changes until data is written to disk.)

Possible responses:

	- Tell everyone to use NFSv4 (and make sure we have
	  changeattr/i_version working correctly).
	- Use a finer-grained time source.  (I believe you when you say
	  the TSC is too slow, but maybe we should run some tests to
	  make sure.)
	- Increment mtime by a nanosecond when necessary.  
	- ?

--b.
--

From: Andi Kleen
Date: Tuesday, August 17, 2010 - 11:29 am

You'll always have a race window with time, the only way around


You cannot be more precise than the backing file system: this causes
non monotonity when the inodes are flushed (has happened in the past)

-Andi
--

From: Patrick J. LoPresti
Date: Tuesday, August 17, 2010 - 11:50 am

True.  But non-toy filesystems designed post-1990 support nanosecond
timestamps.  One-second resolution is a disaster for NFS servers and
always has been.  I wonder how many man-hours have been wasted dealing
with the problem...  I personally have seen it dozens of times in my
career, but "use XFS" was always a solution.  Until now, that is, when
300usec round trip network times, 3GHz processors, and "weak cache
consistency" optimizations have conspired to bring it back, thanks to
4msec resolution timestamps.

Even aside from any NFS issues, I myself would prefer accurate
timestamps over a 10% boost for tight loops calling "utimes()" or
whatever.  But maybe that is just me.

Anyway, to repeat my revised proposal:

1) Add a "sb_current_fs_time" member to "struct super_block".  Make it
a pointer to a function returning "struct timespec".  Have it default
to the current low-resolution implementation.

2) Modify the function "current_fs_time(struct super_block sb *)" to
just "return sb->sb_current_fs_time(sb)".  Might as well inline it
too.

3) Add a mount option to allow the selection of the high-res time
source; i.e., to set sb_current_fs_time to point to the high-res
implementation.

Would patches implementing this stand a realistic chance of being accepted?

 - Pat
--

From: J. Bruce Fields
Date: Tuesday, August 17, 2010 - 12:04 pm

Agreed, but as a practical matter, nanosecond resolution would extend

If we wanted to look into this, what would you suggest (hardware,
workload) to demonstrate the worst case?  (Or are the results from the
TSC or any other higher-precision time source likely to be useless for

Right, I think that we probably have to give up ext3 as a lost cause.
But perhaps we could get away with a hack like this on filesystems that
can store nanoseconds.

--b.
--

From: Patrick J. LoPresti
Date: Tuesday, August 17, 2010 - 12:18 pm

I do not think so.

The problem with "increment mtime by a nanosecond when necessary" is
that timestamps can wind up out of order.  As in:

1) Do a bunch of operations on file A
2) Do one operation on file B

Imagine each operation on A incrementing its timestamp by a nanosecond
"just because".  If all of these operations happen in less than 4 ms,
you can wind up with the timestamp on B being EARLIER than the
timestamp on A.  That is a big no-no (think "make" or anything else
relying on timestamps for relative times).

If you can prove that the last modification on B happens after the
last modification on A, then it is very bad for the mtime on B to be
earlier than the mtime on A.  I guarantee that will break things in
the real world.

As you say, high-resolution timestamps "will extend the useful
lifetime of NFSv3 by quite a bit".  They are also a good idea in
principle, IMO.  Correctness is almost always more important than
performance.

 - Pat
--

From: Alan Cox
Date: Tuesday, August 17, 2010 - 12:39 pm

[time resolution bits of data][value incremented value for that time]


	if (time_now == time_last)
		return { time_last , ++ct };
	else {
		ct = 0;
		time_last = time_now
		return { time_last , 0 };
	}

providing it is done with the same 'ct' across the fs and you can't do
enough ops/second to wrap the nanosecs - which should be fine for now,

Alan
--

From: J. Bruce Fields
Date: Tuesday, August 17, 2010 - 12:29 pm

Right, so if I understand correctly, you're proposing a time source
that's global to the filesystem and that guarantees it will always
return a unique value by incrementing the nanoseconds field if jiffies
haven't changed since the last time it was called.

(Does it really need to be global across all filesystems?  Or is it
unreasonable to expect your unbelievably-fast make's to behave well when
sources and targets live on different filesystems?)

--b.
--

From: Alan Cox
Date: Tuesday, August 17, 2010 - 12:52 pm

I don't believe it does for the NFS semantics. You can't do it globally
because then you get weirdness between local file systems that support
u/nsecs and those that don't.

It's enough to fix NFS I believe.

Alan
--

From: Neil Brown
Date: Tuesday, August 17, 2010 - 10:53 pm

On Tue, 17 Aug 2010 15:29:38 -0400

I'm not sure you even want to pay for a per-filesystem atomic access when
updating mtime.  mnt_want_write - called at the same time - seems to go to
some lengths to avoid an atomic operation.

I think that nfsd should be the only place that has to pay the atomic
penalty, as it is where the need is.

I imagine something like this:
 - Create a global struct timespec which is protected by a seqlock
   Call it current_nfsd_time or similar.
 - file_update_time reads this and uses it if it is newer than
   current_fs_time.
 - nfsd updates it whenever it reads an mtime out of an inode that matches
   current_fs_time to the granularity of 1/HZ.
   If the current value is before current_kernel_time, it
   is set to current_kernel_time, otherwise tv_nsec is incremented -
   unless that increases
   beyond jiffies_to_usec(1)*1000 beyond current_kernel_time.
 - the global 'struct timespec' is zeroed whenever system time is set
   backwards.

Then - providing the fs stores nanosecond timestamps - we should have stable,
globally ordered, precise (if not entirely accurate) time stamps, and a
penalty would only be paid when nfsd actually needs the information.


[[You could probably make ext3 work reasonably well by adding a mount option
  which:
    - advertises s_time_gran as 1
    - when storing: rounds timestamps up to the next second if tv_nsec != 0
    - when loading, setting the timestamp to the current time if the stored
      number matches current_kernel_time().tv_sec+1
  You would get occasional forward jumps in mtime, but usually when you
  aren't looking, and at least you would not get real changes that are not
  reflected in mtime
]]

NeilBrown
--

From: Patrick J. LoPresti
Date: Wednesday, August 18, 2010 - 7:46 am

I think nfsd can simply update current_nfsd_time whenever the mtime it
reads from an inode is >= current_nfsd_time.  (The invariant you need
to maintain is that whenever nfsd reads an mtime, any timestamps
produced after that have a later time.  So just code it that way


But I do not believe this works.

1) Modify file A
2) Modify file B
3) File A experiences one of those "occasional forward jumps in mtime"
(inode evicted + read back within 1 second)
4) mtimes on A and B are now out of order -- very bad

As Bruce mentioned, ext3 is a lost cause.

Regardless of any of this, however, the first step is to provide a
mount option to select the timestamp algorithm...  Because it is still
absurd that I cannot have accurate timestamps on my files here in the
21st century.

Once that is done, the rest is just providing the alternative
implementations and choosing defaults.

 - Pat
--

From: J. Bruce Fields
Date: Wednesday, August 18, 2010 - 10:32 am

We can also skip the update whenever current_nfsd_time is greater than
the inode's mtime--that's enough to ensure that the next
file_update_time() call will get a time different from the inode's
current mtime.

And that means that a sequence like

	file_update_time()
	N nfsd_getattr()'s


... which would only happen on hardware that could process a getattr and

OK, got it, I think: so this is the same as a global version of Alan's
clock, except that the extra ticks only happen when they need to.

The properties it satisfies:

	- It's still a single global clock, so it's consistent between
	  files.
	- It degenerates to jiffies in the absence of getattr's from
	  nfsd.
	- It need only invalidate the other cpus' cached value of the
	  clock on the first getattr of a file that follows less than a
	  jiffy after an update of the file's data.
	- Absent utime(), time going backwards, or futuristic hardware,
	  it guarantees that two nfsd reads of an inode's mtime will
	  return different values iff the inode's data was modified in
	  between the two.

Shortcomings:

	- The clock advances in units only of either 1 jiffy or 1 ns.
	  This will look odd.  But when the alternative is units of 1
	  jiffy or 0 ns, it seems an improvement....
	- A slowdown due to inodes being file_update_time() marking inodes
	  dirty more frequently?
	- Doesn't help with ext3.  Oh well.

Would the extra expense rule out treating sys_stat() the same as nfsd?
It would be nice to be able to solve the same problem for userspace
nfsd's (or any other application that might be using mtime to save
rereading data).

--b.
--

From: Chuck Lever
Date: Wednesday, August 18, 2010 - 11:15 am

Would it help if we only did this for directories, for now?


-- 
chuck[dot]lever[at]oracle[dot]com




--

From: Neil Brown
Date: Wednesday, August 18, 2010 - 4:41 pm

On Wed, 18 Aug 2010 14:15:51 -0400

I'm don't quite see how close-to-open really affects this issue - it still
relies on the timestamps and so can cache old data if a file update didn't
change the timestamp.

In my mind the difference is that near-concurrent access to files usually
involves file locking which flushes caches (and if it doesn't then you have
bigger problems) while near-concurrent access to directories relies on the
natural atomicity of dir operations so no locking or flushing occurs.

So I agree that this is probably more of an issue for directories than for
files, and that implementing it just for directories would be a sensible
first step with lower expected overhead - just my reasoning seems to be a bit
different.

Thanks,
NeilBrown
--

From: Neil Brown
Date: Wednesday, August 18, 2010 - 5:52 pm

On Thu, 19 Aug 2010 09:41:36 +1000

Just to be sure we are on the same page:
  file_update_time would always refer to current_nfsd_time, but nfsd would
  only update current_nfsd_time when a directory was examined (and the other
  conditions were met).


So my current thinking on how this would look - names have been changed:

 - global timespec 'current_fs_precise_time' is zeroed when
   current_kernel_time moves backwards and is protected by a seqlock

 - current_fs_time would be
         now = max(current_kernel_time(), current_fs_precise_time)
         return timespec_trunc(now, sb->s_time_gran)
   (with appropriate seqlock protection)

 - new function in fs/inode.c
         get_precise_time(timestamp)
                cft = current_fs_time()
                if (timestamp == cft)
                   write_seqlock()
                   if cft == current_fs_precise_time
                        current_fs_precise_time.tv_nsec++
                   else if cft > current_fs_precise_time
                        current_fs_precise_time = cft
                   write_sequnlock()
                return timestamp

  - nfsd xdr response routine does
             ts = inode->i_mtime
             if (S_ISDIR(inode->i_mode))
                ts = get_precise_time(ts)
             xdr_encode_timespec(ts)


get_precise_time() probably needs a bit more subtlety to handle different
s_time_gran values and possible races, but I think it is fairly close.

Then if we ever had an xstat or similar that could ask for precise
timestamps, it just makes a similar call to get_precise_time.
Also if we added code later to use a hires timer on hardware where it was
efficient, get_precise_time could test for that and become a no-op

Yes, I should probably turn this into a patch ... maybe another day.

NeilBrown
--

From: J. Bruce Fields
Date: Wednesday, August 18, 2010 - 7:08 pm

Odd name for something that returns nothing of interest;
bump_precise_time() might be closer?

And unique_time might be better than precise_time, since the property
we're asking for is that mtime on a changed file by new?  (Or
		     /*
		      * Make sure the next mtime stored will be
		      * something different from timestamp:

What's the cft < current_fs_precise_time case?

--

From: Neil Brown
Date: Wednesday, August 18, 2010 - 7:44 pm

On Wed, 18 Aug 2010 22:08:03 -0400

Agreed on both counts, tough I'm not keen on 'bump' myself.
  got_unique_time()
because that it what we just did...  I prefer the name to reflect why the
function is called, rather than what the function is expected to do about it.

The current_fs_precise_time has been incremented with a resolution higher
than s_time_gran.  i.e. s_time_gran > 1.
I'm not really sure what we want to do about that.
Maybe we should be incrementing tv_nsec by s_time_gran as long as that is
significantly less than jiffies_to_usec(1)*1000, but I don't know what I mean
by 'significantly'.

The only values I can find for s_time_gran in current code are 1, 100, 1000
and 1000000000.
All those are either way bigger than a jiffie or significantly smaller, but
suppose a filesystem came along that chose 1000000 (i.e. millisecond
timestamps) - should we increment tv_nsec by 1000000, or not, or cross that
bridge when we come to it?

For reference:
  default is 1000000000  (this would cover ext2, ext3, reiserfs, fat, sysv, ...)
  cifs, smbfs, ntfs are 100
  udf, ceph are 1000
  rest (btrfs, ext4, gfs2, jfs, nilfs, ocfs2, xfs and virtual filesystems) are 1

NeilBrown

--

From: J. Bruce Fields
Date: Thursday, August 19, 2010 - 3:46 pm

Maybe "retire" for a pithier version of never_use_again:

/**
 * retire_timestamp - prevent a timestamp from being reused as an mtime.
 * @timestamp
 *
 * Advance the clock used to generate mtimes to guarantee that the
 * given timestamp will not be reused on any future mtime update.
 * This allows the given timestamp to be passed back to users such as
 * nfs clients which need the guarantee that mtimes will always change
 * on file updates.
 *
 * Depending on the filesystem's s_time_gran this may not be an ironclad
 * guarantee.
 */


How about just scratching "significantly" and saying "less"?  As long as
we know jiffies is the default time source for mtimes, that should be


Interesting list, thanks!

--b.
--

From: Neil Brown
Date: Wednesday, August 18, 2010 - 4:47 pm

On Wed, 18 Aug 2010 13:32:03 -0400


It would be nice, but I would be loathe to add any cost to 'stat' unless we
knew it was needed.
If we had an xstat() which could explicitly ask for
high-precision-time-stamps, then yes - otherwise maybe not.

(or maybe define a system:linux.xxxx xattr which would read as a
high-precision time stamp...  I seem to be warming to the idea of using the
xattr interface for enhancing stat).

NeilBrown
--

From: J. Bruce Fields
Date: Wednesday, August 18, 2010 - 11:54 am

Only if those mtime changes are also followed immediately by nfsd reads
of the mtime.

That will be the typical case for nfsd writes, though.

--b.
--

From: Andi Kleen
Date: Wednesday, August 18, 2010 - 12:25 pm

If multiple writers are changing the same location in quick succession 
you have a hot cache line that gets bounced around.  It doesn't need reads,
although reads make it even worse.

There's a lot of effort currently to make the VFS more parallel
and less synchronized and it would be bad again to regress here again.

-Andi
-- 
ak@linux.intel.com -- Speaking for myself only.
--

From: J. Bruce Fields
Date: Wednesday, August 18, 2010 - 12:30 pm

OK, at this point one of us is confused, and I'm not sure which.

Is the "same location" that you're referring to the current_nfsd_time?

Neil's suggestion is to only modify current_nfsd_time on nfsd getattr,
*not* on the write operation that modifies the file data.


Understood.

--b.
--

From: Patrick J. LoPresti
Date: Tuesday, August 17, 2010 - 12:34 pm

Yes, that would work.   Assuming you use atomic counters, else there
is a risk of the visible time ticking backwards.  It seems like a lot
of effort just to avoid having accurate timestamps on your files,
though.

I am having trouble seeing why this is a better idea than a simple
mount option to obtain decent resolution timestamps.  (Not that we
can't have both...)  Is there any objection to the mount option I am
proposing?

For the Nth time, I am willing to produce and test the patch, but not
if there is zero chance of it being accepted.

 - Pat
--

From: Alan Cox
Date: Tuesday, August 17, 2010 - 12:54 pm

I have none. I doubt I'd use it as it would be too expensive on system
performance for some of my boxes, while having an incrementing value is
cheap.

I don't see the two as conflicting - in fact the bits you need to do the
mount option are the bits you also need to do the counter version as
well. One fixes ordering at no real cost, the other adds high res
timestamps, both are useful.

Alan
--

From: Patrick J. LoPresti
Date: Tuesday, August 17, 2010 - 12:43 pm

A mount option could also allow a choice of timestamp resolutions:

Traditional (i.e., fast)
Alan Cox NFS hack (a tad slower but should fix NFS)
High-res time (slowest but most accurate)

I will work on a patch this week (weekend at the latest).

Thanks, Alan.

 - Pat
--

From: J. Bruce Fields
Date: Tuesday, August 17, 2010 - 12:45 pm

I kind of hate to have mount options that are required for nfs exports
to work correctly; it soon makes things too complicated for users to
realiably get right, so distributions end up setting them, and then we
all end up taking the performance tradeoff anyway.

But a mount-option-based version may at least be useful for further
experiments.

--b.
--

From: john stultz
Date: Wednesday, August 18, 2010 - 6:41 pm

Yea, there aren't simple answers. Clocksource hardware varies
drastically in resolution and access time across systems and
architectures. Further, clocksources may change while the system is
up, so we don't really expose the hardware resolution.

On x86, access latency varies from ~50ns (TSC) to ~1.3us (ACPI PM).
(And that is ignoring the PIT, which can be 18us per call - luckily
almost no hardware uses that). The resolution similarly scales from
sub-ns (TSC @ > 1ghz cpus) to ~279ns (ACPI PM). Of course, across
architectures you will see even more variance.

thanks
-john
--

From: J. Bruce Fields
Date: Wednesday, August 18, 2010 - 7:31 pm

The race in question occurs when you manage to check mtime between two
file data updates, with all three operations occurring within a clock
tick.

No idea if that's feasible in hundreds of nanoseconds.

I'm also not sure how to judge the access latency.  Certainly a
microsecond is a lot compared to just reading a cached mtime value.

Will we ever see them go backwards?  (So if I know I wrote to file B
after writing to file A, is there ever a case where I could end up with
an earlier mtime on B than A?)

--b.
--

From: john stultz
Date: Wednesday, August 18, 2010 - 8:17 pm

I think this is what Andi meant that you'll always race with time and

You should not. However, there have been bugs in the past, and there
will probably be a few more in the future.

There are also theoretical issues with SMP systems where the TSCs are
not perfectly synced, but the window for those races should be small
(ie: smaller then can be detected - otherwise we'll throw out the TSC).


thanks
-john


--

From: David Woodhouse
Date: Wednesday, August 18, 2010 - 11:20 am

Um, can't you? You can't *store* timestamps which are more precise, but
they can be in cache can't they?

And since you're not going to drop it from cache and bring it back in
again within 4ms, that ought to suffice?

-- 
David Woodhouse                            Open Source Technology Centre
David.Woodhouse@intel.com                              Intel Corporation

--

From: Patrick J. LoPresti
Date: Wednesday, August 18, 2010 - 11:32 am

No.  That is how Linux used to work, and it caused many problems,

Not the problem.  As usual, the problem is out-of-order timestamps:

1) Modify file A
2) Modify file B
3) File B's inode gets evicted, truncating its timestamp to disk resolution
4) Call stat() on B, bringing it back in with truncated resolution

And boom, B appears to be OLDER than A.  Which is not allowed.

This is exactly what happened when Linux first added sub-second
timestamps to the generic VFS layer.  Many complaints about "make"
rebuilding files unecessarily, among other things.  Eventually it got
fixed by the introduction of current_fs_time().

 - Pat
--

From: Andi Kleen
Date: Wednesday, August 18, 2010 - 11:53 am

No you can't. The initial implementation did that and it broke someone's
make. After that the VFS was fixed to never be precise than the backing
file system.

-Andi
--

Previous thread: defunct processes owned by init, not being reaped ? by Chris Friesen on Friday, August 13, 2010 - 11:22 am. (1 message)

Next thread: BUG: IPv6 stops working after a while, needs ip ne del command to reset by Thomas Habets on Friday, August 13, 2010 - 10:55 am. (24 messages)