Re: HAMMER update 19-June-2008 (56C) (HEADS UP - MEDIA CHANGED)

Previous thread: HAMMER update 19-June-2008 (56C) (HEADS UP - MEDIA CHANGED) by Matthew Dillon on Thursday, June 19, 2008 - 11:23 pm. (2 messages)

Next thread: Re: HAMMER update 19-June-2008 (56C) (HEADS UP - MEDIA CHANGED) by Matthew Dillon on Friday, June 20, 2008 - 11:20 am. (2 messages)
From: Matthew Dillon
Date: Friday, June 20, 2008 - 9:57 am

Yah, I agree.  Here's a quick summary of the issues:

	* UNDO records are used to compartmentalize atomic changes which
	  cover multiple disk blocks.  For example, if you 'rm' a file
	  and a crash occurs, you want the state of the filesystem to
	  either show the file and its directory entry both removed, or show
	  the file and its directory entry both still present.

	* Updates to the inode_data, which holds the stat/chmod info for
	  a file object, typically requires rolling a new inode_data record
	  with the old one still available via the filesystem history.  For
	  example, if you append some stuff to an existing file an old 
	  version of the inode_data must be present in order to 'see' the
	  previous state of the file (in particular, the previous st_size
	  of the file).

	* BUT, having to do any of the above when updating atime and mtime
	  would be really expensive.

	  - atime gets updated all the time. We definitely do not want to
	    roll UNDO records *or* new inode_data records.

	  - mtime gets updated all the time in certain situations, such as
	    when overwriting a file (e.g. in ways that do not modify the
	    file's size).

	  - mtime is often used to uniquely determine whether a file has
	    been modified.

	* And, finally, we want mirroring to work properly even if the
	  filesystem is mounted 'nohistory' (told not to roll new
	  inode_data records).  Or, for that matter, if individual files
	  are chflagged 'nohistory'.

    The bane of HAMMER's design is that we absolutely do not want to roll
    new inode_data records unless we have to, so here is what I am going to
    do:

	* ATime will be updated asynchronously and will not be CRCd, so
	  the B-Tree element's CRC field does not have to be updated.
	  (thus no UNDO records need to be generated either).

	* MTime will be updated semi-synchronously and will be CRCd.
	  (It will be fully synchronous from the point of view of
	  anyone using the filesystem, of course).  UNDO ...
From: Dan M
Date: Friday, June 20, 2008 - 10:43 am

On Fri, Jun 20, 2008 at 12:57 PM, Matthew Dillon

Pardon my ignorance if I am missing something, I haven't looked much
into HAMMER yet.

Will the FS have the same atomic update features that UFS has? Meaning
fsync(2) returns only when all directory entries are safely on the
disk (whether it's with softupdate-type ordering or journaling). It's
important for mail servers and such so they don't lose messages at the
time of powerfail/crash. If you dig around mailing lists, you'll find
interesting stories how people who ran their FS mounted async (the
default Linux EXT2/3 mount) for mail servers (and AFAIK at least on
Linux in that case fsync returns early - not atomic, so software
written with BSD behavior in mind wasn't safe to run without patching)
found some of the messages in lost+found.

Also will there be a feature to grow/and or shrink the FS live without
having to unmount? I can do this right now with XFS and LVM on Linux
(grow, but not shrink), and its working amazingly well and very
quickly to boot.

Thanks.

-- 
Dan
Previous thread: HAMMER update 19-June-2008 (56C) (HEADS UP - MEDIA CHANGED) by Matthew Dillon on Thursday, June 19, 2008 - 11:23 pm. (2 messages)

Next thread: Re: HAMMER update 19-June-2008 (56C) (HEADS UP - MEDIA CHANGED) by Matthew Dillon on Friday, June 20, 2008 - 11:20 am. (2 messages)