Re: [Tux3] dump feature / Documentation (was: Re: Design note: Offline filesystem check)

Previous thread: [Tux3] Time to truncate by Daniel Phillips on Monday, September 1, 2008 - 6:24 pm. (6 messages)

Next thread: [Tux3] [PATCH] Make sure ileaf offsets are in non-descending order by Conrad Meyer on Monday, September 1, 2008 - 11:56 pm. (10 messages)
From: Daniel Phillips
Date: Friday, December 12, 2008 - 2:54 pm

Here are a few notes on some simple things we can do to get some
filesystem checking going early, while being pretty lazy.

Tux3 has a very simple structure.  The superblock points to two 
different entities:

  - Filesystem tree: a btree of btrees (inode table from which
    data btrees descend)

  - Log blocks that define changes to the filesystem tree
    (format not precisely defined yet)

Tux3 has other, higher level structures, but they are all mapped into 
data files, such as the allocation bitmap, extended attribute atom 
table and directory files.  We can check the internal structure of 
these files too, however first it would be nice to know that the file 
structure itself is consistent, which is what today's proposal is 
about.

The complete and consistent state of the Tux3 filesystem is obtained by 
applying the changes encoded in log blocks to blocks encode on disk, 
leaving the result in block cache (block images indexed by disk 
location).  We can then traverse the entire Tux3 filesystem tree in a 
straightforward way through the block cache, either with our userspace 
code or in kernel.  This idea can form the basis of a rudimentary 
filesystem consistency checker.

The general plan is to walk the filesystem tree via the block cache, 
checking the format of each index block before trusting it, and marking 
off the visited blocks in a bitmap as we go in order to detect multiply 
referenced blocks (also detecting cycles as a side effect).  At the end 
of the traversal, the visited bitmap should match the block allocation 
bitmaps.

This simple idea works because Tux3 does not allow multiple references 
to blocks, which can be seen as a robustness feature.  Tux3 does not 
require multiple references to blocks to implement snapshotting because 
of the versioned extents design, which uses only a single pointer to 
each extent that is inherited by children of the version in which the 
extent was allocated.

When we start using log blocks (pretty soon) then ...
From: Conrad Meyer
Date: Monday, September 1, 2008 - 8:54 pm

diff -r dc0f4243e276 user/test/inode.c
--- a/user/test/inode.c	Mon Sep 01 12:51:37 2008 -0700
+++ b/user/test/inode.c	Mon Sep 01 16:56:37 2008 -0700
@@ -418,7 +418,7 @@
 	/* Always 8K regardless of blocksize */
 	int reserve = 1 << (sb->blockbits > 13 ? 0 : 13 - sb->blockbits);
 	for (int i = 0; i < reserve; i++)
-		printf("reserve %Lx\n", balloc_from_range(sb->bitmap, i, 1));
+		printf("reserve %Lx\n", (L)balloc_from_range(sb->bitmap, i, 1));
 
 	printf("---- create inode table ----\n");
 	sb->itree = new_btree(sb, &itree_ops);
diff -r dc0f4243e276 user/test/tux3.c
--- a/user/test/tux3.c	Mon Sep 01 12:51:37 2008 -0700
+++ b/user/test/tux3.c	Mon Sep 01 16:56:37 2008 -0700
@@ -136,7 +136,7 @@
 #endif
 		if (seekarg) {
 			u64 seek = strtoull(seekarg, NULL, 0);
-			printf("seek to %Li\n", seek);
+			printf("seek to %Li\n", (L)seek);
 			tuxseek(file, seek);
 		}
 		char text[2 << 16];
@@ -172,7 +172,7 @@
 		//tuxseek(file, (1LL << 60) - 12);
 		if (seekarg) {
 			u64 seek = strtoull(seekarg, NULL, 0);
-			printf("seek to %Li\n", seek);
+			printf("seek to %Li\n", (L)seek);
 			tuxseek(file, seek);
 		}
 		memset(buf, 0, sizeof(buf));



_______________________________________________
Tux3 mailing list
Tux3@tux3.org
http://tux3.org/cgi-bin/mailman/listinfo/tux3
From: Martin Steigerwald
Date: Sunday, December 14, 2008 - 11:53 am

--nextPart1300184.PL8Y8UbWLf
Content-Type: text/plain;
  charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
Content-Disposition: inline


Hi Daniel,


So this could be used to implement something like xfs_dump of dump for=20
ext2/ext3/ext4?

Thanks for your verbose explainations. I do not understand everything, but=
=20
I like reading about it. I already thought whether I can help with=20
documentation by collecting bits and pieces together, structure them in a=20
good way, maybe rewrite or adapt them at places and check them in as=20
documentation. But I can't promise whether I understand everything of=20
what I read. And its difficult for me to take enough time to the job.=20
Maybe if it could be done in a step by step manner.

But what would be needed for documentation anyway at the moment?

Ciao,
=2D-=20
Martin 'Helios' Steigerwald - http://www.Lichtvoll.de
GPG: 03B0 0D6C 0040 0710 4AFA  B82F 991B EAAC A599 84C7

--nextPart1300184.PL8Y8UbWLf
Content-Type: application/pgp-signature; name=signature.asc 
Content-Description: This is a digitally signed message part.

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)

iEYEABECAAYFAklFVjMACgkQmRvqrKWZhMez5ACguoxcAB4RdMDV7qSlcyh2IHNs
ykEAniWz1Ii36gdS6BiOC5+H8UakNzYo
=571J
-----END PGP SIGNATURE-----

--nextPart1300184.PL8Y8UbWLf--

From: Martin Steigerwald
Date: Sunday, December 14, 2008 - 12:12 pm

--nextPart37255809.UIG2phAbRd
Content-Type: text/plain;
  charset="iso-8859-15"
Content-Transfer-Encoding: quoted-printable
Content-Disposition: inline


Anything for in-kernel merge? What could that be?

Developer releated:
=2D some initial design notes
=2D UML / KVM howtos to invite developers?
=2D pointer to tux3.org and repositories?

User related:
=2D mount options
=2D mkfs options / man page
=2D readme with limitations in the current version and warning not to use i=
t=20
on productive data?

=2D-=20
Martin 'Helios' Steigerwald - http://www.Lichtvoll.de
GPG: 03B0 0D6C 0040 0710 4AFA  B82F 991B EAAC A599 84C7

--nextPart37255809.UIG2phAbRd
Content-Type: application/pgp-signature; name=signature.asc 
Content-Description: This is a digitally signed message part.

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)

iEYEABECAAYFAklFWrUACgkQmRvqrKWZhMen/ACfeUGd7e73JWT0AZjaAOTM1TM5
774An2YgTv4OOugwgn/O7TvGgU6mqDGN
=MWL/
-----END PGP SIGNATURE-----

--nextPart37255809.UIG2phAbRd--

From: Daniel Phillips
Date: Sunday, December 14, 2008 - 2:53 pm

Collecting design nodes, cleaning up and converting to html would be a
very nice contribution.  And the UML howto or things like it really
need organizing.

One convenient way to handle this is to have tux3/doc/html/<things> in
the Mercurial that we automatically upload to tux3.org, and I can pull
from you.

Regards,

Daniel



_______________________________________________
Tux3 mailing list
Tux3@tux3.org
http://mailman.tux3.org/cgi-bin/mailman/listinfo/tux3
From: Daniel Phillips
Date: Sunday, December 14, 2008 - 2:43 pm

It could, however it would need an equivalent of xfsrestore too.

A tree walker like this could be adapted for some interesting analysis
purposes, for example it could find and classify all the in-use blocks
of a volume for more accurate analysis by a script like this:

   http://www.mail-archive.com/linux-btrfs@vger.kernel.org/msg01103.html

which was written to estimate the effectiveness of block level data
de-duplication.

Regards,

Daniel

_______________________________________________
Tux3 mailing list
Tux3@tux3.org
http://mailman.tux3.org/cgi-bin/mailman/listinfo/tux3
Previous thread: [Tux3] Time to truncate by Daniel Phillips on Monday, September 1, 2008 - 6:24 pm. (6 messages)

Next thread: [Tux3] [PATCH] Make sure ileaf offsets are in non-descending order by Conrad Meyer on Monday, September 1, 2008 - 11:56 pm. (10 messages)