Here are a few notes on some simple things we can do to get some
filesystem checking going early, while being pretty lazy.
Tux3 has a very simple structure. The superblock points to two
different entities:
- Filesystem tree: a btree of btrees (inode table from which
data btrees descend)
- Log blocks that define changes to the filesystem tree
(format not precisely defined yet)
Tux3 has other, higher level structures, but they are all mapped into
data files, such as the allocation bitmap, extended attribute atom
table and directory files. We can check the internal structure of
these files too, however first it would be nice to know that the file
structure itself is consistent, which is what today's proposal is
about.
The complete and consistent state of the Tux3 filesystem is obtained by
applying the changes encoded in log blocks to blocks encode on disk,
leaving the result in block cache (block images indexed by disk
location). We can then traverse the entire Tux3 filesystem tree in a
straightforward way through the block cache, either with our userspace
code or in kernel. This idea can form the basis of a rudimentary
filesystem consistency checker.
The general plan is to walk the filesystem tree via the block cache,
checking the format of each index block before trusting it, and marking
off the visited blocks in a bitmap as we go in order to detect multiply
referenced blocks (also detecting cycles as a side effect). At the end
of the traversal, the visited bitmap should match the block allocation
bitmaps.
This simple idea works because Tux3 does not allow multiple references
to blocks, which can be seen as a robustness feature. Tux3 does not
require multiple references to blocks to implement snapshotting because
of the versioned extents design, which uses only a single pointer to
each extent that is inherited by children of the version in which the
extent was allocated.
When we start using log blocks (pretty soon) then ...diff -r dc0f4243e276 user/test/inode.c
--- a/user/test/inode.c Mon Sep 01 12:51:37 2008 -0700
+++ b/user/test/inode.c Mon Sep 01 16:56:37 2008 -0700
@@ -418,7 +418,7 @@
/* Always 8K regardless of blocksize */
int reserve = 1 << (sb->blockbits > 13 ? 0 : 13 - sb->blockbits);
for (int i = 0; i < reserve; i++)
- printf("reserve %Lx\n", balloc_from_range(sb->bitmap, i, 1));
+ printf("reserve %Lx\n", (L)balloc_from_range(sb->bitmap, i, 1));
printf("---- create inode table ----\n");
sb->itree = new_btree(sb, &itree_ops);
diff -r dc0f4243e276 user/test/tux3.c
--- a/user/test/tux3.c Mon Sep 01 12:51:37 2008 -0700
+++ b/user/test/tux3.c Mon Sep 01 16:56:37 2008 -0700
@@ -136,7 +136,7 @@
#endif
if (seekarg) {
u64 seek = strtoull(seekarg, NULL, 0);
- printf("seek to %Li\n", seek);
+ printf("seek to %Li\n", (L)seek);
tuxseek(file, seek);
}
char text[2 << 16];
@@ -172,7 +172,7 @@
//tuxseek(file, (1LL << 60) - 12);
if (seekarg) {
u64 seek = strtoull(seekarg, NULL, 0);
- printf("seek to %Li\n", seek);
+ printf("seek to %Li\n", (L)seek);
tuxseek(file, seek);
}
memset(buf, 0, sizeof(buf));
_______________________________________________
Tux3 mailing list
Tux3@tux3.org
http://tux3.org/cgi-bin/mailman/listinfo/tux3
--nextPart1300184.PL8Y8UbWLf Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Content-Disposition: inline Hi Daniel, So this could be used to implement something like xfs_dump of dump for=20 ext2/ext3/ext4? Thanks for your verbose explainations. I do not understand everything, but= =20 I like reading about it. I already thought whether I can help with=20 documentation by collecting bits and pieces together, structure them in a=20 good way, maybe rewrite or adapt them at places and check them in as=20 documentation. But I can't promise whether I understand everything of=20 what I read. And its difficult for me to take enough time to the job.=20 Maybe if it could be done in a step by step manner. But what would be needed for documentation anyway at the moment? Ciao, =2D-=20 Martin 'Helios' Steigerwald - http://www.Lichtvoll.de GPG: 03B0 0D6C 0040 0710 4AFA B82F 991B EAAC A599 84C7 --nextPart1300184.PL8Y8UbWLf Content-Type: application/pgp-signature; name=signature.asc Content-Description: This is a digitally signed message part. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (GNU/Linux) iEYEABECAAYFAklFVjMACgkQmRvqrKWZhMez5ACguoxcAB4RdMDV7qSlcyh2IHNs ykEAniWz1Ii36gdS6BiOC5+H8UakNzYo =571J -----END PGP SIGNATURE----- --nextPart1300184.PL8Y8UbWLf--
--nextPart37255809.UIG2phAbRd Content-Type: text/plain; charset="iso-8859-15" Content-Transfer-Encoding: quoted-printable Content-Disposition: inline Anything for in-kernel merge? What could that be? Developer releated: =2D some initial design notes =2D UML / KVM howtos to invite developers? =2D pointer to tux3.org and repositories? User related: =2D mount options =2D mkfs options / man page =2D readme with limitations in the current version and warning not to use i= t=20 on productive data? =2D-=20 Martin 'Helios' Steigerwald - http://www.Lichtvoll.de GPG: 03B0 0D6C 0040 0710 4AFA B82F 991B EAAC A599 84C7 --nextPart37255809.UIG2phAbRd Content-Type: application/pgp-signature; name=signature.asc Content-Description: This is a digitally signed message part. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (GNU/Linux) iEYEABECAAYFAklFWrUACgkQmRvqrKWZhMen/ACfeUGd7e73JWT0AZjaAOTM1TM5 774An2YgTv4OOugwgn/O7TvGgU6mqDGN =MWL/ -----END PGP SIGNATURE----- --nextPart37255809.UIG2phAbRd--
Collecting design nodes, cleaning up and converting to html would be a very nice contribution. And the UML howto or things like it really need organizing. One convenient way to handle this is to have tux3/doc/html/<things> in the Mercurial that we automatically upload to tux3.org, and I can pull from you. Regards, Daniel _______________________________________________ Tux3 mailing list Tux3@tux3.org http://mailman.tux3.org/cgi-bin/mailman/listinfo/tux3
It could, however it would need an equivalent of xfsrestore too. A tree walker like this could be adapted for some interesting analysis purposes, for example it could find and classify all the in-use blocks of a volume for more accurate analysis by a script like this: http://www.mail-archive.com/linux-btrfs@vger.kernel.org/msg01103.html which was written to estimate the effectiveness of block level data de-duplication. Regards, Daniel _______________________________________________ Tux3 mailing list Tux3@tux3.org http://mailman.tux3.org/cgi-bin/mailman/listinfo/tux3
| Jesse Barnes | Re: [stable] [BUG][PATCH] cpqphp: fix kernel NULL pointer dereference |
| Greg KH | [003/136] p54usb: add Zcomax XG-705A usbid |
| Magnus Damm | [PATCH 03/07] ARM: Use shared GIC entry macros on Realview |
| Oliver Neukum | Re: [Bug #13682] The webcam stopped working when upgrading from 2.6.29 to 2.6.30 |
| Martin Schwidefsky | Re: [PATCH] optimized ktime_get[_ts] for GENERIC_TIME=y |
git: | |
| Junio C Hamano | Re: Some advanced index playing |
| Jeff King | Re: confusion over the new branch and merge config |
| Robin Rosenberg |
