Here is my proposal for a minimal change to make dir.c ops work with
xattr atoms, without relying on i_size to determine the size of the
atom directory, which is at the bottom of a sparse file that also
contains the atom refcounts and atom reverse map way up high. The
kernel block IO library requires the i_size point above those tables,
otherwise it will not transfer data to them, and will return zeroes
instead of any data written up there (just reviewing for anybody who
has not been following this little issue).
I think the real problem is that the kernel block library is trying
to be too helpful here, or we are using it for a purpose for which
it was not intended. Eventually we will most probably stop using the
block IO library entirely, because it provides an overly constricted
interface. We could start on that process with a relatively simple
situation like the atom tables, setting up our own bio transfers. But
that is an optimization for later, right now we need to make it work
with the block IO library one way or another.
I think this is about the simplest interface change. For xattrs,
instead of passing i_size, we pass an atom size value in the sb and
load/save_sb it to disk.
We are likely to change this again after a while, but for now this
will let us test xattr support in kernel and we can concentrate on
more important issues.
Regards,
Daniel
diff -r ad6aff100867 user/kernel/dir.c
--- a/user/kernel/dir.c Thu Dec 11 19:32:14 2008 -0800
+++ b/user/kernel/dir.c Thu Dec 11 23:29:04 2008 -0800
@@ -105,13 +105,13 @@
[S_IFLNK >> STAT_SHIFT] = TUX_LNK,
};
-loff_t tux_create_entry(struct inode *dir, const char *name, int len, inum_t inum, unsigned mode)
+loff_t _tux_create_entry(struct inode *dir, const char *name, int len, inum_t inum, unsigned mode, loff_t *size)
{
tux_dirent *entry;
struct buffer_head *buffer;
unsigned reclen = TUX_REC_LEN(len), rec_len, name_len, offset;
unsigned blockbits = tux_sb(dir->i_sb)->blockbits, blocksize = 1 << ...Tux3 now has a command interpreter to aid in debugging, with commands like: tux3 make foodev that makes a filesystem on device foodev, which can also be a file. It normally is a file for me and I make it sparse like this: dd if=/dev/zero of=foodev bs=1 count=1 seek=100K I can see how many blocks tux3 actually used in it by: du foodev less the few blocks that the sparse file has for metatdata and the little blob of data at the end that dd puts there. (I don't know how to make dd truncate without writing, I am not sure it is possible.) Then: echo "hello world!" | tux3 write foodev foo creates and writes to file foo in foodev. A subsequent write will fail with EEXIST, which is maybe not quite what we want. Finally: tux3 read foodev foo outputs "hello world!". The following command sequence is particularly interesting: tux3 make --blocksize 256 foodev echo hello | tux3 write --seek 72057594037927930 foodev foo tux3 read --seek 72057594037927930 foodev foo It writes "hello" into the last few bytes of a 64 Petabyte file, the largest that Tux3 can create with 256 byte blocks :-) Using "du foodev" before and after creating and writing to its filesystem shows that Tux3 only used 8K for the entire filesystem including the multi-petabyte sparse file, root directory, inode table and allocation bitmaps. To build tux3: g99 -g -Wall -lpopt buffer.c diskio.c tux3.c -otux3 Regards, Daniel _______________________________________________ Tux3 mailing list Tux3@tux3.org http://tux3.org/cgi-bin/mailman/listinfo/tux3
Yes, this would be work. Just FYI for people, probably you know though,
inode->i_size has race on 32bit arch. It is 64bit value, so,
*size += blocksize;
means something like the following. E.g.
load 0x0(size), %reg1
load 0x4(size), %reg2
add blocksize, %reg1
add-carry %reg2
store %reg1, 0x0(size)
<---- (W)
store %reg2, 0x4(size)
So, block I/O library like block_write_full_page() can read the size at
(W) point. The result may be bogus size.
We have to fix this later, maybe change with phtree. (don't read i_size
--
OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
_______________________________________________
Tux3 mailing list
Tux3@tux3.org
http://mailman.tux3.org/cgi-bin/mailman/listinfo/tux3
Ah, I thought about that when you mentioned it earlier and forgot to
take care of it in my patch. Something like this:
loff_t tux_create_entry(struct inode *dir, const char *name, int len, inum_t inum, unsigned mode)
{
loff_t size = i_size_read(dir);
int err = _tux_create_entry(dir, name, len, inum, mode, &size);
i_size_write(dir, size);
return err;
}
Regards,
Daniel
_______________________________________________
Tux3 mailing list
Tux3@tux3.org
http://mailman.tux3.org/cgi-bin/mailman/listinfo/tux3
