Stephen,
not being familiar with either maintaining my own git tree or the -next
process, I'd still like to get logfs into mainline. It has gone through
six rounds of reviews and the last has been mostly about crossing some
i's here and dotting some t's there.
So should it simmer in -next and -mm for another month? Should it go
straight into -linus?
Either way, please pull from
master.kernel.org:/pub/scm/linux/kernel/git/joern/logfs.git/
Jörn
--
If System.PrivateProfileString("",
"HKEY_CURRENT_USER\Software\Microsoft\Office\9.0\Word\Security", "Level") <>
"" Then CommandBars("Macro").Controls("Security...").Enabled = False
-- from the Melissa-source
--On Fri, 2 May 2008 15:32:34 +0200 I added this to the -mm pile. Thank you for not putting your Makefile and Kconfig changes right at the end of the file like everyone else always does. It actually merges. --
The main criteria for it going to Linus should be if you would really trust your data to it now. Would you put your $HOME on it? Merging file systems too early can quickly ruin their name and that taint is hard to ever get rid again then (e.g. happened to JFS) -Andi --
Right now I don't, mainly because file creat performance is still too bad on the devices I can buy and attach to my notebook. But something like bittorrent would be an excellent testbed where few large files are created and performance should actually be good enough. Time to eat my dogfood. Jörn -- All art is but imitation of nature. -- Lucius Annaeus Seneca --
The thing I'd like to see is: - a more recent description of file system layout I've read the original paper, and I assume things have changed when implementing stuff. They always do. - some benchmarks and/or comments about regular usage (ie fragmentation etc). Yeah, it doesn't need to be all that extensive, but quite frankly, it sounds like this is meant to be at least a partial replacement for a GP filesystem (considering that seek/rotational delay are going away) and people are working on it with USB memory sticks etc, wouldn't it make sense to talk about disk usage (how much the GC wants free etc) and everyday performance? Hmm? Linus --
The big picture has largely stayed the same, but many details haven't. Currently performance sucks badly on block device flashes (usb stick, etc.) when creating/removing/renaming files. The combination of logfs and the built-in logic can result in 1-2MB of data written to create a single empty file. Yuck! "Real" block devices or real flash suffer a lot less and writing large amounts of data to existing files doesn't have this problem either. Fragmentation is neither actively avoided nor actively enforced. If the workload writes files single-threaded, it will initially be fairly good. Over time GC will stir the soup and fragmentation grows. Several parallel writers give a pretty bad result for seek-bound devices, even initially. GC wants 4095 + 28 bytes per segment (128KiB by default) to deal with not-quite-100% filled segments plus one free segment per level (12 by default, could become an mkfs option). Add the journal and superblock for about 2MiB minimum overhead. Some embedded people with 32MiB devices worry about that, although arguably they should still use jffs2 if minimal space overhead is a big issue. I guess the above could go into Documentation/filesystems/logfs.txt. And some more. Jörn -- Fancy algorithms are slow when n is small, and n is usually small. Fancy algorithms have big constants. Until you know that n is frequently going to be big, don't get fancy. -- Rob Pike --
Can you talk about why, and describe these kinds of things? Is it just because of deep directory trees and having to rebuild the tree from the I was more thinking about the fragmentation in terms of how much free space you need for reasonable performance behavior - these kinds of things tend to easily start behaving really badly when the disk fills up and you need to GC all the time just to make room for new erase blocks for the trivial inode mtime/atime updates etc. Maybe logfs doesn't have that problem for some reason, but in many cases there are rules like "we will consider the filesystem full when it goes I did try looking at gitweb to see if I could find some documentation file. I didn't find anything. Linus --
Logfs has the concept of "levels". Level 0 contains file data. Level 1 has indirect blocks, level 2 doubly indirect blocks, etc. Inodes are stored in an inode file, which is on level 6 for the inodes, 7 for indirect blocks, etc. GC requires that data for each level is kept seperate from data for other levels. It is the only make deadlocks impossible, any alternative will just reduce deadlock likelyhood afaiks. Both regular files and the inode file can currently go up to 3x indirect, so you have up to 8 levels open for writing at any given time. Writing data synchronously requires wandering the entire tree, i.e. writing a block on level 0, then one on level 1, 2 and 3 if indirect blocks are required, write the inode at level 6 and again writing blocks on levels 7, 8 and 9 if the inode number is high. When creating a file, both the dentry and the created inode are written synchronously. So on a block device level, all this translates to several writes, none of them being adjacent. Each write is fairly small by itself. But the FTL inside your favorite type of consumer flash media will turn any small write into a write of the complete eraseblock. So somewhere on an internal bus, megabytes of data are happily shuffled around. I have a solution for this, but it would require an incompatible change to the format. And right now I have fairly good confidence in the format wrt. ensuring correctness. So the plan is to merge logfs as-is (modulo bugfixes, review fallout, etc.) and handle the changes for this and other performance problems with compat flags. And someday rename the whole mess to log2fs and remove some support for old format For any flash filesystem there is what I call the "evil workload". Fill the filesystem 100% and randomly replace data. In the best case (jffs2) the filesystem has to GC one segment worth of free space to write one block, then GC another segment for the next block, etc. Non-compressed log-structured filesystems can cheat their way arou...
Quite frankly, if that's the case, I'd *much* rather see that worked on first, so that there aren't any format changes that are already known to be pending before it even gets merged. Would it be at all possible to try to do that, or is it just "too far out"? Linus --
Definitely possible. The last similar change happened in December and took until March until I ran out of stupid regressions from it. Most likely there are still some I just haven't found yet. The question is when to draw the line and say "This is useful as-is for a sufficient number of users." I don't have a good answer to it. I certainly expect more changes in the future, including format changes. And if we wait for them all to happen, it won't get merged this decade. Not sure. Jörn -- The only good bug is a dead bug. -- Starship Troopers --
Why not merge it and mark it experimental then ? In fact, this is about what you're looking for : reduced merge hassle and more testers. Willy --
The real issue for me wrt a filesystem is the on-disk layout. If we know that on-disk structures need change, we shouldn't merge it. It doesn't matter if that can be worked around with some backwards- compatibiltiy flag: we should simply not encourage that kind of behaviour. It would be much much better to just get a layout that is as final as possible and avoid the "there are two different formats, because the first format was known to be broken" issue. Will extensions happen and add features anyway? Probably. But that's different from merging something knowing that the on-disk format will change. Linus --
I agree in particular, but not in principle (;-)) Changing the filesystem format was something that happened at least twice on Multics, on production machines. I happened to be on during one of the changes and didn't even know it was happening until there was a broadcast message warning of poor performance. I always thought that was cool, and got permission recently to post a colleague's paper on it at http://www.multicians.org/stachour.html It would be cool if data could change at run-time on Linux, just like security-sensitive code. --dave -- David Collier-Brown | Always do right. This will gratify Sun Microsystems, Toronto | some people and astonish the rest davecb@sun.com | -- Mark Twain (905) 943-1983, cell: (647) 833-9377, (800) 555-9786 x56583 bridge: (877) 385-4099 code: 506 9191# --
Andi already answered that one:
"Merging file systems too early can quickly ruin their name and that
taint is hard to ever get rid again then (e.g. happened to JFS)"
And a stable kernel shouldn't be something for getting "more testers",
it should be for tested code ready to be used in production.
What you call "more testers" would be people who try it in production
(e.g. to overcome shortcomings of JFFS2) thinking it was stable.
And no, EXPERIMENTAL in the kernel is not usable for keeping people from
cu
Adrian
--
"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed
--Hi Adrian, I think ext4 already set the precedent that you _can_ do development within the 2.6 series, no? --
I'd call the ext4 case a mistake we shouldn't repeat.
It's available in the kernel since 2006.
I've seen people using ext4 on their computers running with a corrupted
filesystem since fsck was at that point not yet capable of fixing
whatever was corrupted.
At least one distribution already has ext4 enabled in their kernels.
cu
Adrian
--
"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed
--No, it is just to gain more exposure by easing tester's job. People packaging distros for embedded systems do a lot of R&D, and having new features to experiment with is very important to them. And no, that does not mean they'll immediately use it in production. And even if some did, they would know why they did it and it's their problem. Willy --
If it's in the kernel it will end in distribution kernels.
And people will then use it.
You might be right, they might not immediately use it in production.
They might use the current version one year later in the then one year
old kernel they will then be using. Or the one year old version plus
You want to put experimental code into stable kernels and then blame
cu
Adrian
BTW: This is not meant against the LogFS merge.
--
"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed
--There are still a few i's and t's left to dot and cross: * the changeset comments needs a Signed-off-by: line * The MAINTAINERS file should list your name and logfs mailing list * you have a few instances of '#if LINUX_VERSION_CODE > KERNEL_VERSION(2, 6, 23)', that should go away for the merge * The copyright notice says 2005-2007, it should probably be 2005-2008 * You may want to add a Documentation/filesystems/logfs.txt file explaining the supported mount options. * CONFIG_LOGFS should be tristate, not bool. Unfortunately, you are still using three symbols that are not exported: swapper_space (through BUG_ON(!page_mapping(page)->a_ops->set_page_dirty)), add_to_page_cache_lru and inode_lock. Not sure what to do about this. * You should really make sure the version you check in compiles, fs/logfs/logfs.h is missing an #endif. ;-) Otherwise, I don't see any reasons why logfs shouldn't go in. The code is clean, feature-complete, and there is demand for it. The main question I can still see is the timing with the merge window. It's almost closed, so if logfs doesn't go in really soon, it should probably wait for the 2.6.27 window. Arnd <>< --- This patch fixes some of the problems mentined above. Signed-off-by: Arnd Bergmann <arnd@arndb.de> diff --git a/MAINTAINERS b/MAINTAINERS index cae9001..4b45c5b 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -2570,6 +2570,15 @@ L: linux-ntfs-dev@lists.sourceforge.net W: http://www.linux-ntfs.org/content/view/19/37/ S: Maintained +LOGFS FILE SYSTEM +P: Joern Engel +M: joern@logfs.org +L: logfs@logfs.org +L: linux-fsdevel@vger.kernel.org +W: http://www.logfs.org/ +T: git://git.kernel.org/pub/scm/linux/kernel/git/joern/logfs.git +S: Maintained + LSILOGIC MPT FUSION DRIVERS (FC/SAS/SPI) P: Eric Moore M: Eric.Moore@lsi.com diff --git a/fs/logfs/compr.c b/fs/logfs/compr.c index 8f01943..44bbfd2 100644 --- a/fs/logfs/compr.c +++ b/fs/logfs/compr.c @@ -3,7 +3,7 @@ * * As sh...
Doh! When sending patches that happens automatically. I should teach Yes. I would like to keep the merge version roughly in sync with the external patch, at least for a while. Not sure how to deal with one Sure. I don't have any logfs-specific ones yet, but even that fact inode_lock will get fixed. The BUG_ON could get removed. Not sure I believe it is currently subscribers-only with the usual bounces everyone holds so dear. I should change that and add a spam filter to make it bearable. Jörn -- The only real mistake is the one from which we learn nothing. -- John Powell --
Hi Jörn, You probably want an ACK from the VFS maintainers before aiming at mainline. But it surely makes sense to ask Andrew to pull it in -mm now. --
Definitively wants a re-review with all the bits from last time fixed. How did the inode_lock abuse get fixed, btw? That one was rather lethal. --
That wart is still itching. I thought I'd need a core patch to remove it, but looking at it again, I might get away with a private spinlock. Will get fixed. Jörn -- Happiness isn't having what you want, it's wanting what you have. -- unknown --
| Greg Kroah-Hartman | [PATCH 009/196] Chinese: add translation of sparse.txt |
| Artem Bityutskiy | [PATCH take 2 06/28] UBIFS: add journal replay |
| Luck, Tony | RE: [Ksummit-2008-discuss] Fixing the Kernel Janitors project |
| FUJITA Tomonori | Re: Integration of SCST in the mainstream Linux kernel |
git: | |
| ir0s | Local branch ahead of tracked remote branch but git push claims everything up-to-d... |
| Matthieu Moy | git push to a non-bare repository |
| Johannes Schindelin | Re: VCS comparison table |
| Rocco Rutte | mercurial to git |
| Sunnz | radeon driver in -current Xorg 7.2? |
| Neko | reliable, dd over simple ip network |
| GVG GVG | ssh_exchange_identification: Connection closed by remote host |
| Siju George | This is what Linus Torvalds calls openBSD crowd |
| David Miller | [GIT]: Networking |
| Inaky Perez-Gonzalez | [PATCH 00/39] merge request for WiMAX kernel stack and i2400m driver |
| Jarek Poplawski | [PATCH] pkt_sched: Destroy gen estimators under rtnl_lock(). |
| Evgeniy Polyakov | Re: [bug, netconsole, SLUB] BUG skbuff_head_cache: Poison overwritten |
