linux-ext4 mailing list

FromSubjectsort iconDate
Joel Becker
Re: [Ocfs2-devel] [PATCH] OCFS2: Allow huge (> 16 TiB) v ...
[Added jbd2 Ccs. Sorry about the whole-patch-quote, but I want jbd2 folks to see what we're doing.] This is completely unsafe. Two reasons. First, you're checking the journal features after ocfs2_journal_load() has done recovery. This may or may not be safe; recovering a 32bit journal probably works even on a 64bit filesystem, and we shouldn't see that combination in the wild anyway. That's not so bad. Far worse is that you might recover a 64bit journal before you've checked the ...
Jul 6, 1:04 pm 2010
David Howells
Re: [PATCH] Rearrange i_flags to be consistent with FS_I ...
That occurred to me after I sent the patch. I can add some preprocessor guards for this. David --
Jul 6, 6:40 am 2010
David Howells
Re: [PATCH] Rearrange i_flags to be consistent with FS_I ...
They're not so dependent. They're based on the FS_IOC_[GS]ETFLAGS ioctl which even XFS translates its flags for. These ioctl flags must now remain invariant. Whilst they might have originated as Ext2/3/4 flags, they're now This can be argued one way or another, however aligning i_flags with something would probably be an improvement somewhere. Most of what I deal with is Ext3/4 based, and BTRFS-based is likely to become important too. David --
Jul 6, 4:45 pm 2010
Eric Sandeen
Re: inconsistent file placement
Using a recent e2fsprogs, and the "filefrag -v" command, will give you much more interesting layout information: # filefrag -v testfile Filesystem type is: ef53 File size of testfile is 1073741824 (262144 blocks, blocksize 4096) ext logical physical expected length flags 0 0 1865728 32768 1 32768 1898496 32768 2 65536 1931264 32768 3 98304 1964032 32768 4 131072 1996800 2048 5 133120 2000896 ...
Jul 5, 7:38 pm 2010
Amir G.
Re: inconsistent file placement
The ext[23] (and I suppose 4 as well) uses the process pid % 16 to define a 'color' for the process. New files first block goal depends on that 'color' - the goal is one of 16 different offsets in the block group where the new file's inode was allocated (usually the block group of its parent directory). The logic behind this allocator is that multiple files created concurrently in the same directory would have less chance of stepping over each other's allocations. I am not sure what you ...
Jul 5, 11:52 pm 2010
Daniel Taylor
inconsistent file placement
I realize that it is enerally not a good idea to tune an operating system, or subsystem, for benchmarking, but there's something that I don't understand about ext[234] that is badly affecting our product. File placement on newly-created file systems is inconsistent. I can't, yet, call it a bug, but I really need to understand what is happening, and I cannot find, in the source code, the source of the randomization (related to "goal"???). Disk drive performance for writing/reading large ...
Jul 5, 6:49 pm 2010
Daniel Taylor
RE: inconsistent file placement
In all of my recent tests, there has only been one file created, in the root directory of the freshly created and mounted file system. mkfs.ext[234] -b 65536 /dev/sda4 mount <some options tested> /dev/sda4 /DataVolume touch /DataVolume/hex.txt "for i in 1 2 3 4 5; do dd if=/hex.txt bs=64K; \ done >>/DataVolume/hex.txt" umount /DataVolume dumpe2fs /dev/sda4 >/<log file> where /hex.txt is a 1G file on the NFS root. I tried with, and without, orlov on ext3 (-o orlov and -o oldalloc) and ...
Jul 6, 3:15 pm 2010
tytso
Re: inconsistent file placement
In ext3, it really is random. The randomness you're looking for can be found in fs/ext3/ialloc.c:find_group_orlov(), when it calls get_random_bytes(). This is responsible for "spreading" directories so they are spread across the block groups, to try to prevent fragmented files. Yes, if all you care about is benchmarks which only use 10% of the entire file system, and for which the benchmarks don't adequately simulate file system aging, the algorithms in ext3 will cause a lot of ...
Jul 6, 11:55 am 2010
Eric Sandeen
Re: inconsistent file placement
However, from the test description it looks like it is writing a file to the root dir, so there should be no parent-dir random spreading, right? -Eric --
Jul 6, 11:59 am 2010
tytso
Re: inconsistent file placement
Hmm, yes, I missed that part of Daniel's e-mail. He's just writing a single file. In that case, Amir is right, the only thing which would be causing this is the colour offset, at least for ext2 and ext3. This is avoid fragmented files caused by two or more processes running on different CPU's all writing into the same block group. In the case of ext4, we don't use a pid-determined colour algorithm if delayed allocation is used, and the randomness is caused by the writeback system deciding ...
Jul 6, 3:01 pm 2010
tytso
Re: inconsistent file placement
Out of curiosity, what *are* the "common NAS benchmarks" in use today, and who chooses them? There have been times in the past when "common benchmarks" promulgated by reviewers have done active harm in the industry, driving disk drive manufacturers to chose unsafe defaults, all because the only thing people paid attention to was crappy benchmarks. Sometimes the right answer is to put a spotlight on deficient Delayed allocation is the default for ext4. If you are seeing random behaviour ...
Jul 6, 4:14 pm 2010
Eric Sandeen
Re: inconsistent file placement
orlov is an inode allocator for directory inodes; since you are creating 1 file in the root dir, they won't matter. It affects file placement because files prefer to be close to their parent dir, more or less, but in your case you are never allocating a directory so the point is moot. delalloc is the default as well. filefrag -v output would be much more enlightening than what you've shown so far... --
Jul 6, 4:34 pm 2010
Eric Sandeen
Re: inconsistent file placement
that patch is rather simplistic, FWIW; at least for XFS it -hurt- perf due to the unwritten->written conversion and the relatively small, frequent preallocations. More smarts to merge up multiple 1-byte-writes into a large preallocation might help, as the bug mentions. But ... is something like it already in samba? that'd be nifty, but I wasn't aware of that. There is a preallocation-sounding switch but I think it doesn't do what you think it does. I'd have to go look up details, ...
Jul 6, 4:39 pm 2010
Eric Sandeen
Re: extent counting fun
Well, I didn't actually look but I'm 98% sure it's just because it's not reporting the interspersed metadata blocks. Sorry, above was on ext3, that wasn't clear, just a stock dd-streamed Hm don't we have that already? Hmm... just xattr I guess. In any case it's still a question of whether ext3 extent count should be "fudged" to make blocks separated by metadata look contiguous or not ... --
Jul 5, 5:50 pm 2010
Eric Sandeen
Re: [PATCH] fix for consistency errors after crash
Hi Amir, I really do appreciate the effort, the patch, and the ping. :) I'll have to set aside some time to give it a hard look, but linking it back to existing bugs of mine raises that priority, thanks. :) --
Jul 6, 9:00 am 2010
Amir G.
Re: [PATCH] fix for consistency errors after crash
Hi Eric, I've seen you guys had some open RH bugs on ext3, who all share in common the "bit already free" error. This bug I reported can explain many different problems in ext[34]. Essentially, every time there is a kernel crash (or hard reboot) during delete/truncate of a large file, it may result in "bit already clear" error after reboot. The problem is very simple and so is the fix. I proved the problem with 100% recreation chances using a small patch, instead of running statistical ...
Jul 6, 6:00 am 2010
bugzilla-daemon
[Bug 15827] ext4_get_blocks may be called while ext4_tr ...
https://bugzilla.kernel.org/show_bug.cgi?id=15827 Dmitry Monakhov <dmonakhov@openvz.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution| |CODE_FIX -- Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are ...
Jul 6, 12:15 am 2010
bugzilla-daemon
[Bug 15792] ext4_inode_info->i_flags modification is racy
https://bugzilla.kernel.org/show_bug.cgi?id=15792 Dmitry Monakhov <dmonakhov@openvz.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution| |CODE_FIX -- Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are ...
Jul 6, 12:15 am 2010
bugzilla-daemon
[Bug 15742] Fallocated extents handled incorrectly if be ...
https://bugzilla.kernel.org/show_bug.cgi?id=15742 Dmitry Monakhov <dmonakhov@openvz.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution| |CODE_FIX -- Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are ...
Jul 6, 12:16 am 2010
previous daytodaynext day
July 5, 2010July 6, 2010July 7, 2010