On Sun, May 18, 2008 at 10:03:55PM +0200, Andi Kleen wrote:Given how rarely people have reported problems, I think it's a really good idea to understand what exactly our exposure is for $COMMON_HARDWARE. And I suspect the biggest question isn't the hardware, but the workload. Here are the questions that I think are worth asking: * How often can we get corruption on a common desktop workload? Given that we're mostly kernel developers, and kernbench is probably worst case for desktops, that's useful. * What is the performance hit on a common desktop workload (let's use kernbench for consistency). * How often can we get corruption on a hard-core enterprise application with lots of fsync()'s? (i.e., postmark, et. al) * What is the performance hit on a an fsync()-heavy workload? I have a feeling that the likelihood of corruption when running kernbench is minimal, but the performance hit is probably minimal as well. And that the corruption for potential is higher for an fsync-heavy workload, but that's also where we are seeing the (reported) 30% hit. The other thing which we should consider is that I suspect we can do much better for ext4 given that we have journal checksums. As Chris pointed out, right now, with barriers turned on, we are doing this: write log blocks flush #1 write commit block flush #2 write metadata blocks If we don't mind mixing bh and bio functions, we could change it to this for ext4 (when journal checksumming is enabled) write log blocks write commit block flush (via submitting an empty barrier block I/O request) write metadata blocks This should hopefully reduce the performance hit by half, since we're eliminating one of the flushes. Even more interesting would be moving the flush until right before we attempt to write the metadata blocks, and allowing data writes which don't require metadata updates through. That should be safe, even in data=ordered mode. The point is we should think about ways that we can optimize barrier mode for ext4. If we do this, then it may be that people will find it interesting to mount ext3 filesystems using ext4, even without making any additional changes, because of the better choices of speed/safety tradeoffs. - Ted --
| Linus Torvalds | Linux 2.6.27-rc5 |
| Greg KH | Re: Dual-Licensing Linux Kernel with GPL V2 and GPL V3 |
| Greg Kroah-Hartman | [PATCH 004/196] Chinese: add translation of SubmittingPatches |
| Trent Piepho | Re: [PATCH] [POWERPC] Improve (in|out)_beXX() asm code |
git: | |
| Christoph Hellwig | Re: [PATCH 06/32] IGET: Mark iget() and read_inode() as being obsolete [try #2] |
| Gerrit Renker | [PATCH 0/37] dccp: Feature negotiation - last call for comments |
| David Miller | Re: [PATCH] pkt_sched: Destroy gen estimators under rtnl_lock(). |
| David Miller | [GIT]: Networking |
