On Tuesday 20 May 2008, Jamie Lokier wrote:Jens and I talked about tossing the barriers completely and just doing FUA for all metadata writes. For drives with NCQ, we'll get something close to optimal because the higher layer elevators are already doing most of the hard work. Either way, you do want the flush to cover all the data=ordered writes, at least all the ordered writes from the transaction you're about to commit. Telling the difference between data=ordered from an old transaction or from the running transaction gets into pushing ordering down to the lower levels (see below) Adding explicit ordering into the IO path is really interesting. We toss a bunch of IO down to the lower layers with information about dependencies and let the lower layers figure it out. James had a bunch of ideas here, but I'm afraid the only people that understood it were James and the whiteboard he was scribbling on. The trick is to code the ordering in such a way that an IO failure breaks the chain, and that the filesystem has some sensible chance to deal with all these requests that have failed because an earlier write failed. Also, once we go down the ordering road, it is really tempting to forget that ordering does ensure consistency but doesn't tell us the write actually happened. fsync and friends need to hook into the dependency chain to wait for the barrier instead of waiting for the commit. But, back to the short term for a second, what we need are some benchmarks for barriers on and off and some guidance from the ext34 maintainers about turning them on by default. We shouldn't be pushing this FS integrity decision off on the distros. My test prog is definitely a worst case, but I'm pretty confident that most mail server workloads end up doing similar IO. A 16MB or 32MB disk cache is common these days, and that is a very sizable percentage of the jbd log size. I think the potential for corruptions on power failure is only growing over time. -chris --
| James Bottomley | Re: Integration of SCST in the mainstream Linux kernel |
| Greg Kroah-Hartman | [PATCH 007/196] Chinese: add translation of stable_kernel_rules.txt |
| david | Re: Dual-Licensing Linux Kernel with GPL V2 and GPL V3 |
| Jan Engelhardt | intel iommu (Re: -mm merge plans for 2.6.23) |
git: | |
| Alexey Dobriyan | Re: [GIT]: Networking |
| Jarek Poplawski | [PATCH] pkt_sched: Destroy gen estimators under rtnl_lock(). |
| Gerrit Renker | [PATCH 27/37] dccp: Integration of dynamic feature activation - part 2 (server side) |
| David Miller | Re: [BUG] New Kernel Bugs |
