> On Wed, Apr 07 2010, Vivek Goyal wrote:
> > On Wed, Apr 07, 2010 at 05:18:12PM -0400, Jeff Moyer wrote:
> > > Hi again,
> > >
> > > So, here's another stab at fixing this. This patch is very much an RFC,
> > > so do not pull it into anything bound for Linus. ;-) For those new to
> > > this topic, here is the original posting:
http://lkml.org/lkml/2010/4/1/344
> > >
> > > The basic problem is that, when running iozone on smallish files (up to
> > > 8MB in size) and including fsync in the timings, deadline outperforms
> > > CFQ by a factor of about 5 for 64KB files, and by about 10% for 8MB
> > > files. From examining the blktrace data, it appears that iozone will
> > > issue an fsync() call, and will have to wait until it's CFQ timeslice
> > > has expired before the journal thread can run to actually commit data to
> > > disk.
> > >
> > > The approach below puts an explicit call into the filesystem-specific
> > > fsync code to yield the disk so that the jbd[2] process has a chance to
> > > issue I/O. This bring performance of CFQ in line with deadline.
> > >
> > > There is one outstanding issue with the patch that Vivek pointed out.
> > > Basically, this could starve out the sync-noidle workload if there is a
> > > lot of fsync-ing going on. I'll address that in a follow-on patch. For
> > > now, I wanted to get the idea out there for others to comment on.
> > >
> > > Thanks a ton to Vivek for spotting the problem with the initial
> > > approach, and for his continued review.
> > >
> >
> > Thanks Jeff. Conceptually this appraoch makes lot of sense to me. Higher
> > layers explicitly telling CFQ not to idle/yield the slice.
> >
> > My firefox timing test is perfoming much better now.
> >
> > real 0m15.957s
> > user 0m0.608s
> > sys 0m0.165s
> >
> > real 0m12.984s
> > user 0m0.602s
> > sys 0m0.148s
> >
> > real 0m13.057s
> > user 0m0.624s
> > sys 0m0.145s
> >
> > So we got to take care of two issues now.
> >
> > - Make it work with dm/md devices also. Somehow shall have to propogate
> > this yield semantic down the stack.
>
> The way that Jeff set it up, it's completely parallel to eg congestion
> or unplugging. So that should be easily doable.
>