> On Wed, Mar 10, 2010 at 9:17 AM, Hans-Peter Jansen <hpj@urpla.net> wrote:
> > While this system usually operates fine, it suffers from delays, that
> > are displayed in latencytop as: "Writing page to disk: 8425,5 ms":
> >
ftp://urpla.net/lat-8.4sec.png, but we see them also in the 1.7-4.8 sec
> > range:
ftp://urpla.net/lat-1.7sec.png,
ftp://urpla.net/lat-2.9sec.png,
> >
ftp://urpla.net/lat-4.6sec.png and
ftp://urpla.net/lat-4.8sec.png.
> >
> > From other observations, this issue "feels" like it is induced by
> > single syncronisation points in the block layer, eg. if I create heavy
> > IO load on one RAID array, say resizing a VMware disk image, it can
> > take up to a minute to log in by ssh, although the ssh login does not
> > touch this area at all (different RAID arrays). Note, that the
> > latencytop snapshots above are made during normal operation, not this
> > kind of load..
> >
> > Might later kernels mitigate this problem? As this is a production
> > system, that is used 6.5 days a week, I cannot do dangerous
> > experiments, also switching to 64 bit is a problem due to the legacy
> > stuff described above... OTOH, my users suffer from this, and anything
> > helping in this respect is highly appreciated.
>
> Seems like a 2.6.32 based kernel which has per-BDI writeback and "CFQ
> low latency mode" changes might help a good deal. I know that on one
> of my bigger machines (similar in specs to yours) which has a lot of
> processes which do a decent amount of IO, latency and load average has
> gone down after going to a 2.6.32 kernel from a 2.6.31 kernel (Fedora
> 11 system).
>
> Like Chris suggested, I've also heard that using the noop IO scheduler
> can work well on Areca controllers on some kernels and workloads.
> It's worth a shot and you can even try changing it at run-time.