login
Header Space

 
 

Re: [PATCH 5/5] writeback: introduce writeback_control.more_io to indicate more io

Score:
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]
To: Fengguang Wu <wfg@...>
Cc: David Chinner <dgc@...>, Andrew Morton <akpm@...>, <linux-kernel@...>, Ken Chen <kenchen@...>, Andrew Morton <akpm@...>, Michael Rubin <mrubin@...>
Date: Friday, October 5, 2007 - 3:41 am

On Fri, Oct 05, 2007 at 11:36:52AM +0800, Fengguang Wu wrote:

I walked right into that one ;)


From this, if we have more_io on one superblock and we skip pages on a
different superblock, the combination of the two will causes us to stop
writeback for a while. Is this the right thing to do?


To me it reads as:

	while (!done) {
		/* sync all data or until one inode skips */
		congestion_wait(up to 100ms);
	}

and it ignores that we might have more superblocks with dirty data
on them that we haven't flushed because we skipped pages on
an inode on a different block device.



If that's the worst case, then it's far better than the current
"wait 30s for every 4MB".  ;)

Still, if it can be improved....


And probably good enough to make it unnoticable.


Ah, ok. that I understand ;)


But it takes a modern SATA disk ~40-50ms to write 4MB (80-100MB/s).
IOWs, what you've timed above is a burst workload, not a steady
state behaviour. And it actually shows that the elevator queues
are growing in constrast to your goal of preventing them from
growing.

In more detail, the first half of the trace indicates no pages under
writeback, that tends to imply that all I/O is complete by the
time wb_kupdate is woken - it's been sucked into the drive
cache as fast as possible.

About half way through we start to see windup of the the number of
pages under writeback of about 800-900 pages per printk.  That's
1024 pages minus 1 or 2 512k I/Os. This implies that the disk cache
is now full and the disk has reached saturation. I/O is now
being queued in the elevator. The last trace has 13051 pages under
writeback, which at 128 pages per I/O is ~100 queued 512k I/Os.

The default queue depth with cfq is 128 requests, and IIRC it
congests at 7/8s full, or 112 requests. IOWs, you file that you
wrote was about 10MB short of what is needed to see congestion on
your test rig.

So the trace shows we slept on neither congestion or more_io
and it points towards congestion being the thing will typically
block us on large file I/O. Before drawing any conclusions on
whether wbc.more_io is needed or not, do you have any way of
producing skipped pages when more_io is set?


You are using ext3? That would be my guess based simply on the write
rate - ext3 has long been stuck at about that speed for buffered
writes even on much faster block devices.  If I'm right, try using
XFS and see how much differently it behaves. I bet you hit
congestion much sooner than you expect. ;)

Cheers,

Dave.
-- 
Dave Chinner
Principal Engineer
SGI Australian Software Group
-
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

Messages in current thread:
Re: [PATCH 5/5] writeback: introduce writeback_control.more_..., David Chinner, (Fri Oct 5, 3:41 am)
speck-geostationary