On Sat, 2009-08-22 at 10:51 +0800, Wu Fengguang wrote:
quoted text > > +++ a/mm/page-writeback.c
quoted text > > @@ -465,7 +439,6 @@ get_dirty_limits(unsigned long *pbackgro
> > bdi_dirty = dirty * bdi->max_ratio / 100;
> >
> > *pbdi_dirty = bdi_dirty;
> > task_dirty_limit(current, pbdi_dirty);
> > }
> > }
> > @@ -499,45 +472,12 @@ static void balance_dirty_pages(struct a
> > };
> >
> > get_dirty_limits(&background_thresh, &dirty_thresh,
> > + &bdi_thresh, bdi);
> >
> > nr_reclaimable = global_page_state(NR_FILE_DIRTY) +
> > + global_page_state(NR_UNSTABLE_NFS);
> > + nr_writeback = global_page_state(NR_WRITEBACK) +
> > + global_page_state(NR_WRITEBACK_TEMP);
> >
> > /*
> > * In order to avoid the stacked BDI deadlock we need
> > @@ -557,16 +497,48 @@ static void balance_dirty_pages(struct a
> > bdi_nr_writeback = bdi_stat(bdi, BDI_WRITEBACK);
> > }
> >
>
> > + /* always throttle if over threshold */
> > + if (nr_reclaimable + nr_writeback < dirty_thresh) {
>
> That 'if' is a big behavior change. It effectively blocks every one
> and canceled Peter's proportional throttling work: the less a process
> dirtied, the less it should be throttled.
Hmm, I think you're right, I had not considered that, thanks for
catching that.
quoted text > I'd propose to remove the above 'if' and liberate the following three 'if's.
That might work, but it looses the total dirty_thresh constraint. The
sum of per-bdi dirties _should_ not be larger than that, but I'm not
sure it won't ever be.
The clip code Richard removed ensured that, and I think I wrote that out
of more than sheer paranoia, but I'm not sure anymore :/
I'll go over the numeric stuff again to see where it could go wrong.
If we can bound the error (I'm suspecting it was some numerical error
bound) we should be good and can indeed do this.
quoted text > > +
> > + if (bdi_nr_reclaimable + bdi_nr_writeback <= bdi_thresh)
> > + break;
> > +
> > + /*
> > + * Throttle it only when the background writeback cannot
> > + * catch-up. This avoids (excessively) small writeouts
> > + * when the bdi limits are ramping up.
> > + */
> > + if (nr_reclaimable + nr_writeback <
> > + (background_thresh + dirty_thresh) / 2)
> > + break;
> > +
> > + /* done enough? */
> > + if (pages_written >= write_chunk)
> > + break;
> > + }
> > + if (!bdi->dirty_exceeded)
> > + bdi->dirty_exceeded = 1;
> >
> > + /* Note: nr_reclaimable denotes nr_dirty + nr_unstable.
> > + * Unstable writes are a feature of certain networked
> > + * filesystems (i.e. NFS) in which data may have been
> > + * written to the server's write cache, but has not yet
> > + * been flushed to permanent storage.
> > + * Only move pages to writeback if this bdi is over its
> > + * threshold otherwise wait until the disk writes catch
> > + * up.
> > + */
> > + if (bdi_nr_reclaimable > bdi_thresh) {
> > + writeback_inodes(&wbc);
> > + pages_written += write_chunk - wbc.nr_to_write;
>
> > + if (wbc.nr_to_write == 0)
> > + continue;
>
> What's the purpose of the above 2 lines?
I think I should slow down, I seem to have totally missed these two
lines when I read the patch :/
quoted text > > + }
> > congestion_wait(BLK_RW_ASYNC, HZ/10);
> > }
> >
> > if (bdi_nr_reclaimable + bdi_nr_writeback < bdi_thresh &&
> > + bdi->dirty_exceeded)
> > bdi->dirty_exceeded = 0;
> >
> > if (writeback_in_progress(bdi))
> > @@ -580,10 +552,8 @@ static void balance_dirty_pages(struct a
> > * In normal mode, we start background writeout at the lower
> > * background_thresh, to keep the amount of dirty memory low.
> > */
> > + if ((laptop_mode && pages_written) || (!laptop_mode &&
> > + (nr_reclaimable > background_thresh)))
> > bdi_start_writeback(bdi, NULL, 0, WB_SYNC_NONE);
> > }
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to
majordomo@vger.kernel.org
More majordomo info at
http://vger.kernel.org/majordomo-info.html