On Thu, 2007-11-15 at 08:32 -0800, Linus Torvalds wrote:
I wondered about that part the other day when I went through the BDI
dirty code due to that iozone thing..
The initial commit states:
commit d90e4590519d196004efbb308d0d47596ee4befe
Author: akpm <akpm>
Date: Sun Oct 13 16:33:20 2002 +0000
[PATCH] reduce the dirty threshold when there's a lot of mapped
Dirty memory thresholds are currently set by /proc/sys/vm/dirty_ratio.
Background writeout levels are controlled by
/proc/sys/vm/dirty_background_ratio.
Problem is that these levels are hard to get right - they are too
static. If there is a lot of mapped memory around then the 40%
clamping level causes too much dirty data. We do lots of scanning in
page reclaim, and the VM generally starts getting into distress. Extra
swapping, extra page unmapping.
It would be much better to simply tell the caller of write(2) to slow
down - to write out their dirty data sooner, to make those written
pages trivially reclaimable. Penalise the offender, not the innocent
page allocators.
This patch changes the writer throttling code so that we clamp down
much harder on writers if there is a lot of mapped memory in the
machine. We only permit memory dirtiers to dirty up to 50% of unmapped
memory before forcing them to clean their own pagecache.
BKrev: 3da9a050Mz7H6VkAR9xo6ongavTMrw
But because dirty mapped pages are no longer special, I'd say the reason
for its existance is gone. So,
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
As for the highmem part, that was due to buffer cache, and unfortunately
that is still true. Although maybe we can do something smart with the
per-bdi stuff.
-