[PATCH] mm: set_page_dirty_balance() vs ->page_mkwrite()

Previous thread: halt does not shut the system down by John Sigler on Monday, October 8, 2007 - 12:19 pm. (20 messages)

Next thread: RFC: reviewer's statement of oversight by Jonathan Corbet on Monday, October 8, 2007 - 1:24 pm. (35 messages)
To: linux-kernel <linux-kernel@...>, Linus Torvalds <torvalds@...>, Andrew Morton <akpm@...>
Cc: Christoph Hellwig <hch@...>, David Howells <dhowells@...>, Nick Piggin <nickpiggin@...>, Dave Chinner <dgc@...>, Trond Myklebust <trond.myklebust@...>, <mark.fasheh@...>, hugh <hugh@...>, stable <stable@...>
Date: Monday, October 8, 2007 - 12:54 pm

It seems that with the recent usage of ->page_mkwrite() a little detail
was overlooked.

.22-rc1 merged OCFS2 usage of this hook
.23-rc1 merged XFS usage
.24-rc1 will most likely merge NFS usage

Please consider this for .23 final and maybe even .22.x

---
Subject: mm: set_page_dirty_balance() vs ->page_mkwrite()

All the current page_mkwrite() implementations also set the page dirty. Which
results in the set_page_dirty_balance() call to _not_ call balance, because the
page is already found dirty.

This allows us to dirty a _lot_ of pages without ever hitting
balance_dirty_pages(). Not good (tm).

Force a balance call if ->page_mkwrite() was successful.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
---
include/linux/writeback.h | 2 +-
mm/memory.c | 9 +++++++--
mm/page-writeback.c | 4 ++--
3 files changed, 10 insertions(+), 5 deletions(-)

Index: linux-2.6/include/linux/writeback.h
===================================================================
--- linux-2.6.orig/include/linux/writeback.h
+++ linux-2.6/include/linux/writeback.h
@@ -137,7 +137,7 @@ int sync_page_range(struct inode *inode,
loff_t pos, loff_t count);
int sync_page_range_nolock(struct inode *inode, struct address_space *mapping,
loff_t pos, loff_t count);
-void set_page_dirty_balance(struct page *page);
+void set_page_dirty_balance(struct page *page, int page_mkwrite);
void writeback_set_ratelimit(void);

/* pdflush.c */
Index: linux-2.6/mm/memory.c
===================================================================
--- linux-2.6.orig/mm/memory.c
+++ linux-2.6/mm/memory.c
@@ -1559,6 +1559,7 @@ static int do_wp_page(struct mm_struct *
struct page *old_page, *new_page;
pte_t entry;
int reuse = 0, ret = 0;
+ int page_mkwrite = 0;
struct page *dirty_page = NULL;

old_page = vm_normal_page(vma, address, orig_pte);
@@ -1607,6 +1608,8 @@ static int do_wp_page(struct mm_struct *
page_cache_release(old_page);
i...

To: Peter Zijlstra <peterz@...>
Cc: linux-kernel <linux-kernel@...>, Linus Torvalds <torvalds@...>, Andrew Morton <akpm@...>, Christoph Hellwig <hch@...>, David Howells <dhowells@...>, Dave Chinner <dgc@...>, Trond Myklebust <trond.myklebust@...>, <mark.fasheh@...>, hugh <hugh@...>, stable <stable@...>
Date: Monday, October 8, 2007 - 2:37 am

-

To: Nick Piggin <nickpiggin@...>
Cc: Peter Zijlstra <peterz@...>, linux-kernel <linux-kernel@...>, Linus Torvalds <torvalds@...>, Andrew Morton <akpm@...>, Christoph Hellwig <hch@...>, David Howells <dhowells@...>, Dave Chinner <dgc@...>, Trond Myklebust <trond.myklebust@...>, <mark.fasheh@...>, hugh <hugh@...>, stable <stable@...>
Date: Monday, October 8, 2007 - 7:36 pm

block_page_mkwrite() is just using generic interfaces to do this,
same as pretty much any write() system call. The idea was to make it
as similar to the write() call path as possible...

However, unlike generic_file_buffered_write(), we are not calling
balance_dirty_pages_ratelimited(mapping) between
->prepare/commit_write call pairs. Perhaps this should be added to
block_page_mkwrite() after the page is unlocked....

Cheers,

Dave.
--
Dave Chinner
Principal Engineer
SGI Australian Software Group
-

To: David Chinner <dgc@...>
Cc: Peter Zijlstra <peterz@...>, linux-kernel <linux-kernel@...>, Linus Torvalds <torvalds@...>, Andrew Morton <akpm@...>, Christoph Hellwig <hch@...>, David Howells <dhowells@...>, Trond Myklebust <trond.myklebust@...>, <mark.fasheh@...>, hugh <hugh@...>, stable <stable@...>
Date: Monday, October 8, 2007 - 3:47 am

That sounds pretty sane, in terms of matching with
generic_file_buffered_write.
-

To: Nick Piggin <nickpiggin@...>
Cc: David Chinner <dgc@...>, Peter Zijlstra <peterz@...>, linux-kernel <linux-kernel@...>, Linus Torvalds <torvalds@...>, Andrew Morton <akpm@...>, Christoph Hellwig <hch@...>, David Howells <dhowells@...>, Trond Myklebust <trond.myklebust@...>, hugh <hugh@...>, stable <stable@...>
Date: Monday, October 8, 2007 - 10:12 pm

I agree. We could also insert a call to balance_dirty_pages_ratelimited() in
__ocfs2_page_mkwrite.
--Mark

--
Mark Fasheh
Senior Software Developer, Oracle
mark.fasheh@oracle.com
-

To: Mark Fasheh <mark.fasheh@...>
Cc: David Chinner <dgc@...>, Peter Zijlstra <peterz@...>, linux-kernel <linux-kernel@...>, Linus Torvalds <torvalds@...>, Andrew Morton <akpm@...>, Christoph Hellwig <hch@...>, David Howells <dhowells@...>, Trond Myklebust <trond.myklebust@...>, hugh <hugh@...>, stable <stable@...>
Date: Monday, October 8, 2007 - 10:50 am

Hmm, Peter's patch got merged -- I suppose that's fine for 2.6.23 though...

-

Previous thread: halt does not shut the system down by John Sigler on Monday, October 8, 2007 - 12:19 pm. (20 messages)

Next thread: RFC: reviewer's statement of oversight by Jonathan Corbet on Monday, October 8, 2007 - 1:24 pm. (35 messages)