Re: msync(2) bug(?), returns AOP_WRITEPAGE_ACTIVATE to userland

!MAILaRCHIVE_VOTE_RePLACE
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]
To: Erez Zadok <ezk@...>
Cc: Pekka Enberg <penberg@...>, Ryan Finnie <ryan@...>, Andrew Morton <akpm@...>, <linux-kernel@...>, <linux-fsdevel@...>, <cjwatson@...>, <linux-mm@...>
Date: Thursday, October 25, 2007 - 12:40 pm

On Mon, 22 Oct 2007, Erez Zadok wrote:

unionfs_writepage returns it in two different cases: when it can't
find the underlying page; and when the underlying writepage returns
it.  I'd say it's wrong to return it in both cases.

In the first case, you don't really want your page put back to the head
of the active list, you want to come back to try it again quite soon
(I think): so you should just redirty and unlock and pretend success.

ramdisk uses A_W_A because none of its pages will ever become freeable
(and comment points out it'd be better if they weren't even on the
LRUs - I think several people have recently been putting forward
patches to keep such timewasters off the LRUs).

shmem uses A_W_A when there's no swap (left), or when the underlying
shm is marked as locked in memory: in each case, best to move on to
look for other pages to swap out.  (But I'm not quite convincing myself
that the temporarily out-of-swap case is different from yours above.)
It's about fixing some horrid busy loops where vmscan kept going
over the same hopeless pages repeatedly, instead of moving on to
better candidates.  Oh, there's a third case, when move_to_swap_cache
fails: that's rare, and I think I was just too lazy to separate them.

In your second case, I fail to see why the unionfs level should
mimic the lower level: you've successfully copied data and marked
the lower level pages as dirty, vmscan will come back to those in
due course, but it's just a waste of time for it to come back to
the unionfs pages again - isn't it?


unionfs_writepages handles the sync/msync/fsync leaking of A_W_A to
userspace issue, as does Pekka & Andrew's patch to write_cache_pages,
as does my patch to shmem_writepage.  And I'm contending that
unionfs_writepage should in no case return A_W_A up.

But so long as A_W_A is still defined, unionfs_writepage does
still need to check for it after calling the lower level ->writepage
(because it needs to do the missing unlock_page): unionfs_writepages
prevents unionfs_writepage being called on the normal writeout path,
but it's still getting called by vmscan under memory pressure.

(I'm in the habit of saying "vmscan" rather than naming the functions
in question, because every few months someone restructures that file
and changes their names.  I exaggerate, but it's happened often enough.)


For so long as AOP_WRITEPAGE_ACTIVATE exists, unionfs_writepage needs to
check for it coming from the lower level ->writepage, as I said above.

But your/Pekka's unionfs_writepages doesn't need to worry about it
at all, because Andrew/Pekka's write_cache_pages fix prevents it
leaking up in the !reclaim case (as does my shmem_writepage fix):
please remove that AOP_WRITEPAGE_ACTIVATE comment from unionfs_writepages.

Hugh
-
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

Messages in current thread:
[PATCH] fix tmpfs BUG and AOP_WRITEPAGE_ACTIVATE, Hugh Dickins, (Wed Oct 24, 5:02 pm)
Re: [PATCH] fix tmpfs BUG and AOP_WRITEPAGE_ACTIVATE, Andrew Morton, (Wed Oct 24, 5:08 pm)
Re: msync(2) bug(?), returns AOP_WRITEPAGE_ACTIVATE to userl..., Hugh Dickins, (Thu Oct 25, 12:40 pm)