Re: [RFC 0/3] Recursive reclaim (on __PF_MEMALLOC)

!MAILaRCHIVE_VOTE_RePLACE
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]
To: Christoph Lameter <clameter@...>
Cc: <linux-mm@...>, <linux-kernel@...>, <akpm@...>, <dkegel@...>, Peter Zijlstra <a.p.zijlstra@...>, David Miller <davem@...>, Nick Piggin <npiggin@...>
Date: Wednesday, September 5, 2007 - 5:20 am

On Tuesday 14 August 2007 07:21, Christoph Lameter wrote:

Hi Christoph,

Over the last two weeks we have tested your patch set in the context of 
ddsnap, which used to be prone to deadlock before we added a series of 
anti-deadlock measures, including Peter's anti-deadlock patch set, our 
own bio throttling code and judicious use of PF_MEMALLOC mode.  This 
cocktail of patches finally banished the deadlocks, none of which have 
been seen during several months of heavy testing.  The question in 
which you are interested no doubt, is whether your patch set also 
solves the same deadlocks.

The results are mixed.  I will briefly describe the test setup now.  If 
you are interested in specific details for independent verification, we 
can provide the full recipe separately.  We used the patches here:

   http://zumastor.googlecode.com/svn/trunk/ddsnap/patches/2.6.21.1/

driven by the scripted storage application here:

   http://zumastor.googlecode.com/svn/trunk/zumastor/

If we remove our anti-deadlock measures, including the ddsnap.vm.fixes 
(a roll-up of Peter's patch set) and the request throttling code in 
dm-ddsnap.c, and apply your patch set instead, we hit deadlock on the 
socket write path after a few hours (traceback tomorrow).  So your 
patch set by itself is a stability regression.

There is also some good news for you here.  The combination of our 
throttling code, plus your recursive reclaim patches and some fiddling 
with PF_LESS_THROTTLE has so far survived testing without deadlocking.  
In other words, as far as we have tested it, your patch set can 
substitute for Peter's and produce the same effect, provided that we 
throttle the block IO traffic.

Just to recap, we have identified two essential ingredients in the 
recipe for writeout deadlock prevention:

   1) Throttle block IO traffic to a bounded maximum memory use.

   2) Guarantee availability of the required amount of memory.

Now we have learned that (1) is not optional with either the peterz or 
the clameter approach, and we are wondering which is the better way to
handle (2).

If we accept for the moment that both approaches to (2) are equally 
effective at preventing deadlock (this is debatable) then the next 
criterion on the list for deciding the winner would be efficiency.  A 
slight oversimplification to be sure, since we are also interested in 
issues of maintainability, provability and general forward progress.  
However, since none of the latter is directly measurable, efficiency is 
a good place to start.

It is clear which approach is more efficient: Peter's.  This is because 
no scanning is required to pop a free page off a free list, so scanning 
work is not duplicated.  How much more efficient is an open question.  
Hopefully we will measure that soon.

Briefly touching on other factors:

  * Peter's patch set is much bigger than yours.  The active ingredients
    need to be separated out from the other peterz bits such as reserve
    management APIs so we can make a fairer comparison.

  * Your patch set here does not address the question of atomic
     allocation, though I see you have been busy with that elsewhere.
     Adding code to take care of this means you will start catching up
     with Peter in complexity.

  * The questions Peter raised about how you will deal with loads
     involving heavy anonymous allocations are still open.   This looks
     like more complexity on the way.

  * You depend on maintaining a global dirty page limit while Peter's
     approach does not.  So we see the peterz approach as progress
     towards eliminating one of the great thorns in our side:
     congestion_wait deadlocks, which we currently hack around in a
     thoroughly disgusting way (PF_LESS_THROTTLE abuse).

  * Which approach allows us to run with a higher dirty page threshold?
     More dirty page caching is better.  We will test the two approaches
     head to head on this issue pretty soon.

Regards,

Daniel
-
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

Messages in current thread:
[RFC 0/3] Recursive reclaim (on __PF_MEMALLOC), Christoph Lameter, (Tue Aug 14, 10:21 am)
Re: [RFC 0/3] Recursive reclaim (on __PF_MEMALLOC), Daniel Phillips, (Wed Sep 5, 5:20 am)
Re: [RFC 0/3] Recursive reclaim (on __PF_MEMALLOC), Christoph Lameter, (Wed Sep 5, 6:42 am)
Re: [RFC 0/3] Recursive reclaim (on __PF_MEMALLOC), Daniel Phillips, (Wed Sep 5, 12:16 pm)
Re: [RFC 0/3] Recursive reclaim (on __PF_MEMALLOC), Christoph Lameter, (Mon Sep 10, 3:25 pm)
Re: [RFC 0/3] Recursive reclaim (on __PF_MEMALLOC), Peter Zijlstra, (Mon Sep 10, 3:55 pm)
Re: [RFC 0/3] Recursive reclaim (on __PF_MEMALLOC), Christoph Lameter, (Mon Sep 10, 4:22 pm)
Re: [RFC 0/3] Recursive reclaim (on __PF_MEMALLOC), Peter Zijlstra, (Mon Sep 10, 4:48 pm)
Re: [RFC 0/3] Recursive reclaim (on __PF_MEMALLOC), Pavel Machek, (Fri Oct 26, 1:44 pm)
Re: [RFC 0/3] Recursive reclaim (on __PF_MEMALLOC), Daniel Phillips, (Sat Oct 27, 7:08 pm)
Re: [RFC 0/3] Recursive reclaim (on __PF_MEMALLOC), Christoph Lameter, (Fri Oct 26, 1:55 pm)
Re: [RFC 0/3] Recursive reclaim (on __PF_MEMALLOC), Daniel Phillips, (Sat Oct 27, 6:58 pm)
Re: [RFC 0/3] Recursive reclaim (on __PF_MEMALLOC), Mike Snitzer, (Sat Sep 8, 1:12 am)
Re: [RFC 0/3] Recursive reclaim (on __PF_MEMALLOC), Daniel Phillips, (Mon Sep 17, 8:28 pm)
Re: [RFC 0/3] Recursive reclaim (on __PF_MEMALLOC), Mike Snitzer, (Mon Sep 17, 11:27 pm)
Re: [RFC 0/3] Recursive reclaim (on __PF_MEMALLOC), Peter Zijlstra, (Tue Sep 18, 5:30 am)
Re: [RFC 0/3] Recursive reclaim (on __PF_MEMALLOC), Daniel Phillips, (Tue Sep 18, 1:37 am)
Re: [RFC 0/3] Recursive reclaim (on __PF_MEMALLOC), Nick Piggin, (Wed Sep 5, 7:42 am)
Re: [RFC 0/3] Recursive reclaim (on __PF_MEMALLOC), Christoph Lameter, (Wed Sep 5, 8:14 am)
Re: [RFC 0/3] Recursive reclaim (on __PF_MEMALLOC), Peter Zijlstra, (Wed Sep 12, 6:52 am)
Re: [RFC 0/3] Recursive reclaim (on __PF_MEMALLOC), Christoph Lameter, (Wed Sep 12, 6:47 pm)
Re: [RFC 0/3] Recursive reclaim (on __PF_MEMALLOC), Peter Zijlstra, (Thu Sep 13, 4:19 am)
Re: [RFC 0/3] Recursive reclaim (on __PF_MEMALLOC), Christoph Lameter, (Thu Sep 13, 2:32 pm)
Re: [RFC 0/3] Recursive reclaim (on __PF_MEMALLOC), Peter Zijlstra, (Thu Sep 13, 3:24 pm)
Re: [RFC 0/3] Recursive reclaim (on __PF_MEMALLOC), Nick Piggin, (Wed Sep 5, 8:19 am)
Re: [RFC 0/3] Recursive reclaim (on __PF_MEMALLOC), Christoph Lameter, (Mon Sep 10, 3:29 pm)
Re: [RFC 0/3] Recursive reclaim (on __PF_MEMALLOC), Nick Piggin, (Tue Sep 11, 3:41 am)
Re: [RFC 0/3] Recursive reclaim (on __PF_MEMALLOC), Peter Zijlstra, (Mon Sep 10, 3:37 pm)
Re: [RFC 0/3] Recursive reclaim (on __PF_MEMALLOC), Christoph Lameter, (Mon Sep 10, 3:41 pm)
Re: [RFC 0/3] Recursive reclaim (on __PF_MEMALLOC), Peter Zijlstra, (Mon Sep 10, 3:55 pm)
Re: [RFC 0/3] Recursive reclaim (on __PF_MEMALLOC), Christoph Lameter, (Mon Sep 10, 4:17 pm)
Re: [RFC 0/3] Recursive reclaim (on __PF_MEMALLOC), Peter Zijlstra, (Mon Sep 10, 4:48 pm)
Re: [RFC 0/3] Recursive reclaim (on __PF_MEMALLOC), Nick Piggin, (Wed Aug 15, 8:22 am)
Re: [RFC 0/3] Recursive reclaim (on __PF_MEMALLOC), Peter Zijlstra, (Wed Aug 15, 9:12 am)
Re: [RFC 0/3] Recursive reclaim (on __PF_MEMALLOC), Nick Piggin, (Wed Aug 15, 11:29 pm)
Re: [RFC 0/3] Recursive reclaim (on __PF_MEMALLOC), Peter Zijlstra, (Sun Aug 19, 11:51 pm)
Re: [RFC 0/3] Recursive reclaim (on __PF_MEMALLOC), Nick Piggin, (Mon Aug 20, 8:28 pm)
Re: [RFC 0/3] Recursive reclaim (on __PF_MEMALLOC), Christoph Lameter, (Wed Sep 12, 6:39 pm)
Re: [RFC 0/3] Recursive reclaim (on __PF_MEMALLOC), Peter Zijlstra, (Tue Aug 21, 11:29 am)
Re: [RFC 0/3] Recursive reclaim (on __PF_MEMALLOC), Nick Piggin, (Wed Aug 22, 11:02 pm)
Re: [RFC 0/3] Recursive reclaim (on __PF_MEMALLOC), Christoph Lameter, (Mon Aug 20, 3:15 pm)
Re: [RFC 0/3] Recursive reclaim (on __PF_MEMALLOC), Nick Piggin, (Mon Aug 20, 8:32 pm)
Re: [RFC 0/3] Recursive reclaim (on __PF_MEMALLOC), Christoph Lameter, (Thu Aug 16, 4:27 pm)
Re: [RFC 0/3] Recursive reclaim (on __PF_MEMALLOC), Christoph Lameter, (Wed Aug 15, 4:29 pm)
Re: [RFC 0/3] Recursive reclaim (on __PF_MEMALLOC), Andi Kleen, (Wed Aug 15, 10:15 am)
Re: [RFC 0/3] Recursive reclaim (on __PF_MEMALLOC), Peter Zijlstra, (Wed Aug 15, 9:55 am)
Re: [RFC 0/3] Recursive reclaim (on __PF_MEMALLOC), Christoph Lameter, (Wed Aug 15, 4:32 pm)
Re: [RFC 0/3] Recursive reclaim (on __PF_MEMALLOC), Andi Kleen, (Wed Aug 15, 10:34 am)
Re: [RFC 0/3] Recursive reclaim (on __PF_MEMALLOC), Peter Zijlstra, (Tue Aug 14, 10:36 am)
Re: [RFC 0/3] Recursive reclaim (on __PF_MEMALLOC), Christoph Lameter, (Tue Aug 14, 11:29 am)
Re: [RFC 0/3] Recursive reclaim (on __PF_MEMALLOC), Peter Zijlstra, (Tue Aug 14, 3:32 pm)
Re: [RFC 0/3] Recursive reclaim (on __PF_MEMALLOC), Christoph Lameter, (Tue Aug 14, 3:41 pm)