Re: Sudden and massive page cache eviction

Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]
From: Simon Kirby
Date: Wednesday, December 1, 2010 - 2:15 am

On Thu, Nov 25, 2010 at 04:33:01PM +0100, Peter Sch??ller wrote:


Disclaimer: I have no idea what I'm doing. :)

Your buddyinfo looks to be pretty low for order 3 and above, before and
after the sudden eviction, so my guess is that it's probably related to
the issues I'm seeing with fragmentation, but maybe not fighting between
zones, since you seem to have a larger Normal zone than DMA32.  (Not
sure, you didn't post /proc/zoneinfo).  Also, you seem to be on an
actual NUMA system, so other things are happening there, too.

If you have munin installed (it looks like you do), try enabling the
buddyinfo plugin available since munin 1.4.4.  It graphs the buddyinfo
data, so it could be lined up with the memory graphs (thanks Erik).

[snip]


Note that all of the above are actually attempting order-3 allocations
first; see /sys/kernel/slab/kmalloc-1024/order, for instance.  The "8" is
means "8 pages per slab", which means order 3 is the attempted allocation
size.

I did the following on a system to test, but the free memory did not
actually improve.  It seems that even only order 1 allocations are enough
to reclaim too much order 0.  Even a "while true; sleep .01; done" caused
free memory to start increasing due to order 1 (task_struct allocation)
watermarks waking kswapd, while our other usual VM activity is happening.

#!/bin/bash

for i in /sys/kernel/slab/*/; do
        if [ `cat $i/object_size` -le 4096 ]; then
                echo 0 > $i/order
        else
                echo 1 > $i/order
        fi
done

But this is on another machine, without Mel's patch, and with 8 GB
memory, so a bigger Normal zone.

[snip]


Try installing the "perf" tool.  It can be built from the kernel tree in
tools/perf, and then you usually can just copy the binary around.  You
can use it to trace the points which cause kswapd to wake up, which will
show which processes are doing it, the order, flags, etc.

Just before the eviction is about to happen (or whenever), try this:

perf record --event vmscan:mm_vmscan_wakeup_kswapd --filter 'order>=3' \
	--call-graph -a sleep 30

Then view the recorded events with "perf trace", which should spit out
something like this:

    lmtp-3531  [003] 432339.243851: mm_vmscan_wakeup_kswapd: nid=0 zid=2 order=3
    lmtp-3531  [003] 432339.243856: mm_vmscan_wakeup_kswapd: nid=0 zid=1 order=3

The process which woke kswapd may not be directly responsible for the
allocation as a network interrrupt or something could have happened on
top of it.  See "perf report", which is a bit dodgy at least for me, to
see the stack traces, which might make things clearer.  For example, my
traces show that even kswapd wakes kswapd sometimes, but it's because of
a trace like this:

    -      9.09%  kswapd0  [kernel.kallsyms]  [k] wakeup_kswapd
         wakeup_kswapd
         __alloc_pages_nodemask
         alloc_pages_current
         new_slab
         __slab_alloc
         __kmalloc_node_track_caller
         __alloc_skb
         __netdev_alloc_skb
         bnx2_poll_work
         bnx2_poll
         net_rx_action
         __do_softirq
         call_softirq
         do_softirq
         irq_exit
         do_IRQ
         ret_from_intr
         truncate_inode_pages
         proc_evict_inode
         evict
         iput
         dentry_iput
         d_kill
         __shrink_dcache_sb
         shrink_dcache_memory
         shrink_slab
         kswapd

Anyway, maybe you'll see some interesting traces.  If kswapd isn't waking
very often, you can also trace "kmem:mm_page_alloc" or similar (see "perf
list"), or try a smaller order or a longer sleep.

Cheers,

Simon-
--
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

Messages in current thread:
Sudden and massive page cache eviction, Peter Schüller, (Fri Nov 12, 9:20 am)
Re: Sudden and massive page cache eviction, Andrew Morton, (Mon Nov 22, 5:11 pm)
Re: Sudden and massive page cache eviction, Dave Hansen, (Tue Nov 23, 1:38 am)
Re: Sudden and massive page cache eviction, Peter Schüller, (Tue Nov 23, 2:44 am)
Re: Sudden and massive page cache eviction, Dave Hansen, (Tue Nov 23, 9:19 am)
Re: Sudden and massive page cache eviction, Peter Schüller, (Wed Nov 24, 7:02 am)
Re: Sudden and massive page cache eviction, Peter Schüller, (Wed Nov 24, 7:14 am)
Re: Sudden and massive page cache eviction, Pekka Enberg, (Wed Nov 24, 7:20 am)
Re: Sudden and massive page cache eviction, Peter Schüller, (Wed Nov 24, 8:32 am)
Re: Sudden and massive page cache eviction, Dave Hansen, (Wed Nov 24, 10:32 am)
Re: Sudden and massive page cache eviction, Pekka Enberg, (Wed Nov 24, 10:46 am)
Re: Sudden and massive page cache eviction, Simon Kirby, (Wed Nov 24, 6:18 pm)
Re: Sudden and massive page cache eviction, Peter Schüller, (Thu Nov 25, 8:33 am)
Re: Sudden and massive page cache eviction, Peter Schüller, (Thu Nov 25, 8:59 am)
Re: Sudden and massive page cache eviction, Simon Kirby, (Tue Nov 30, 11:36 pm)
Re: Sudden and massive page cache eviction, Simon Kirby, (Wed Dec 1, 2:15 am)