Re: PROBLEM: oom killer and swap weirdness on 2.6.3* kernels

Previous thread: WARNING: at fs/namespace.c:648 commit_tree+0xf1/0x10b() by Luis Claudio R. Goncalves on Friday, May 14, 2010 - 5:35 am. (1 message)

Next thread: [PATCH] Use microsecond granularity for taskstats accounting by Michael Holzheu on Friday, May 14, 2010 - 6:22 am. (3 messages)
From: dave b
Date: Friday, May 14, 2010 - 5:53 am

In 2.6.3* kernels (test case was performed on the 2.6.33.3 kernel)
when physical memory runs out and there is a large swap partition -
the system completely stalls.

I noticed that when running debian lenny using dm-crypt  with
encrypted / and swap with a  2.6.33.3 kernel (and all of the 2.6.3*
series iirc) when all physical memory is used (swapiness was left at
the default 60) the system hangs and does not respond. It can resume
normal operation some time later - however it seems to take a *very*
long time for the oom killer to come in. Obviously with swapoff this
doesn't happen - the oom killer comes in and does its job.


free -m
             total       used       free     shared    buffers     cached
Mem:          1980       1101        879          0         58        201
-/+ buffers/cache:        840       1139
Swap:        24943          0      24943


My simple test case is

dd if=/dev/zero of=/tmp/stall
and wait till /tmp fills...
--

From: dave b
Date: Friday, May 14, 2010 - 6:14 am

Sorry - I forgot to say I am running x86-64
--

From: dave b
Date: Thursday, May 20, 2010 - 12:27 am

Is there a reason - no one has taken any interesting in my email ?....
 The behaviour isn't found on the 2.6.26 debian kernel. So I was
thinking that it might be due to my intel graphics card / memory
interplay ? ....

--

From: Hugh Dickins
Date: Friday, May 21, 2010 - 2:18 pm

Is that tmpfs sized the default 50% of RAM?

But I wonder if you're suffering from a bug which KOSAKI-San just
identified, and has very recently posted this patch: please try
it and let us all know - thanks.

Hugh

[PATCH] tmpfs: Insert tmpfs cache pages to inactive list at first

Shaohua Li reported parallel file copy on tmpfs can lead to
OOM killer. This is regression of caused by commit 9ff473b9a7
(vmscan: evict streaming IO first). Wow, It is 2 years old patch!

Currently, tmpfs file cache is inserted active list at first. It
mean the insertion doesn't only increase numbers of pages in anon LRU,
but also reduce anon scanning ratio. Therefore, vmscan will get totally
confusion. It scan almost only file LRU even though the system have
plenty unused tmpfs pages.

Historically, lru_cache_add_active_anon() was used by two reasons.
1) Intend to priotize shmem page rather than regular file cache.
2) Intend to avoid reclaim priority inversion of used once pages.

But we've lost both motivation because (1) Now we have separate
anon and file LRU list. then, to insert active list doesn't help
such priotize. (2) In past, one pte access bit will cause page
activation. then to insert inactive list with pte access bit mean
higher priority than to insert active list. Its priority inversion
may lead to uninteded lru chun. but it was already solved by commit
645747462 (vmscan: detect mapped file pages used only once).
(Thanks Hannes, you are great!)

Thus, now we can use lru_cache_add_anon() instead.

Reported-by: Shaohua Li <shaohua.li@intel.com>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Minchan Kim <minchan.kim@gmail.com>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
---
 mm/filemap.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/mm/filemap.c b/mm/filemap.c
index b941996..023ef61 100644
--- a/mm/filemap.c
+++ ...
From: dave b
Date: Wednesday, May 26, 2010 - 8:45 pm

That was just a simple test case with dd. That test case might be
invalid - but it is trying to trigger out of memory - doing this any
other way still causes the problem. I note that playing with some bios
settings I was actually able to trigger what appeared to be graphics
corruption issues when I launched kde applications ... nothing shows
up in dmesg so this might just be a conflict between xorg and the
kernel with those bios settings...

Anyway, This is no longer a 'problem' for me since I disabled
overcommit and altered the values for dirty_ratio and
dirty_background_ratio - and I cannot trigger it.
--

From: David Rientjes
Date: Tuesday, June 1, 2010 - 1:52 pm

Disabling overcommit should always do it, but I'd be interested to know if 
restoring dirty_ratio to 40 would help your usecase.
--

From: dave b
Date: Monday, July 26, 2010 - 7:05 am

Actually it turns out on 2.6.34.1 I can trigger this issue. What it
really is, is that linux doesn't invoke the oom killer when it should
and kill something off. This is *really* annoying.

I used the follow script - (on 2.6.34.1)
cat ./scripts/disable_over_commit
#!/bin/bash
echo 2 > /proc/sys/vm/overcommit_memory
echo 40 > /proc/sys/vm/dirty_ratio
echo 5 > /proc/sys/vm/dirty_background_ratio

And I was still able to reproduce this bug.
Here is some c  code to trigger the condition I am talking about.


#include <stdlib.h>
#include <stdio.h>

int main(void)
{
	while(1)
	{
		malloc(1000);
	}

	return 0;
}
--

From: David Rientjes
Date: Monday, July 26, 2010 - 3:12 pm

I'm not exactly sure what you're referring to, it's been two months and 
you're using a new kernel and now you're saying that the oom killer isn't 
being utilized when the original problem statement was that it was killing 
--

From: dave b
Date: Monday, July 26, 2010 - 9:39 pm

Sorry about the timespan :(
Well actually it is the same issue. Originally the oom killer wasn't
being invoked and now the problem is still it isn't invoked - it
doesn't come and kill things - my desktop just sits :)
I have since replaced the hard disk - which I thought could be the
issue. I am thinking that because I have shared graphics not using KMS
- with intel graphics - this may be the root of the cause.

--
All things that are, are with more spirit chased than enjoyed.		--
Shakespeare, "Merchant of Venice"
--

From: KOSAKI Motohiro
Date: Monday, July 26, 2010 - 9:46 pm

Do you mean the issue will be gone if disabling intel graphics?
if so, we need intel graphics driver folks help. sorry, linux-mm folks don't
know intel graphics detail.


--

From: dave b
Date: Monday, July 26, 2010 - 9:49 pm

Well the only other system I have running the 2.6.34.1 kernel atm is
an arm based system.
I originally sent this to the kernel list and was told I should
probably forward it to the mm list.
It may be a general issue or it could just be specific :)

--
"Not Hercules could have knock'd out his brains, for he had none."		--
Shakespeare
--

From: KOSAKI Motohiro
Date: Monday, July 26, 2010 - 11:09 pm

Hmm.. I'm puzzled 8-)

I don't understand why other all people can't reproduce your issue
even though your reproduce program is very simple.

So, I'm guessing there is hidden reproduce condition. but I have no
idea to find it.



--

From: dave b
Date: Tuesday, July 27, 2010 - 1:09 am

I will try with the latest ubuntu and report how that goes (that will
be using fairly new xorg etc.) it is likely to be hidden issue just
with the intel graphics driver. However, my concern is that it isn't -
and it is about how shared graphics memory is handled :)
--

From: dave b
Date: Tuesday, July 27, 2010 - 3:40 am

Ok my desktop still stalled and no oom killer was invoked when I added
swap to a live-cd of 10.04 amd64.

*Without* *swap* *on* - the oom killer was invoked - here is a copy of it.

[  298.180542] Xorg invoked oom-killer: gfp_mask=0xd0, order=0, oom_adj=0
[  298.180553] Xorg cpuset=/ mems_allowed=0
[  298.180560] Pid: 3808, comm: Xorg Not tainted 2.6.32-21-generic #32-Ubuntu
[  298.180564] Call Trace:
[  298.180583]  [<ffffffff810b37cd>] ? cpuset_print_task_mems_allowed+0x9d/0xb0
[  298.180595]  [<ffffffff810f64f4>] oom_kill_process+0xd4/0x2f0
[  298.180603]  [<ffffffff810f6ab0>] ? select_bad_process+0xd0/0x110
[  298.180609]  [<ffffffff810f6b48>] __out_of_memory+0x58/0xc0
[  298.180616]  [<ffffffff810f6cde>] out_of_memory+0x12e/0x1a0
[  298.180626]  [<ffffffff81540c9e>] ? _spin_lock+0xe/0x20
[  298.180633]  [<ffffffff810f9d21>] __alloc_pages_slowpath+0x511/0x580
[  298.180641]  [<ffffffff810f9eee>] __alloc_pages_nodemask+0x15e/0x1a0
[  298.180650]  [<ffffffff8112ca57>] alloc_pages_current+0x87/0xd0
[  298.180657]  [<ffffffff810f8e0e>] __get_free_pages+0xe/0x50
[  298.180666]  [<ffffffff81154994>] __pollwait+0xb4/0xf0
[  298.180673]  [<ffffffff814e09a5>] unix_poll+0x25/0xc0
[  298.180682]  [<ffffffff81449bea>] sock_poll+0x1a/0x20
[  298.180688]  [<ffffffff811545b2>] do_select+0x3a2/0x6d0
[  298.180696]  [<ffffffff811548e0>] ? __pollwait+0x0/0xf0
[  298.180702]  [<ffffffff811549d0>] ? pollwake+0x0/0x60
[  298.180708]  [<ffffffff811549d0>] ? pollwake+0x0/0x60
[  298.180714]  [<ffffffff811549d0>] ? pollwake+0x0/0x60
[  298.180721]  [<ffffffff811549d0>] ? pollwake+0x0/0x60
[  298.180727]  [<ffffffff811549d0>] ? pollwake+0x0/0x60
[  298.180732]  [<ffffffff811549d0>] ? pollwake+0x0/0x60
[  298.180737]  [<ffffffff811549d0>] ? pollwake+0x0/0x60
[  298.180741]  [<ffffffff811549d0>] ? pollwake+0x0/0x60
[  298.180745]  [<ffffffff811549d0>] ? pollwake+0x0/0x60
[  298.180749]  [<ffffffff811550ba>] core_sys_select+0x18a/0x2c0
[  298.180777]  [<ffffffffa001eced>] ? drm_ioctl+0x13d/0x480 ...
From: KOSAKI Motohiro
Date: Tuesday, July 27, 2010 - 4:14 am

This stack seems similar following bug. can you please try to disable intel graphics
driver?






--

From: dave b
Date: Tuesday, July 27, 2010 - 4:26 am

Ok I am not sure how to do that :)
I could revert the patch and see if it 'fixes' this :)
--

From: KOSAKI Motohiro
Date: Tuesday, July 27, 2010 - 10:06 pm

Oops, no, revert is not good action. the patch is correct. 
probably my explanation was not clear. sorry.

I did hope to disable 'driver' (i.e. using vga), not disable the patch.

Thanks.


--

From: dave b
Date: Wednesday, July 28, 2010 - 12:14 am

Oh you mean in xorg, I will also blacklist the module. Sure that patch
might not it but in 2.6.26 the problem isn't there :)
--

From: dave b
Date: Thursday, July 29, 2010 - 2:47 am

Ok I re-tested with 2.6.26 and 2.6.34.1
So I will describe what happens below:

2.6.26 - with xorg running
"Given I have a test file called a.out
 And I can see Xorg
 And I am using 2.6.26
 And I have swap on
 When I run it I run a.out
 Then I see the system freeze up slightly
 And the hard drive churns( and the cpu is doing something as the
large fan kicks)
 And after a while the system unfreezes"

2.6.26 - from single mode - before xorg starts and i915 is *not* loaded.
"Given I have a test file called a.out
 And I cannot see Xorg
 And I am using 2.6.26
 And I have swap on
 When I run it I run a.out
 Then I see the system freeze up
 And the system fan doesn't spin any faster
 And the system just sits idle"

2.6.34.1
With and without xorg - WITH spam on the same behaviour as in the
2.6.26 kernel appears (when xorg is not loaded).

OOM attached from the 2.6.26 kernel when I used magic keys to invoke
the oom killer :) (this was on the 2.6.26 kernel - before i915 had
loaded and in single mode).

[  280.323899] SysRq : Manual OOM execution
[  280.324009] events/0 invoked oom-killer: gfp_mask=0xd0, order=0, oomkilladj=0
[  280.324056] Pid: 9, comm: events/0 Not tainted 2.6.26-2-amd64 #1
[  280.324098]
[  280.324099] Call Trace:
[  280.324200]  [<ffffffff8027388c>] oom_kill_process+0x57/0x1dc
[  280.324247]  [<ffffffff8023b49d>] __capable+0x9/0x1c
[  280.324290]  [<ffffffff80273bb7>] badness+0x188/0x1c7
[  280.324341]  [<ffffffff80273deb>] out_of_memory+0x1f5/0x28e
[  280.324396]  [<ffffffff8037824c>] moom_callback+0x0/0x1a
[  280.324449]  [<ffffffff80243070>] run_workqueue+0x82/0x111
[  280.324497]  [<ffffffff8024393d>] worker_thread+0xd5/0xe0
[  280.324543]  [<ffffffff80246171>] autoremove_wake_function+0x0/0x2e
[  280.324596]  [<ffffffff80243868>] worker_thread+0x0/0xe0
[  280.324637]  [<ffffffff8024604b>] kthread+0x47/0x74
[  280.324678]  [<ffffffff802300ed>] schedule_tail+0x27/0x5c
[  280.326721]  [<ffffffff8020cf38>] child_rip+0xa/0x12
[  280.326788]  ...
From: dave b
Date: Thursday, July 29, 2010 - 2:48 am

s/spam/same/
--

From: dave b
Date: Tuesday, September 21, 2010 - 6:01 am

Ok this issue is still around and still *really* annoying.
So I had a 5mb text file, I put %s/\n/, in vim, my desktop stalls as
vim uses memory it sits there for ~10 minutes before finally the oom
killer wakes up and does something....
This is on totally different hardware now(amd phenom ddr3 ram, SATA 3
disk) and Here is some dmesg output :)


ep 21 22:41:44 RANDOMBOXEN kernel: [329160.956367] kjournald     D
ffff88011be59a00     0   982      2 0x00000000
Sep 21 22:41:44 RANDOMBOXEN kernel: [329160.956370]  ffff88011bf9fbf0
0000000000000046 ffff88011bf9fbc0 ffffffffa00f0775
Sep 21 22:41:44 RANDOMBOXEN kernel: [329160.956373]  ffff88011bf9ffd8
0000000000013900 ffff88011bf9ffd8 ffff88011be59680
Sep 21 22:41:44 RANDOMBOXEN kernel: [329160.956375]  0000000000013900
0000000000013900 0000000000013900 0000000000013900
Sep 21 22:41:44 RANDOMBOXEN kernel: [329160.956377] Call Trace:
Sep 21 22:41:44 RANDOMBOXEN kernel: [329160.956399]
[<ffffffffa00f0775>] ? dm_table_unplug_all+0x54/0xc6 [dm_mod]
Sep 21 22:41:44 RANDOMBOXEN kernel: [329160.956405]
[<ffffffff812e4f80>] io_schedule+0x7b/0xc1
Sep 21 22:41:44 RANDOMBOXEN kernel: [329160.956408]
[<ffffffff8110d0ea>] sync_buffer+0x3b/0x3f
Sep 21 22:41:44 RANDOMBOXEN kernel: [329160.956409]
[<ffffffff812e5488>] __wait_on_bit+0x47/0x79
Sep 21 22:41:44 RANDOMBOXEN kernel: [329160.956411]
[<ffffffff8110d0af>] ? sync_buffer+0x0/0x3f
Sep 21 22:41:44 RANDOMBOXEN kernel: [329160.956413]
[<ffffffff8110d0af>] ? sync_buffer+0x0/0x3f
Sep 21 22:41:44 RANDOMBOXEN kernel: [329160.956415]
[<ffffffff812e5524>] out_of_line_wait_on_bit+0x6a/0x77
Sep 21 22:41:44 RANDOMBOXEN kernel: [329160.956418]
[<ffffffff8105b678>] ? wake_bit_function+0x0/0x2a
Sep 21 22:41:44 RANDOMBOXEN kernel: [329160.956419]
[<ffffffff8110d06f>] __wait_on_buffer+0x1f/0x21
Sep 21 22:41:44 RANDOMBOXEN kernel: [329160.956425]
[<ffffffffa0165824>] journal_commit_transaction+0xa42/0xfba [jbd]
Sep 21 22:41:44 RANDOMBOXEN kernel: [329160.956427]
[<ffffffff812e4e36>] ? ...
Previous thread: WARNING: at fs/namespace.c:648 commit_tree+0xf1/0x10b() by Luis Claudio R. Goncalves on Friday, May 14, 2010 - 5:35 am. (1 message)

Next thread: [PATCH] Use microsecond granularity for taskstats accounting by Michael Holzheu on Friday, May 14, 2010 - 6:22 am. (3 messages)