PROBLEM: kernel memory subsystem incorrectly invokes OOM killer under certain situations

Previous thread: Re: Don't leak 'listeners' in netlink_kernel_create() by Eric W. Biederman on Sunday, October 14, 2007 - 6:30 pm. (2 messages)

Next thread: [PATCH] doc: add uio document to docbook compilation target by Satoru Takeuchi on Sunday, October 14, 2007 - 7:21 pm. (4 messages)
To: <linux-kernel@...>
Date: Sunday, October 14, 2007 - 7:04 pm

Hi linux-kernel,

[1.] One line summary of the problem:

kernel memory subsystem incorrectly invokes OOM killer under certain situations

[2.] Full description of the problem/report:

My guess is that whatever invokes the OOM killer is incorrectly
"deciding" that memory allocated for disk cache operations cannot be
"reclaimed", or, the oom killer code itself is incorrectly killing
processes when the cause of the memory exhaustion is the disk cache
subsystem (and not a runaway process).

Specifically - I have a RedHat AS4u5 2.6.9-55.0.6.ELsmp system with
4gigs RAM, running vmware 1.0.4, and another AS4 guest, which has 3
virtual SCSI drives. The following guest command reliably causes the
host OOM killer to terminate my vmware process:

dd if=/dev/sdb of=/deb/sdc

(to clone the contents of a 16gb virtual disk). The host has one 2TB
file system only.

While it's easiest to use vmware to demonstrate the problem, this does
not appear to be a problem with vmware itself.

[3.] Keywords (i.e., modules, networking, kernel):

/usr/src/redhat/BUILD/kernel-2.6.9/linux-2.6.9/mm/oom_kill.c

OOM killer

[4.] Kernel version (from /proc/version):

Linux version 2.6.9-55.0.6.ELsmp (brewbuilder@hs20-bc2-3.build.redhat.com) (gcc version 3.4.6 20060404 (Red Hat 3.4.6-8)) #1 SMP Thu Aug 23 11:11:20 EDT 2007

[5.] Output of Oops.. message (if applicable) with symbolic information
resolved (see Documentation/oops-tracing.txt)

Here's the messages output showing the offending oom-kill.

Oct 14 21:05:14 dor kernel: oom-killer: gfp_mask=0xd0
Oct 14 21:05:14 dor kernel: Mem-info:
Oct 14 21:05:14 dor kernel: DMA per-cpu:
Oct 14 21:05:14 dor kernel: cpu 0 hot: low 2, high 6, batch 1
Oct 14 21:05:14 dor kernel: cpu 0 cold: low 0, high 2, batch 1
Oct 14 21:05:14 dor kernel: cpu 1 hot: low 2, high 6, batch 1
Oct 14 21:05:14 dor kernel: cpu 1 cold: low 0, high 2, batch 1
Oct 14 21:05:14 dor kernel: cpu 2 hot: low 2, high 6, batch 1
Oct 14 21:05:14 dor kernel: cpu 2 cold: low 0...

Previous thread: Re: Don't leak 'listeners' in netlink_kernel_create() by Eric W. Biederman on Sunday, October 14, 2007 - 6:30 pm. (2 messages)

Next thread: [PATCH] doc: add uio document to docbook compilation target by Satoru Takeuchi on Sunday, October 14, 2007 - 7:21 pm. (4 messages)