Re: OOM killer problem - how to read the kernel log?

Previous thread: PATCH 6/6] scsi: megaraid_sas - Update version and changelog by bo yang on Friday, November 9, 2007 - 5:44 am. (1 message)

Next thread: [git pull] scheduler fixes by Ingo Molnar on Friday, November 9, 2007 - 6:06 pm. (1 message)
To: <linux-kernel@...>
Date: Friday, November 9, 2007 - 5:35 pm

We have a database server which we've had some problems with, we've
had some serious crashes (particularly during postgres vacuuming)
which couldn't be fixed without hardware reboot. We assumed that the
problem would go away by itself after upgrading the server to a new
box with 32G of RAM ... but, alas, one of the first days in operation
it first killed lots of postgres children during the evening, and
finally it crashed and required hardware reboot. The box is certainly
not out of memory, but after reading some posts here, I've understood
that memory management is a complex issue.

We're running 32 bits linux, maybe it would help to upgrade to 64 bits?

I'm pretty sure that everything we need to know is located in
/var/log/kernel.log - if we could only interpret it, that is. I'm
particularly curious about those lines:

Nov 8 18:15:17 localhost kernel: DMA free:3556kB min:68kB low:84kB
high:100kB active:52kB inactive:0kB present:16256kB pages_scanned:208
all_unreclaimable? yes
Nov 8 18:15:17 localhost kernel: lowmem_reserve[]: 0 873 33512 33512
Nov 8 18:15:17 localhost kernel: Normal free:3676kB min:3744kB
low:4680kB high:5616kB active:892kB inactive:1792kB present:894080kB
pages_scanned:3468 all_unreclaimable? yes
Nov 8 18:15:17 localhost kernel: lowmem_reserve[]: 0 0 261112 261112
Nov 8 18:15:17 localhost kernel: HighMem free:7347404kB min:512kB
low:35536kB high:70560kB active:15982108kB inactive:9299956kB
present:33422336kB pages_scanned:0 all_unreclaimable? no
Nov 8 18:15:17 localhost kernel: lowmem_reserve[]: 0 0 0 0
Nov 8 18:15:17 localhost kernel: DMA: 7*4kB 15*8kB 5*16kB 0*32kB
0*64kB 0*128kB 1*256kB 0*512kB 1*1024kB 1*2048kB 0*4096kB = 3556kB
Nov 8 18:15:17 localhost kernel: Normal: 1*4kB 0*8kB 24*16kB 2*32kB
1*64kB 0*128kB 0*256kB 0*512kB 1*1024kB 1*2048kB 0*4096kB = 3588kB
Nov 8 18:15:17 localhost kernel: HighMem: 2054*4kB 60332*8kB
54264*16kB 14863*32kB 4487*64kB 3716*128kB 1243*256kB 3228*512kB
80*1024kB 979*2048kB 169*4096kB = 7347608kB

I've copied o...

To: Tobias Brox <tobixen@...>
Cc: <linux-kernel@...>
Date: Saturday, November 10, 2007 - 10:02 am

Hi,

Although you can address up to 64GB of RAM with PAE, the address space
of directly accessible memory is still at 32bit (4GB) and access to
higher memory costs low memory and processing overhead for the mapping

kern.log has wrong permissions, slabinfo and meminfo are empty.

Hannes
-

To: Tobias Brox <tobixen@...>, <linux-kernel@...>
Date: Saturday, November 10, 2007 - 12:09 pm

Point taken. I've gotten quite some replies on this, also off the
list. We will upgrade to 64 bits, but I don't know when we will get
that done, so now we're wondering what we can do for short-term
solutions.

My oh my. I'm really awfully bad at testing things. Sorry for that.
Links should work now. Ok, now it's even tested ;-)
-

To: Tobias Brox <tobixen@...>
Cc: <linux-kernel@...>
Date: Friday, November 9, 2007 - 5:54 pm

Big-time. There are a lot of "artificial" internal kernel memory
limits in the 32-bit architecture, so you can run up against those
(triggering the OOM killer) even if you have plenty of application
memory.

32GB in the box actually might be counter-productive unless you're
running 64-bit, because just tracking that much memory will consume
precious low kernel RAM.

-Doug
-

Previous thread: PATCH 6/6] scsi: megaraid_sas - Update version and changelog by bo yang on Friday, November 9, 2007 - 5:44 am. (1 message)

Next thread: [git pull] scheduler fixes by Ingo Molnar on Friday, November 9, 2007 - 6:06 pm. (1 message)