login
Header Space

 
 

Getting "out of low memory" error on 2.6.24. Doesn't appear to happen on 2.6.23

August 1, 2008 - 3:56pm
Submitted by Anonymous on August 1, 2008 - 3:56pm.
Linux

I have a Sun x4450 machine with 16 Intel 2.4G cores and 128GB of RAM. I'm having troubles running kernel 2.6.24 on it. Here is the error I get during bootup:

Console: colour VGA+ 80x25
console [tty0] enabled
Checking apeture...
PCI-DMA: Using software bounce buffering for IO (SWIOTLB)
low bootmem alloc of 67108864 bytes failed!
KErnel panic - not syncing: Out of low memory

I tried using a very similar kernel config on 2.6.23 and it appears to work:

Console: colour VGA+ 80x25
console [tty0] enabled
Checking aperture...
PCI-DMA: Using software bounce buffering for IO (SWIOTLB)
Placing software IO TLB between 0x710ce000 - 0x750ce000
Memory: 132308340k/134742016k available (3394k kernel code, 1908680k reserved, 1607k data, 324k init)
SLUB: Genslabs=23, HWalign=64, Order=0-1, MinObjects=4, CPUs=16, Nodes=1
Calibrating delay using timer specific routine.. 4803.00 BogoMIPS (lpj=24015016)
... machine boots ...

I'm fairly lost at this point. I tried adding some printk's to the source, but that didn't get me very far. It looks like the hangup happens in __alloc_bootmem_core(). It gets to 'restart_scan' and I think it ends up returning NULL. It appears it's only trying to allocate 64MB. I also noticed a comment in the source for __alloc_bootmem_core():

* We 'merge' subsequent allocations to save space. We might 'lose'
* some fraction of a page if allocations cannot be satisfied due to
* size constraints on boxes where there is physical RAM space
* fragmentation - in these cases (mostly large memory boxes) this
* is not a problem.

Does anyone think that might be pertinent to my situation?

Here's the kernel config: http://pastebin.com/m5fc3d1db

Thanks in advance!

first difference?

August 1, 2008 - 5:00pm

where do the two bootups differ first? is there a difference in the number, order or size of the bootmem-allocations before the failing one? you could for example printk every allocation that is bigger than everyone before and print an allocation count in regular intervals or in the "low bootmem alloc" message in __alloc_bootmem_low() to get it all into one screen. are there differents limits on the possible size of the bootmem, i.e. using one or more node's memory? are there differences in 'mapsize', i.e. the start and end addresses passed into init_bootmem_core() for any of the nodes?

i don't think the comment has anything to do with your problem. if the last allocation left a partial page (A: allocated, F: free, N: new, H: hole):

page i page i+1
AAFFFFFF FFFFFFFF

and you allocate a new block, a new free page is searched first:

AAFFFFFF NNNFFFFF

and then it is merged by shifting forward to the point where the last allocation left off:

AANNNFFF FFFFFFFF

but if there's a hole in between, no merging takes place:

AAFFFFFF HHHHHHHH NNNFFFFF

so you lose some amount of memory, but the size of the wasted fragments is below the page size (4096 bytes in most cases). due to allocation order you can lose multiple fragments per hole, but if the allocations are quasi-random (i.e. no crafted sequence of worst case sizes) the number should stay low and there are not that many holes.

Ok - figured it out. I

August 4, 2008 - 9:04pm
Anonymous (not verified)

Ok - figured it out. I added kernel parameter swiotlb=16 and all is well.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.
speck-geostationary