Re: bug in ext3 code causing OOM error on systems with small memory

Previous thread: e2fsck doesn't repair broken ext4 by Jörg on Friday, March 12, 2010 - 1:54 pm. (5 messages)

Next thread: [PATCH] memory leakage in ext4_ext_zeroout by jing zhang on Friday, March 12, 2010 - 11:33 pm. (2 messages)
From: Andrew Morton
Date: Friday, March 12, 2010 - 2:57 pm

(cc's added)

On Sat, 6 Mar 2010 10:31:07 +0100

Thanks.  Here's Frans's patch:

--- a/fs/ext3/balloc.c~a
+++ a/fs/ext3/balloc.c
@@ -1581,6 +1581,8 @@ retry_alloc:
 		gdp = ext3_get_group_desc(sb, group_no, &gdp_bh);
 		if (!gdp)
 			goto io_error;
+		if (!gdp->bg_free_blocks_count)
+			continue;
 		free_blocks = le16_to_cpu(gdp->bg_free_blocks_count);
 		/*
 		 * skip this group if the number of
_


where

	free_blocks = le16_to_cpu(gdp->bg_free_blocks_count);

so this is zero.  We set my_rsv to NULL and then fail to do the
`free_blocks <= (windowsz/2)' check, which would have caused the
bailout.


Is anyone interested in taking a look?
--

From: Jan Kara
Date: Monday, March 15, 2010 - 11:43 am

Hi,

  I'd just add a comment why this check is needed but otherwise the patch
looks fine. Maybe I'd just use free_blocks in the check. I know that
zero-check works fine even with disk-endian value but still... And I agree
that the Mingming's patch probably caused the regression.
  Frans, do you agree with the patch below and can I add you Signed-off-by
to it (see Documentation/SubmittingPatches)?

								Honza
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR
---

From 0e7e5dd29c072fa7afe0a25d64d41682a07d7dff Mon Sep 17 00:00:00 2001
From: Frans van de Wiel <fvdw@fvdw.eu>
Date: Mon, 15 Mar 2010 19:29:34 +0100
Subject: [PATCH] ext3: Avoid loading bitmaps for full groups during block allocation

There is no point in loading bitmap for groups which are completely full.
This causes noticeable performance problems (and memory pressure) on small
systems with large full filesystem
(http://marc.info/?l=linux-ext4&m=126843108314310&w=2).

Jan Kara: Added a comment and changed check to use cpu-endian value.

Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/ext3/balloc.c |    6 ++++++
 1 files changed, 6 insertions(+), 0 deletions(-)

diff --git a/fs/ext3/balloc.c b/fs/ext3/balloc.c
index 161da2d..c0980fc 100644
--- a/fs/ext3/balloc.c
+++ b/fs/ext3/balloc.c
@@ -1583,6 +1583,12 @@ retry_alloc:
 			goto io_error;
 		free_blocks = le16_to_cpu(gdp->bg_free_blocks_count);
 		/*
+		 * skip this group (and avoid loading bitmap) if there
+		 * are no free blocks
+		 */
+		if (!free_blocks)
+			continue;
+		/*
 		 * skip this group if the number of
 		 * free blocks is less than half of the reservation
 		 * window size.
-- 
1.6.4.2

--

From: Frans van de Wiel
Date: Tuesday, March 16, 2010 - 12:50 pm

Dear Jan, Andrew

The patch looks fine to me, if you say using free_blocks is better in the if 
statement I believe you, as said I am not a very experienced C programmer.
I just used "common sense" to locate this loop causing problems on my 
system.
I will sign it off as you requested and double check it in the weekend by 
compiling the kernel again with this patch.

PS there is one thing, think a similar patch is required in balloc.c in 
fs/ext2 as well.
There is the same loop only it does not cause on OOM error but it 
significantly delays the creation of a sub folder (25 seconds  on my disk of 
500 GB, with the patch its done it less then a second)

kind regards, Frans van de Wiel

--------------------------------------------------
From: "Jan Kara" <jack@suse.cz>
Sent: Monday, March 15, 2010 7:43 PM
To: "Andrew Morton" <akpm@linux-foundation.org>
Cc: "Frans van de Wiel" <fvdw@fvdw.eu>; <adilger@sun.com>; 
<linux-ext4@vger.kernel.org>; "Mingming Cao" <cmm@us.ibm.com>; "Jan Kara" 
<jack@suse.cz>
--

Previous thread: e2fsck doesn't repair broken ext4 by Jörg on Friday, March 12, 2010 - 1:54 pm. (5 messages)

Next thread: [PATCH] memory leakage in ext4_ext_zeroout by jing zhang on Friday, March 12, 2010 - 11:33 pm. (2 messages)