Re: [PATCH] ext4: BUG_ON triggered in ext4_mb_complex_scan_group()

Previous thread: [Bug 15690] New: system hanged while doing "sync" by bugzilla-daemon on Sunday, April 4, 2010 - 7:20 am. (4 messages)

Next thread: EXT4_IOC_MOVE_EXT file corruption! by Darrick J. Wong on Monday, April 5, 2010 - 3:02 pm. (6 messages)
From: jing zhang
Date: Sunday, April 4, 2010 - 9:30 pm

From: Jing Zhang <zj.barak@gmail.com>

Date: Mon Apr 5 12:10:50     2010

BUG_ON is triggered at [line:1762] in ext4_mb_complex_scan_group(),
and fix is to change the return value of mb_find_extent() if the
needed extent can not be found. The fix can also be applied to other
points where mb_find_extent() is called.

Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Andreas Dilger <adilger@sun.com>
Cc: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
Cc: "Aneesh Kumar K. V" <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Jing Zhang <zj.barak@gmail.com>
---
 fs/ext4/mballoc.c |    7 +++++--
 1 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
index bba1282..fbb6899 100644
--- a/fs/ext4/mballoc.c
+++ b/fs/ext4/mballoc.c
@@ -1321,7 +1321,7 @@ static int mb_find_extent(struct ext4_buddy
*e4b, int order, int block,
 		ex->fe_len = 0;
 		ex->fe_start = 0;
 		ex->fe_group = 0;
-		return 0;
+		return -ENOSPC;
 	}

 	/* FIXME dorp order completely ? */
@@ -1358,6 +1358,8 @@ static int mb_find_extent(struct ext4_buddy
*e4b, int order, int block,
 	}

 	BUG_ON(ex->fe_start + ex->fe_len > (1 << (e4b->bd_blkbits + 3)));
+	if (needed > ex->fe_len)
+		return -ENOSPC;
 	return ex->fe_len;
 }

@@ -1758,7 +1760,8 @@ void ext4_mb_complex_scan_group(struct
ext4_allocation_context *ac,
 			break;
 		}

-		mb_find_extent(e4b, 0, i, ac->ac_g_ex.fe_len, &ex);
+		if (0 > mb_find_extent(e4b, 0, i, ac->ac_g_ex.fe_len, &ex))
+			break;
 		BUG_ON(ex.fe_len <= 0);
 		if (free < ex.fe_len) {
 			ext4_grp_locked_error(sb, e4b->bd_group,
--

From: Eric Sandeen
Date: Sunday, April 4, 2010 - 9:36 pm

Do you have a test case which can trigger this BUG_ON?


--

From: jing zhang
Date: Sunday, April 4, 2010 - 10:01 pm

No I do not have.

But you can check [line: 1762] and [line: 1321].

And please info me the result, if you like, after check.

                - zj
--

From: tytso
Date: Wednesday, April 7, 2010 - 8:31 am

Again, line numbers make it hard to determine what you're talking
about.  And we need a scenario description that might trigger this.
If you can't give it, then my fear is that you don't understand how
the code is used, in which case the risk is extremely high that that
your patch will not be correct, and may in fact may make things worse.

I could spend time trying to reverse engineer the scenario where this
might happen, but the fact of the matter is, no one has reported these
BUG_ON's ever triggering as far as I know.  So when I have free time,
I might get back to looking for them.  Otherwise, I have higher
priority things to work on, and it's going to be up to you to pursuade
me that this could actually happen in real life, and how.

Best regards,

					- Ted
--

Previous thread: [Bug 15690] New: system hanged while doing "sync" by bugzilla-daemon on Sunday, April 4, 2010 - 7:20 am. (4 messages)

Next thread: EXT4_IOC_MOVE_EXT file corruption! by Darrick J. Wong on Monday, April 5, 2010 - 3:02 pm. (6 messages)