[PATCH] ext4: memory leakage in ext4_mb_init()

From: jing zhang
Date: Sunday, March 21, 2010 - 7:01 am

From: Jing Zhang <zj.barak@gmail.com>

Date: Sun Mar 21 21:59:35     2010

Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Andreas Dilger <adilger@sun.com>
Cc: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
Signed-off-by: Jing Zhang <zj.barak@gmail.com>

---

--- linux-2.6.32/fs/ext4/mballoc.c	2009-12-03 11:51:22.000000000 +0800
+++ ext4_mm_leak/mballoc3.c	2010-03-21 21:37:18.000000000 +0800
@@ -2360,6 +2360,24 @@ err_freesgi:
 	return -ENOMEM;
 }

+static void ext4_mb_destroy_backend(struct super_block *sb)
+{
+	ext4_group_t ngroups = ext4_get_groups_count(sb);
+	ext4_group_t i;
+	struct ext4_sb_info *sbi = EXT4_SB(sb);
+	int j;
+	int num_meta_group_infos = (ngroups + EXT4_DESC_PER_BLOCK(sb) -1)
+					>> EXT4_DESC_PER_BLOCK_BITS(sb);
+	for (i = 0; i < ngroups; i++)
+		kfree(ext4_get_group_info(sb, i));
+
+	for (j = 0; j < num_meta_group_infos; j++)
+		kfree(sbi->s_group_info[j]);
+
+	kfree(sbi->s_group_info);
+	iput(sbi->s_buddy_cache);
+}
+
 int ext4_mb_init(struct super_block *sb, int needs_recovery)
 {
 	struct ext4_sb_info *sbi = EXT4_SB(sb);
@@ -2419,6 +2437,7 @@ int ext4_mb_init(struct super_block *sb,
 	if (sbi->s_locality_groups == NULL) {
 		kfree(sbi->s_mb_offsets);
 		kfree(sbi->s_mb_maxs);
+		ext4_mb_destroy_backend(sb);
 		return -ENOMEM;
 	}
 	for_each_possible_cpu(i) {
--

From: tytso
Date: Sunday, March 21, 2010 - 6:27 pm

It would be better if this could be done by making ext4_mb_release()
more flexible, and then calling ext4_mb_release() if there is an error
setting up the data structures in ext4_mb_init().

						- Ted
--

From: jing zhang
Date: Tuesday, March 23, 2010 - 5:47 am

Yeah, Ted, going through ext4_mb_release() is clearer.

---

diff --git a/linux-2.6.32/fs/ext4/mballoc.c b/ext4_mm_leak/mballoc3.c
index bba1282..99ca2de 100644
--- a/linux-2.6.32/fs/ext4/mballoc.c
+++ b/ext4_mm_leak/mballoc3.c
@@ -2417,8 +2417,7 @@ int ext4_mb_init(struct super_block *sb, int
needs_recovery)

 	sbi->s_locality_groups = alloc_percpu(struct ext4_locality_group);
 	if (sbi->s_locality_groups == NULL) {
-		kfree(sbi->s_mb_offsets);
-		kfree(sbi->s_mb_maxs);
+		ext4_mb_release(sb);
 		return -ENOMEM;
 	}
 	for_each_possible_cpu(i) {
@@ -2511,7 +2510,8 @@ int ext4_mb_release(struct super_block *sb)
 				atomic_read(&sbi->s_mb_discarded));
 	}

-	free_percpu(sbi->s_locality_groups);
+	if (sbi->s_locality_groups)
+		free_percpu(sbi->s_locality_groups);
 	if (sbi->s_proc)
 		remove_proc_entry("mb_groups", sbi->s_proc);
--

From: Aneesh Kumar K. V
Date: Friday, March 26, 2010 - 1:57 am

We may want to make sure that we can safely call ext4_mb_release that
early. what i would suggest is to move s_locality_group allocation

-aneesh
--

From: jing zhang
Date: Friday, March 26, 2010 - 7:40 am

I will try what you suggest next week.

--

From: jing zhang
Date: Sunday, March 28, 2010 - 1:13 am

cool idea, Aneesh, and I got it.

---

--- old-linux-2.6.32/fs/ext4/mballoc.c	2009-12-03 11:51:22.000000000 +0800
+++ ext4_mm_leak/mballoc3-2.c	2010-03-28 16:13:20.000000000 +0800
@@ -2397,11 +2397,27 @@ int ext4_mb_init(struct super_block *sb,
 		i++;
 	} while (i <= sb->s_blocksize_bits + 1);

+	sbi->s_locality_groups = alloc_percpu(struct ext4_locality_group);
+	if (sbi->s_locality_groups == NULL) {
+		kfree(sbi->s_mb_offsets);
+		kfree(sbi->s_mb_maxs);
+		return -ENOMEM;
+	}
+	for_each_possible_cpu(i) {
+		struct ext4_locality_group *lg;
+		lg = per_cpu_ptr(sbi->s_locality_groups, i);
+		mutex_init(&lg->lg_mutex);
+		for (j = 0; j < PREALLOC_TB_SIZE; j++)
+			INIT_LIST_HEAD(&lg->lg_prealloc_list[j]);
+		spin_lock_init(&lg->lg_prealloc_lock);
+	}
+
 	/* init file for buddy data */
 	ret = ext4_mb_init_backend(sb);
 	if (ret != 0) {
 		kfree(sbi->s_mb_offsets);
 		kfree(sbi->s_mb_maxs);
+		free_percpu(sbi->s_locality_groups);
 		return ret;
 	}

@@ -2415,20 +2431,6 @@ int ext4_mb_init(struct super_block *sb,
 	sbi->s_mb_order2_reqs = MB_DEFAULT_ORDER2_REQS;
 	sbi->s_mb_group_prealloc = MB_DEFAULT_GROUP_PREALLOC;

-	sbi->s_locality_groups = alloc_percpu(struct ext4_locality_group);
-	if (sbi->s_locality_groups == NULL) {
-		kfree(sbi->s_mb_offsets);
-		kfree(sbi->s_mb_maxs);
-		return -ENOMEM;
-	}
-	for_each_possible_cpu(i) {
-		struct ext4_locality_group *lg;
-		lg = per_cpu_ptr(sbi->s_locality_groups, i);
-		mutex_init(&lg->lg_mutex);
-		for (j = 0; j < PREALLOC_TB_SIZE; j++)
-			INIT_LIST_HEAD(&lg->lg_prealloc_list[j]);
-		spin_lock_init(&lg->lg_prealloc_lock);
-	}

 	if (sbi->s_proc)
--

From: tytso
Date: Saturday, April 3, 2010 - 9:53 am

Hi Jing,

If you're wondering why I'm taking a long time to respond to your
patches, it is because they are very ill-formed.  They don't conform
to the submitting patches guidelines, the patch comments don't
adequately explain the why the patch is needed, and what testing has
been done, and you tend to throw in patches that aren't correctly
submitted in the middle of comments, and in some cases it's not clear
whether this patch is suppose to be in addition to the previous patch,
and combined into a separate commit, or kept as two separate patches,
etc.

As a result, it takes, much, MUCH, MUCH longer for me to review the
patches for correctness.  I will eventually get to them, but I may end
up working on other patches which are better formed and easier for me
to evaluate for correctness and quality.

If you do submit new patches, especially in this thread where you have
already submitted so many different patches, I would appreicate it if
you could explicitly state that a particular patch has been superceded
by another, or has been withdrawn.  I will try to keep score on the
patchwork web site (for example, I rejected your bb_free_cache patch),
but you've been so prolific with patches, some of which haven't been
very well explained, that I may have lost track of all of your
submissions.   

Please bear with me.

Best regards,

					- Ted




--

From: jing zhang
Date: Saturday, April 3, 2010 - 6:05 pm

Sorry again, Ted.

There are so much for a newbie to learn to do cool work, especially to
meet your requirements. I am happy, any way, to patch ext4 under your

Without the cool git, though I am learning how to take advantage of
it, I could not manage all the patches delivered. In fact, I dig the
patches with UltraEdit for modifying the C code, Cygwin for git and
diff -Npu, and virtual machine for compiling. My kid, 11 years old
boy, has to share the HP notebook with me playing games.

And please laugh at me, I am not living in stone age.

I resolve conflicts and dependencies between patches in the way that
they are carried out by independent guys, since I am told that git is
cool enough. But indeed I created so many hard work for you, sorry.

Is it possible for me to patch not based upon the stock version I
downloaded at kernel.org, but upon the patched version, say the latest

No problem. I like ext4, maybe still the native file system of GNU
Linux, and the cool guys such as Alen Cox, Andy Clin, Ingo Molnar,
Rusty Russell, Jens Axboe, Alexey N. Kuznetsov (where r u now, ANK?),
Mike Haertel, Avi Kivity, Ted, Reiser, and their cool works which help
many newbies to understand Linux, to apply the cool ideas and methods
they learned in Linux to their everyday work, to earn bread and salt
in this hard time.

One of the amazing cools of Linux, I believe, besides free, is to
change ideas and lives of individuals, and to change what should be
changed in the society in which individuals live.

I am being changed to do patch in appreciated way.

Thanks
--

From: tytso
Date: Sunday, April 4, 2010 - 11:08 am

How much testing are you doing before submitting patches, out of

Having independent patches is actually better --- but I think you're
misunderstanding what I was complaining about before.  Patches should
that are accepted into mainline should do one and only one thing.  So
if someone suggests that you make changes to your submitted patch,
ideally what you should do is to resubmit the patch with the fixes ---
and not submit a patch which is a delta to the previous one.

This is especially true if the original patch is buggy; one of the
things we try very hard to maintain is that the kernel tree compile
cleanly, and pass the regression test suite, between every single
commit.  In other words, we try to avoid knowingly introducing a
regression in a patch and fixing it in a subsequent patch.  This
allows things like "git bisect" to work, and it also makes it easier
for people to look at the commit history to understand why certain
changes were made, and especially when trying to find how a bug was
introduced into ext4.  Ultimately, this is about keeping the kernel
source code easily maintainable.  This means that incrased code
complexity has to be justified, and code and code changes have to be
meticulously documented.

Best regards,

					- Ted
--

From: jing zhang
Date: Sunday, April 4, 2010 - 8:53 pm

Yes, Ted, it is curiosity that drives me to do hard works, including patch ext4.
I already told you that I am happy to patch ext4.
I also like to share/deliver my patches to the developers and
maintainers of ext4, simply because I find something not correct or
perfect in the version I got at kernel.org, and because I am not 100%
sure. And review is necessary, as described in Andi Kleen's "On
submitting kernel patches", which I read last night. And I got that on
cmdline,

           git --stat --summary  a/x.c  b/x.c  >  x-01.diff

may get nice output, correct?

It is hard work for me, a newbie, though curious, to do perfect patch,
and I try my best to describe the patch in English, which is not my
native language but one of the most beautiful languages on earth.

What is behind my curiosity is the belief that I will be freed both by
Jesus in West and by Buddha in East, today or tomorrow, if I deliver
what I can to those who need.

And after operations on cmdline, I compile the modified, modprobe, dd,

Whatever you complain, I try to be a good listener every time:)
GNU Linux is free, is MIT free?
And I am free to change, to be responsible for what I did, or maybe

I think what is called ext4 will change, and more will be freed by the
change, since it is under your maintenance, at least currently, a
model of maintainer, especially of strict requirement and kind
patience to patches received.

Good weekend, and please review my new patches next week.

Thank you, Ted.

                       - zj
--

From: Eric Sandeen
Date: Sunday, April 4, 2010 - 9:27 pm

It is the language barrier that is making some of this difficult,
but I'm not complaining - you speak English much better than I speak any
second language.  :)

Ted meant that -he- was curious about how much testing you were doing.


More testing than this would be good; dd is very minimal.

One of our new standard tests for et4 is the xfstests test suite from 
http://git.kernel.org/?p=fs/xfs/xfstests-dev.git;a=summary

It is a collection of many tests developed for xfs, but many tests are
generic and can run on ext4 as well.  I would suggest that after you have
several patches ready, you should at least run through the tests in this
collection.  It won't catch every mistake but it runs a large variety of 
tests, much more stressful than dd.

Thanks for your email, and thanks for clearly spending time looking for
ways to improve ext4.  I think that with practice, you will be a good 
contributor.

Ted can certainly be a patient maintainer - read his suggestions and the
kernel patch submission guidelines, and I think you will get better at this.

Do your best to explain the reasons for your patches, and any testing you
have done, and describe any test which can show a bug that you find - 
and we can help to clarify changelogs if they need it.

-Eric
--

From: jing zhang
Date: Sunday, April 4, 2010 - 9:51 pm

You are good guy:) Is English your native language? And, I am curious,



I am not good at testing, partially because it is hard to setup the
required environment, sometimes several hard disks are needed, maybe a
few boxes, but I try to analyse the C code while reading and
understanding the works by great maintainers and developers of Linux
kernel.

                - zj
--

From: Eric Sandeen
Date: Sunday, April 4, 2010 - 9:59 pm

No, because I am a native English speaker and I understood the 

Testing really is critical to development; some things can be done by
inspection, but if you don't test it is hard to know if you made a
mistake.

You can always test inside a vm, or on a loopback file, on a single box.

Without testing, you are asking others to do testing for you
(unless the change is so obvious that it can be trusted) 



--

From: jing zhang
Date: Sunday, April 4, 2010 - 10:08 pm

Before linux-2.6.32 is released, would you like tell me, how is ext4 tested?
Is tough testing able to catch all bugs?

      - zj
--

From: tytso
Date: Monday, April 5, 2010 - 5:42 am

For one thing, we use the XFSQA (aka xfstests) test suite.  I usually
run the "quick" set of tests between each commit, and I run a much
longer "auto" test suite before submitting a set of patches to Linus.

For specific patches, we also like to be able to test it specifically.
So often we'll create a small filesystem image and then see what the
file system does with and without the patch.  For example, if we're
worried that a specific file system corruption will cause a BUG_ON,
we'll create a small file system, corrupt it using the debugfs tool,
and then demonstrate that without the patch, we get the BUG_ON.  With
the patch applied, the BUG_ON should go away.

So the patch-specific tests are there so we can make sure the patch
does what we think it should, and it fixes the problem it claims to
fix.  The XFSQA test suite is to there to make sure that we don't
accidentally introduce bugs while trying to improve the file system.

And of course, the final safeguard is patch review by others.  This is
where decent comments and making sure the code is maintainable is
critically important.  Since this takes other developer's time,
especially senior developers like Eric and myself's time, which is
rather valuable and hard to find, it's why patch submitters need
shoulder more of the work in terms of refining patches and
resubmitting new versions of the patches.

						- Ted

P.S.  Yes, Eric was absolutely correct.  I wanted to know how much
testing you had been doing on your patches before submitting them.  At
least few of your patches have caused me to wonder whether the patches
had gone through sufficient testing.
--

From: jing zhang
Date: Tuesday, April 6, 2010 - 6:43 am

First, Mr. Theodore Ts'o, your reply is a great honor to me.

And I learn more about ext4 and its maintainers and developers,
specially that testing is important to ext4. Thanks.

In my view, one of the important parts of a patch is that the patcher
is really concerning what is patched, another is whether the questions
listed in the patch do exist, and whether the solution, if provided by
the patcher, is correct.

Even if the provide solution is ok, I think, it is the privilege,
responsibility and duty of maintainers to execute the final
submitting, and therefore it is not the duty of patchers to make sure
that the solution is 100% correct and perfect, testing is good or not.

I like to be a maintainer of some subsystem of GNU Linux, since it is
my shame that end users experience panic, crash, bug, bug_on caused by
what is under my maintenance, that end users are treated as rats and
monkeys in laboratory even though I am not paid by any end user, and
the nice reputations of distributors, say red hat, of service
providers, say IBM.

To be a maintainer, I will apologize on my homepage once a patch
received, if the questions listed in the patch do exist, simply

I still concern how to correctly measure the lost, in the next half of
2010, of time and value of end users, say 100, caused by what is buggy

Do you like opera, Mr. Theodore Ts'o?

If you like, I invite you Beijing Opera and tea, and I pay the air
ticket, double trip, from Los Angeles to Beijing. And please info me

If end users of ext4 can not be treated, for any reason, as rats and
monkeys, lets go ahead, please, focusing upon patch.

           - zj
--

From: tytso
Date: Tuesday, April 6, 2010 - 7:21 am

Well, the problem is there are many more patch submitters than there
are maintainers, and so your proposal simply doesn't scale.  Consider
that for some maintainers, there may be 10 or 20 or 30 or more patch
submitters in their subsystem.  With that kind of
submitter-to-maintainer ratio, the patch submitter simply has to do
much more of the work, since otherwise the subsystem maintainer simply
can't keep up.

Most maintainers actually spend much less time dealing with patch
submitters than I have invested with you.  They simply send a NACK,
and maybe 2-3 sentences explaning of what still needs fixing, and then
it's up to the patch submitter to resubmit the patch.  Some patches
are resubmitted 5, 6 times before the maintainer finally accepts it.

I happen to believe that we need to encourage newcomers to the kernel
developer community, and so I spend more time mentoring people who are
new to the process.  But at the end of the day, I have only so many
hours that I can spend on newcomers, and so after a while, I will
start serving other patch submitters who are more capable of providing

Sure, but that means the patches which are submitted has to be high
quality.  If I get half-tested patches, or patches where you can't
give me a reproduction case that demonstrates the BUG_ON, and it's not
obvious whether or not the patch is safe, I have to worry about the
possibility that applying your patch may make the the file system more
likely to crash, not less.

Ext4 is actually quite stable at this point.  Very large numbers of
people are using it, and most users are quite happy.  So at this
point, I have to weigh the risk that a patch might introduce a bug
against the claim that it fixes the bug --- especially if the patch
submitter can't explain how the BUG_ON might have been triggered
without their patch, and can't explain to me how much testing he or
she has done before sumitting the patch to me.

As far as your concern about end users getting treated as "rats and
monkeys", ...
From: jing zhang
Date: Sunday, April 4, 2010 - 10:18 pm

And I am also curious about what educational degree did you win, Eric,
on the language of English, at Harvard University?

          - zj
--

From: tytso
Date: Monday, April 5, 2010 - 5:43 am

Jing,

That's uncalled for.  Please be polite.  Eric is a native speaker of
English, and he quite correctly interpreted my question.

Thanks,

	     	      			    - Ted
--

From: Aneesh Kumar K. V
Date: Friday, March 26, 2010 - 1:54 am

A better fix would be move s_locality_group allocation before
ext4_mb_init_backend

-aneesh
--