[2.6.36-rc1] List corruption when using initrd.

Previous thread: [PATCH] Export mlock information via smaps by Nikanth Karthikesan on Monday, August 16, 2010 - 10:09 pm. (11 messages)

Next thread: Re: tasks getting stuck on mmap_sem? by Dimitris Michailidis on Monday, August 16, 2010 - 10:22 pm. (3 messages)
From: Tetsuo Handa
Date: Monday, August 16, 2010 - 10:17 pm

Hello.

I get list_add corruption message when booting with initrd on Debian Sarge.
Config is at http://I-love.SAKURA.ne.jp/tmp/config-2.6.36-rc1-2 .

[    7.140845] VFS: Mounted root (cramfs filesystem) readonly on device 1:0.
[    7.192635] mount used greatest stack depth: 2004 bytes left
[    7.212497] linuxrc used greatest stack depth: 1572 bytes left
[    7.214451] debug: unmapping init memory c158d000..c1751000
[    7.218958] Write protecting the kernel text: 3648k
[    7.220230] Write protecting the kernel read-only data: 1700k
[    7.222770] ------------[ cut here ]------------
[    7.223823] WARNING: at fs/inode.c:692 unlock_new_inode+0x78/0xc0()
[    7.225249] Hardware name: VMware Virtual Platform
[    7.228818] Modules linked in:
[    7.229668] Pid: 1, comm: swapper Not tainted 2.6.36-rc1 #1
[    7.230963] Call Trace:
[    7.231535]  [<c103dbe8>] ? printk+0x18/0x20
[    7.232478]  [<c10d8778>] ? unlock_new_inode+0x78/0xc0
[    7.233668]  [<c103d18c>] warn_slowpath_common+0x7c/0xa0
[    7.234882]  [<c10d8778>] ? unlock_new_inode+0x78/0xc0
[    7.237936]  [<c103d24d>] warn_slowpath_null+0x1d/0x40
[    7.239257]  [<c10d8778>] unlock_new_inode+0x78/0xc0
[    7.240411]  [<c10d8d5e>] ? iget_locked+0x2e/0x50
[    7.241590]  [<c113c219>] get_cramfs_inode+0x49/0x80
[    7.242731]  [<c113cad6>] cramfs_lookup+0x196/0x1c0
[    7.245605]  [<c10d6736>] ? d_lookup+0x26/0x50
[    7.246633]  [<c10cd017>] do_lookup+0x137/0x1b0
[    7.247717]  [<c10ce887>] do_last+0x67/0x450
[    7.248695]  [<c10cee5d>] do_filp_open+0x1ed/0x500
[    7.249804]  [<c10acc55>] ? __get_user_pages+0xe5/0x2d0
[    7.250996]  [<c10ace92>] ? get_user_pages+0x52/0x60
[    7.252087]  [<c11d39ac>] ? _copy_from_user+0x3c/0x70
[    7.253266]  [<c10c8818>] ? put_arg_page+0x8/0x10
[    7.256494]  [<c10c8be4>] ? copy_strings+0x194/0x1b0
[    7.257644]  [<c10c8f90>] open_exec+0x30/0xe0
[    7.258665]  [<c10fcce2>] load_script+0x1c2/0x220
[    7.259757]  [<c106a000>] ? trace_hardirqs_off_caller+0xf0/0x110
[    ...
From: Tetsuo Handa
Date: Tuesday, August 17, 2010 - 12:51 am

Bisection completed.

commit 7e496299d4d2ad8083effed6c5a18313a919edc6
tmpfs: make tmpfs scalable with percpu_counter for used blocks

Regards.
--

From: Hugh Dickins
Date: Tuesday, August 17, 2010 - 3:23 pm

Thanks for reporting and bisecting.  Certainly there's a bug in shmem
(that we ought to have caught long before it reached 36-rc1: sorry),
and it is probably the cause of your crashes; but it's possible there's
a similar bug elsewhere too, something else messing up the percpu_counters
list - so please check if the patch below really does fix it for you.

I notice your dmesg also showed fs/inode.c:692 unlock_new_inode()
warnings from get_cramfs_inode(): those would be unrelated, and
probably from Al Viro's recent changes in cramfs/inode.c - Cc'ed.

Thanks,
Hugh


[PATCH] shmem: put_super must percpu_counter_destroy

list_add() corruption messages reported from shmem_fill_super()'s recently
introduced percpu_counter_init(): shmem_put_super() needs to remember to
percpu_counter_destroy().  And also check error from percpu_counter_init().

Reported-and-bisected-by: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Signed-off-by: Hugh Dickins <hughd@google.com>
---

 mm/shmem.c |    8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

--- 2.6.36-rc1/mm/shmem.c	2010-08-16 00:18:01.000000000 -0700
+++ linux/mm/shmem.c	2010-08-17 14:42:56.000000000 -0700
@@ -2325,7 +2325,10 @@ static int shmem_show_options(struct seq
 
 static void shmem_put_super(struct super_block *sb)
 {
-	kfree(sb->s_fs_info);
+	struct shmem_sb_info *sbinfo = SHMEM_SB(sb);
+
+	percpu_counter_destroy(&sbinfo->used_blocks);
+	kfree(sbinfo);
 	sb->s_fs_info = NULL;
 }
 
@@ -2367,7 +2370,8 @@ int shmem_fill_super(struct super_block
 #endif
 
 	spin_lock_init(&sbinfo->stat_lock);
-	percpu_counter_init(&sbinfo->used_blocks, 0);
+	if (percpu_counter_init(&sbinfo->used_blocks, 0))
+		goto failed;
 	sbinfo->free_inodes = sbinfo->max_inodes;
 
 	sb->s_maxbytes = SHMEM_MAX_BYTES;
--

From: Tetsuo Handa
Date: Tuesday, August 17, 2010 - 6:13 pm

This patch solved the list_add() corruption messages and
unlock_new_inode() warnings remain after applying this patch.

Thank you.
--

From: Hugh Dickins
Date: Tuesday, August 17, 2010 - 8:28 pm

That's good, but puzzling.  I'll mention it in the patch comment, since

Bigger thanks to you.  Patch to Linus follows.

Hugh
--

From: Hugh Dickins
Date: Tuesday, August 17, 2010 - 8:32 pm

list_add() corruption messages reported from shmem_fill_super()'s recently
introduced percpu_counter_init(): shmem_put_super() needs to remember to
percpu_counter_destroy().  And also check error from percpu_counter_init().

Reported to fix oopses in __free_pipe_info() but I cannot work that out!

Reported-and-bisected-by: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Signed-off-by: Hugh Dickins <hughd@google.com>
Tested-by: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
---

 mm/shmem.c |    8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

--- 2.6.36-rc1/mm/shmem.c	2010-08-16 00:18:01.000000000 -0700
+++ linux/mm/shmem.c	2010-08-17 14:42:56.000000000 -0700
@@ -2325,7 +2325,10 @@ static int shmem_show_options(struct seq
 
 static void shmem_put_super(struct super_block *sb)
 {
-	kfree(sb->s_fs_info);
+	struct shmem_sb_info *sbinfo = SHMEM_SB(sb);
+
+	percpu_counter_destroy(&sbinfo->used_blocks);
+	kfree(sbinfo);
 	sb->s_fs_info = NULL;
 }
 
@@ -2367,7 +2370,8 @@ int shmem_fill_super(struct super_block
 #endif
 
 	spin_lock_init(&sbinfo->stat_lock);
-	percpu_counter_init(&sbinfo->used_blocks, 0);
+	if (percpu_counter_init(&sbinfo->used_blocks, 0))
+		goto failed;
 	sbinfo->free_inodes = sbinfo->max_inodes;
 
 	sb->s_maxbytes = SHMEM_MAX_BYTES;
--

From: Tim Chen
Date: Wednesday, August 18, 2010 - 9:50 am

It was my bad.  I had those two chunks of code in a previous version
of the patch but somehow missed them in the final one.

Acked-by: Tim Chen <tim.c.chen@linux.intel.com>



--

Previous thread: [PATCH] Export mlock information via smaps by Nikanth Karthikesan on Monday, August 16, 2010 - 10:09 pm. (11 messages)

Next thread: Re: tasks getting stuck on mmap_sem? by Dimitris Michailidis on Monday, August 16, 2010 - 10:22 pm. (3 messages)