Re: ls & flush-btrfs-1 sit at 100% sys

Previous thread: [GIT PULL][PATCH v2 0/6] btrfs: Add lzo compression support by Li Zefan on Wednesday, November 17, 2010 - 7:08 pm. (13 messages)

Next thread: Root fs on raid1 by admin on Wednesday, November 17, 2010 - 11:14 pm. (4 messages)
From: Brian Sullivan
Date: Wednesday, November 17, 2010 - 10:03 pm

I had been running 2.6.32 for a many months without any issues.  Btrfs
on top of a raid6 md array.  Filesystem is at  9/11TB used.

I updated to 2.6.34 for a week or so and had no problem.
Updated to 2.6.36 for a few days and no problems.
Update to 2.6.37 and now I cannot read from array.

So I boot up to 2.6.37...
I can run btrfsck and it finds no problems.
I can mount the fs, no problem.
I can run df, no problem.
I can cd to the fs, no problem... even a few folders down, as long as
I can remember exact path because...
Soon as I do 'ls' anywhere it gets stuck.

Is doesn't return, I check top and both ls and flush-btrfs-1 are
sitting at ~50% sys usage each.

I let the system sit in this state all day today while I was at work
(~9 hours) and ls never returned.  Even while in this state however,
df still is working.  I can still cd to directories (as long as I
recall their exact path).

Not sure what is going on here.
--

From: Chris Ball
Date: Wednesday, November 17, 2010 - 10:15 pm

Hi,

   > Is doesn't return, I check top and both ls and flush-btrfs-1 are
   > sitting at ~50% sys usage each.

Does anything new appear in dmesg when the hang happens?  Can you run
alt-sysrq-t (show tasks) and send us the output for the ls process?

- Chris.
-- 
Chris Ball   <cjb@laptop.org>
One Laptop Per Child
--

From: Brian Sullivan
Date: Wednesday, November 17, 2010 - 11:03 pm

Nothing shows up in dmesg.

[ 8114.870020] ls            R  running task        0  3438   3375 0x00000004
[ 8114.870020]  ffff88036339dab8 0000000000000086 ffff88036339da60
ffff88036339dfd8
[ 8114.870020]  00000000000139c0 0000000000000000 ffff88036339dfd8
ffff88036339dfd8
[ 8114.870020]  00000000000139c0 ffff88034f670398 ffff88034f6703a0
ffff88034f670000
[ 8114.870020] Call Trace:
[ 8114.870020]  [<ffffffff8159f7b4>] ? schedule+0x224/0x660
[ 8114.870020]  [<ffffffff815a01de>] schedule_timeout+0x19e/0x2e0
[ 8114.870020]  [<ffffffff81057690>] enqueue_task_fair+0x50/0x60
[ 8114.870020]  [<ffffffff8105d550>] enqueue_task+0x70/0xd0
[ 8114.870020]  [<ffffffff8105e9be>] ? try_to_wake_up+0x18e/0x3f0
[ 8114.870020]  [<ffffffff8105ec20>] ? default_wake_function+0x0/0x20
[ 8114.870020]  [<ffffffff815a0196>] ? schedule_timeout+0x156/0x2e0
[ 8114.870020]  [<ffffffff81181399>] ? writeback_inodes_sb_nr_if_idle+0x49/0x70
[ 8114.870020]  [<ffffffffa0e84607>] ? shrink_delalloc+0x127/0x170 [btrfs]
[ 8114.870020]  [<ffffffffa0e84727>] ? reserve_metadata_bytes+0xd7/0x1f0 [btrfs]
[ 8114.870020]  [<ffffffffa0e84913>] ? btrfs_block_rsv_add+0x43/0x60 [btrfs]
[ 8114.870020]  [<ffffffff81085e00>] ? autoremove_wake_function+0x0/0x40
[ 8114.870020]  [<ffffffffa0e8498b>] ?
btrfs_trans_reserve_metadata+0x5b/0xa0 [btrfs]
[ 8114.870020]  [<ffffffffa0e9a0be>] ? start_transaction+0xbe/0x210 [btrfs]
[ 8114.870020]  [<ffffffff8116fa80>] ? filldir+0x0/0xf0
[ 8114.870020]  [<ffffffffa0e9a423>] ? btrfs_start_transaction+0x13/0x20 [btrfs]
[ 8114.870020]  [<ffffffffa0e9d3e8>] ? btrfs_dirty_inode+0x98/0x120 [btrfs]
[ 8114.870020]  [<ffffffff8116fa80>] ? filldir+0x0/0xf0
[ 8114.870020]  [<ffffffff81182d9a>] ? __mark_inode_dirty+0x3a/0x200
[ 8114.870020]  [<ffffffff811754f4>] ? touch_atime+0xf4/0x100
[ 8114.870020]  [<ffffffff8116f92c>] ? vfs_readdir+0xcc/0xd0
[ 8114.870020]  [<ffffffff8116f9ba>] ? sys_getdents+0x8a/0xe0
[ 8114.870020]  [<ffffffff815a2515>] ? page_fault+0x25/0x30
[ 8114.870020]  [<ffffffff8100c132>] ? ...
From: Daniel J Blueman
Date: Thursday, November 18, 2010 - 4:08 am

Interesting. If you mount the filesystem with 'noatime,nodiratime' or
'ro', does it allow ls to return?

Daniel
-- 
Daniel J Blueman
--

From: Brian Sullivan
Date: Thursday, November 18, 2010 - 11:30 am

Yep actually, with noatime,nodiratime ls is fine.  I didn't try ro but
I assume that'll work too.  So with noatime,nodiratime I can go around
in tree and ls works.  If I try to touch a new file, touch doesn't
return.  If I then ls in that same folder ls doesn't return either.
So yeah seems like soon as something has to write.

Also after I run touch, it doesn't return, I look at top, nothing is
spinning, everything is at 0% usage.  After a minute or so then touch
and flush-btrfs-1 jump to 50%sys each and sit there.

Think this was just before they went to 50% usage and were still sitting at 0:
[  420.110021] touch         S ffff8803646c9a58     0  3337   3252 0x00000000
[  420.110021]  ffff8803416c9a28 0000000000000086 0000000000000000
ffff8803416c9fd8
[  420.110021]  00000000000139c0 00000000000139c0 ffff8803416c9fd8
ffff8803416c9fd8
[  420.110021]  00000000000139c0 ffff8803646c9a58 ffff8803646c9a60
ffff8803646c96c0
[  420.110021] Call Trace:
[  420.110021]  [<ffffffff815a0196>] schedule_timeout+0x156/0x2e0
[  420.110021]  [<ffffffff810733b0>] ? process_timeout+0x0/0x10
[  420.110021]  [<ffffffffa0dbe607>] shrink_delalloc+0x127/0x170 [btrfs]
[  420.110021]  [<ffffffffa0dbe727>] reserve_metadata_bytes+0xd7/0x1f0 [btrfs]
[  420.110021]  [<ffffffffa0dbe913>] btrfs_block_rsv_add+0x43/0x60 [btrfs]
[  420.110021]  [<ffffffffa0dbe98b>]
btrfs_trans_reserve_metadata+0x5b/0xa0 [btrfs]
[  420.110021]  [<ffffffffa0dd40be>] start_transaction+0xbe/0x210 [btrfs]
[  420.110021]  [<ffffffffa0dd4423>] btrfs_start_transaction+0x13/0x20 [btrfs]
[  420.110021]  [<ffffffffa0dda8a4>] btrfs_create+0x84/0x220 [btrfs]
[  420.110021]  [<ffffffff81168c94>] ? generic_permission+0x24/0xc0
[  420.110021]  [<ffffffff8116a7a8>] vfs_create+0xb8/0x110
[  420.110021]  [<ffffffff8116a870>] __open_namei_create+0x70/0x100
[  420.110021]  [<ffffffff8116b366>] do_last+0x486/0x4e0
[  420.110021]  [<ffffffff8116c27e>] do_filp_open+0x25e/0x620
[  420.110021]  [<ffffffff812d5e37>] ? __strncpy_from_user+0x27/0x60
[  ...
From: Chris Mason
Date: Friday, November 19, 2010 - 7:32 am

So, based on this trace we're banging on the delalloc flushing to free
up room.

I just wanted to confirm, you're seeing this with 2.6.37-rc?  I thought
I had fixed up this delalloc hammering.

-chris
--

From: Josef Bacik
Date: Friday, November 19, 2010 - 7:46 am

Also can you run with this patch

http://www.spinics.net/lists/linux-btrfs/msg06890.html

its a crap bug which will make us look like we're out of space when we arent and
we'll flush alot more.  Thanks,

Josef
--

From: Brian Sullivan
Date: Friday, November 19, 2010 - 1:09 pm

Will try tonight.

-Brian
--

From: Brian Sullivan
Date: Monday, November 22, 2010 - 4:29 pm

Got 2.6.37-rc2 from kernel.org, applied this patch, and still not able
to write to the filesystem.

I hooked up another array (md raid w/btfs on top) and it this one
works fine.  Should I just wipe out of the broken fs and recreate it?
Or is there anything else I should try?

--

From: Chris Mason
Date: Monday, November 22, 2010 - 5:54 pm

So with the patch are you still seeing the 100% system time?

-chris
--

From: Brian Sullivan
Date: Tuesday, November 23, 2010 - 1:27 pm

Yes, no change with patch.

-Brian
--

From: Chris Mason
Date: Tuesday, November 23, 2010 - 2:07 pm

Ok, the short term solution is going to be adding another drive to your
FS and letting the space spill over to there.

I'd like to try and reproduce here though, if it isn't too difficult to
keep things in the current state?

-chris
--

Previous thread: [GIT PULL][PATCH v2 0/6] btrfs: Add lzo compression support by Li Zefan on Wednesday, November 17, 2010 - 7:08 pm. (13 messages)

Next thread: Root fs on raid1 by admin on Wednesday, November 17, 2010 - 11:14 pm. (4 messages)