Re: -mm: xfs lockdep warning

Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]
From: Torsten Kaiser
Date: Saturday, September 25, 2010 - 6:08 am

[adding Tejun Heo, because I'm partly blaming his workqueues]

On Mon, Sep 20, 2010 at 9:13 PM, Christoph Hellwig <hch@infradead.org> wrote:

I had the same lockdep report in mainline 2.6.36-rc5, when I enabled
lockdep to try to debug a hang under high load and some memory
pressure.
Do you have a patch I could try to get rid of this lockdep report to
see, if there is another one lurking behind it?

As for the hang: Today I wanted to install updates on my gentoo system
and that included an update of koffice. Because I allow emerge
(gentoos package manager) to process 4 packages in parallel and
specify -j5 for each of the compiles the result was, that emerge
decided to build 4 koffice programs in parallel which resulted in a
load ~18 on my 4 core system. With all these gcc running and the
compiles happening on a tmpfs this also resulted in some memory
pressure. (But not in a swap storm)

The first time I got this hang, I was not able to capture useful
informations, the second time it happened I got a report from the hung
task timeout:
3 processes where blocked in:
[ 1203.056532]  [<ffffffff810801c0>] ? sync_page+0x0/0x50
[ 1203.061836]  [<ffffffff8156078d>] ? io_schedule+0x3d/0x60
[ 1203.067390]  [<ffffffff810801fd>] ? sync_page+0x3d/0x50
probably because of these:
[ 1202.872424] INFO: task plasma-desktop:3169 blocked for more than 120 seconds.
[ 1202.879883] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 1202.888118]  ffff88011ef860b0 0000000000000086 000000011ef860b0
ffff880114faff48
[ 1202.895936]  ffff8800071cf1c0 0000000000012780 ffff880114faffd8
0000000000012780
[ 1202.903777]  ffff880114faffd8 ffff88011ef86310 ffff880114faffd8
ffff88011ef86308
[ 1202.911586] Call Trace:
[ 1202.914140]  [<ffffffff815612c1>] ? __mutex_lock_slowpath+0xe1/0x160
[ 1202.920693]  [<ffffffff810d1e57>] ? __d_lookup+0x97/0x120
[ 1202.926272]  [<ffffffff81560d8a>] ? mutex_lock+0x1a/0x40
[ 1202.931783]  [<ffffffff810c9dbf>] ? do_lookup+0x10f/0x190
[ 1202.937426]  [<ffffffff810c7e24>] ? acl_permission_check+0x54/0xb0
[ 1202.943898]  [<ffffffff810ca8d9>] ? link_path_walk+0x4c9/0x9a0
[ 1202.950028]  [<ffffffff810d65f4>] ? mnt_want_write+0x34/0x70
[ 1202.955947]  [<ffffffff810caecf>] ? path_walk+0x5f/0xe0
[ 1202.961416]  [<ffffffff810caf9b>] ? do_path_lookup+0x4b/0x50
[ 1202.967325]  [<ffffffff810cbc35>] ? user_path_at+0x55/0xb0
[ 1202.973020]  [<ffffffff8102e860>] ? __dequeue_entity+0x40/0x50
[ 1202.979054]  [<ffffffff8107bcf4>] ? perf_event_task_sched_out+0x44/0x260
[ 1202.985995]  [<ffffffff810c2e46>] ? vfs_fstatat+0x36/0x80
[ 1202.991600]  [<ffffffff815602ac>] ? schedule+0x24c/0x6f0
[ 1202.997157]  [<ffffffff810c2f8f>] ? sys_newstat+0x1f/0x50
[ 1203.002784]  [<ffffffff81003035>] ? device_not_available+0x15/0x20
[ 1203.009204]  [<ffffffff8100246b>] ? system_call_fastpath+0x16/0x1b

[ 1202.400046] INFO: task kworker/u:8:845 blocked for more than 120 seconds.
[ 1202.407128] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 1202.415372]  ffff88007ffa8b60 0000000000000046 ffff88007f74fe98
0000000000000000
[ 1202.423229]  ffff8800071cf770 0000000000012780 ffff8800070e9fd8
0000000000012780
[ 1202.431054]  ffff8800070e9fd8 ffff88007ffa8dc0 ffff8800070e9fd8
ffff88007ffa8db8
[ 1202.438898] Call Trace:
[ 1202.441495]  [<ffffffff814361ed>] ? md_write_start+0x9d/0x190
[ 1202.447536]  [<ffffffff81052e70>] ? autoremove_wake_function+0x0/0x30
[ 1202.454311]  [<ffffffff810925ca>] ? zone_nr_free_pages+0x9a/0xb0
[ 1202.460639]  [<ffffffff8142c423>] ? make_request+0x23/0x800
[ 1202.466496]  [<ffffffff811d90ea>] ? blkcipher_walk_done+0x8a/0x230
[ 1202.472990]  [<ffffffff811df4c4>] ? crypto_cbc_encrypt+0xe4/0x180
[ 1202.479327]  [<ffffffff8102a150>] ? twofish_encrypt+0x0/0x10
[ 1202.485276]  [<ffffffff81431ac3>] ? md_make_request+0xc3/0x220
[ 1202.491412]  [<ffffffff811eafca>] ? generic_make_request+0x18a/0x330
[ 1202.498096]  [<ffffffff814484fc>] ? crypt_convert+0x25c/0x310
[ 1202.504090]  [<ffffffff81448881>] ? kcryptd_crypt+0x2d1/0x3f0
[ 1202.510031]  [<ffffffff814485b0>] ? kcryptd_crypt+0x0/0x3f0
[ 1202.516081]  [<ffffffff8104d2eb>] ? process_one_work+0xfb/0x370
[ 1202.522123]  [<ffffffff8104ee4c>] ? worker_thread+0x16c/0x360
[ 1202.527935]  [<ffffffff8104ece0>] ? worker_thread+0x0/0x360
[ 1202.533541]  [<ffffffff8104ece0>] ? worker_thread+0x0/0x360
[ 1202.539131]  [<ffffffff810529e6>] ? kthread+0x96/0xa0
[ 1202.544245]  [<ffffffff81003194>] ? kernel_thread_helper+0x4/0x10
[ 1202.550363]  [<ffffffff81052950>] ? kthread+0x0/0xa0
[ 1202.555346]  [<ffffffff81003190>] ? kernel_thread_helper+0x0/0x10

After that I enabled lockdep and retried the same update: This time no
hang, only the same lockdep report that Yang Ruirui had.

So I'm currently at a loss how I should continue from here. The XFS
false positive drowns any other lockdep problems, but XFS had some
hang problems with the new workqueues. (Personally I had no other hang
with earlier 2.6.36-rcs)
As seen in the call trace from the kworker, there are more workqueues
involved, thats why I added Tejun to the CC. My root filesystem is XFS
on dm-crypt on a md/raid1, so it might have been something there.

If you need more information, just ask, I will try to provide it.

Torsten
--
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

Messages in current thread:
-mm: xfs lockdep warning, Yang Ruirui, (Thu Sep 16, 12:46 am)
Re: -mm: xfs lockdep warning, Dave Chinner, (Thu Sep 16, 5:52 pm)
Re: -mm: xfs lockdep warning, Christoph Hellwig, (Mon Sep 20, 12:13 pm)
Re: -mm: xfs lockdep warning, Torsten Kaiser, (Sat Sep 25, 6:08 am)
Re: -mm: xfs lockdep warning, Tejun Heo, (Mon Oct 4, 2:01 am)
Re: -mm: xfs lockdep warning, Dave Chinner, (Mon Oct 4, 2:21 am)
Re: -mm: xfs lockdep warning, Torsten Kaiser, (Tue Oct 5, 3:09 am)
Re: -mm: xfs lockdep warning, Tejun Heo, (Tue Oct 5, 9:51 am)
Re: -mm: xfs lockdep warning, Tejun Heo, (Tue Oct 5, 9:55 am)
Re: -mm: xfs lockdep warning, Torsten Kaiser, (Tue Oct 5, 10:54 am)
Re: -mm: xfs lockdep warning, Torsten Kaiser, (Tue Oct 5, 11:09 am)
Re: -mm: xfs lockdep warning, Christoph Hellwig, (Tue Oct 5, 11:17 am)
Re: -mm: xfs lockdep warning, Tejun Heo, (Thu Oct 7, 4:14 am)
Re: -mm: xfs lockdep warning, Torsten Kaiser, (Thu Oct 7, 6:21 am)
Re: -mm: xfs lockdep warning, Tejun Heo, (Thu Oct 7, 7:05 am)
Re: -mm: xfs lockdep warning, Torsten Kaiser, (Thu Oct 7, 8:15 am)
Re: -mm: xfs lockdep warning, Tejun Heo, (Thu Oct 7, 8:41 am)
Re: -mm: xfs lockdep warning, Tejun Heo, (Fri Oct 8, 5:40 am)