https://bugzilla.kernel.org/show_bug.cgi?id=15906 Summary: serious performance regression in "umount" on ext4 over LVM Product: File System Version: 2.5 Kernel Version: 2.6.33 Platform: All OS/Version: Linux Tree: Mainline Status: NEW Severity: normal Priority: P1 Component: ext4 AssignedTo: fs_ext4@kernel-bugs.osdl.org ReportedBy: kees@outflux.net Regression: Yes As root, if /tmp is on an LV (or possibly even without this): cd /tmp dd if=/dev/zero of=test.ext4 bs=1 count=1 seek=1G mkfs.ext4 -F test.ext4 mkdir -p /mnt/test mount -o loop test.ext4 /mnt/test echo $(seq 65536) | (cd /mnt/test; xargs touch) time umount /mnt/test Prior to 2.6.32, this took a second or so. Now it takes over a minute. Reproduced in both Ubuntu 10.04 (2.6.32) and Fedora 12 (2.6.33). https://bugs.launchpad.net/ubuntu/+source/linux/+bug/543617 The current Ubuntu work-around is only useful on an otherwise idle system. -- Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are watching the assignee of the bug. --
https://bugzilla.kernel.org/show_bug.cgi?id=15906 --- Comment #1 from Kees Cook <kees@outflux.net> 2010-05-04 20:37:32 --- Sorry, I meant Fedora 13 (Beta). -- Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are watching the assignee of the bug. --
https://bugzilla.kernel.org/show_bug.cgi?id=15906 --- Comment #2 from Kees Cook <kees@outflux.net> 2010-05-04 20:41:55 --- https://bugzilla.redhat.com/show_bug.cgi?id=588930 -- Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are watching the assignee of the bug. --
https://bugzilla.kernel.org/show_bug.cgi?id=15906 --- Comment #3 from Kees Cook <kees@outflux.net> 2010-05-04 20:52:46 --- As an interesting oddity, creating an ext3 filesystem, but then attempting to mount it as ext4 shows the same pathological behavior. Mounting as ext3, it's fine again. -- Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are watching the assignee of the bug. --
https://bugzilla.kernel.org/show_bug.cgi?id=15906 Eric Sandeen <sandeen@redhat.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |sandeen@redhat.com --- Comment #4 from Eric Sandeen <sandeen@redhat.com> 2010-05-04 21:05:24 --- As I mentioned on the RH bz, it seems barrier-related. I thought most barrier enhancements/changes were before 2.6.31, though. Was this really fast for you on .31? Lots of lvm barrier changes went into .30. mounting either the host fs, the loopback file, or both with "nobarrier" avoids all the cache flushes from the barriers, and it's speedy again. -Eric -- Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are watching the assignee of the bug. --
https://bugzilla.kernel.org/show_bug.cgi?id=15906 --- Comment #5 from Kees Cook <kees@outflux.net> 2010-05-04 21:11:03 --- Hm, this was really a reduced test-case, as my initial situations where I noticed this was umounting LVM snapshots, then later umounting aufs unions (in an effort to get away from 3x slow-down of snapshot IO). Doing this is just as slow as loopback for me: VG=systemvg lvcreate -n testlv -L1G $VG mkfs.ext4 -F /dev/$VG/testlv mkdir -p /mnt/test mount -o loop /dev/$VG/testlv /mnt/test seq 65536 | (cd /mnt/test; xargs touch) time umount /mnt/test Should 68db1961bbf4e16c220ccec4a780e966bc1fece3 be expected to fix it too? -- Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are watching the assignee of the bug. --
https://bugzilla.kernel.org/show_bug.cgi?id=15906 --- Comment #6 from Kees Cook <kees@outflux.net> 2010-05-04 21:12:51 --- Hm, it seemed to be progressively worse since 2.6.28 (when it was okay). It was not good in 2.6.31, and bad in 2.6.32. So I don't have data for 2.6.29 or .30. -- Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are watching the assignee of the bug. --
https://bugzilla.kernel.org/show_bug.cgi?id=15906 keith mannthey <kmannth@us.ibm.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |kmannth@us.ibm.com --- Comment #7 from keith mannthey <kmannth@us.ibm.com> 2010-05-04 21:34:38 --- Eric is spot on. At 2.6.30 and dm/lvm started supporting barriers, this can cause serious performance issues depending on your configuration. Try mounting with the " -o nobarrier " option to see if your performance is restored. -- Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are watching the assignee of the bug. --
https://bugzilla.kernel.org/show_bug.cgi?id=15906 --- Comment #8 from Eric Sandeen <sandeen@redhat.com> 2010-05-04 21:48:27 --- Kees, no, 68db1961bbf4e16c220ccec4a780e966bc1fece3 made the test on loopback slower, because now a barrier from a filesystem on a loopback device issues an fsync on the underlying filesystem which issues a cache flush on the underlying block device .... you get the picture. Unless we really have gratuitous barrier issues here, there's probably not a whole lot to be done, though maybe some batching is possible. As for the business about ext3, there is nothing mysterious or odd there either, ext3 mounts with barriers off by default, despite my best efforts, and mounting it with the ext4 driver does turn them on. -- Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are watching the assignee of the bug. --
https://bugzilla.kernel.org/show_bug.cgi?id=15906 --- Comment #9 from Kees Cook <kees@outflux.net> 2010-05-05 00:28:42 --- Well, it seems to me that there is something wrong somewhere in here since "sync; umount" takes seconds and "umount" alone takes minutes. I can, however, confirm that adding "-o nobarrier" causes the symptom to vanish. -- Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are watching the assignee of the bug. --
https://bugzilla.kernel.org/show_bug.cgi?id=15906 Theodore Tso <tytso@mit.edu> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |tytso@mit.edu --- Comment #10 from Theodore Tso <tytso@mit.edu> 2010-05-05 03:18:02 --- To be fair, there does seem to be a real problem here. See my comment here: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/543617/comments/12 It looks like if there are dirty inodes, whatever writeback path is used by umount(8) is much more inefficient than that used by sync(8). Specifically it seems to be doing a journal commit between each dirty inode which is flushed out. -- Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are watching the assignee of the bug. --
https://bugzilla.kernel.org/show_bug.cgi?id=15906 --- Comment #11 from Eric Sandeen <sandeen@redhat.com> 2010-05-05 04:00:04 --- Yep, fair enough, and on closer inspection its doing way more IO than expected, wrapping the journal 4 times. :) -- Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are watching the assignee of the bug. --
https://bugzilla.kernel.org/show_bug.cgi?id=15906 --- Comment #12 from Eric Sandeen <sandeen@redhat.com> 2010-05-05 04:46:16 --- so basically: sync_inodes_sb bdi_sync_writeback (sync_mode = WB_SYNC_ALL) ... writeback_inodes_wb writeback_single_inode ext4_write_inode if (WB_SYNC_ALL) ext4_force_commit <--- ouch. -- Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are watching the assignee of the bug. --
https://bugzilla.kernel.org/show_bug.cgi?id=15906 Dmitry Monakhov <dmonakhov@openvz.org> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |dmonakhov@openvz.org --- Comment #13 from Dmitry Monakhov <dmonakhov@openvz.org> 2010-05-05 07:28:10 --- Yep. i've already know that issue. In fact it was broken by followng commit From 03ba3782e8dcc5b0e1efe440d33084f066e38cae Mon Sep 17 00:00:00 2001 From: Jens Axboe <jens.axboe@oracle.com> Date: Wed, 9 Sep 2009 09:08:54 +0200 Subject: [PATCH] writeback: switch to per-bdi threads for flushing data The problem with __sync_filesystem(0) is no longer works on umount because sb can not be pined s_mount sem is downed for write and s_root is NULL. And in fact ext3 is also broken in case of "-obarrier=1" The patch attached fix the original regression, but there is one more issue left A delalloc option. In fact dirty inode is still dirty even after first call of writeback_single_inode which is called from __sync_filesystem(0) due to delalloc allocation happen during inode write. So it takes second __sync_filesystem call to clear dirty flags. Currently i'm working on that issue. I hope i'll post a solution today. -- Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are watching the assignee of the bug. --
bugzilla-daemon@bugzilla.kernel.org writes: Hi Jens, Just FYI, we have found a regression which was caused by your famous writeback patch 03ba3782e8dcc5b0e1efe440d33084f066e38cae I'm not allowed to add you to CC in BZ, that's why i wrote this mail. Before the patch __sync_filesystem() called writeback_single_inode() directly, but now it is called indirectly (from flush-X:X task) which require a super_block in question to be pinned. But this is impossible to pin this SB on umount because we already hold s_umount sem for write, so effectively we already pinned that SB. So my proposal is to treat umount similar to WB_SYNC_ALL, and skip --
Hmm I see, yes that is a bug. How about adding a WB_SYNC_NONE_PINNED or
something to that effect, which acts like WB_SYNC_NONE but the caller is
required to hold the s_umount already (like WB_SYNC_ALL).
Something like the below. It compiles, but not tested.
diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
index 4b37f7c..4327465 100644
--- a/fs/fs-writeback.c
+++ b/fs/fs-writeback.c
@@ -245,19 +245,20 @@ static void bdi_sync_writeback(struct backing_dev_info *bdi,
* @bdi: the backing device to write from
* @sb: write inodes from this super_block
* @nr_pages: the number of pages to write
+ * @sb_locked: caller already holds sb umount sem.
*
* Description:
* This does WB_SYNC_NONE opportunistic writeback. The IO is only
* started when this function returns, we make no guarentees on
- * completion. Caller need not hold sb s_umount semaphore.
+ * completion. Caller specifies whether sb umount sem is held already or not.
*
*/
void bdi_start_writeback(struct backing_dev_info *bdi, struct super_block *sb,
- long nr_pages)
+ long nr_pages, int sb_locked)
{
struct wb_writeback_args args = {
.sb = sb,
- .sync_mode = WB_SYNC_NONE,
+ .sync_mode = sb_locked ? WB_SYNC_NONE_PIN : WB_SYNC_NONE,
.nr_pages = nr_pages,
.range_cyclic = 1,
};
@@ -577,7 +578,8 @@ static enum sb_pin_state pin_sb_for_writeback(struct writeback_control *wbc,
/*
* Caller must already hold the ref for this
*/
- if (wbc->sync_mode == WB_SYNC_ALL) {
+ if (wbc->sync_mode == WB_SYNC_ALL ||
+ wbc->sync_mode == WB_SYNC_NONE_PIN) {
WARN_ON(!rwsem_is_locked(&sb->s_umount));
return SB_NOT_PINNED;
}
@@ -1183,6 +1185,18 @@ static void wait_sb_inodes(struct super_block *sb)
iput(old_inode);
}
+static void __writeback_inodes_sb(struct super_block *sb, int sb_locked)
+{
+ unsigned long nr_dirty = global_page_state(NR_FILE_DIRTY);
+ unsigned long nr_unstable = global_page_state(NR_UNSTABLE_NFS);
+ long nr_to_write;
+
+ nr_to_write = ...Yes, the core idea definitely looks better. Caller must know in what conditions it submit work to flush task and umount is just one of such 'special' situations. BTW do we have writeback-subsystem maintainer who --
https://bugzilla.kernel.org/show_bug.cgi?id=15906 --- Comment #14 from Dmitry Monakhov <dmonakhov@openvz.org> 2010-05-05 07:30:04 --- Created an attachment (id=26224) --> (https://bugzilla.kernel.org/attachment.cgi?id=26224) Patch attached fix original regression -- Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are watching the assignee of the bug. --
https://bugzilla.kernel.org/show_bug.cgi?id=15906 Jens Axboe <axboe@kernel.dk> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |axboe@kernel.dk --- Comment #15 from Theodore Tso <tytso@mit.edu> 2010-05-05 19:44:04 --- The summary for this patch should probably be changed, yes? It should happen with filesystems other than ext4 that use barriers (i.e., btrfs, XFS), and it should happen on a single disk just as easily as a LVM. The latter I can confirm --- you can see the performance degradation quite easily on a single disk, so the mention of LVM in the summary is probably misleading. -- Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are watching the assignee of the bug. --
https://bugzilla.kernel.org/show_bug.cgi?id=15906 Kees Cook <kees@outflux.net> changed: What |Removed |Added ---------------------------------------------------------------------------- Summary|serious performance |performance regression in |regression in "umount" on |"umount" of filesystems |ext4 over LVM |using boundaries Summary|serious performance |performance regression in |regression in "umount" on |"umount" of filesystems |ext4 over LVM |using barriers -- Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are watching the assignee of the bug. --
QAqPqqaAAaQw On 5/5/10, bugzilla-daemon@bugzilla.kernel.org -- Sent from my mobile device Greg Freemyer Head of EDD Tape Extraction and Processing team Litigation Triage Solutions Specialist http://www.linkedin.com/in/gregfreemyer CNN/TruTV Aired Forensic Imaging Demo - http://insession.blogs.cnn.com/2010/03/23/how-computer-evidence-gets-retrieved/ The Norcross Group The Intersection of Evidence & Technology http://www.norcrossgroup.com --
https://bugzilla.kernel.org/show_bug.cgi?id=15906 Kees Cook <kees@outflux.net> changed: What |Removed |Added ---------------------------------------------------------------------------- Summary|serious performance |performance regression in |regression in "umount" on |"umount" of filesystems |ext4 over LVM |using boundaries Summary|serious performance |performance regression in |regression in "umount" on |"umount" of filesystems |ext4 over LVM |using barriers -- Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are watching the assignee of the bug. --
https://bugzilla.kernel.org/show_bug.cgi?id=15906 --- Comment #16 from Greg.Freemyer@gmail.com 2010-05-05 21:36:46 --- QAqPqqaAAaQw On 5/5/10, bugzilla-daemon@bugzilla.kernel.org -- Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are watching the assignee of the bug. --
That was the equivalent of a butt dial. I'd delete it from the bugzilla if I knew how. Sorry Greg -- Greg Freemyer Head of EDD Tape Extraction and Processing team Litigation Triage Solutions Specialist http://www.linkedin.com/in/gregfreemyer CNN/TruTV Aired Forensic Imaging Demo - http://insession.blogs.cnn.com/2010/03/23/how-computer-evidence-gets-retrieved/ The Norcross Group The Intersection of Evidence & Technology http://www.norcrossgroup.com --
https://bugzilla.kernel.org/show_bug.cgi?id=15906 --- Comment #17 from Dmitry Monakhov <dmonakhov@openvz.org> 2010-05-06 03:48:58 --- Just a note, performance degradation is not that huge for XFS, as soon as i understand this is because it batch several inode writes in to one 'log' so less barriers required But nor than less this is generic writeback issue. -- Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are watching the assignee of the bug. --
https://bugzilla.kernel.org/show_bug.cgi?id=15906 --- Comment #18 from Jens Axboe <axboe@kernel.dk> 2010-05-06 07:08:55 --- Created an attachment (id=26255) --> (https://bugzilla.kernel.org/attachment.cgi?id=26255) Explicitly pass in whether sb is pinned or not Revision two of this patch. I got rid of WB_SYNC_NONE_PIN, since we should not further muddy the sync type and whether or not the sb is pinned. Instead lets add a specific flag for this. Untested, but it compiles. -- Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are watching the assignee of the bug. --
https://bugzilla.kernel.org/show_bug.cgi?id=15906 --- Comment #19 from Jens Axboe <axboe@kernel.dk> 2010-05-06 07:10:30 --- BTW, I believe that Chris has observed this problem on btrfs as well. If you have a lot of dirty inodes and do the umount, all the writeback will be done with WB_SYNC_ALL which will cause a log commit for each on btrfs. End result is the same, umount runs a lot slower than it otherwise should have. -- Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are watching the assignee of the bug. --
https://bugzilla.kernel.org/show_bug.cgi?id=15906 --- Comment #20 from Dmitry Monakhov <dmonakhov@openvz.org> 2010-05-07 09:45:32 --- I can confirm that Jens's fix the original regression for me, so IMHO first part well done. About second one. Proposed patch was posted may be found here: http://marc.info/?l=linux-fsdevel&m=127322500915287&w=2 -- Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are watching the assignee of the bug. --
https://bugzilla.kernel.org/show_bug.cgi?id=15906 Thierry Vignaud <thierry.vignaud@gmail.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |thierry.vignaud@gmail.com --- Comment #20 from Dmitry Monakhov <dmonakhov@openvz.org> 2010-05-07 09:45:32 --- I can confirm that Jens's fix the original regression for me, so IMHO first part well done. About second one. Proposed patch was posted may be found here: http://marc.info/?l=linux-fsdevel&m=127322500915287&w=2 --- Comment #21 from Dmitry Monakhov <dmonakhov@openvz.org> 2010-05-07 10:49:00 --- ######################################### # Testcase #Prep stage for ((i=0;i<1000;i++)) ;do echo test > /tmp/FILES/file-$i ;done tar cf /tmp/files.tar /tmp/FILES #Actual measurements FSTYPE=ext4 OPT='-obarrier=1' mkfs.$FSTYPE /dev/sdb1 mount /dev/sdb1 /mnt $OPT time tar xf /tmp/files.tar -C /mnt/ time umount /mnt ######################################### w/o update dflags patch real 0m0.142s user 0m0.005s sys 0m0.093s real 0m8.506s user 0m0.000s sys 0m0.016s with update dflags patch real 0m0.105s user 0m0.006s sys 0m0.086s real 0m0.148s user 0m0.002s sys 0m0.007s Seems where are no visible diffidence for btrfs and xfs. -- Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are watching the assignee of the bug. --
https://bugzilla.kernel.org/show_bug.cgi?id=15906 Dmitry Monakhov <dmonakhov@openvz.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Attachment #26224|0 |1 is obsolete| | --- Comment #22 from Dmitry Monakhov <dmonakhov@openvz.org> 2010-05-17 03:28:50 --- (From update of attachment 26224) Superseded by Jens fix. -- Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are watching the assignee of the bug. --
https://bugzilla.kernel.org/show_bug.cgi?id=15906 --- Comment #23 from Dmitry Monakhov <dmonakhov@openvz.org> 2010-05-17 03:44:19 --- Created an attachment (id=26403) --> (https://bugzilla.kernel.org/attachment.cgi?id=26403) Fix async inodes writeback for FS with delalloc Attach second patch, after it has passed 1 week review cycle in a list. Seems that we have an agreement for both issues. IMHO They are 99.999% percent candidates for 2.6.35-rc1(and then for stable). Who will take care about patches? Jens, will you? -- Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are watching the assignee of the bug. --
https://bugzilla.kernel.org/show_bug.cgi?id=15906 --- Comment #24 from Jens Axboe <axboe@kernel.dk> 2010-05-17 10:52:54 --- I'll add both patches to for-2.6.35 and mark them with a stable tag. -- Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are watching the assignee of the bug. --
https://bugzilla.kernel.org/show_bug.cgi?id=15906 Andrew Morton <akpm@linux-foundation.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED CC| |akpm@linux-foundation.org Resolution| |CODE_FIX -- Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are watching the assignee of the bug. --
https://bugzilla.kernel.org/show_bug.cgi?id=15906 --- Comment #25 from Kees Cook <kees@outflux.net> 2010-05-19 19:12:08 --- Thanks again to everyone who dug into this. I'd been pulling my hair out out for months on it. :) -- Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are watching the assignee of the bug. --
https://bugzilla.kernel.org/show_bug.cgi?id=15906 Chuck Ebbert <cebbert@redhat.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |cebbert@redhat.com --- Comment #26 from Chuck Ebbert <cebbert@redhat.com> 2010-05-21 01:04:55 --- This patch is triggering a warning: [ http://www.kerneloops.org/oops.php?number=3142224 ] WARNING: at fs/fs-writeback.c:597 writeback_inodes_wb+0x229/0x380() Hardware name: MS-7250 Modules linked in: fuse sunrpc ip6t_REJECT nf_conntrack_ipv6 ip6table_filter ip6_tables ipv6 ext2 dm_multipath kvm_amd kvm uinput snd_hda_codec_atihdmi snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_hwdep snd_seq snd_seq_device rndis_wlan snd_pcm cfg80211 snd_timer snd rfkill soundcore snd_page_alloc rndis_host edac_core ppdev cdc_ether i2c_nforce2 usbnet k8temp edac_mce_amd parport_pc forcedeth mii parport serio_raw sha256_generic aes_x86_64 aes_generic cbc dm_crypt firewire_ohci pata_acpi firewire_core crc_itu_t floppy ata_generic pata_amd sata_nv radeon ttm drm_kms_helper drm i2c_algo_bit i2c_core [last unloaded: scsi_wait_scan] Pid: 1035, comm: flush-253:0 Not tainted 2.6.33.4-95.fc13.x86_64 #1 Call Trace: [] warn_slowpath_common+0x77/0x8f [] warn_slowpath_null+0xf/0x11 [] writeback_inodes_wb+0x229/0x380 [] wb_writeback+0x13d/0x1bc [] ? call_rcu_sched+0x10/0x12 [] ? call_rcu+0x9/0xb [] wb_do_writeback+0x69/0x14d [] bdi_writeback_task+0x3a/0xaa [] ? bdi_start_fn+0x0/0xd4 [] bdi_start_fn+0x6c/0xd4 [] ? bdi_start_fn+0x0/0xd4 [] kthread+0x7a/0x82 [] kernel_thread_helper+0x4/0x10 [] ? kthread+0x0/0x82 [] ? kernel_thread_helper+0x0/0x10 -- Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are watching the assignee of the bug. --
https://bugzilla.kernel.org/show_bug.cgi?id=15906 --- Comment #27 from Eric Sandeen <sandeen@redhat.com> 2010-05-21 02:07:01 --- static int pin_sb_for_writeback(struct writeback_control *wbc, struct inode *inode, struct super_block **psb) { struct super_block *sb = inode->i_sb; /* * If this sb is already pinned, nothing more to do. If not and * *psb is non-NULL, unpin the old one first */ if (sb == *psb) return 0; else if (*psb) unpin_sb_for_writeback(psb); /* * Caller must already hold the ref for this */ if (wbc->sync_mode == WB_SYNC_ALL || wbc->sb_pinned) { WARN_ON(!rwsem_is_locked(&sb->s_umount)); <<----- here return 0; } http://git.kernel.org/?p=linux/kernel/git/axboe/linux-2.6-block.git;a=commitdiff;h=e91... is what Jens had upstream, and I think http://git.kernel.org/?p=linux/kernel/git/axboe/linux-2.6-block.git;a=commitdiff;h=30f... -- Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are watching the assignee of the bug. --
https://bugzilla.kernel.org/show_bug.cgi?id=15906 --- Comment #28 from Jens Axboe <axboe@kernel.dk> 2010-05-21 06:08:09 --- Eric is spot on, that latter commit does indeed fix up that issue. The problem is that the caller holds the umount sem when initiating the WB_SYNC_NONE writeback, and thus it passes down ->sb_pinned == 1. But since we clear the work state for WB_SYNC_NONE early, the caller can then drop the umount sem before we are actually done writing out inodes. So the fix is queued up with the original patch, when it goes upstream there should be no warnings. -- Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are watching the assignee of the bug. --
https://bugzilla.kernel.org/show_bug.cgi?id=15906 --- Comment #29 from Eric Sandeen <sandeen@redhat.com> 2010-05-22 04:38:10 --- (Fedora has the updated patches now, btw - 2.6.33.4-102.fc13.x86_64 I think) -- Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are watching the assignee of the bug. --
https://bugzilla.kernel.org/show_bug.cgi?id=15906 Brian Bloniarz <phunge0@hotmail.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |phunge0@hotmail.com --- Comment #30 from Brian Bloniarz <phunge0@hotmail.com> 2010-05-26 16:44:37 --- I'm trying to get these fixes merged into the ubuntu release, because ubuntu's workaround has issues. The downstream bug for this is https://bugs.launchpad.net/ubuntu/+source/linux/+bug/585092 I backported to ubuntu's kernel, which took a little fixup. You'll find the patches attached on that bug. 0001-writeback-fix-WB_SYNC_NONE-writeback-from-umount.patch 0002-writeback-Update-dirty-flags-in-two-steps.patch 0003-writeback-ensure-that-WB_SYNC_NONE-writeback-with-sb.patch If anyone who's familiar with these code paths could review my backports, I'd appreciate it. They apply cleanly to mainline v2.6.32.11, which is the base for ubuntu's kernel. -- Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are watching the assignee of the bug. --
