Yeah, the background threshold is not (yet) scaled. So it can happen
that the bdi_dirty limit is below the background limit.
I'm curious though as to these stalls, though, I can't seem to think of
what goes wrong.. esp since most writeback seems to happen from pdflush.
(or I'm totally misreading it - quite a possible as I'm still recovering
from a serious cold and not all the green stuff has yet figured out its
proper place wrt brain cells 'n stuff)
I still have this patch floating around:
---
Subject: mm: speed up writeback ramp-up on clean systems
We allow violation of bdi limits if there is a lot of room on the
system. Once we hit half the total limit we start enforcing bdi limits
and bdi ramp-up should happen. Doing it this way avoids many small
writeouts on an otherwise idle system and should also speed up the
ramp-up.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
---
mm/page-writeback.c | 19 +++++++++++++++++--
1 file changed, 17 insertions(+), 2 deletions(-)
Index: linux-2.6/mm/page-writeback.c
===================================================================
--- linux-2.6.orig/mm/page-writeback.c 2007-09-28 10:08:33.937415368 +0200
+++ linux-2.6/mm/page-writeback.c 2007-09-28 10:54:26.018247516 +0200
@@ -355,8 +355,8 @@ get_dirty_limits(long *pbackground, long
*/
static void balance_dirty_pages(struct address_space *mapping)
{
- long bdi_nr_reclaimable;
- long bdi_nr_writeback;
+ long nr_reclaimable, bdi_nr_reclaimable;
+ long nr_writeback, bdi_nr_writeback;
long background_thresh;
long dirty_thresh;
long bdi_thresh;
@@ -376,11 +376,26 @@ static void balance_dirty_pages(struct a
get_dirty_limits(&background_thresh, &dirty_thresh,
&bdi_thresh, bdi);
+
+ nr_reclaimable = global_page_state(NR_FILE_DIRTY) +
+ global_page_state(NR_UNSTABLE_NFS);
+ nr_writeback = global_page_state(NR_WRITEBACK);
+
bdi_nr_reclaimable = bdi_stat(bdi, BDI_RECLAIMABLE);
bdi_nr_writeback = bdi_stat(bdi, ...Me confused too. The new debug patch will confirm whether emerge is I think this patch is OK for 2.6.24 :-) -
On Fri, 2 Nov 2007 18:33:29 +0800 Given the problems we're having in there I'm a bit reluctant to go tossing hastily put together and inadequately tested stuff onto the fire. And that's what this patch looks like to me. Wanna convince me otherwise? -
Not really hastily, I think it was written before the stuff hit
mainline. Inadequately tested, perhaps, its been in my and probably Wu's
kernels for a while. Granted that's not a lot of testing in the face of
I'm perfectly happy with this patch earning its credits in -mm for a
while and maybe going in around -rc4 or something like that (hoping that
by then we've fixed these nagging issues).
Another patch I did come up with yesterday - not driven by any problems
in that area - could perhaps join this one on that path:
---
Subject: mm: bdi: tweak task dirty penalty
Penalizing heavy dirtiers with 1/8-th the total dirty limit might be rather
excessive on large memory machines. Use sqrt to scale it sub-linearly.
Update the comment while we're there.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
---
mm/page-writeback.c | 12 ++++++++----
1 file changed, 8 insertions(+), 4 deletions(-)
Index: linux-2.6-2/mm/page-writeback.c
===================================================================
--- linux-2.6-2.orig/mm/page-writeback.c
+++ linux-2.6-2/mm/page-writeback.c
@@ -213,17 +213,21 @@ static inline void task_dirties_fraction
}
/*
- * scale the dirty limit
+ * Task specific dirty limit:
*
- * task specific dirty limit:
+ * dirty -= 8 * sqrt(dirty) * p_{t}
*
- * dirty -= (dirty/8) * p_{t}
+ * Penalize tasks that dirty a lot of pages by lowering their dirty limit. This
+ * avoids infrequent dirtiers from getting stuck in this other guys dirty
+ * pages.
+ *
+ * Use a sub-linear function to scale the penalty, we only need a little room.
*/
void task_dirty_limit(struct task_struct *tsk, long *pdirty)
{
long numerator, denominator;
long dirty = *pdirty;
- u64 inv = dirty >> 3;
+ u64 inv = 8*int_sqrt(dirty);
task_dirties_fraction(tsk, &numerator, &denominator);
inv *= numerator;
-
I still have not seen a trigger of the "throttle_vm_writeout". This time installing 2.6.24-rc1 again it not even triggerd any other debugs apart from the one in wb_kupdate. But 300Mb of new files might still not trigger this with 4Gb of RAM. I'm currently testing 2.6.23-mm1 with this patch and the second I also don't know. But looking at the time the system takes to write out 8kb, I'm starting to suspect that something is writing this out, applied, but did not fix the stalls. Here the complete log from vmstat 10 and the syslog from an install of vanilla 2.6.24-rc1. (Please note: I installed the source of vanilla 2.6.24-rc1, but I am still using 2.6.23-mm1!) All lines with [note] are my comments about what the system was doing, both logs are from the same run, so the notes should be more or less in sync. I used SysRq+L to insert the SysRq-Helptext into the syslog as marker... The visible effects are similar to the unmerge run, but the stalls during the moving did only start later. But the same effect after emerge finished and the almost all dirty data was written, was visible: I can still hear the disks and see the hdd light flickering (mostly on) for much, much longer than it should take to write 8kb. vmstat 10: [note]emerge start 1 0 0 3668496 332 187748 0 0 0 29 39 491 3 0 96 0 1 0 0 3623940 332 188880 0 0 83 17 1724 3893 15 2 81 1 0 0 0 3559488 332 252432 0 0 1021 48 11719 4536 9 4 74 13 2 0 0 3482220 332 311916 0 0 70 60 93 3818 11 3 86 0 1 0 0 3289352 332 486932 0 0 2 35 33 11997 25 3 72 0 1 0 0 3174036 332 596412 0 0 10 33 35 3937 21 4 75 0 2 0 0 3215756 332 555292 0 0 6 28 85 742 12 12 76 0 2 0 0 3202128 332 559792 0 0 32 9 34 1566 31 1 68 0 2 0 0 3192804 332 568072 0 0 60 46 172 4206 30 2 67 1 3 0 0 3202424 332 ...
That's stalled waiting on the inode cluster buffer lock. That implies that the inode lcuser is already being written out and the inode has been redirtied during writeout. Does the kernel you are testing have the "flush inodes in ascending inode number order" patches applied? If so, can you remove that patch and see if the problem goes away? Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group -
It's 2.6.23-mm1 with only some small fixes. In it's broken-out directory I see: git-xfs.patch and writeback-fix-periodic-superblock-dirty-inode-flushing.patch writeback-fix-time-ordering-of-the-per-superblock-dirty-inode-lists-2.patch writeback-fix-time-ordering-of-the-per-superblock-dirty-inode-lists-3.patch writeback-fix-time-ordering-of-the-per-superblock-dirty-inode-lists-4.patch writeback-fix-time-ordering-of-the-per-superblock-dirty-inode-lists-5.patch writeback-fix-time-ordering-of-the-per-superblock-dirty-inode-lists-6.patch writeback-fix-time-ordering-of-the-per-superblock-dirty-inode-lists-7.patch writeback-fix-time-ordering-of-the-per-superblock-dirty-inode-lists.patch writeback-fix-time-ordering-of-the-per-superblock-inode-lists-8.patch writeback-introduce-writeback_controlmore_io-to-indicate-more-io.patch I don't know if the patch you mentioned is part of that version of the mm-patchset. Torsten -
I can now confirm, that I see this also with the current mainline-git-version I used 2.6.24-rc1-git-b4f555081fdd27d13e6ff39d455d5aefae9d2c0c plus the fix for the sg changes in ieee1394. Bisecting would be troublesome, as the sg changes prevent mainline to boot with my normal config / kill my network. treogen ~ # vmstat 10 procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa -> starting emerge 1 0 0 3627072 332 157724 0 0 97 13 41 189 2 2 94 2 0 0 0 3607240 332 163736 0 0 599 10 332 951 2 1 93 4 0 0 0 3601920 332 167592 0 0 380 2 218 870 1 1 98 0 0 0 0 3596356 332 171648 0 0 404 21 182 818 0 0 99 0 0 0 0 3579328 332 180436 0 0 878 12 147 912 1 1 97 2 0 0 0 3575376 332 182776 0 0 236 4 244 953 1 1 95 3 2 1 0 3571792 332 185084 0 0 232 7 256 1003 2 1 95 2 0 0 0 3564844 332 187364 0 0 228 605 246 1167 2 1 93 4 0 0 0 3562128 332 189784 0 0 230 4 527 1238 2 1 93 4 0 1 0 3558764 332 191964 0 0 216 24 438 1059 1 1 93 6 0 0 0 3555120 332 193868 0 0 199 36 406 959 0 0 92 8 0 0 0 3552008 332 195928 0 0 197 11 458 1023 1 1 90 8 0 0 0 3548728 332 197660 0 0 183 7 496 1086 1 1 90 8 0 0 0 3545560 332 199372 0 0 170 8 483 1017 1 1 90 9 0 1 0 3542124 332 201256 0 0 190 1 544 1137 1 1 88 10 1 0 0 3536924 332 203296 0 0 195 7 637 1209 2 1 89 8 1 1 0 3485096 332 249184 0 0 101 16 10372 4537 13 3 76 8 2 0 0 3442004 332 279728 0 0 1086 40 219 1349 7 3 87 4 -> emerge is done reading its package database 1 0 0 3254796 332 448636 0 ...
Ok, so it's probably a side effect of the writeback changes. Attached are two patches (two because one was in a separate patchset as a standalone change) that should prevent async writeback from blocking on locked inode cluster buffers. Apply the xfs-factor-inotobp patch first. Can you see if this fixes the problem? Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group
Applied both patches against the kernel mentioned above. This blows up at boot: [ 80.807589] Filesystem "dm-0": Disabling barriers, not supported by the underlying device [ 80.820241] XFS mounting filesystem dm-0 [ 80.913144] ------------[ cut here ]------------ [ 80.914932] kernel BUG at drivers/md/raid5.c:143! [ 80.916751] invalid opcode: 0000 [1] SMP [ 80.918338] CPU 3 [ 80.919142] Modules linked in: [ 80.920345] Pid: 974, comm: md1_raid5 Not tainted 2.6.24-rc1 #3 [ 80.922628] RIP: 0010:[<ffffffff804b6ee4>] [<ffffffff804b6ee4>] __release_stripe+0x164/0x170 [ 80.925935] RSP: 0018:ffff8100060e7dd0 EFLAGS: 00010002 [ 80.927987] RAX: 0000000000000000 RBX: ffff81010141c288 RCX: 0000000000000000 [ 80.930738] RDX: 0000000000000000 RSI: ffff81010141c288 RDI: ffff810004fb3200 [ 80.933488] RBP: ffff810004fb3200 R08: 0000000000000000 R09: 0000000000000005 [ 80.936240] R10: 0000000000000e00 R11: ffffe200038465e8 R12: ffff81010141c298 [ 80.938990] R13: 0000000000000286 R14: ffff810004fb3330 R15: 0000000000000000 [ 80.941741] FS: 000000000060c870(0000) GS:ffff810100313700(0000) knlGS:0000000000000000 [ 80.944861] CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b [ 80.947080] CR2: 00007fff7b295000 CR3: 0000000101842000 CR4: 00000000000006e0 [ 80.949830] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 80.952580] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 80.955332] Process md1_raid5 (pid: 974, threadinfo ffff8100060e6000, task ffff81000645c730) [ 80.958584] Stack: ffff81010141c288 00000000000001f4 ffff810004fb3200 ffffffff804b6f2d [ 80.961761] 00000000000001f4 ffff81010141c288 ffffffff804c8bd0 0000000000000000 [ 80.964681] ffff8100060e7ee8 ffffffff804bd094 ffff81000645c730 ffff8100060e7e70 [ 80.967518] Call Trace: [ 80.968558] [<ffffffff804b6f2d>] release_stripe+0x3d/0x60 [ 80.970677] [<ffffffff804c8bd0>] md_thread+0x0/0x100 [ 80.972629] [<ffffffff804bd094>] ...
Now testing v2.6.24-rc1-650-gb55d1b1+ the fix for the missapplied raid5-patch Applying your two patches ontop of that does not fix the stalls. vmstat 10 output from unmerging (uninstalling) a kernel: 1 0 0 3512188 332 192644 0 0 185 12 368 735 10 3 85 1 -> emerge starts to remove the kernel source files 3 0 0 3506624 332 192836 0 0 15 9825 2458 8307 7 12 81 0 0 0 0 3507212 332 192836 0 0 0 554 630 1233 0 1 99 0 0 0 0 3507292 332 192836 0 0 0 537 580 1328 0 1 99 0 0 0 0 3507168 332 192836 0 0 0 633 626 1380 0 1 99 0 0 0 0 3507116 332 192836 0 0 0 1510 768 2030 1 2 97 0 0 0 0 3507596 332 192836 0 0 0 524 540 1544 0 0 99 0 procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 0 0 0 3507540 332 192836 0 0 0 489 551 1293 0 0 99 0 0 0 0 3507528 332 192836 0 0 0 527 510 1432 1 1 99 0 0 0 0 3508052 332 192840 0 0 0 2088 910 2964 2 3 95 0 0 0 0 3507888 332 192840 0 0 0 442 565 1383 1 1 99 0 0 0 0 3508704 332 192840 0 0 0 497 529 1479 0 0 99 0 0 0 0 3508704 332 192840 0 0 0 594 595 1458 0 0 99 0 0 0 0 3511492 332 192840 0 0 0 2381 1028 2941 2 3 95 0 0 0 0 3510684 332 192840 0 0 0 699 600 1390 0 0 99 0 0 0 0 3511636 332 192840 0 0 0 741 661 1641 0 0 100 0 0 0 0 3524020 332 192840 0 0 0 2452 1080 3910 2 3 95 0 0 0 0 3524040 332 192844 0 0 0 530 617 1297 0 0 99 0 0 0 0 3524128 332 192844 0 0 0 812 674 1667 0 1 99 0 0 0 0 3527000 332 193672 0 0 339 721 754 1681 3 2 93 1 -> emerge is finished, no ...
So you are having RAID5 problems as well? I'm struggling to understand what possible changed in XFS or writeback that would lead to stalls like this, esp. as you appear to be removing files when the stalls occur. Rather than vmstat, can you use something like iostat to show how busy your disks are? i.e. are we seeing RMW cycles in the raid5 or some such issue. At this point, can you run a "sync" and see how long that takes to complete? The only thing I can think that woul dbe written out after this point is inodes, but even then it seems to go on for a long, long time and it really doesn't seem like XFS is holding up the inode writes. Another option is to use blktrace/blkparse to determine which process is Well, the patches I posted should prevent blocking in the places that it was seen, so if that does not stop the slowdowns then either the writeback code is not feeding us inodes fast enough or the block device below is having some kind of problem.... Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group -
The first 2.6.24-rc1-git-kernel that I patched with your patches did
not boot for me. (Oops send in one of my previous mails) But given
that the stacktrace was not xfs related and I had seen this patch on
the lkml, I tried to fix this Oops this way.
meta-data=/dev/mapper/root isize=256 agcount=32, agsize=4731132 blks
= sectsz=512 attr=1
data = bsize=4096 blocks=151396224, imaxpct=25
= sunit=0 swidth=0 blks, unwritten=1
naming =version 2 bsize=4096
log =internal bsize=4096 blocks=32768, version=1
= sectsz=512 sunit=0 blks, lazy-count=0
Already tried that: http://lkml.org/lkml/2007/11/2/178
See the logs from the second unmerge in the second half of the mail.
Yes, I completly agree that this is much to long. Thats why I included
the after-emerge-finished parts of the logs. But I still partly
I don't think its the block device, because reading/writing larger
files do not seem to be troubled. It looks much more like an inode
problem. For example both installing and uninstalling kernel source
trees show these stalls, but during uninstalling this is much more
noticeable.
But I agree that this might not be xfs specific, as this showed up at
the same time as other people started reporting about the 100% iowait
bug. Could be that this is the same bug and the differences between
reiserfs and xfs might explain the iowait vs. idle. Or that I don't
see the 100% iowait is something else on my system...
Torsten
-
Just a crazy idea,.. Could there be a set_page_dirty() that doesn't have balance_dirty_pages() call near? For example modifying meta data in unlink? Such a situation could lead to an excess of dirty pages and the next call to balance_dirty_pages() would appear to stall, as it would desperately try to get below the limit again. -
Only if accounting of the dirty pages is also broken. In the unmerge testcase I see most of the time only <200kb of dirty data in /proc/meminfo. The system has 4Gb of RAM so I'm not sure if it should ever be valid to stall even the emerge/install testcase. Torsten Now building a kernel with the skipped-pages-accounting-patch reverted... -
The above messages and the below 'D' state pdflush indicate that one single writeback_inodes(4MB) call takes a long time(up to 10s!) to complete. Let's try reverting the below patch with `patch -R`? It looks like ------------------------------------------------------ Subject: writeback: remove pages_skipped accounting in __block_write_full_page() From: Fengguang Wu <wfg@mail.ustc.edu.cn> It can be produced by the following test scheme: # cat bin/test-writeback.sh grep nr_dirty /proc/vmstat echo 1 > /proc/sys/fs/inode_debug dd if=/dev/zero of=/var/x bs=1K count=204800& while true; do grep nr_dirty /proc/vmstat; sleep 1; done # bin/test-writeback.sh nr_dirty 19207 nr_dirty 19207 nr_dirty 30924 204800+0 records in 204800+0 records out 209715200 bytes (210 MB) copied, 1.58363 seconds, 132 MB/s nr_dirty 47150 nr_dirty 47141 nr_dirty 47142 nr_dirty 47142 nr_dirty 47142 nr_dirty 47142 nr_dirty 47205 nr_dirty 47214 nr_dirty 47214 nr_dirty 47214 nr_dirty 47214 nr_dirty 47214 nr_dirty 47215 nr_dirty 47216 nr_dirty 47216 nr_dirty 47216 nr_dirty 47154 nr_dirty 47143 nr_dirty 47143 nr_dirty 47143 nr_dirty 47143 nr_dirty 47143 nr_dirty 47142 nr_dirty 47142 nr_dirty 47142 nr_dirty 47142 nr_dirty 47134 nr_dirty 47134 nr_dirty 47135 nr_dirty 47135 nr_dirty 47135 nr_dirty 46097 <== -1038 nr_dirty 46098 nr_dirty 46098 nr_dirty 46098 [...] nr_dirty 46091 nr_dirty 46092 nr_dirty 46092 nr_dirty 45069 <== -1023 nr_dirty 45056 nr_dirty 45056 nr_dirty 45056 [...] nr_dirty 37822 nr_dirty 36799 <== -1023 [...] nr_dirty 36781 nr_dirty 35758 <== -1023 [...] nr_dirty 34708 nr_dirty 33672 <== -1024 [...] nr_dirty 33692 nr_dirty 32669 <== -1023 % ls -li /var/x 847824 -rw-r--r-- 1 root root 200M 2007-08-12 04:12 /var/x % dmesg|grep 847824 # generated by a debug printk [ 529.263184] redirtied inode 847824 line 548 [ 564.250872] redirtied inode 847824 line 548 [ 594.272797] redirtied inode 847824 line 548 [ 629.231330] redirtied inode 847824 ...
I have now testet v2.6.24-rc1-748-g2655e2c with above patch reverted. This does still stall. Both "vmstat 10" and "iostat -x 10" output from this test: procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 2 0 0 3700592 0 85424 0 0 31 83 108 244 2 1 95 1 -> emerge reads something, don't knwo for sure what... 1 0 0 3665352 0 87940 0 0 239 2 343 585 2 1 97 0 0 0 0 3657728 0 91228 0 0 322 35 445 833 0 0 99 0 1 0 0 3653136 0 94692 0 0 330 33 455 844 1 1 98 0 0 0 0 3646836 0 97720 0 0 289 3 422 751 1 1 98 0 0 0 0 3616468 0 99692 0 0 185 33 399 614 9 3 87 1 -> starts to remove the kernel tree 0 0 0 3610452 0 102592 0 0 138 3598 1398 3945 3 6 90 1 0 0 0 3607136 0 104548 0 0 2 5962 1919 6070 4 9 87 0 0 0 0 3606636 0 105080 0 0 0 1539 810 2200 1 2 97 0 -> first stall 28 sec. 0 0 0 3606592 0 105292 0 0 0 698 679 1390 0 1 99 0 0 0 0 3606440 0 105532 0 0 0 658 690 1457 0 0 99 0 0 0 0 3606068 0 106128 0 0 1 1780 947 1982 1 3 96 0 -> second stall 24 sec. 0 0 0 3606036 0 106464 0 0 4 858 758 1457 0 1 98 0 0 0 0 3605380 0 106872 0 0 0 1173 807 1880 1 2 97 0 0 0 0 3605000 0 107748 0 0 1 2413 1103 2996 2 4 94 0 -> third stall 38 sec. 0 0 0 3604488 0 108472 0 0 45 897 748 1577 0 1 98 0 0 0 0 3604176 0 108764 0 0 0 824 752 1700 0 1 98 0 0 0 0 3604012 0 108988 0 0 0 660 643 1237 0 1 99 0 0 0 0 3608936 0 110120 0 0 1 3490 1232 3455 3 5 91 0 -> fourth stall 64 sec. 1 0 0 ...
^^ Apparently we are doing synchronous writes. That would explain why it is slow. We shouldn't be doing synchronous writes here. I'll see if I can reproduce this. <goes off and looks> Yes, I can reproduce the sync writes coming out of xfssyncd. I'll look into this further and send a patch when I have something concrete. Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group -
Ok, so it's not synchronous writes that we are doing - we're just submitting bio's tagged as WRITE_SYNC to get the I/O issued quickly. The "synchronous" nature appears to be coming from higher level locking when reclaiming inodes (on the flush lock). It appears that inode write clustering is failing completely so we are writing the same block multiple times i.e. once for each inode in the cluster we have to write. This must be a side effect of some other change as we haven't changed anything in the reclaim code recently..... /me scurries off to run some tests Indeed it is. The patch below should fix the problem - the inode clusters weren't getting set up properly when inodes were being read in or allocated. This is a regression, introduced by this mod: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=da353b0d64... Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group --- fs/xfs/xfs_iget.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) Index: 2.6.x-xfs-new/fs/xfs/xfs_iget.c =================================================================== --- 2.6.x-xfs-new.orig/fs/xfs/xfs_iget.c 2007-11-02 13:44:46.000000000 +1100 +++ 2.6.x-xfs-new/fs/xfs/xfs_iget.c 2007-11-07 13:08:42.534440675 +1100 @@ -248,7 +248,7 @@ finish_inode: icl = NULL; if (radix_tree_gang_lookup(&pag->pag_ici_root, (void**)&iq, first_index, 1)) { - if ((iq->i_ino & mask) == first_index) + if ((XFS_INO_TO_AGINO(mp, iq->i_ino) & mask) == first_index) icl = iq->i_cluster; } -
Works for me. The only remaining stalls are sub second and look completely valid, considering the amount of files being removed. iostat 10 from this test: 3 0 0 3500192 332 204956 0 0 105 8512 1809 6473 6 10 83 1 0 0 0 3500200 332 204576 0 0 0 4367 1355 3712 2 6 92 0 2 0 0 3504264 332 203528 0 0 0 6805 1912 4967 4 8 88 0 0 0 0 3511632 332 203528 0 0 0 2843 805 1791 2 4 94 0 0 0 0 3516852 332 203516 0 0 0 3375 879 2712 3 5 93 0 0 0 0 3530544 332 202668 0 0 186 776 488 1152 4 2 89 4 0 0 0 3574788 332 204960 0 0 226 326 358 787 0 1 98 0 0 0 0 3576820 332 204960 0 0 0 376 332 737 0 0 99 0 0 0 0 3578432 332 204960 0 0 0 356 293 606 1 1 99 0 0 0 0 3580192 332 204960 0 0 0 101 104 384 0 0 99 0 I'm pleased to note that this is now much faster again. Thanks! Tested-by: Torsten Kaiser <just.for.lkml@googlemail.com> CC's please note: It looks like this was really a different problem then the 100% iowait that was seen with reiserfs. Also the one complete stall I have seen is probably something else. But I have not been able to reproduce this again with -mm and have never seen this on mainline, so I will just ignore that single event until I see it again. -
Great - thanks for reporting the problem and testing the fix. Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group -
This patch has not yet made its way into 2.6.24 (rc3). Is it intended? Maybe the fix can wait for 2.6.25, but wanted to make sure... -- Damien Wyart -
The patch is in the XFS dev tree being QA'd, and we will push it to 2.6.24-rcX in the next few days. Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group -
