xfs slow to delete directory trees

Submitted by Anonymous
on April 26, 2009 - 11:23am

xfs is so slow at deleting directory trees that I wonder if there's some performance problem with the xfs code. I have read this is a known characteristic of xfs, but still, minutes, as in, more than 120 seconds, to delete a few lousy directories containing a mere 370M of files?

On the same hardware, while not busy, deleting the Linux kernel source under xfs and under reiserfs took 9 seconds and 1.5 seconds respectively. And that's if the options for xfs are friendly. With unfriendly xfs options, rm took 90 seconds. If the system is busy, it's far worse-- the worst time I saw was over 5 minutes! I've seen this on several different servers (but all from the same manufacturer). Perhaps most suspicious of all, top shows that rm is taking a lot of CPU time and its status is usually 'D'. But under reiserfs, the deletion is still reasonably fast. Why is this operation so slow under xfs?

One factor is that we are not using SAS drives. I heard speculation that SATA is poor at this sort of operation. We've also tried fiddling with xfs options. The xfs options currently used on /home seem to be about the fastest we can make xfs. The xfs options used on /ROOTDOCS aren't so good.

Here's the dirt on a relatively simple server, no RAID, only 1 hard drive.

Dmesg reports this about the hard drive:

SCSI subsystem initialized
libata version 3.00 loaded.
ahci 0000:00:1f.2: version 3.0
ahci 0000:00:1f.2: PCI INT B -> GSI 17 (level, low) -> IRQ 17
ahci 0000:00:1f.2: AHCI 0001.0200 32 slots 6 ports 3 Gbps 0x3f impl SATA mode
ahci 0000:00:1f.2: flags: 64bit ncq sntf stag pm led clo pmp pio slum part ems
ahci 0000:00:1f.2: setting latency timer to 64
scsi0 : ahci
scsi1 : ahci
scsi2 : ahci
scsi3 : ahci
scsi4 : ahci
scsi5 : ahci
ata1: SATA max UDMA/133 abar m2048@0xfc500000 port 0xfc500100 irq 4345
ata2: SATA max UDMA/133 abar m2048@0xfc500000 port 0xfc500180 irq 4345
ata3: SATA max UDMA/133 abar m2048@0xfc500000 port 0xfc500200 irq 4345
ata4: SATA max UDMA/133 abar m2048@0xfc500000 port 0xfc500280 irq 4345
ata5: SATA max UDMA/133 abar m2048@0xfc500000 port 0xfc500300 irq 4345
ata6: SATA max UDMA/133 abar m2048@0xfc500000 port 0xfc500380 irq 4345
input: ImPS/2 Generic Wheel Mouse as /devices/platform/i8042/serio1/input/input1
ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
ata1.00: ATA-8: ST3500320NS, AN05, max UDMA/133
ata1.00: 976773168 sectors, multi 0: LBA48 NCQ (depth 31/32)
ata1.00: configured for UDMA/133
ata2: SATA link down (SStatus 0 SControl 300)
ata3: SATA link down (SStatus 0 SControl 300)
ata4: SATA link down (SStatus 0 SControl 300)
ata5: SATA link down (SStatus 0 SControl 300)
ata6: SATA link down (SStatus 0 SControl 300)
scsi 0:0:0:0: Direct-Access ATA ST3500320NS AN05 PQ: 0 ANSI: 5
pata_it8213 0000:07:00.0: version 0.0.3
vendor=8086 device=244e
pata_it8213 0000:07:00.0: PCI INT A -> GSI 17 (level, low) -> IRQ 17
scsi6 : pata_it8213
scsi7 : pata_it8213
ata7: PATA max UDMA/66 cmd 0x4098 ctl 0x4090 bmdma 0x4080 irq 17
ata8: DUMMY

And here's some times and system info:

u1@testserver1:~> time wget http://kernel.org/pub/linux/kernel/v2.6/linux-2.6.29.1.tar.bz2
--2009-04-26 13:43:23-- http://kernel.org/pub/linux/kernel/v2.6/linux-2.6.29.1.tar.bz2
Resolving kernel.org... 149.20.20.133, 204.152.191.37
Connecting to kernel.org|149.20.20.133|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 56553335 (54M) [application/x-bzip2]
Saving to: `linux-2.6.29.1.tar.bz2'

100%[==========================================================================================================>] 56,553,335 10.1M/s in 5.3s

2009-04-26 13:43:29 (10.1 MB/s) - `linux-2.6.29.1.tar.bz2' saved [56553335/56553335]

real 0m6.026s
user 0m0.072s
sys 0m0.472s
u1@testserver1:~> time tar xjf linux-2.6.29.1.tar.bz2

real 0m17.815s
user 0m11.045s
sys 0m2.140s
u1@testserver1:~> time rm -rf linux-2.6.29.1

real 0m9.173s
user 0m0.024s
sys 0m0.944s
u1@testserver1:~> exit
logout
testserver1:/home/u1 # cd /ROOTDOCS/
testserver1:/ROOTDOCS # time tar xjf /home/u1/linux-2.6.29.1.tar.bz2

real 0m31.986s
user 0m11.061s
sys 0m2.436s
testserver1:/ROOTDOCS # time rm -rf linux-2.6.29.1/

real 1m30.038s
user 0m0.016s
sys 0m1.312s
testserver1:/ROOTDOCS # cd /usr/src
testserver1:/usr/src # ls
linux-2.6.27.21-0.1-obj packages
testserver1:/usr/src # time tar xjf /home/u1/linux-2.6.29.1.tar.bz2

real 0m19.357s
user 0m10.985s
sys 0m3.588s
testserver1:/usr/src # time rm -rf linux-2.6.29.1/

real 0m1.443s
user 0m0.016s
sys 0m1.392s

testserver1:/usr/src # df
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/sda1 8385660 3147744 5237916 38% /
udev 4090184 104 4090080 1% /dev
/dev/sda6 460998656 2581324 458417332 1% /ROOTDOCS
/dev/sda5 12568576 1392568 11176008 12% /home
tmpfs 4090184 4 4090180 1% /tmp
testserver1:/usr/src # mount
/dev/sda1 on / type reiserfs (rw,noatime,data=writeback,noacl)
/proc on /proc type proc (rw)
sysfs on /sys type sysfs (rw)
debugfs on /sys/kernel/debug type debugfs (rw)
udev on /dev type tmpfs (rw)
devpts on /dev/pts type devpts (rw,mode=0620,gid=5)
/dev/sda6 on /ROOTDOCS type xfs (rw,noatime,nodiratime,noikeep,logbufs=8,allocsize=512m)
/dev/sda5 on /home type xfs (rw,noatime,nodiratime,noikeep,logbufs=8,allocsize=256k,sunit=512,swidth=512)
tmpfs on /tmp type tmpfs (rw,mode=1777)
fusectl on /sys/fs/fuse/connections type fusectl (rw)
none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)
proc on /var/lib/ntp/proc type proc (ro)
testserver1:/usr/src # uname -a
Linux testserver1 2.6.27.21-0.1-default #1 SMP 2009-03-31 14:50:44 +0200 x86_64 x86_64 x86_64 GNU/Linux
testserver1:/usr/src # cat /etc/SuSE-release
openSUSE 11.1 (x86_64)
VERSION = 11.1
testserver1:/usr/src #