Re: performance "regression" in cfq compared to anticipatory, deadline and noop

Previous thread: Error in save_stack_trace() on x86_64? by Vegard Nossum on Sunday, May 11, 2008 - 6:09 am. (8 messages)

Next thread: [PATCH] Make for_each_cpu_mask a bit smaller by Alexander van Heukelum on Sunday, May 11, 2008 - 6:50 am. (16 messages)
From: Daniel J Blueman
Date: Sunday, May 11, 2008 - 6:14 am

I've been experiencing this for a while also; an almost 50% regression
is seen for single-process reads (ie sync) if slice_idle is 1ms or
more (eg default of 8) [1], which seems phenomenal.

Jens, is this the expected price to pay for optimal busy-spindle
scheduling, a design issue, bug or am I missing something totally?

Thanks,
  Daniel

--- [1]

# cat /sys/block/sda/queue/iosched/slice_idle
8
# echo 1 >/proc/sys/vm/drop_caches
# dd if=/dev/sda of=/dev/null bs=64k count=5000
5000+0 records in
5000+0 records out
327680000 bytes (328 MB) copied, 4.92922 s, 66.5 MB/s

# echo 0 >/sys/block/sda/queue/iosched/slice_idle
# echo 1 >/proc/sys/vm/drop_caches
# dd if=/dev/sda of=/dev/null bs=64k count=5000
5000+0 records in
5000+0 records out
327680000 bytes (328 MB) copied, 2.74098 s, 120 MB/s

# hdparm -Tt /dev/sda

/dev/sda:
 Timing cached reads:   15464 MB in  2.00 seconds = 7741.05 MB/sec
 Timing buffered disk reads:  342 MB in  3.01 seconds = 113.70 MB/sec

[120MB/s is known platter-rate for this disc, so expected]
-- 
Daniel J Blueman
--

From: Kasper Sandberg
Date: Sunday, May 11, 2008 - 7:02 am

This appears to be what i get aswell..

root@quadstation # dd if=/dev/sda of=/dev/null bs=64k count=5000
5000+0 records in
5000+0 records out
327680000 bytes (328 MB) copied, 5.48209 s, 59.8 MB/s
root@quadstation # echo 0 >/sys/block/sda/queue/iosched/slice_idle
root@quadstation # echo 1 >/proc/sys/vm/drop_caches
root@quadstation # dd if=/dev/sda of=/dev/null bs=64k count=5000
5000+0 records in
5000+0 records out
327680000 bytes (328 MB) copied, 2.93932 s, 111 MB/s
root@quadstation # hdparm -Tt /dev/sda
 Timing cached reads:   7264 MB in  2.00 seconds = 3633.82 MB/sec
 Timing buffered disk reads:  322 MB in  3.01 seconds = 107.00 MB/se
root@quadstation # echo 0 >/sys/block/sda/queue/iosched/slice_idle
root@quadstation # echo 1 >/proc/sys/vm/drop_caches
root@quadstation # hdparm -Tt /dev/sda
 Timing cached reads:   15268 MB in  2.00 seconds = 7643.54 MB/sec
 Timing buffered disk reads:  328 MB in  3.01 seconds = 108.85 MB/sec


To be sure, i did it all again:
noop:
root@quadstation # echo 1 >/proc/sys/vm/drop_caches
root@quadstation # dd if=/dev/sda of=/dev/null bs=64k count=5000
5000+0 records in
5000+0 records out
327680000 bytes (328 MB) copied, 2.85503 s, 115 MB/s
root@quadstation # echo 1 >/proc/sys/vm/drop_caches
root@quadstation # hdparm -tT /dev/sda
 Timing cached reads:   14076 MB in  2.00 seconds = 7045.78 MB/sec
 Timing buffered disk reads:  328 MB in  3.01 seconds = 109.12 MB/sec

anticipatory:
root@quadstation # echo 1 >/proc/sys/vm/drop_caches
root@quadstation # dd if=/dev/sda of=/dev/null bs=64k count=5000
5000+0 records in
5000+0 records out
327680000 bytes (328 MB) copied, 2.96948 s, 110 MB/s
root@quadstation # echo 1 >/proc/sys/vm/drop_caches
root@quadstation # hdparm -tT /dev/sda
 Timing cached reads:   13424 MB in  2.00 seconds = 6719.29 MB/sec
 Timing buffered disk reads:  328 MB in  3.01 seconds = 109.13 MB/sec

cfq:
root@quadstation # echo 1 >/proc/sys/vm/drop_caches
root@quadstation # dd if=/dev/sda of=/dev/null bs=64k count=5000
5000+0 ...
From: Jens Axboe
Date: Tuesday, May 13, 2008 - 5:20 am

Indeed, that is of course a bug. The initial mail here mentions this as
a regression - which kernel was the last that worked ok?

If someone would send me a blktrace of such a slow run, that would be
nice. Basically just do a blktrace /dev/sda (or whatever device) while
doing the hdparm, preferably storing output files on a difference
device. Then send the raw sda.blktrace.* files to me. Thanks!

-- 
Jens Axboe

--

From: Matthew
Date: Tuesday, May 13, 2008 - 5:58 am

[snip]
...

Hi Jens,

I called this a "regression" since I wasn't sure if this is a real bug
or just something introduced recently, I just started to use cfq as
main io-scheduler so I can't tell ...

testing 2.6.17 unfortunately is somewhat impossible for me (reiser4;
too new hardware - problems with jmicron)

google "says" that it seemingly already existed since at least 2.6.18
(Ubuntu DapperDrake) [see:
http://ubuntuforums.org/showpost.php?p=1484633&postcount=12]

well - back to topic:

for a blktrace one need to enable  CONFIG_BLK_DEV_IO_TRACE , right ?
blktrace can be obtained from your git-repo ?

Thanks

Mat
--

From: Jens Axboe
Date: Tuesday, May 13, 2008 - 6:05 am

Yes on both accounts, or just grab a blktrace snapshot from:

http://brick.kernel.dk/snaps/blktrace-git-latest.tar.gz

if you don't use git.

-- 
Jens Axboe

--

From: Kasper Sandberg
Date: Tuesday, May 13, 2008 - 6:51 am

I am afraid i cannot exactly tell you..

But i do have some additional information for you.

I have a server running with identical disk to mine, however, with an
older intel ahci controller..

This one gets 80mb/s with cfq, and 100mb/s with
anticipatory/deadline/noop with hdparm..

This server is running debian stable with a .18 kernel. I am sad to say
however, that i will be unable to do any testing on this box, since it
is a production server, and i can not shut it down.

haltek:~/blktrace# ./blktrace  /dev/sda
BLKTRACESETUP: Inappropriate ioctl for device
Failed to start trace on /dev/sda

However, on the box where you saw the previous numbers, i sure will be
able to provide you with the data you need.

i expect to get around to doing this this afternoon, or tonight at
~02:00

--

From: Kasper Sandberg
Date: Tuesday, May 13, 2008 - 5:33 pm

Well :) not too far off(02:32 now)

http://62.242.235.92/~redeeman/blktrace.tar.bz2

it contains the blktrace with cfq, noop, anticipatory and deadline,
along with the output of blktrace and hdparm.


--

Previous thread: Error in save_stack_trace() on x86_64? by Vegard Nossum on Sunday, May 11, 2008 - 6:09 am. (8 messages)

Next thread: [PATCH] Make for_each_cpu_mask a bit smaller by Alexander van Heukelum on Sunday, May 11, 2008 - 6:50 am. (16 messages)