Re: Flash IO slow 1.5 MB/s

Previous thread: [GIT PULL] lockdep stat fixes by Frederic Weisbecker on Monday, May 3, 2010 - 8:45 pm. (4 messages)

Next thread: linux-next: manual merge of the pcmcia tree with the wireless tree by Stephen Rothwell on Monday, May 3, 2010 - 8:57 pm. (1 message)
From: Trenton D. Adams
Date: Monday, May 3, 2010 - 8:52 pm

If I mount my usb key with "sync" option, I get 500kb or less transfer
speeds.  If I use the gnome defaults, I get 60M+ for awhile, and then
it continually drops over time, down to the 500kb/s again.  Gnome
defaults are...

/dev/sdc1 on /media/FLASH type vfat
(rw,nosuid,nodev,noatime,uhelper=hal,shortname=lower,flush,uid=500)

I have done similar tests on both Rally2 64G usb stick and sandisk
ultra (15M/s) SDHC 8G cards.  I get lousy performance on both, unless
I set dirty bytes.  These are both FAT 32.  But, as you can see below,
14 minutes to transfer less than a couple gigs is a little nutty.  The
3 minutes is a lot nicer.  I am using 2.6.33 with a patch from
https://bugzilla.kernel.org/show_bug.cgi?id=15374

As an example, checkout this rsync
time rsync -v --progress /home/share/*.avi /media/disk/
1.avi
   709911016 100%    8.88MB/s    0:01:16 (xfer#1, to-check=1/2)
2.avi
   621254748 100%    8.07MB/s    0:01:13 (xfer#2, to-check=0/2)

sent 1331328404 bytes  received 50 bytes  1510298.87 bytes/sec
total size is 1331165764  speedup is 1.00

real    14m40.863s
user    0m8.473s
sys     0m9.525s

It really looks like there's a scheduling issue.  It seems as if the
system is IO thrashing on the flash drive, and bounces all over the
place in terms of performance.  Sometimes it's really low, like the
2.73M/s, and other times it's really fast, like the 28.86M/s.
Although you can't see it there, there were times when rsync was
registering 200kb/s.  None of them are "really" accurate, as
everything is queued for writing, but the final results of 1.5M/s
(calculated from the "real" time) is terrible.

I have not seen this bad of performance on a normal USB drive, but
only on my USB flash drive, which is FAT32.  In addition, Windows and
Mac systems transfer easily 9M/s write speeds on my rally 2.

If I do the following...
          echo 16000000 > /proc/sys/vm/dirty_bytes
the performance is 9-12M/s all the way through the transfer.  It is
also interesting to note that it ...
From: Theodore Tso
Date: Tuesday, May 4, 2010 - 4:20 am

Very interesting.  How much memory do you have?  (The core tuning parameter is dirty_ratio, which defaults to 20):

dirty_bytes:  Contains the amount of dirty memory at which a process generating disk writes will itself start writeback.  If dirty_bytes is written, dirty_ratio becomes a function of its value (dirty_bytes / the amount of dirtyable system memory). 

dirty_ratio:  Contains, as a percentage of total system memory, the number of pages at which a process which is generating disk writes will itself start writing out dirty data.

Or put another way, what is dirty_ratio after you set dirty_bytes to 16000000?

Something else that would be very interesting is blktrace runs with and without dirty_bytes set to 16000000. 

Also, what was the average size of the files which you were writing out with that rsync command?  Were they all avi files that were tens of megabytes?   Hundreds of megabytes?

It does sound like the writeback code is doing something disastrously wrong on USB drives, perhaps interacting with the fat fs code.

-- Ted


--

From: Trenton D. Adams
Date: Tuesday, May 4, 2010 - 11:05 am

Ahh, this one didn't get to LKML either, oops.  I don't use gmail often. :P


dirty_ratio goes to 0 whenever I set dirty_bytes to anything, and
dirty_bytes goes to 0 when I set dirty_ratio.  I even tried setting

tdanotebook linux # zcat /proc/config.gz  | grep DEBUG_FS
CONFIG_DEBUG_FS=y

tdanotebook linux # mount | grep debug
none on /sys/kernel/debug type debugfs (rw)

tdanotebook linux # blktrace -d /dev/mmcblk0p1
BLKTRACESETUP: Inappropriate ioctl for device

Usually hundreds of megabytes.  I can do a few hundred without a
problem.  Looks like 20% is 800M.
--

From: Paul Hartman
Date: Tuesday, May 4, 2010 - 10:34 am

On Mon, May 3, 2010 at 10:52 PM, Trenton D. Adams

I have a similar experience (posted to this list a few months ago)
with mounting a flash device (mobile phone) in USB mass storage mode.
When I/O scheduler for that device is CFQ, write performance is really
terrible. When I change the scheduler to deadline, performance is
several times better. In 2.6.32 pdflush was replaced and CFQ
performance saw a 4x increase but still far too slow.

CFQ in <=2.6.31: 450KB/sec
CFQ in >=2.6.32: 2MB/sec
Deadline in all: 17MB/sec

I didn't try anything with dirty_bytes.

FWIW :)
--

From: Trenton D. Adams
Date: Tuesday, May 4, 2010 - 11:00 am

On Tue, May 4, 2010 at 11:34 AM, Paul Hartman

Oops, my message didn't reach the LKML, sorry for the spam Paul.

I switched to deadline and dirty_ratio 20 for my flash device, and I
am seeing VERY slow performance as well.  I get a lot of freezing up
of rsync, where the progress just stops (visually anyhow), which is
the same as what I see with cfq.  However, it's not 14 minutes as it
was in my original email...

[11:44 trenta@tdanotebook web] $ time rsync -v --progress
/home/share/DVD/*.avi /media/disk/
facing-the-giants.avi
  709911016 100%    5.49MB/s    0:02:03 (xfer#1, to-check=1/2)
jonah.avi
  621254748 100%   15.97MB/s    0:00:37 (xfer#2, to-check=0/2)

sent 1331328404 bytes  received 50 bytes  4430377.55 bytes/sec
total size is 1331165764  speedup is 1.00

real    4m59.657s
user    0m8.553s
sys     0m9.501s


with dirty_bytes 16000000, I still get twice the speed out of deadline.

[11:53 trenta@tdanotebook web] $ time rsync -v --progress
/home/share/DVD/*.avi /media/disk/
facing-the-giants.avi
  709911016 100%    7.62MB/s    0:01:28 (xfer#1, to-check=1/2)
jonah.avi
  621254748 100%    7.64MB/s    0:01:17 (xfer#2, to-check=0/2)

sent 1331328404 bytes  received 50 bytes  7948229.58 bytes/sec
total size is 1331165764  speedup is 1.00

real    2m47.244s
user    0m8.429s
sys     0m9.377s


So, perhaps it's a combination of the schedulers and something else in
the kernel?  And perhaps, CFQ just amplifies something else in the
kernel, more than deadline does?
--

From: Paul Hartman
Date: Monday, May 10, 2010 - 8:16 am

On Tue, May 4, 2010 at 1:00 PM, Trenton D. Adams

In my case I also noticed that if I'm using CFQ and leave everything
as normal, the problem only shows up when I copy more than 1 file
before syncing. For example, with 2 test files 700M each in size:

# one file at a time with sync in-between, fast speeds:
$ sync; time sh -c "cp file1 /mnt/usb; sync; cp file2 /mnt/usb; sync"

real    1m25.697s
user    0m0.005s
sys     0m2.509s

# copy two files in a row, then sync, speed is bad:
$ sync; time sh -c "cp file1 file2 /mnt/usb; sync"

real    6m51.439s
user    0m0.007s
sys     0m2.615s

(and, like you, if I mount with "sync" option the speed is basically terrible)

I've tested on 2 machines and had the same results on both, almost
identical timings in fact. Both 64-bit (Core 2 E6600, Core i7 920).
Others who have the same device have tested and some experience the
problem, some do not. I'm not sure of their system specs.

In my case the first machine had 8GB or RAM and second had 12GB of RAM
and in both cases actual RAM use by system was around 1G, leaving the
rest to be used for disk caching etc. in case it is related to having
a large amount of RAM.
--

From: Dave Chinner
Date: Tuesday, May 4, 2010 - 5:14 pm

Perhaps it might be worth taking the writeback tracing code
from this patch series:

http://marc.info/?l=linux-fsdevel&m=127173141007222&w=2

And seeing what that tells you about how writeback is acting
differently....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com
--

Previous thread: [GIT PULL] lockdep stat fixes by Frederic Weisbecker on Monday, May 3, 2010 - 8:45 pm. (4 messages)

Next thread: linux-next: manual merge of the pcmcia tree with the wireless tree by Stephen Rothwell on Monday, May 3, 2010 - 8:57 pm. (1 message)