Re: Areca vs. ZFS performance testing.

Previous thread: gmirror with only some partitions gjournal'd, autosync setting? by Carl on Wednesday, October 29, 2008 - 11:58 pm. (1 message)

Next thread: Re: Areca vs. ZFS performance testing. by Simun Mikecin on Friday, October 31, 2008 - 2:20 am. (1 message)
From: Jeremy Chadwick
Date: Thursday, October 30, 2008 - 8:32 pm

Cross-posting this to freebsd-fs, as I'm sure people there will have
other recommendations.  (This is one of those rare cross-posting
situations.....)


I think these sets of tests are good.  There are some others I'd like to
see, but they'd only be applicable if the 1231-ML has hardware cache.  I

The general concept is: "the more RAM the better".  However, if you're
using RELENG_7, then there's not much point (speaking solely about ZFS)
to getting more than maybe 3 or 4GB; you're still limited to a 2GB kmap
maximum.

Regarding size of the array vs. memory usage: as long as you tune kmem
and ZFS ARC, you shouldn't have much trouble.  There have been some
key people reporting lately that they run very large ZFS arrays without
issue, with proper tuning.

Also, just a reminder: do not pick a value of 2048M for kmem_size or
kmem_size_max; the machine won't boot/work.  You shouldn't go above
something like 1536M, although some have tuned slightly above that
with success.  (You need to remember that there is more to kernel
memory allocation than just this, so you don't want to exhaust it all

Only on CURRENT; 7.x cannot, and AFAIK, will never be able to, as the
engineering efforts required to fix it are too great.

I look forward to seeing your numbers.  Someone here might be able to
compile them into some graphs and other whatnots to make things easier
for future readers.

Thanks for doing all of this!

-- 
| Jeremy Chadwick                                jdc at parodius.com |
| Parodius Networking                       http://www.parodius.com/ |
| UNIX Systems Administrator                  Mountain View, CA, USA |
| Making life hard for others since 1977.              PGP: 4BD6C0CB |

_______________________________________________
freebsd-fs@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-fs
To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
From: Danny Carroll
Date: Thursday, October 30, 2008 - 9:07 pm

[Empty message]
From: Jeremy Chadwick
Date: Thursday, October 30, 2008 - 9:34 pm

I'd like to see the performance difference between these scenarios:

- Memory cache enabled on Areca, write caching enabled on disks
- Memory cache enabled on Areca, write caching disabled on disks
- Memory cache disabled on Areca, write caching enabled on disks
- Memory cache disabled on Areca, write caching disabled on disks

I don't know if the controller will let you disable use of memory cache,
but I'm hoping it does.  I'm pretty sure it lets you disable disk

All of the tuning variables apply to i386 and amd64.

You do not need the vfs.zfs.debug variable; I'm not sure why you enabled
that.  I imagine it will have some impact on performance.

I do not know anything about kern.maxvnodes, or vfs.zfs.vdev.cache.size.

The tuning variables I advocate for a system with 2GB of RAM or more,
on RELENG_7, are:

vm.kmem_size="1536M"
vm.kmem_size_max="1536M"
vfs.zfs.arc_min="16M"
vfs.zfs.arc_max="64M"
vfs.zfs.prefetch_disable="1"

You can gradually increase arc_min and arc_max by ~16MB increments as
you see fit; you should see general performance improvements as they
get larger (more data being kept in the ARC), but don't get too crazy.
I've tuned arc_max up to 128MB before with success, but I don't want

The only reason you need to adjust kmem_size and kmem_size_max is to
increase the amount of available kmap memory which ZFS relies heavily
on.  If the values are too low, under heavy I/O, the kernel will panic
with kmem exhaustion messages (see the ZFS Wiki for what some look
like, or my Wiki).

I would recommend you stick with a consistent set of loader.conf
tuning variables, and focus entirely on comparing the performance of
ZFS on the Areca controller vs. the ICH controller.

You can perform a "ZFS tuning comparison" later.  One step at a time;
don't over-exert yourself quite yet.  :-)

You can add raidz2 to this comparison list too if you feel it's
worthwhile, but I think most people will be using raidz1.

-- 
| Jeremy Chadwick                                ...
From: Andrew Snow
Date: Thursday, October 30, 2008 - 9:44 pm

Its probably worth playing with vfs.zfs.cache_flush_disable when using 
the hardware RAID.

By default, ZFS will flush the entire hardware cache just to make sure 
the ZFS Intent Log (ZIL) has been written.

This isn't so bad on a group of hard disks with small caches, but bad if 
you have 256mb of controller write cache.

_______________________________________________
freebsd-fs@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-fs
To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
From: Danny Carroll
Date: Thursday, October 30, 2008 - 9:50 pm

Ok.


_______________________________________________
freebsd-fs@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-fs
To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
From: Peter Schuller
Date: Sunday, November 2, 2008 - 8:09 am

Flushing the cache to constituent drives also has a direct impact on
latency, even without any dirty data (save what you just written) in
the cache. If you're doing anything that does frequent fsync():s,
you're likely to not want to wait for actual persistence to disk (with
battery backed cache).

In any case, why would the actual RAID controller cache be flushed,
unless someone expliclitly configured it such? I would expect a
regular BIO_FLUSH (that's all ZFS is going right?) to be satisfied by
the data being contained in the controller cache, under the assumption
that it is battery backed, and that the storage volume/controller has
not been explicitly configured otherwise to not rely on the battery
for persistence.

Please correct me if I'm wrong, but if synchronous writing to your
RAID device results in actually waiting for underlying disks to commit
the data to platters, that sounds like a driver/controller
problem/policy issue rather than anything that should be fixed by
tweaking ZFS.

Or is it the case that ZFS does both a "regular" request to commit
data (which I thought was the purpose of BIO_FLUSH, even though the
"FLUSH" sounds more specific) and separately does a "flush any actual
caches no matter what" type of request that ends up bypassing
controller policy (because it is needed on stupid SATA drives or
such)?

--=20
/ Peter Schuller

PGP userID: 0xE9758B7D or 'Peter Schuller <peter.schuller@infidyne.com>'
Key retrieval: Send an E-Mail to getpgpkey@scode.org
E-Mail: peter.schuller@infidyne.com Web: http://www.scode.org

From: Danny Carroll
Date: Thursday, October 30, 2008 - 9:47 pm

Does it matter what type of disk we are talking about?   What I mean is,
do you want to see this with both Raid5 and Raid6 arrays?

Also, I'm pretty sure that in JBod mode the cache (on the card) will do
nothing.  But I am not certain, so I'll do the tests there as well.

What about stripe sizes?  I mainly use big files so I was going to
stripe accordingly.  But the bonnie++ tests might give strange results

It's been a while since I've had a hardware raid card.  I'll see what is


At the moment I am not hitting anywhere near the max vnodes setting.  So


Once I am settled on a 'starting point' I won't be altering it for the

Yeah, this is weekend stuff for me at the moment, it will take me some
time to get things done.  Firstly I need to figure out how I am going to
 hook up 10 drives to my system.  I don't have the drive-bay space and I
am not shelling out for a new case so I am hunting around for an ancient

I might as well do both.

-D
_______________________________________________
freebsd-fs@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-fs
To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
From: Danny Carroll
Date: Thursday, October 30, 2008 - 9:55 pm

The manual suggests that the write cache can be disabled.   Perhaps
there is no read cache for this card.

-D
_______________________________________________
freebsd-fs@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-fs
To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
From: Danny Carroll
Date: Wednesday, November 12, 2008 - 10:46 pm

The initial results for a ICH9 vs Areca in JBod mode can be found here:
http://www.dannysplace.net/ZFS-JBODTests.html

Summary:
	5 Disk ZFS RaidZ array with atime turned off.
	ICH9      - block reads  avg 400MByte/Sec
	ICH9      - block writes avg 150MByte/Sec
	ArecaJBOD - block reads  avg 300MByte/Sec
	ArecaJBOD - block writes avg 160MByte/Sec


The Areca seems to be in all except char and block writes.  Block reads
are 75% as fast as the ICH9 and rewrites are about 85% as fast.

There seems to be little difference between enabling and disabling the
disk cache on the Areca.  This leads me to two conclusions:
	1. Disabling the write cache does nothing on Seagate drives.
	2. IO to the drives is so slow that a write cache is irrelevant.

These are just some quick tests that I started with, mainly to compare
the areca bus versus the ich9 bus.  If someone has any tuning
suggestions, then now is the time to make them before I migrate the ICH9
drives to the Areca bus.

-D
p.s. My OS details are:
FreeBSD 7.1-PRERELEASE #3: Tue Nov  4 13:58:49 EST 2008
localhost# cat /etc/sysctl.conf
kern.maxvnodes=400000
net.key.preferred_oldsa=0
net.key.blockacq_count=0
kern.ipc.maxsockbuf=400000
net.inet.ip.fastforwarding=1
net.inet.tcp.rfc1323=1
kern.ipc.maxsockbuf=16777216
net.local.stream.sendspace=82320
net.local.stream.recvspace=82320
net.inet.tcp.local_slowstart_flightsize=10
net.inet.tcp.nolocaltimewait=1
net.inet.tcp.delayed_ack=1
net.inet.tcp.delacktime=100
net.inet.tcp.mssdflt=1460
net.inet.tcp.sendspace=78840
net.inet.tcp.recvspace=78840
net.inet.tcp.slowstart_flightsize=54
net.inet.tcp.inflight.enable=1
net.inet.tcp.inflight.min=6144
net.inet.tcp.hostcache.expire=3900

localhost# cat /boot/loader.conf
hw.em.rxd=4096
hw.em.txd=4096
vm.kmem_size="1536M"
vm.kmem_size_max="1536M"
smb_load="YES"
smbus_load="YES"
ichsmb_load="YES"


_______________________________________________
freebsd-fs@freebsd.org mailing ...
From: Dieter
Date: Wednesday, November 12, 2008 - 3:57 pm

I have a couple of the ST31000340AS 1TB disks as well as older lower capacity
Seagates, and turning the write cache on/off makes a MASSIVE (roughly 10:1)
difference in write speed.

Jeremy reports "about 13%" with Seagate ST3120026AS:
http://lists.freebsd.org/pipermail/freebsd-hardware/2008-October/005450.html

Perhaps there is something about the Areca or the testing?  Is the write cache
really getting turned on/off?

You're getting about 2-3x the speed I'd expect if the write cache were off,
so maybe it is still on but there is a bottleneck elsewhere?

Have you tried a simple test with /dev/zero and dd to a raw drive to
eliminate the effects of the filesystem?
_______________________________________________
freebsd-fs@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-fs
To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
From: Jeremy Chadwick
Date: Thursday, November 13, 2008 - 12:43 am

The Areca controller he has can do caching of its own (it has 256MBytes
of cache).  Meaning, if you disable write cache on the disks (but not
the Areca controller itself), all of the caching being done is purely
controller-based.  The actual disk writes between the controller and the
disk will, of course, be "slow" -- but between the OS and the
controller, things should appear fast.

Let me outline the 4 test scenarios (I thought I did this in my original
mail to Danny, but I believe I also said "don't get caught up in
excessive granularity because it'll just confuse people now" -- case in
point):

- Areca cache disabled, disk write cache enabled
- Areca cache disabled, disk write cache disabled
- Areca cache enabled, disk write cache enabled
- Areca cache enabled, disk write cache disabled [**]

As I understand it, Danny performed the tests with the [**]
configuration.

-- 
| Jeremy Chadwick                                jdc at parodius.com |
| Parodius Networking                       http://www.parodius.com/ |
| UNIX Systems Administrator                  Mountain View, CA, USA |
| Making life hard for others since 1977.              PGP: 4BD6C0CB |

_______________________________________________
freebsd-fs@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-fs
To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
From: Danny Carroll
Date: Thursday, November 13, 2008 - 6:59 am

It is entirely possible.  I do not know however if the Areca cache works
just for Raid or also in JBOD mode.

The card can be configured via a web interface (it has it's own nic),
via the CLI, or via the BIOS.   The only setting I do see is: Disk Write
Cache Mode.  This is what I have tested.  It might have been the Areca
cache I turned off, or it might have been the disk caches that I turned
off.  I hope it is the former, otherwise what is the purpose of having a
battery backup unit?  If the disks cache the write, then you will
probably lose data anyway.

I think, once I turn on Raid mode, there will be an option to turn
on/off caching in the raid part of the config.  The manual shows me that
there is an option there, but it only indicates that you can change the
cache mode from WriteBack to WriteThrough.  But for now, since it's in

The tests should have names:
Test 1: Areca cache disabled, disk write cache enabled
Test 2: Areca cache disabled, disk write cache disabled
Test 3: Areca cache enabled, disk write cache enabled
Test 4: Areca cache enabled, disk write cache disabled


You did outline these, I thought I was performaing test 2 because I am
assuming that when you turn on JBOD mode, you do not get caching on the
controller.

Once I am sure there is not something glaringly wrong with the FreeBSD
side of things I'll run as many of these tests as I can.  For now, I
think it is only tests 1 and 2.

So, my thoughts remain, why was the read performance the same, and the
write performance actually marginally better, after I turned off the
cache?  I did a reboot after I turned off the cache but I did not power
cycle the drives.  Perhaps that is the answer?

Or perhaps simply the Areca controller cannot turn off the cache on the
ST31000340AS drives.

Or perhaps the cache is ALWAYS enabled and cannot be turned off on the
controller.  That mean I was doing test 4 as Jeremy suggested.  That
seems a likely possibility as well.  In fact, thinking about it now, it
makes ...
From: Nikolay Denev
Date: Thursday, November 13, 2008 - 8:06 am

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


On 13 Nov, 2008, at 15:59 , Danny Carroll wrote:

I think some RAID controllers do not use the cache when you export the  
disks
as pass-thru/jbod, but on some controllers you can workaround this by  
making
every disk a RAID0(stripe) array with only one disk.
Dunno if that would work on the areca...

[snip]

- --
Regards,
Nikolay Denev




-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.9 (Darwin)

iEYEARECAAYFAkkcQlsACgkQHNAJ/fLbfrkTkgCgo2NupY2Qe3TglJpoIIwne4uH
VRwAnRl9p44NFxyWf9zhjrZOOImtiBAs
=4Djt
-----END PGP SIGNATURE-----
_______________________________________________
freebsd-fs@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-fs
To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
From: Danny Carroll
Date: Thursday, November 13, 2008 - 1:46 pm

You can probably do that with this as controller as well.
However if I look at the manual I do not see an option to disable the
cache for Raid sets.  Only to change it from Write-back to
Write-Through.   I guess write-through is *almost* as if the cache is
disabled.

-D
_______________________________________________
freebsd-fs@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-fs
To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
From: Willem Jan Withagen
Date: Thursday, November 13, 2008 - 1:32 am

Just as a polite question, since I'm very much in favor doing benchmarking and 
do appreciate these kinds of test.

You might want to add an introductory page to your results describing how you 
setup the test:
	Details of the hardware
	Details of the disk setup
	possible version and options with bonnie
	The script you used....

This would allow others to redo your experiment and try to figure out why their 
numbers are different.

--WjW

_______________________________________________
freebsd-fs@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-fs
To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
From: Danny Carroll
Date: Thursday, November 13, 2008 - 4:09 am

Good idea.

Actually, what I will do eventually is *also* post the results to the
mailing list.  It will probably be around long after my own server is gone.

-D


_______________________________________________
freebsd-fs@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-fs
To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
From: Scott Long
Date: Thursday, November 13, 2008 - 9:49 am

The Areca controller likely doesn't buffer/cache for disks in JBOD mode,
as others in this thread have stated.  Without buffering, simple disk
controllers will almost always be faster than accelerated raid
controllers because the accelerated controllers add more latency between
the host and the disk.  A simple controller will directly funnel data
from the host to the disk as soon as it receives a command.  An
accelerated controller, however, has a CPU and a mini-OS on it that has
to schedule the work coming from the host and handle its own tasks and
interrupts.  This adds latency that quickly adds up under benchmarks.
Your numbers clearly demonstrate this.

Scott
_______________________________________________
freebsd-fs@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-fs
To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
From: Danny Carroll
Date: Thursday, November 13, 2008 - 1:59 pm

That's nice to know.  I'm not sure it tells us why the Non-Cached writes
were about 8% faster though.  The other thing about the "NoWriteCache"
test I performed that I neglected to mention yesterday is that I
actually panic'd the box (running out of memory).   This was the first
time I have had that happen with ZFS even though in previous testing
(with cache enabled) I punished the box for a lot longer.

Perhaps the ZFS caching took over where the disk caching left off?
Could that explain why I did not see a negative difference in the
numbers between Cache enabled and Cache disabled?

One of the questions I wanted to answer for myself was just this:  "Does
a battery-backed cache on an Areca card protect me when I am in JBOD
mode."  If the Areca does not buffer/cache in JBOD mode then that means
the answer is no.

-D
_______________________________________________
freebsd-fs@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-fs
To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
From: Eirik Øverby
Date: Sunday, November 16, 2008 - 1:27 pm

I have noticed that my 3ware controllers, after updating firmware  
recently, have removed the JBOD option entirely, classifying it as  
something you wouldn't want to do with that kind of hardware anyway. I  
believed then, and even more so now, they are correct.

Use the RAID-0 disk trick to be able to utilize the controller cache.  
And regarding write-back vs write-through; I believe write-through is  
equvivalent to disabling controller write cache, however it WILL cache  
the writes in order to respond to future reads of the data being  
written. I would guess, but I don't know, that this also goes for disk- 
level caches too, though, so it probably doesn't matter.

/Eirik

_______________________________________________
freebsd-fs@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-fs
To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
From: Danny Carroll
Date: Sunday, November 16, 2008 - 8:15 pm

It kinda depends.  If there were a good 8 or 16+ port SATA card out
there that *simply* did SATA with no bells and whistles, then there
would be no point buying a Raid adaptor when you want to use things like
ZFS.


It is interesting to me that the default setting on the Areca card was
to have the disk caches turned on.  I think that is strange because by
default you have a situation that can lead to data loss even if you have
a battery backup unit.

-D
_______________________________________________
freebsd-fs@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-fs
To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
From: Matt Simerson
Date: Sunday, November 16, 2008 - 11:06 pm

Allow me to introduce you to Marvell. The sell the SATA controller  
used in the Sun thumper (X4500). I've used that same SATA controller  
under OpenSolaris and FreeBSD. Unfortunately, that controller doesn't  
use multi-lane cables. When you pack in 3 controllers and 24 disks,  
it's a cabling disaster.


The Areca cards do NOT have the cache enabled by default. I ordered  
the optional battery and RAM upgrade for my collection of 1231ML  
cards. Even with the BBWC, the cache is not enabled by default. I had  
to go out of my way to enable it, on every single controller.

Matt

_______________________________________________
freebsd-fs@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-fs
To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
From: Jeremy Chadwick
Date: Monday, November 17, 2008 - 12:08 am

I participated in that thread.

	http://freebsd.monkey.org/freebsd-fs/200808/msg00028.html

The questions I had never got answered.  The most important one being:
have you actually performed a hard failure or forced disk swap with both
the Areca and Marvell controllers?  And how does FreeBSD behave when you
do this?

I've a feeling it works fine on the Areca (since CAM/da(4) are used),
but if the Marvell card uses ata(4) (and I'm guessing it does) I'm
concerned.  Why?

For sake of comparison: Promise controllers are considered one of the
most well-supported controllers under FreeBSD, mainly due to Soren
having access to their documentation; yet, when I attempted to do an
actual disk upgrade, the Promise controller did nothing but cause me
grief, forcing me to yank the entire card from my system.

http://wiki.freebsd.org/JeremyChadwick/ZFS_disk_upgrade_gone_bad

Users should read this story and the follow-up.  And in my situation,
the disk wasn't even bad/failed.

What was supposed to be a simple procedure (and it was with Intel AHCI,
as you'll read) turned into a complete nightmare.  Take my story and
apply it to a production datacentre -- but with an 8 or 16-port card and
a shelf of disks.  What're you going to tell your boss when this stuff
fails like how I documented?  "Yeah so I need US$600 to replace the
card"  "Why?  We don't have that kind of budget.  Is the card bad?  Can
we RMA it?"  "No, the card isn't bad"  "Then what is the problem?"
"Well you see......"

So when I see someone say "Yeah, try the <XXX card>, it works great", my
first response is "Just how well have you actually tested failure or
upgrade scenarios?"  Most don't, and instead just *assume* come
fail-time, that everything will "just work" -- and they find out the
horrible truth when it's already too late.

-- 
| Jeremy Chadwick                                jdc at parodius.com |
| Parodius Networking                       http://www.parodius.com/ |
| UNIX Systems Administrator                ...
From: Danny Carroll
Date: Wednesday, January 7, 2009 - 5:33 pm

I'd like to post some results of what I have found with my tests.
I did a few different types of tests.  Basically a set of 5-disk tests
and a set of 12-disk tests.

I did this because I only had 5 ports available on my onboard controller
and I wanted to see how the areca compared to that.  I also wanted to
see comparisons between JBOD, Passthru and hardware raid5.

I have not tested raid6 or raidz2.

You can see the results here:
http://www.dannysplace.net/quickweb/filesystem%20tests.htm

An explanation of each of the tests:
ICH9_ZFS			5 disk zfs raidz test with onboard SATA
				ports.
ARECAJBOD_ZFS			5 disk zfs raidz test with Areca SATA
				ports configured in JBOD mode.
ARECAJBOD_ZFS_NoWriteCache	5 disk zfs raidz test with Areca SATA 					
ports configured in JBOD mode and with
				disk caches disabled.
ARECARAID			5 disk zfs single-disk test with Areca
				raid5 array.
ARECAPASSTHRU			5 disk zfs raidz test with Areca SATA 						ports
configured in Passthru mode.  This
				means that the onboard areca cache is
				active.
ARECARAID-UFS2			5 disk ufs2 single-disk test with Areca
				raid5 array.
ARECARAID-BIG			12 disk zfs single-disk test with Areca
				raid5 array.
ARECAPASSTHRU_12		12 disk zfs raidz test with Areca SATA 						ports
configured in Passthru mode.  This
				means that the onboard areca cache is
				active.


I'll probably be opting for the ARECAPASSTHRU_12 configuration.   Mainly
because I do not need amazing read speeds (network port would be
saturated anyway) and I think that the raidz implementation would be
more fault tolerant.  By that I mean if you have a disk read error
during a rebuild then as I understand it, raidz will write off that
block (and hopefully tell me about dead files) but continue with the
rest of the rebuild.

This is something I'd love to test for real, just to see what happens.
But I am not sure how I could do that.  Perhaps removing one drive, then
 a few random writes to a remaining disk (or two) and seeing how it ...
From: Zaphod Beeblebrox
Date: Thursday, January 8, 2009 - 12:40 am

... been a long time since I've seen someone link stuff on this list that
won't shot in Firefox.  Pretty sad that it's just a table of values that
would be just as well presented as text.
_______________________________________________
freebsd-fs@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-fs
To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
From: Koen Smits
Date: Thursday, January 8, 2009 - 12:48 am

My guess is it probably has to do with the way ZFS does cache flushes:
http://www.solarisinternals.com/wiki/index.php/ZFS_Evil_Tuning_Guide#Cache_Flushes
It might be worth it to disable the forced flushing and test again, if you
feel like it.

-Koen
_______________________________________________
freebsd-fs@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-fs
To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
From: Danny Carroll
Date: Thursday, January 8, 2009 - 7:30 pm

I've just done this and the results are on the same page:

http://www.dannysplace.net/quickweb/filesystem%20tests.htm
The Excel version is here:

http://www.dannysplace.net/quickweb/filesystem%20tests.xls

It is a major improvement but I do not know 100% for sure if the disks
are protected by the write cache/battery backup when in Passthrough mode.

When creating a passthrough disk the "Volume Cache Mode" can be set to
"Write Back" or "Write Through".  This makes me feel as though the cache
is being used and that when the cache is used, so is the BBU.    But I
cannot be 100% sure.  I will send an email to Areca support to ask.

-D
_______________________________________________
freebsd-fs@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-fs
To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
From: Koen Smits
Date: Friday, January 9, 2009 - 1:46 am

Those numbers are pretty good, right? Who needs onboard XOR anyway :)

_______________________________________________
freebsd-fs@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-fs
To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
From: Danny Carroll
Date: Friday, January 9, 2009 - 2:02 am

Those numbers are great, but I would love to know that writes to the
disks are also protected by the battery backup.  If not then I'll be
forced to use either hardware raid5/6 or perhaps some other
configuration.  Maybe 6 stripe sets in a raidz array?

At the end of the day however I really don't care about the performance,
even the slowest of the tests I did would be fast enough to saturate a
gigabit ethernet port, which is way fast enough for me.  But its an
interesting set of tests...

-D
_______________________________________________
freebsd-fs@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-fs
To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
From: Koen Smits
Date: Friday, January 9, 2009 - 8:58 am

Please let us know what Areca says about the caching.
If you ask me, these results definitely are cached.

_______________________________________________
freebsd-fs@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-fs
To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
From: Danny Carroll
Date: Friday, January 9, 2009 - 9:58 pm

Are yes but are they cached by the OS or by the array controller :-)

-D
_______________________________________________
freebsd-fs@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-fs
To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
From: Danny Carroll
Date: Tuesday, January 20, 2009 - 11:40 pm

Sorry for the delay.

Areca got back to me.  It took a few days but I got someone who seemed
be writeback or writethrough in some situations but when that is not an
option, writethrough is the default.  I could not get any information
about read caching although I might send an email to see what happens.

Here is the transcript of the conversation:
Me:
I have a rather simple question about the 1231 controller.
Can you please explain the difference between using disks in JBOD mode
and using disks in passthrough mode.   I have a feeling that the
controller uses it's onboard cache when in passthrough mode.  Is this
the case?

Also, are both read and write operations cached?

Areca Support:
Dear Sir,
the only difference is
in JBOD mode, controller configure all drives as passthrough disk.
in RAID mode, you have to configure passthrough disk by yourself in RAID
mode

in other words, you can use raid with passthrough disks at saem time in
RAID mode but JBOD mode not.

Me:
So does that mean if I use passthrough, I am not protected by the
cache/battery backup?  I ask because there is an option for cache mode
when creating a passthrough disk.  i.e. Write-Back or Write-Through

Areca Support:
Dear Sir,
in JBOD mode, the default setting writeback mode.

with writeback mode, you will need a battery module to protect the data
remain in cache in case you got a power failure problem.

Me:
And so in Passthrough mode I am still protected with the battery backup?

So JBOD = WriteBack Cache with protection of the battery backup.
Passthrough = WriteBack or WriteThrough also with protection of the
battery backup.

Is this correct?

Areca Support:
Dear Sir,
if you have battery module attached, yes.








_______________________________________________
freebsd-fs@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-fs
To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
From: Koen Smits
Date: Wednesday, January 21, 2009 - 2:16 am

[Empty message]
From: Andrew Snow
Date: Friday, January 9, 2009 - 7:38 pm

ZFS does not require battery-backed disk cache, as long as disks and 
controller flush their cache when they are told to by the OS.

Then ZFS only issues sync/flush commands for the ZIL (transaction log), 
but majority of I/Os are free to sit in cache to complete when they are 
ready.  Data that is not fsync()'d by the application may be lost on 
power outage, but stuff like databases do fsync() so they are protected.

- Andrew

_______________________________________________
freebsd-fs@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-fs
To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
From: Nikolay Denev
Date: Thursday, January 8, 2009 - 2:19 am

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1




There is a big difference betweeen hardware and ZFS raidz with 12 disk  
on the get_block test,
maybe it would be interesting to rerun this test with zfs prefetch  
disabled?

- --
Regards,
Nikolay Denev




-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.9 (Darwin)

iEYEARECAAYFAkllxT8ACgkQHNAJ/fLbfrnHnwCeJ8nSjBY6fc0Lvu2+fSN5E4HI
zb0Ani2ZFLdxYCWYBuCnoo+D244O2lg5
=EKgi
-----END PGP SIGNATURE-----
_______________________________________________
freebsd-fs@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-fs
To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
From: Danny Carroll
Date: Monday, November 17, 2008 - 4:43 am

Interesting.  Wish I had seen it before.  To be honest I did consider
this board but I was really in favour of PCIe over PCIX.  That might

Are you talking about the Areca cache or the disks own caches?

On my board it was enabled.  But maybe mine was the exception.

-D
_______________________________________________
freebsd-fs@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-fs
To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
From: Matt Simerson
Date: Monday, November 17, 2008 - 2:04 pm

Disk caching is a completely different animal, and one which I didn't  
mention.  I'm spoke only about the write cache on the controller. Mine  
all arrived off by default, which is a VERY reasonable default  

Perhaps it's model specific, or your vendor configured it that way. Or  
you got a return that someone else monkeyed with. I'm not going to  
speak for Areca but it seems quite odd that Areca would ship them with  
the cache enabled. I've used many hundreds of RAID controllers over  
the years and without exception, every single one with a write cache  
had it disabled by default.

Matt
_______________________________________________
freebsd-fs@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-fs
To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
From: Danny Carroll
Date: Monday, November 17, 2008 - 4:46 pm

Ahhh, no I was talking about the disk cache setting.   That is the one
that is set to on by default (at least for me).

I find it strange that this is the case.  IMHO it makes the idea of a

I guess I had a return model.  It's not really a big deal.

-D
_______________________________________________
freebsd-fs@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-fs
To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
From: Wes Morgan
Date: Monday, November 17, 2008 - 4:26 am

>> Eirik 
From: Matt Simerson
Date: Monday, November 17, 2008 - 3:07 pm

I talked to a storage vendor of ours that has sold several SuperMicro  
systems like ours where the client was using OpenSolaris and having  
similar stability issues to what we see on FreeBSD. It seems to be a  
lack of maturity in ZFS that underlies these problems.

It appears that running ZFS on FreeBSD will either thrill or horrify.  
When I tested with modest I/O requirements, it worked great and I was  
tickled. But when I build these new systems as backup servers, I was  
generating immensely more disk I/O. I started with 7.0 release and saw  
crashes hourly. With tuning, I was only crashing once or twice a day  
(always memory related). With 16GB of RAM.

I ran for a month with one server on JBOD with RAIDZ2 and another with  
RAIDZ across two RAID 5 arrays. Then I lost a disk and consequently  
the array on the JBOD server. Since RAID 5 had proved to run so much  
faster, I ditched the Marvell cards, installed a pair of 1231MLs and  
reformatted it with RAID 5. Both 24 disk systems have been ZFS RAIDZ  
across two RAID 5 hardware arrays for months since. If I build another  
system tomorrow, that's exactly how I'd do it.

After upgrading to 8-HEAD and applying The Great ZFS Patch, I am  
content with only having to reboot the systems once every 7-12 days.

I have another system with only 8 disks and 4GB of RAM with ZFS  
running on a single RAID 5 array.  Under the same workload as the 24  
disk systems, it was crashing at least once a day. This was existing  
hardware, so we were confident it wasn't hardware issues. I finally  
resolved it by wiping the disks clean, creating a GPT partition on the  
array and using UFS.  The system hasn't crashed once since and is far  
more responsive under heavy load than my ZFS systems.

Of course, all of this might get a fair bit better soon:

http://svn.freebsd.org/viewvc/base?view=revision&revision=185029

Matt
_______________________________________________
freebsd-fs@freebsd.org mailing ...
From: Jan Mikkelsen
Date: Tuesday, December 2, 2008 - 3:38 am

Hi,


I am seeing I/O related lockups on 7.1-PRE with an Areca ARC-1220 controller
and eight drives in a RAID-6 array.  The same hardware works fine with 6.3.

When I run gstat while it is happening I see I/O performance drop and the
time to service each write (ms/w) goes up, and then suddenly goes back down
to a sensible value.  I have seen it get to about 22000ms.

The system is essentially unusable for writes, which limits the utility a
bit.  Reads seem fine.

Is this similar to the behaviour you saw?

Thanks,

Jan Mikkelsen

_______________________________________________
freebsd-fs@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-fs
To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
From: Wes Morgan
Date: Tuesday, December 2, 2008 - 5:04 am

Not quite. The zfs deadlock/hang effected both reads and writes, blocking 
either of them indefinitely. They were "fixed" by the most recent set of 
patches in -current.

_______________________________________________
freebsd-fs@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-fs
To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
Previous thread: gmirror with only some partitions gjournal'd, autosync setting? by Carl on Wednesday, October 29, 2008 - 11:58 pm. (1 message)

Next thread: Re: Areca vs. ZFS performance testing. by Simun Mikecin on Friday, October 31, 2008 - 2:20 am. (1 message)