Cross-posting this to freebsd-fs, as I'm sure people there will have other recommendations. (This is one of those rare cross-posting situations.....) I think these sets of tests are good. There are some others I'd like to see, but they'd only be applicable if the 1231-ML has hardware cache. I The general concept is: "the more RAM the better". However, if you're using RELENG_7, then there's not much point (speaking solely about ZFS) to getting more than maybe 3 or 4GB; you're still limited to a 2GB kmap maximum. Regarding size of the array vs. memory usage: as long as you tune kmem and ZFS ARC, you shouldn't have much trouble. There have been some key people reporting lately that they run very large ZFS arrays without issue, with proper tuning. Also, just a reminder: do not pick a value of 2048M for kmem_size or kmem_size_max; the machine won't boot/work. You shouldn't go above something like 1536M, although some have tuned slightly above that with success. (You need to remember that there is more to kernel memory allocation than just this, so you don't want to exhaust it all Only on CURRENT; 7.x cannot, and AFAIK, will never be able to, as the engineering efforts required to fix it are too great. I look forward to seeing your numbers. Someone here might be able to compile them into some graphs and other whatnots to make things easier for future readers. Thanks for doing all of this! -- | Jeremy Chadwick jdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | _______________________________________________ freebsd-fs@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-fs To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
I'd like to see the performance difference between these scenarios: - Memory cache enabled on Areca, write caching enabled on disks - Memory cache enabled on Areca, write caching disabled on disks - Memory cache disabled on Areca, write caching enabled on disks - Memory cache disabled on Areca, write caching disabled on disks I don't know if the controller will let you disable use of memory cache, but I'm hoping it does. I'm pretty sure it lets you disable disk All of the tuning variables apply to i386 and amd64. You do not need the vfs.zfs.debug variable; I'm not sure why you enabled that. I imagine it will have some impact on performance. I do not know anything about kern.maxvnodes, or vfs.zfs.vdev.cache.size. The tuning variables I advocate for a system with 2GB of RAM or more, on RELENG_7, are: vm.kmem_size="1536M" vm.kmem_size_max="1536M" vfs.zfs.arc_min="16M" vfs.zfs.arc_max="64M" vfs.zfs.prefetch_disable="1" You can gradually increase arc_min and arc_max by ~16MB increments as you see fit; you should see general performance improvements as they get larger (more data being kept in the ARC), but don't get too crazy. I've tuned arc_max up to 128MB before with success, but I don't want The only reason you need to adjust kmem_size and kmem_size_max is to increase the amount of available kmap memory which ZFS relies heavily on. If the values are too low, under heavy I/O, the kernel will panic with kmem exhaustion messages (see the ZFS Wiki for what some look like, or my Wiki). I would recommend you stick with a consistent set of loader.conf tuning variables, and focus entirely on comparing the performance of ZFS on the Areca controller vs. the ICH controller. You can perform a "ZFS tuning comparison" later. One step at a time; don't over-exert yourself quite yet. :-) You can add raidz2 to this comparison list too if you feel it's worthwhile, but I think most people will be using raidz1. -- | Jeremy Chadwick ...
Its probably worth playing with vfs.zfs.cache_flush_disable when using the hardware RAID. By default, ZFS will flush the entire hardware cache just to make sure the ZFS Intent Log (ZIL) has been written. This isn't so bad on a group of hard disks with small caches, but bad if you have 256mb of controller write cache. _______________________________________________ freebsd-fs@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-fs To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
Ok. _______________________________________________ freebsd-fs@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-fs To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
Flushing the cache to constituent drives also has a direct impact on latency, even without any dirty data (save what you just written) in the cache. If you're doing anything that does frequent fsync():s, you're likely to not want to wait for actual persistence to disk (with battery backed cache). In any case, why would the actual RAID controller cache be flushed, unless someone expliclitly configured it such? I would expect a regular BIO_FLUSH (that's all ZFS is going right?) to be satisfied by the data being contained in the controller cache, under the assumption that it is battery backed, and that the storage volume/controller has not been explicitly configured otherwise to not rely on the battery for persistence. Please correct me if I'm wrong, but if synchronous writing to your RAID device results in actually waiting for underlying disks to commit the data to platters, that sounds like a driver/controller problem/policy issue rather than anything that should be fixed by tweaking ZFS. Or is it the case that ZFS does both a "regular" request to commit data (which I thought was the purpose of BIO_FLUSH, even though the "FLUSH" sounds more specific) and separately does a "flush any actual caches no matter what" type of request that ends up bypassing controller policy (because it is needed on stupid SATA drives or such)? --=20 / Peter Schuller PGP userID: 0xE9758B7D or 'Peter Schuller <peter.schuller@infidyne.com>' Key retrieval: Send an E-Mail to getpgpkey@scode.org E-Mail: peter.schuller@infidyne.com Web: http://www.scode.org
Does it matter what type of disk we are talking about? What I mean is, do you want to see this with both Raid5 and Raid6 arrays? Also, I'm pretty sure that in JBod mode the cache (on the card) will do nothing. But I am not certain, so I'll do the tests there as well. What about stripe sizes? I mainly use big files so I was going to stripe accordingly. But the bonnie++ tests might give strange results It's been a while since I've had a hardware raid card. I'll see what is At the moment I am not hitting anywhere near the max vnodes setting. So Once I am settled on a 'starting point' I won't be altering it for the Yeah, this is weekend stuff for me at the moment, it will take me some time to get things done. Firstly I need to figure out how I am going to hook up 10 drives to my system. I don't have the drive-bay space and I am not shelling out for a new case so I am hunting around for an ancient I might as well do both. -D _______________________________________________ freebsd-fs@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-fs To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
The manual suggests that the write cache can be disabled. Perhaps there is no read cache for this card. -D _______________________________________________ freebsd-fs@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-fs To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
The initial results for a ICH9 vs Areca in JBod mode can be found here: http://www.dannysplace.net/ZFS-JBODTests.html Summary: 5 Disk ZFS RaidZ array with atime turned off. ICH9 - block reads avg 400MByte/Sec ICH9 - block writes avg 150MByte/Sec ArecaJBOD - block reads avg 300MByte/Sec ArecaJBOD - block writes avg 160MByte/Sec The Areca seems to be in all except char and block writes. Block reads are 75% as fast as the ICH9 and rewrites are about 85% as fast. There seems to be little difference between enabling and disabling the disk cache on the Areca. This leads me to two conclusions: 1. Disabling the write cache does nothing on Seagate drives. 2. IO to the drives is so slow that a write cache is irrelevant. These are just some quick tests that I started with, mainly to compare the areca bus versus the ich9 bus. If someone has any tuning suggestions, then now is the time to make them before I migrate the ICH9 drives to the Areca bus. -D p.s. My OS details are: FreeBSD 7.1-PRERELEASE #3: Tue Nov 4 13:58:49 EST 2008 localhost# cat /etc/sysctl.conf kern.maxvnodes=400000 net.key.preferred_oldsa=0 net.key.blockacq_count=0 kern.ipc.maxsockbuf=400000 net.inet.ip.fastforwarding=1 net.inet.tcp.rfc1323=1 kern.ipc.maxsockbuf=16777216 net.local.stream.sendspace=82320 net.local.stream.recvspace=82320 net.inet.tcp.local_slowstart_flightsize=10 net.inet.tcp.nolocaltimewait=1 net.inet.tcp.delayed_ack=1 net.inet.tcp.delacktime=100 net.inet.tcp.mssdflt=1460 net.inet.tcp.sendspace=78840 net.inet.tcp.recvspace=78840 net.inet.tcp.slowstart_flightsize=54 net.inet.tcp.inflight.enable=1 net.inet.tcp.inflight.min=6144 net.inet.tcp.hostcache.expire=3900 localhost# cat /boot/loader.conf hw.em.rxd=4096 hw.em.txd=4096 vm.kmem_size="1536M" vm.kmem_size_max="1536M" smb_load="YES" smbus_load="YES" ichsmb_load="YES" _______________________________________________ freebsd-fs@freebsd.org mailing ...
I have a couple of the ST31000340AS 1TB disks as well as older lower capacity Seagates, and turning the write cache on/off makes a MASSIVE (roughly 10:1) difference in write speed. Jeremy reports "about 13%" with Seagate ST3120026AS: http://lists.freebsd.org/pipermail/freebsd-hardware/2008-October/005450.html Perhaps there is something about the Areca or the testing? Is the write cache really getting turned on/off? You're getting about 2-3x the speed I'd expect if the write cache were off, so maybe it is still on but there is a bottleneck elsewhere? Have you tried a simple test with /dev/zero and dd to a raw drive to eliminate the effects of the filesystem? _______________________________________________ freebsd-fs@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-fs To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
The Areca controller he has can do caching of its own (it has 256MBytes of cache). Meaning, if you disable write cache on the disks (but not the Areca controller itself), all of the caching being done is purely controller-based. The actual disk writes between the controller and the disk will, of course, be "slow" -- but between the OS and the controller, things should appear fast. Let me outline the 4 test scenarios (I thought I did this in my original mail to Danny, but I believe I also said "don't get caught up in excessive granularity because it'll just confuse people now" -- case in point): - Areca cache disabled, disk write cache enabled - Areca cache disabled, disk write cache disabled - Areca cache enabled, disk write cache enabled - Areca cache enabled, disk write cache disabled [**] As I understand it, Danny performed the tests with the [**] configuration. -- | Jeremy Chadwick jdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | _______________________________________________ freebsd-fs@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-fs To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
It is entirely possible. I do not know however if the Areca cache works just for Raid or also in JBOD mode. The card can be configured via a web interface (it has it's own nic), via the CLI, or via the BIOS. The only setting I do see is: Disk Write Cache Mode. This is what I have tested. It might have been the Areca cache I turned off, or it might have been the disk caches that I turned off. I hope it is the former, otherwise what is the purpose of having a battery backup unit? If the disks cache the write, then you will probably lose data anyway. I think, once I turn on Raid mode, there will be an option to turn on/off caching in the raid part of the config. The manual shows me that there is an option there, but it only indicates that you can change the cache mode from WriteBack to WriteThrough. But for now, since it's in The tests should have names: Test 1: Areca cache disabled, disk write cache enabled Test 2: Areca cache disabled, disk write cache disabled Test 3: Areca cache enabled, disk write cache enabled Test 4: Areca cache enabled, disk write cache disabled You did outline these, I thought I was performaing test 2 because I am assuming that when you turn on JBOD mode, you do not get caching on the controller. Once I am sure there is not something glaringly wrong with the FreeBSD side of things I'll run as many of these tests as I can. For now, I think it is only tests 1 and 2. So, my thoughts remain, why was the read performance the same, and the write performance actually marginally better, after I turned off the cache? I did a reboot after I turned off the cache but I did not power cycle the drives. Perhaps that is the answer? Or perhaps simply the Areca controller cannot turn off the cache on the ST31000340AS drives. Or perhaps the cache is ALWAYS enabled and cannot be turned off on the controller. That mean I was doing test 4 as Jeremy suggested. That seems a likely possibility as well. In fact, thinking about it now, it makes ...
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 13 Nov, 2008, at 15:59 , Danny Carroll wrote: I think some RAID controllers do not use the cache when you export the disks as pass-thru/jbod, but on some controllers you can workaround this by making every disk a RAID0(stripe) array with only one disk. Dunno if that would work on the areca... [snip] - -- Regards, Nikolay Denev -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.9 (Darwin) iEYEARECAAYFAkkcQlsACgkQHNAJ/fLbfrkTkgCgo2NupY2Qe3TglJpoIIwne4uH VRwAnRl9p44NFxyWf9zhjrZOOImtiBAs =4Djt -----END PGP SIGNATURE----- _______________________________________________ freebsd-fs@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-fs To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
You can probably do that with this as controller as well. However if I look at the manual I do not see an option to disable the cache for Raid sets. Only to change it from Write-back to Write-Through. I guess write-through is *almost* as if the cache is disabled. -D _______________________________________________ freebsd-fs@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-fs To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
Just as a polite question, since I'm very much in favor doing benchmarking and do appreciate these kinds of test. You might want to add an introductory page to your results describing how you setup the test: Details of the hardware Details of the disk setup possible version and options with bonnie The script you used.... This would allow others to redo your experiment and try to figure out why their numbers are different. --WjW _______________________________________________ freebsd-fs@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-fs To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
Good idea. Actually, what I will do eventually is *also* post the results to the mailing list. It will probably be around long after my own server is gone. -D _______________________________________________ freebsd-fs@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-fs To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
The Areca controller likely doesn't buffer/cache for disks in JBOD mode, as others in this thread have stated. Without buffering, simple disk controllers will almost always be faster than accelerated raid controllers because the accelerated controllers add more latency between the host and the disk. A simple controller will directly funnel data from the host to the disk as soon as it receives a command. An accelerated controller, however, has a CPU and a mini-OS on it that has to schedule the work coming from the host and handle its own tasks and interrupts. This adds latency that quickly adds up under benchmarks. Your numbers clearly demonstrate this. Scott _______________________________________________ freebsd-fs@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-fs To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
That's nice to know. I'm not sure it tells us why the Non-Cached writes were about 8% faster though. The other thing about the "NoWriteCache" test I performed that I neglected to mention yesterday is that I actually panic'd the box (running out of memory). This was the first time I have had that happen with ZFS even though in previous testing (with cache enabled) I punished the box for a lot longer. Perhaps the ZFS caching took over where the disk caching left off? Could that explain why I did not see a negative difference in the numbers between Cache enabled and Cache disabled? One of the questions I wanted to answer for myself was just this: "Does a battery-backed cache on an Areca card protect me when I am in JBOD mode." If the Areca does not buffer/cache in JBOD mode then that means the answer is no. -D _______________________________________________ freebsd-fs@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-fs To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
I have noticed that my 3ware controllers, after updating firmware recently, have removed the JBOD option entirely, classifying it as something you wouldn't want to do with that kind of hardware anyway. I believed then, and even more so now, they are correct. Use the RAID-0 disk trick to be able to utilize the controller cache. And regarding write-back vs write-through; I believe write-through is equvivalent to disabling controller write cache, however it WILL cache the writes in order to respond to future reads of the data being written. I would guess, but I don't know, that this also goes for disk- level caches too, though, so it probably doesn't matter. /Eirik _______________________________________________ freebsd-fs@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-fs To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
It kinda depends. If there were a good 8 or 16+ port SATA card out there that *simply* did SATA with no bells and whistles, then there would be no point buying a Raid adaptor when you want to use things like ZFS. It is interesting to me that the default setting on the Areca card was to have the disk caches turned on. I think that is strange because by default you have a situation that can lead to data loss even if you have a battery backup unit. -D _______________________________________________ freebsd-fs@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-fs To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
Allow me to introduce you to Marvell. The sell the SATA controller used in the Sun thumper (X4500). I've used that same SATA controller under OpenSolaris and FreeBSD. Unfortunately, that controller doesn't use multi-lane cables. When you pack in 3 controllers and 24 disks, it's a cabling disaster. The Areca cards do NOT have the cache enabled by default. I ordered the optional battery and RAM upgrade for my collection of 1231ML cards. Even with the BBWC, the cache is not enabled by default. I had to go out of my way to enable it, on every single controller. Matt _______________________________________________ freebsd-fs@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-fs To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
I participated in that thread. http://freebsd.monkey.org/freebsd-fs/200808/msg00028.html The questions I had never got answered. The most important one being: have you actually performed a hard failure or forced disk swap with both the Areca and Marvell controllers? And how does FreeBSD behave when you do this? I've a feeling it works fine on the Areca (since CAM/da(4) are used), but if the Marvell card uses ata(4) (and I'm guessing it does) I'm concerned. Why? For sake of comparison: Promise controllers are considered one of the most well-supported controllers under FreeBSD, mainly due to Soren having access to their documentation; yet, when I attempted to do an actual disk upgrade, the Promise controller did nothing but cause me grief, forcing me to yank the entire card from my system. http://wiki.freebsd.org/JeremyChadwick/ZFS_disk_upgrade_gone_bad Users should read this story and the follow-up. And in my situation, the disk wasn't even bad/failed. What was supposed to be a simple procedure (and it was with Intel AHCI, as you'll read) turned into a complete nightmare. Take my story and apply it to a production datacentre -- but with an 8 or 16-port card and a shelf of disks. What're you going to tell your boss when this stuff fails like how I documented? "Yeah so I need US$600 to replace the card" "Why? We don't have that kind of budget. Is the card bad? Can we RMA it?" "No, the card isn't bad" "Then what is the problem?" "Well you see......" So when I see someone say "Yeah, try the <XXX card>, it works great", my first response is "Just how well have you actually tested failure or upgrade scenarios?" Most don't, and instead just *assume* come fail-time, that everything will "just work" -- and they find out the horrible truth when it's already too late. -- | Jeremy Chadwick jdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator ...
I'd like to post some results of what I have found with my tests. I did a few different types of tests. Basically a set of 5-disk tests and a set of 12-disk tests. I did this because I only had 5 ports available on my onboard controller and I wanted to see how the areca compared to that. I also wanted to see comparisons between JBOD, Passthru and hardware raid5. I have not tested raid6 or raidz2. You can see the results here: http://www.dannysplace.net/quickweb/filesystem%20tests.htm An explanation of each of the tests: ICH9_ZFS 5 disk zfs raidz test with onboard SATA ports. ARECAJBOD_ZFS 5 disk zfs raidz test with Areca SATA ports configured in JBOD mode. ARECAJBOD_ZFS_NoWriteCache 5 disk zfs raidz test with Areca SATA ports configured in JBOD mode and with disk caches disabled. ARECARAID 5 disk zfs single-disk test with Areca raid5 array. ARECAPASSTHRU 5 disk zfs raidz test with Areca SATA ports configured in Passthru mode. This means that the onboard areca cache is active. ARECARAID-UFS2 5 disk ufs2 single-disk test with Areca raid5 array. ARECARAID-BIG 12 disk zfs single-disk test with Areca raid5 array. ARECAPASSTHRU_12 12 disk zfs raidz test with Areca SATA ports configured in Passthru mode. This means that the onboard areca cache is active. I'll probably be opting for the ARECAPASSTHRU_12 configuration. Mainly because I do not need amazing read speeds (network port would be saturated anyway) and I think that the raidz implementation would be more fault tolerant. By that I mean if you have a disk read error during a rebuild then as I understand it, raidz will write off that block (and hopefully tell me about dead files) but continue with the rest of the rebuild. This is something I'd love to test for real, just to see what happens. But I am not sure how I could do that. Perhaps removing one drive, then a few random writes to a remaining disk (or two) and seeing how it ...
... been a long time since I've seen someone link stuff on this list that won't shot in Firefox. Pretty sad that it's just a table of values that would be just as well presented as text. _______________________________________________ freebsd-fs@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-fs To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
My guess is it probably has to do with the way ZFS does cache flushes: http://www.solarisinternals.com/wiki/index.php/ZFS_Evil_Tuning_Guide#Cache_Flushes It might be worth it to disable the forced flushing and test again, if you feel like it. -Koen _______________________________________________ freebsd-fs@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-fs To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
I've just done this and the results are on the same page: http://www.dannysplace.net/quickweb/filesystem%20tests.htm The Excel version is here: http://www.dannysplace.net/quickweb/filesystem%20tests.xls It is a major improvement but I do not know 100% for sure if the disks are protected by the write cache/battery backup when in Passthrough mode. When creating a passthrough disk the "Volume Cache Mode" can be set to "Write Back" or "Write Through". This makes me feel as though the cache is being used and that when the cache is used, so is the BBU. But I cannot be 100% sure. I will send an email to Areca support to ask. -D _______________________________________________ freebsd-fs@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-fs To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
Those numbers are pretty good, right? Who needs onboard XOR anyway :) _______________________________________________ freebsd-fs@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-fs To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
Those numbers are great, but I would love to know that writes to the disks are also protected by the battery backup. If not then I'll be forced to use either hardware raid5/6 or perhaps some other configuration. Maybe 6 stripe sets in a raidz array? At the end of the day however I really don't care about the performance, even the slowest of the tests I did would be fast enough to saturate a gigabit ethernet port, which is way fast enough for me. But its an interesting set of tests... -D _______________________________________________ freebsd-fs@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-fs To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
Please let us know what Areca says about the caching. If you ask me, these results definitely are cached. _______________________________________________ freebsd-fs@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-fs To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
Are yes but are they cached by the OS or by the array controller :-) -D _______________________________________________ freebsd-fs@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-fs To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
Sorry for the delay. Areca got back to me. It took a few days but I got someone who seemed be writeback or writethrough in some situations but when that is not an option, writethrough is the default. I could not get any information about read caching although I might send an email to see what happens. Here is the transcript of the conversation: Me: I have a rather simple question about the 1231 controller. Can you please explain the difference between using disks in JBOD mode and using disks in passthrough mode. I have a feeling that the controller uses it's onboard cache when in passthrough mode. Is this the case? Also, are both read and write operations cached? Areca Support: Dear Sir, the only difference is in JBOD mode, controller configure all drives as passthrough disk. in RAID mode, you have to configure passthrough disk by yourself in RAID mode in other words, you can use raid with passthrough disks at saem time in RAID mode but JBOD mode not. Me: So does that mean if I use passthrough, I am not protected by the cache/battery backup? I ask because there is an option for cache mode when creating a passthrough disk. i.e. Write-Back or Write-Through Areca Support: Dear Sir, in JBOD mode, the default setting writeback mode. with writeback mode, you will need a battery module to protect the data remain in cache in case you got a power failure problem. Me: And so in Passthrough mode I am still protected with the battery backup? So JBOD = WriteBack Cache with protection of the battery backup. Passthrough = WriteBack or WriteThrough also with protection of the battery backup. Is this correct? Areca Support: Dear Sir, if you have battery module attached, yes. _______________________________________________ freebsd-fs@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-fs To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
ZFS does not require battery-backed disk cache, as long as disks and controller flush their cache when they are told to by the OS. Then ZFS only issues sync/flush commands for the ZIL (transaction log), but majority of I/Os are free to sit in cache to complete when they are ready. Data that is not fsync()'d by the application may be lost on power outage, but stuff like databases do fsync() so they are protected. - Andrew _______________________________________________ freebsd-fs@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-fs To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 There is a big difference betweeen hardware and ZFS raidz with 12 disk on the get_block test, maybe it would be interesting to rerun this test with zfs prefetch disabled? - -- Regards, Nikolay Denev -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.9 (Darwin) iEYEARECAAYFAkllxT8ACgkQHNAJ/fLbfrnHnwCeJ8nSjBY6fc0Lvu2+fSN5E4HI zb0Ani2ZFLdxYCWYBuCnoo+D244O2lg5 =EKgi -----END PGP SIGNATURE----- _______________________________________________ freebsd-fs@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-fs To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
Interesting. Wish I had seen it before. To be honest I did consider this board but I was really in favour of PCIe over PCIX. That might Are you talking about the Areca cache or the disks own caches? On my board it was enabled. But maybe mine was the exception. -D _______________________________________________ freebsd-fs@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-fs To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
Disk caching is a completely different animal, and one which I didn't mention. I'm spoke only about the write cache on the controller. Mine all arrived off by default, which is a VERY reasonable default Perhaps it's model specific, or your vendor configured it that way. Or you got a return that someone else monkeyed with. I'm not going to speak for Areca but it seems quite odd that Areca would ship them with the cache enabled. I've used many hundreds of RAID controllers over the years and without exception, every single one with a write cache had it disabled by default. Matt _______________________________________________ freebsd-fs@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-fs To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
Ahhh, no I was talking about the disk cache setting. That is the one that is set to on by default (at least for me). I find it strange that this is the case. IMHO it makes the idea of a I guess I had a return model. It's not really a big deal. -D _______________________________________________ freebsd-fs@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-fs To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
I talked to a storage vendor of ours that has sold several SuperMicro systems like ours where the client was using OpenSolaris and having similar stability issues to what we see on FreeBSD. It seems to be a lack of maturity in ZFS that underlies these problems. It appears that running ZFS on FreeBSD will either thrill or horrify. When I tested with modest I/O requirements, it worked great and I was tickled. But when I build these new systems as backup servers, I was generating immensely more disk I/O. I started with 7.0 release and saw crashes hourly. With tuning, I was only crashing once or twice a day (always memory related). With 16GB of RAM. I ran for a month with one server on JBOD with RAIDZ2 and another with RAIDZ across two RAID 5 arrays. Then I lost a disk and consequently the array on the JBOD server. Since RAID 5 had proved to run so much faster, I ditched the Marvell cards, installed a pair of 1231MLs and reformatted it with RAID 5. Both 24 disk systems have been ZFS RAIDZ across two RAID 5 hardware arrays for months since. If I build another system tomorrow, that's exactly how I'd do it. After upgrading to 8-HEAD and applying The Great ZFS Patch, I am content with only having to reboot the systems once every 7-12 days. I have another system with only 8 disks and 4GB of RAM with ZFS running on a single RAID 5 array. Under the same workload as the 24 disk systems, it was crashing at least once a day. This was existing hardware, so we were confident it wasn't hardware issues. I finally resolved it by wiping the disks clean, creating a GPT partition on the array and using UFS. The system hasn't crashed once since and is far more responsive under heavy load than my ZFS systems. Of course, all of this might get a fair bit better soon: http://svn.freebsd.org/viewvc/base?view=revision&revision=185029 Matt _______________________________________________ freebsd-fs@freebsd.org mailing ...
Hi, I am seeing I/O related lockups on 7.1-PRE with an Areca ARC-1220 controller and eight drives in a RAID-6 array. The same hardware works fine with 6.3. When I run gstat while it is happening I see I/O performance drop and the time to service each write (ms/w) goes up, and then suddenly goes back down to a sensible value. I have seen it get to about 22000ms. The system is essentially unusable for writes, which limits the utility a bit. Reads seem fine. Is this similar to the behaviour you saw? Thanks, Jan Mikkelsen _______________________________________________ freebsd-fs@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-fs To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
Not quite. The zfs deadlock/hang effected both reads and writes, blocking either of them indefinitely. They were "fixed" by the most recent set of patches in -current. _______________________________________________ freebsd-fs@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-fs To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
