Hi we are experiencing massive performance problems with two of our Linux servers that contain 3ware controllers on a Tyan mainboard and a couple of 1T disks. During the daily cron job that uses rsync to sync a 500G file system =66rom another machine to the raid on the 3ware controller the load jumps up, and the machine becomes sluggish as hell. For example, an ssh login to that machine takes minutes to complete and ldap becomes unreliable while the rsync job is running. Even Nagios complains about the machine being down while rsync is running. We tried the Cent-OS 2.6.18-based kernel, 2.6.23.y and linus-git from today, but all three kernels show the same very poor performance as soon as data is written to the disks on the 3ware controller. In particular commit 1e6c38c, i.e. [SCSI] 3w-9xxx: fix abysmal write performance on some motherboards which is contained in linus-current but not in the other two kernels mentioned above does not seem to make any difference. We also tried different Raid Configurations, to no avail. ATM we're using a raid10 over 4 disks with write cache enabled. Below there's some more info about the card, dmesg and lspci output and our kernel config. A similar machine works fine with FreeBSD, so I really think it's a problem with the linux driver. ATM this machine is only used as a fallback for the main server, so we'll be able to reboot and test patches. Thanks Andre -------------------------------------------------------------------- =46rom the 3DM2 web interface: Model: 9500S-4LP Firmware: FE9X 2.08.00.009 Driver: 2.26.02.010 BIOS: BE9X 2.03.01.052 Memory Installed 112 MB # of Ports 4 # of Units 1 # of Drives 4 -------------------------------------------------------------------- =46rom dmesg (linus-git): Driver 'sd' needs updating - please use bus_type methods 3ware 9000 Storage Controller device driver for Linux v2.26.02.010. ACPI: PCI Interrupt 0000:03:03.0[A] -> GSI 24 (level, low) -> IRQ 24 in...
Could you give some numbers, please? However there are some known issues: http://forums.storagereview.net/index.php?showtopic=25923 http://tumbleweed.org.za/2007/02/16/horrific-performance-with-3ware-raid Symptons are reasonable performance with large block ops, but really bad performance with small block ops. time (cp -a linux-2.6.24.2 linux-2.6.24.2b; sync) Gives me with some tuning 50 seconds here with a 9650SE in a 4 disk raid5 setup. (very bad, single disk will do it in <30s!!!) But reading and writing large files with large block sizes is usually beyond > 100 MB/s Best regards, Arnd --
Thanks, this helped a lot. However, there does not seem to be a way to make the system more responsive, which is really the problem we are experiencing. Andre --=20 The only person who always got his work done by Friday was Robinson Crusoe
This is not 3ware-specific, but kernel 2.6.24 has new per-device write throttling that might help with the responsiveness issue: http://kernelnewbies.org/LinuxChanges#head-92340ffcec39e7c2a09fd933243fb18eda57f1b4 http://lwn.net/Articles/245600/ Also, check to see if the 3ware controller has a background initialize or verify in progress, since that will obviously slow things down until it is complete. Tony --
Yes, but we tried both 2.6.24 and 2.6.25-rc, so Peter's new write-throttling code doesn't seem to help much in our situtation. I'll play a bit with the various /proc/sys/vm/* knobs to see if that makes That's certainly not the case. Thanks Andre --=20 The only person who always got his work done by Friday was Robinson Crusoe
Andre, Can you try turning down /sys/block/sdX/device/queue_depth to 16 and see if that improves your responsiveness? -Adam --
Yes, that setting seems to improve responsiveness greatly. Thanks a lot Andre --=20 The only person who always got his work done by Friday was Robinson Crusoe
You're putting your box under astronomical load. This is generally regarded as a bad idea, regardless of how well your storage controller is performing. Can you measure the single-threaded throughput (say, coping one huge file, and then syncing) to give us a baseline performance figure? rsync will happily peg your box, your network, and your cat if you let it. -- Chris --
The machine becomes sluggish also when I write directly to the raid array. A simple dd if=3D/dev/zero of=3Dtmpfile Single threaded throughput seems to be ok (140M/s). The problem is that the machine becomes unresponsive. Andre --=20 The only person who always got his work done by Friday was Robinson Crusoe
Actually, it's normal for pdflush to spawn up to 8 threads when you're dirtying memory faster than it can be written to disk. Load going to 4 Does the machine become unresponsive during the single-threaded test, or only when doing the rsync? -- Chris --
It takes noticably longer to ssh into the machine also in the single-threaded case (using dd to write to the device), but the system remains usable. When the rsync job is running, it becomes unusable quickly. However, reducing the queue depth with echo 16 > /sys/block/sda/device/queue_depth as suggested by Adam, solves all problems. Thanks all for your help. Andre --=20 The only person who always got his work done by Friday was Robinson Crusoe
Hello Andre, do you have the write-back cache of the controller enabled for your disks? When you disable this cache, the controller will also disable the disks, cause a write-performance between 3 to 8MB/s per disks. Cheers, Bernd -- Bernd Schubert Q-Leap Networks GmbH --
?=20 Yes, I do. Performance is poor anyway. Andre --=20 The only person who always got his work done by Friday was Robinson Crusoe
| Greg Kroah-Hartman | [PATCH 004/196] Chinese: add translation of SubmittingPatches |
| James Bottomley | Re: Integration of SCST in the mainstream Linux kernel |
| Jeff Garzik | Re: [Patch v2] Make PCI extended config space (MMCONFIG) a driver opt-in |
| Chodorenko Michail | PROBLEM: Celeron Core |
git: | |
| Linus Torvalds | People unaware of the importance of "git gc"? |
| Johannes Schindelin | Re: Empty directories... |
| Jakub Narebski | Re: VCS comparison table |
| Sam Song | Re: Fwd: [OT] Re: Git via a proxy server? |
| J.W. Zondag | Dell PE1950 III - Perc 6i |
| Richard Stallman | Real men don't attack straw men |
| GVG GVG | ssh_exchange_identification: Connection closed by remote host |
| Anselm R. Garbe | OpenBSD 4.0 / Xorg -> vesa 1920x1200 widescreen resolution |
| Jim Winstead Jr. | Re: Root Disk/Book Disk Compatibility |
| Anselm Lingnau | File creation date in UNIX (was: Re: VMS) |
| Rafal Kustra (summer student) | mount |
| Nicholas Yue | Re: more on 486/33 weirdness |
