login
Header Space

 
 

Performance problems with 3ware 9500S-4LP and 2.6.25-rc3

Previous thread: h8300 drivers/serial/sh-sci.c compile error by Adrian Bunk on Tuesday, February 26, 2008 - 1:35 pm. (2 messages)

Next thread: -next build logs by Stefan Richter on Tuesday, February 26, 2008 - 2:15 pm. (4 messages)
To: adam radford <aradford@...>
Cc: Tony Battersby <tonyb@...>, Johannes <johannes.woerner@...>, James Bottomley <James.Bottomley@...>, linux-scsi <linux-scsi@...>, linux-kernel <linux-kernel@...>
Date: Tuesday, February 26, 2008 - 1:43 pm

Hi

we are experiencing massive performance problems with two of our
Linux servers that contain 3ware controllers on a Tyan mainboard and
a couple of 1T disks.

During the daily cron job that uses rsync to sync a 500G file system
=66rom another machine to the raid on the 3ware controller the load
jumps up, and the machine becomes sluggish as hell. For example, an
ssh login to that machine takes minutes to complete and ldap becomes
unreliable while the rsync job is running. Even Nagios complains
about the machine being down while rsync is running.

We tried the Cent-OS 2.6.18-based kernel, 2.6.23.y and linus-git from
today, but all three kernels show the same very poor performance as
soon as data is written to the disks on the 3ware controller.

In particular commit 1e6c38c, i.e.

	    [SCSI] 3w-9xxx: fix abysmal write performance on some motherboards

which is contained in linus-current but not in the other two kernels
mentioned above does not seem to make any difference.

We also tried different Raid Configurations, to no avail. ATM we're
using a raid10 over 4 disks with write cache enabled.

Below there's some more info about the card, dmesg and lspci output
and our kernel config. A similar machine works fine with FreeBSD,
so I really think it's a problem with the linux driver.

ATM this machine is only used as a fallback for the main server,
so we'll be able to reboot and test patches.

Thanks
Andre
--------------------------------------------------------------------
=46rom the 3DM2 web interface:

	Model: 9500S-4LP
	Firmware: FE9X 2.08.00.009
	Driver: 2.26.02.010
	BIOS: BE9X 2.03.01.052
	Memory Installed  	112 MB
	# of Ports 	4
	# of Units 	1
	# of Drives 	4

--------------------------------------------------------------------
=46rom dmesg (linus-git):

	Driver 'sd' needs updating - please use bus_type methods
	3ware 9000 Storage Controller device driver for Linux v2.26.02.010.
	ACPI: PCI Interrupt 0000:03:03.0[A] -&gt; GSI 24 (level, low) -&gt; IRQ 24
	in...
To: Andre Noll <maan@...>
Cc: adam radford <aradford@...>, Tony Battersby <tonyb@...>, Johannes Wörner <johannes.woerner@...>, James Bottomley <James.Bottomley@...>, linux-scsi <linux-scsi@...>, linux-kernel <linux-kernel@...>
Date: Tuesday, February 26, 2008 - 7:07 pm

Could you give some numbers, please?

However there are some known issues:

http://forums.storagereview.net/index.php?showtopic=25923
http://tumbleweed.org.za/2007/02/16/horrific-performance-with-3ware-raid

Symptons are reasonable performance with large block ops, but really bad performance with small block ops.
time (cp -a linux-2.6.24.2 linux-2.6.24.2b; sync)
Gives me with some tuning 50 seconds here with a 9650SE in a 4 disk raid5 setup. (very bad, single disk will do it in &lt;30s!!!)
But reading and writing large files with large block sizes is usually beyond &gt; 100 MB/s


Best regards,
Arnd
--
To: Arnd Hannemann <hannemann@...>
Cc: adam radford <aradford@...>, Tony Battersby <tonyb@...>, Johannes <johannes.woerner@...>, James Bottomley <James.Bottomley@...>, linux-scsi <linux-scsi@...>, linux-kernel <linux-kernel@...>
Date: Wednesday, February 27, 2008 - 6:11 am

Thanks, this helped a lot. However, there does not seem to be a way
to make the system more responsive, which is really the problem we
are experiencing.

Andre
--=20
The only person who always got his work done by Friday was Robinson Crusoe
To: Andre Noll <maan@...>
Cc: Arnd Hannemann <hannemann@...>, adam radford <aradford@...>, Johannes Wörner <johannes.woerner@...>, James Bottomley <James.Bottomley@...>, linux-scsi <linux-scsi@...>, linux-kernel <linux-kernel@...>
Date: Wednesday, February 27, 2008 - 10:26 am

This is not 3ware-specific, but kernel 2.6.24 has new per-device write
throttling that might help with the responsiveness issue:

http://kernelnewbies.org/LinuxChanges#head-92340ffcec39e7c2a09fd933243fb18eda57f1b4
http://lwn.net/Articles/245600/

Also, check to see if the 3ware controller has a background initialize
or verify in progress, since that will obviously slow things down until
it is complete.

Tony

--
To: Tony Battersby <tonyb@...>
Cc: Arnd Hannemann <hannemann@...>, adam radford <aradford@...>, Johannes <johannes.woerner@...>, James Bottomley <James.Bottomley@...>, linux-scsi <linux-scsi@...>, linux-kernel <linux-kernel@...>
Date: Wednesday, February 27, 2008 - 1:05 pm

Yes, but we tried both 2.6.24 and 2.6.25-rc, so Peter's new
write-throttling code doesn't seem to help much in our situtation. I'll
play a bit with the various /proc/sys/vm/* knobs to see if that makes

That's certainly not the case.

Thanks
Andre
--=20
The only person who always got his work done by Friday was Robinson Crusoe
To: Andre Noll <maan@...>
Cc: Tony Battersby <tonyb@...>, Arnd Hannemann <hannemann@...>, Johannes Wörner <johannes.woerner@...>, James Bottomley <James.Bottomley@...>, linux-scsi <linux-scsi@...>, linux-kernel <linux-kernel@...>
Date: Wednesday, February 27, 2008 - 4:06 pm

Andre,

Can you try turning down /sys/block/sdX/device/queue_depth to 16 and
see if that improves your responsiveness?

-Adam
--
To: adam radford <aradford@...>
Cc: Tony Battersby <tonyb@...>, Arnd Hannemann <hannemann@...>, Johannes <johannes.woerner@...>, James Bottomley <James.Bottomley@...>, linux-scsi <linux-scsi@...>, linux-kernel <linux-kernel@...>
Date: Thursday, February 28, 2008 - 5:46 am

Yes, that setting seems to improve responsiveness greatly.

Thanks a lot
Andre
--=20
The only person who always got his work done by Friday was Robinson Crusoe
To: Andre Noll <maan@...>
Cc: adam radford <aradford@...>, Tony Battersby <tonyb@...>, Johannes Wörner <johannes.woerner@...>, James Bottomley <James.Bottomley@...>, linux-scsi <linux-scsi@...>, linux-kernel <linux-kernel@...>
Date: Tuesday, February 26, 2008 - 2:33 pm

You're putting your box under astronomical load.  This is generally 
regarded as a bad idea, regardless of how well your storage controller 
is performing.  Can you measure the single-threaded throughput (say, 
coping one huge file, and then syncing) to give us a baseline 
performance figure?  rsync will happily peg your box, your network, and 
your cat if you let it.

	-- Chris
--
To: Chris Snook <csnook@...>
Cc: adam radford <aradford@...>, Tony Battersby <tonyb@...>, Johannes <johannes.woerner@...>, James Bottomley <James.Bottomley@...>, linux-scsi <linux-scsi@...>, linux-kernel <linux-kernel@...>
Date: Wednesday, February 27, 2008 - 6:11 am

The machine becomes sluggish also when I write directly to the raid
array. A simple

	dd if=3D/dev/zero of=3Dtmpfile


Single threaded throughput seems to be ok (140M/s). The problem is
that the machine becomes unresponsive.

Andre
--=20
The only person who always got his work done by Friday was Robinson Crusoe
To: Andre Noll <maan@...>
Cc: adam radford <aradford@...>, Tony Battersby <tonyb@...>, Johannes Wörner <johannes.woerner@...>, James Bottomley <James.Bottomley@...>, linux-scsi <linux-scsi@...>, linux-kernel <linux-kernel@...>
Date: Wednesday, February 27, 2008 - 3:55 pm

Actually, it's normal for pdflush to spawn up to 8 threads when you're 
dirtying memory faster than it can be written to disk.  Load going to 4 

Does the machine become unresponsive during the single-threaded test, or 
only when doing the rsync?

	-- Chris
--
To: Chris Snook <csnook@...>
Cc: adam radford <aradford@...>, Tony Battersby <tonyb@...>, Johannes <johannes.woerner@...>, James Bottomley <James.Bottomley@...>, linux-scsi <linux-scsi@...>, linux-kernel <linux-kernel@...>
Date: Thursday, February 28, 2008 - 5:46 am

It takes noticably longer to ssh into the machine also in the
single-threaded case (using dd to write to the device), but the
system remains usable. When the rsync job is running, it becomes
unusable quickly.

However, reducing the queue depth with

	echo 16 &gt; /sys/block/sda/device/queue_depth

as suggested by Adam, solves all problems.

Thanks all for your help.
Andre
--=20
The only person who always got his work done by Friday was Robinson Crusoe
To: Andre Noll <maan@...>
Cc: adam radford <aradford@...>, Tony Battersby <tonyb@...>, Johannes <johannes.woerner@...>, James Bottomley <James.Bottomley@...>, linux-scsi <linux-scsi@...>, linux-kernel <linux-kernel@...>
Date: Tuesday, February 26, 2008 - 1:54 pm

Hello Andre,


do you have the write-back cache of the controller enabled for your disks? 
When you disable this cache, the controller will also disable the disks, 
cause a write-performance between 3 to 8MB/s per disks.

Cheers,
Bernd


-- 
Bernd Schubert
Q-Leap Networks GmbH
--
To: Bernd Schubert <bs@...>
Cc: adam radford <aradford@...>, Tony Battersby <tonyb@...>, Johannes <johannes.woerner@...>, James Bottomley <James.Bottomley@...>, linux-scsi <linux-scsi@...>, linux-kernel <linux-kernel@...>
Date: Wednesday, February 27, 2008 - 6:10 am

?=20

Yes, I do. Performance is poor anyway.

Andre
--=20
The only person who always got his work done by Friday was Robinson Crusoe
Previous thread: h8300 drivers/serial/sh-sci.c compile error by Adrian Bunk on Tuesday, February 26, 2008 - 1:35 pm. (2 messages)

Next thread: -next build logs by Stefan Richter on Tuesday, February 26, 2008 - 2:15 pm. (4 messages)
speck-geostationary