Re: Patch RFC: Promise SATA300 TX4 hardware bug workaround.

Previous thread: apic vs. ASUS P5N-E (nforce 650) by Max Laier on Thursday, November 1, 2007 - 11:40 am. (2 messages)

Next thread: [RFT] Intel 3945abg wireless driver (wpi) by Benjamin Close on Thursday, November 1, 2007 - 4:53 pm. (26 messages)
From: Alexander Sabourenkov
Date: Thursday, November 1, 2007 - 3:34 pm

Hello.

I have ported the workaround for the hardware bug that causes data
corruption on Promise SATA300 TX4 cards to RELENG_7.

Bug description:
SATA300 TX4 hardware chokes if last PRD entry (in a dma transfer) is
larger than 164 bytes. This was found while analysing vendor-supplied
linux driver.

Workaround:
Split trailing PRD entry if it's larger that 164 bytes.

Two supplied patches do fix problem on my machine.

There is, however, a style problem with them. It seems like PRD entry
count is limited at 256. I have not found a good way to guarantee that
one entry is always available to do the split, thus the ugly solution of
patching ata-dma.c.


Patches, patched and original files are at http://lxnt.info/tx4/freebsd/.


--- ata-chipset.c.orig	2007-11-02 01:05:49.000000000 +0300
+++ ata-chipset.c	2007-11-02 01:05:49.000000000 +0300
@@ -142,6 +142,7 @@
 static int ata_promise_mio_command(struct ata_request *request);
 static void ata_promise_mio_reset(device_t dev);
 static void ata_promise_mio_dmainit(device_t dev);
+static void ata_promise_mio_dmasetprd(void *xsc, bus_dma_segment_t
*segs, int nsegs, int error);
 static void ata_promise_mio_setmode(device_t dev, int mode);
 static void ata_promise_sx4_intr(void *data);
 static int ata_promise_sx4_command(struct ata_request *request);
@@ -185,7 +186,6 @@
 static int ata_check_80pin(device_t dev, int mode);
 static int ata_mode2idx(int mode);

-
 /*
  * generic ATA support functions
  */
@@ -3759,8 +3759,44 @@
 static void
 ata_promise_mio_dmainit(device_t dev)
 {
+    struct ata_channel *ch = device_get_softc(dev);
+	
     /* note start and stop are not used here */
     ata_dmainit(dev);
+
+    if (ch->dma)
+	ch->dma->setprd = ata_promise_mio_dmasetprd;
+}
+
+static void
+ata_promise_mio_dmasetprd(void *xsc, bus_dma_segment_t *segs, int
nsegs, int error)
+{
+    #define PDC_MAXLASTSGSIZE 41*4
+    struct ata_dmasetprd_args *args = xsc;
+    struct ata_dma_prdentry *prd = args->dmatab;
+    int ...
From: Søren Schmidt
Date: Friday, November 2, 2007 - 3:27 am

Good catch!

However from my quick glimpse at the Promise sources the limit seems to=20
be 32 Dwords ie 32*4 =3D 128bytes.
I'll investigate further and ask Promise for the gory details, stay tuned=
=2E..
I dont think the PRD count limitation is a real problem, I've newer seen =

that long a list and IIRC we newer do more than 64K transfers in one go=20
(yet).
Anyhow I need to get checks in for that not just here...

Give me a few days and I'll get this figured out for 7-rel...



_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"
From: Søren Schmidt
Date: Friday, November 2, 2007 - 3:41 am

Oh, and I forgot, do you have a surefire way to reproduce the problem so =

the fix can be tested ?

I've newer been able to trigger this problem myself so far.


-S=F8ren

_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"
From: Alexander Sabourenkov
Date: Friday, November 2, 2007 - 3:57 am

In (current) practice, yes, but check should be there even if only to 

dd if=/dev/ad8 of=/dev/null bs=1048576 count=1000 works every time.

I have tested it on my home machine:

without the patch first timeouts and errors appear about 10 seconds into 
the read.

with the patch a read of entire disk (320G) completed without errors.

Previous tests of analogous linux driver fix shown no errors and no data 

Seems like the bug is highly configuration-dependent, or 
pci-chiset-depended, or just present in some production runs and not other.

-- 

./lxnt
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"
From: Ulf Lilleengen
Date: Friday, November 16, 2007 - 7:43 am

Hi,

I tried the patch, but I end up with the partition table being incorrectly
read (probably) on the drives connected to my TX4 card. Normally, there's
one partition on the drive, but when I apply the patch, the drive provider
(ad6) is all that shows up in /dev. 

When I revert the patch, the partition (ad6s1) shows up in /dev again.

I applied both the ata-chipset patch and ata-dma patch to a RELENG_7 system.

-- 
Ulf Lilleengen
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"
From: Søren Schmidt
Date: Friday, November 16, 2007 - 8:43 am

You should try the attached "official" patch and let me know if that=20
helps, thanks!

-S=F8ren
From: Søren Schmidt
Date: Monday, November 19, 2007 - 1:02 am

Hi All!

I'd like to get the final verdict of the attached patch and if it fixes=20
the problem or not.

Please test and report, its a bit urgent if it need to get into R7 :)


-S=F8ren
From: Ulf Lilleengen
Date: Monday, November 19, 2007 - 3:34 am

Hi!

I'm sorry I wasn't able to test this earlier, but my office was locked during
the weekend and I was therefore not able to test until today. 

But good news is, it works. I get no error messages when reading or writing
data to the drives anymore, and the partition table is correctly read so that
the correct device nodes show up. This should definately go into 7.0 imho if
no bugs show up.

Thanks!

-- 
Ulf Lilleengen
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"
From: Ari Suutari
Date: Monday, November 19, 2007 - 11:49 pm

I have Promise TX2 (PDC20575). It didn't work with 7.0 betas
before, but with this patch things run as well as they did
on 6.x.

     Ari S.


_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"
From: Thierry Herbelot
Date: Tuesday, November 20, 2007 - 12:13 am

Hello,

Has anyone an idea why the Promise controllers seemed to work correctly under 
6.x, then have issues with 7.0 ? (more precisely : was the existing bug not 
triggered by the 6.x kernel ?)

	Thierry
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"
From: Joao Barros
Date: Tuesday, November 20, 2007 - 3:20 am

Apparently not all Promise controllers are/were affected. I've been
running CURRENT since Pawel committed ZFS with an onboard Promise:

atapci0: <Promise PDC20319 SATA150 controller> port
0xb000-0xb03f,0xb400-0xb40f,0xb800-0xb87f mem
0xfc024000-0xfc024fff,0xfc000000-0xfc01ffff irq 23 at device 4.0 on
pci4
ar0: 305245MB <Promise Fasttrak RAID0 (stripe 64 KB)> status: READY
ar1: 305245MB <Promise Fasttrak RAID0 (stripe 64 KB)> status: READY

atapci0@pci0:4:4:0:     class=0x010400 card=0x80f51043 chip=0x3319105a
rev=0x02 hdr=0x00
    vendor     = 'Promise Technology Inc'
    device     = 'PDC20319(??) FastTrak SATA150 TX4 Controller'
    class      = mass storage
    subclass   = RAID

The only problem I have and I'm filling a pr for that, is when booting
from CD with the controller enabled, the BTX loader just reboots.

-- 
Joao Barros
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"
From: Søren Schmidt
Date: Tuesday, November 20, 2007 - 8:51 am

The problems as in the Promise HW, so it bound to happen on 6.x as well. 
Thats at least not an ATA problem :)

-Søren


_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"
From: Thierry Herbelot
Date: Monday, November 19, 2007 - 11:38 am

Hello SoS,

From what I read, it seems that the last promise-fix3 patch is the same as the 
previous promise-fix2, except a cosmetic change.

Then, I'd say go for it as I was happy with promise_fix2.

	Thanks

	TfH
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"
From: Matthew D. Fuller
Date: Sunday, December 9, 2007 - 8:44 pm

On Mon, Nov 19, 2007 at 09:02:33AM +0100 I heard the voice of

Behind the curve, as usual, I just upgraded one of my systems that's
had the problem in the past to RELENG_7 (which has the fix).  It's
since moved a bunch of data and done a bunch of builds without a hint
of trouble, so looks good to me.


-- 
Matthew Fuller     (MF4839)   |  fullermd@over-yonder.net
Systems/Network Administrator |  http://www.over-yonder.net/~fullermd/
           On the Internet, nobody can hear you scream.
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"
Previous thread: apic vs. ASUS P5N-E (nforce 650) by Max Laier on Thursday, November 1, 2007 - 11:40 am. (2 messages)

Next thread: [RFT] Intel 3945abg wireless driver (wpi) by Benjamin Close on Thursday, November 1, 2007 - 4:53 pm. (26 messages)