Re: sata_sil24 broken since 2.6.23-rc4-mm1

!MAILaRCHIVE_VOTE_RePLACE
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]
To: Tejun Heo <htejun@...>, Jens Axboe <jens.axboe@...>
Cc: Jeff Garzik <jeff@...>, <linux-kernel@...>, <akpm@...>
Date: Sunday, October 7, 2007 - 10:39 am

[Adding Jens Axboe, the author of what looks like the probable cause]
On 10/7/07, Torsten Kaiser <just.for.lkml@googlemail.com> wrote:

Looking closer at
http://git.kernel.org/?p=linux/kernel/git/axboe/linux-2.6-block.git;a=commitdiff;h=ec6...
the change to libata.h seems bogus :

in ata_qc_first_sg:
old                                new
return qc->__sg                    return qc->__sg
qc->__sg - qc->__sg == 0           qc->n_iter=0
-> sg - qc->__sg corresponds to qc->n_iter

in ata_qc_next_sg:
sg++;                              sg_next(sg); qc->n_iter++;
sg - qc->__sg < qc->n_elem         qc->n_iter < qc->nelem
-> sg - qc->__sg corresponds to qc->n_iter

but in ata_sg_is_last:
(sg - qc->__sg) +1 == qc->n_elem   qc->n_iter == qc->n_elem
if sg - qc->__sg corresponds to qc->n_iter then shoudn't it be
qc->n_iter+1 == qc->n_elem?

That missing +1 would explain, why the SGE_TRM never gets set.

And it would fit the symptoms, that the boot would fail at random. If
the "correct" garbage was in place to where the sglist runs off it
hangs the drive.
And that would even fit the two different errors that I only got one time each:
* a completely illegal access (PCI master abort while fetching SGT)
* wrong alignment of the SGT (SGT no on qword boundary)
At that that times the garbage seemed to point invalid addresses.

But I'm still not understanding, how the kernel could only fail
sometimes at bootup, but after that working without any visible
errors? Is the sil-chip rather intelligent about detecting corrupted
sglists and silently ignoring them?

Torsten
-
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

Messages in current thread:
sata_sil24 broken since 2.6.23-rc4-mm1, Torsten Kaiser, (Wed Sep 26, 4:26 pm)
Re: sata_sil24 broken since 2.6.23-rc4-mm1, Tejun Heo, (Thu Sep 27, 12:54 am)
Re: sata_sil24 broken since 2.6.23-rc4-mm1, Tejun Heo, (Thu Sep 27, 12:57 am)
Re: sata_sil24 broken since 2.6.23-rc4-mm1, Torsten Kaiser, (Thu Sep 27, 2:14 am)
Re: sata_sil24 broken since 2.6.23-rc4-mm1, Jeff Garzik, (Thu Sep 27, 2:24 am)
Re: sata_sil24 broken since 2.6.23-rc4-mm1, Torsten Kaiser, (Thu Sep 27, 1:34 pm)
Re: sata_sil24 broken since 2.6.23-rc4-mm1, Tejun Heo, (Thu Sep 27, 4:22 pm)
Re: sata_sil24 broken since 2.6.23-rc4-mm1, Torsten Kaiser, (Fri Sep 28, 1:36 am)
Re: sata_sil24 broken since 2.6.23-rc4-mm1, Torsten Kaiser, (Sun Sep 30, 2:00 am)
Re: sata_sil24 broken since 2.6.23-rc4-mm1, Tejun Heo, (Sun Sep 30, 10:34 am)
Re: sata_sil24 broken since 2.6.23-rc4-mm1, Torsten Kaiser, (Sun Sep 30, 12:19 pm)
Re: sata_sil24 broken since 2.6.23-rc4-mm1, Tejun Heo, (Sun Sep 30, 1:39 pm)
Re: sata_sil24 broken since 2.6.23-rc4-mm1, Torsten Kaiser, (Sun Sep 30, 2:39 pm)
Re: sata_sil24 broken since 2.6.23-rc4-mm1, Torsten Kaiser, (Mon Oct 1, 2:00 pm)
Re: sata_sil24 broken since 2.6.23-rc4-mm1, Torsten Kaiser, (Wed Oct 3, 11:21 am)
Re: sata_sil24 broken since 2.6.23-rc4-mm1, Torsten Kaiser, (Wed Oct 3, 11:55 am)
Re: sata_sil24 broken since 2.6.23-rc4-mm1, Matt Mackall, (Wed Oct 3, 12:38 pm)
Re: sata_sil24 broken since 2.6.23-rc4-mm1, Torsten Kaiser, (Thu Oct 4, 1:32 am)
Re: sata_sil24 broken since 2.6.23-rc4-mm1, Matt Mackall, (Thu Oct 4, 1:05 pm)
Re: sata_sil24 broken since 2.6.23-rc4-mm1, Torsten Kaiser, (Fri Oct 5, 2:06 am)
Re: sata_sil24 broken since 2.6.23-rc4-mm1, Torsten Kaiser, (Sun Oct 7, 4:44 am)
Re: sata_sil24 broken since 2.6.23-rc4-mm1, Torsten Kaiser, (Sun Oct 7, 10:39 am)
Re: sata_sil24 broken since 2.6.23-rc4-mm1, Tejun Heo, (Wed Oct 10, 11:25 pm)
Re: sata_sil24 broken since 2.6.23-rc4-mm1, Jens Axboe, (Thu Oct 11, 4:26 am)
Re: sata_sil24 broken since 2.6.23-rc4-mm1, Tejun Heo, (Thu Oct 11, 4:36 am)
Re: sata_sil24 broken since 2.6.23-rc4-mm1, Jens Axboe, (Thu Oct 11, 6:28 am)
Re: sata_sil24 broken since 2.6.23-rc4-mm1, Torsten Kaiser, (Thu Oct 11, 1:54 am)
Re: sata_sil24 broken since 2.6.23-rc4-mm1, Tejun Heo, (Thu Oct 11, 2:26 am)
Re: sata_sil24 broken since 2.6.23-rc4-mm1, Torsten Kaiser, (Thu Oct 11, 1:51 pm)
Re: sata_sil24 broken since 2.6.23-rc4-mm1, Torsten Kaiser, (Wed Oct 3, 1:36 pm)
Re: sata_sil24 broken since 2.6.23-rc4-mm1, Matt Mackall, (Wed Oct 3, 1:51 pm)
Re: sata_sil24 broken since 2.6.23-rc4-mm1, Torsten Kaiser, (Wed Oct 3, 2:06 pm)