I did not try it.
I'm not an libata expert and while this change looks suspicios, I
can't be 100% sure if that change was intended.
And I did not want to experiment this deep in the code and risk
corrupting the hole drive.
And that is, why I'm so unsure.
The error looks to serious to only cause random failures on one of two
drives on bootup.
I never had trouble with the remaining drive on the SiI-chip or both
drives if one got killed during booting.
I'm guessing that leaving the computer powered down long enough fills
the RAM with a special pattern that really hangs the drive, while
normaly it would just reject the invalid data. (I have ECC-RAM, does
this matter?)
Another guess might be that most of the time the Sil-chip correctly
terminates after the transfer-length is reached, even if SGE_TRM is
missing...
Torsten
-