Re: [smartmontools-support] exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen

Previous thread: oops during reboot in device_shutdown() by xerces8 on Saturday, August 30, 2008 - 2:52 pm. (3 messages)

Next thread: 2.6.26.[1-3] + x61 tablet + x6ultrabase: no resume after undocking by Steven King on Saturday, August 30, 2008 - 3:35 pm. (15 messages)
From: Justin Piszcz
Date: Saturday, August 30, 2008 - 3:12 pm

I have the same controller in my host as well, but it does not appear to
matter whether it happens on the ICH8 controller or other controllers.

I have noticed on Velociraptors I seem to get the same/similar error that
you do as well, and I ran all the same tests as you, to no avail as to getting
any closer to finding the root cause/problem.
(.. more so than the regular old raptor150s)

Besides the annoying messages in the kernel log/syslog/dmesg, does it
affect your system stability in any way as of yet?

I must add a very important note here though, you are using an ICH8 chipset
and so am I, we both have same/similar problems-- however, I also have
another machine setup VERY similarly (except different HDDs) for the RAID5
but the RAID1 is the same as one of my ICH8 boxes (dual raptor150s)--
and to date it has never? or rarely thrown the frozen error except when a disk 
actually failed (or when NCQ is enabled for a WD drive), (NCQ+Linux for WD) is
broken.

I have disks in a raid set (both raid1 and raid5) that get same/similar
warnings as I mentioned above and so far it has not had any impact that I have
noticed in relation to these specific errors.

I think for now we just have to live with them, I am not sure what else
to say here..

CC'ing linux-ide and linux-kernel with your original error from the start
of this e-mail thread:

Here is a snippet from this morning - this time it came back to life:

[46874.898690] ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2
frozen
[46874.898703] ata3.00: cmd c8/00:08:90:3c:59/00:00:00:00:00/ef tag 0
dma 4096 in
[46874.898705]          res 40/00:01:01:4f:c2/00:00:00:00:00/00 Emask
0x4 (timeout)
[46874.898709] ata3.00: status: { DRDY }
[46879.643962] ata3: port is slow to respond, please be patient (Status
0xd0)
[46884.473195] ata3: device not ready (errno=-16), forcing hardreset
[46884.473202] ata3: soft resetting link
[46912.740010] ata3.00: qc timeout (cmd 0xec)
[46912.740020] ata3.00: failed to IDENTIFY (I/O ...
From: Jonas Petersson
Date: Sunday, August 31, 2008 - 3:00 am

Hi again Justin,


Very much so, yes.

At best, all disk access will hang for a while and then resume after the 
reset has worked out - this often happens a couple of times per day now.

At worst, the reset will not work and the disk is remounted read-only 
and I can sort of use the system a bit this way. It seems somewhat 
random how much still works: Up until today I could at least always use 
dmesg and tail various logs to try to hunt down what happened, but this 
morning dmesg could not be found and I got I/O errors when accessing 
anything in /var/log. Rebooting helped as usual.

This fatal variant has happened about every second day lately.

The first two weeks I had the system showed nothing at all like this: I 
have log files since July 26 and the first recorded (reset-able) glitch 
is from Aug 16. Obviously, any non-resetable problem would have been 

Yes, I would not point fingers to the ICH8 chipset either: The other 
MacBookPro I have experimented with now is a 2,2 (ATI based) and has 
ICH7, but I'm 99.9% sure my previous MacBookPro 3,1 (nvidia based) was 
ICH8 and it worked flawlessly (I saw no reason to swap for the 4,1 
version, but it was stolen from me in June). As far as I know the 
significant differences with my current MBP are just: higher screen 
resolution, multitouch ("iphone") touchpad and more memory. Alas, I 

I'll just clarify that the errno after "revalidation failed" is not 
always -5. When it ends up fatal I've also seen -3 and possibly 
something else too. I would have taken a screen shot this morning if 

For the record: My current theory is that it is some kind of hardware 
problem - either in the disk or on the motherboard so I have persuaded 
my local AppleStore to swap the harddisk on Monday and then they will 
run their full hardware stress test (4+ hours according to him). The 
stress test was apparently suggested from the central repair people (who 
have no idea I run Linux on it - the local techie knows, but has no 
problem with ...
Previous thread: oops during reboot in device_shutdown() by xerces8 on Saturday, August 30, 2008 - 2:52 pm. (3 messages)

Next thread: 2.6.26.[1-3] + x61 tablet + x6ultrabase: no resume after undocking by Steven King on Saturday, August 30, 2008 - 3:35 pm. (15 messages)