login
Header Space

 
 

Re: Possibly SATA related freeze killed networking and RAID

Previous thread: Re: [rfc] lockless get_user_pages for dio (and more) by Dave Kleikamp on Monday, December 10, 2007 - 5:30 pm. (7 messages)

Next thread: Re: Iomega ZIP-100 drive unsupported with jmicron JMB361 chip? by Robert Hancock on Monday, December 10, 2007 - 8:21 pm. (7 messages)
To: <linux-kernel@...>
Cc: <noah123@...>
Date: Monday, December 10, 2007 - 5:25 pm

Hello,

I think, I'm experiencing the same problem:

09:16:34 : NETDEV WATCHDOG: eth0: transmit timed out
09:16:34 : eth0: Got tx_timeout. irq: 00000000
09:16:34 : eth0: Ring at 37e50000
09:16:34 : eth0: Dumping tx registers
09:16:34 :   0: 00000000 000000ff 00000003 025003ca 00000000 00000000
00000000 00000000
09:16:34 :  20: 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000

[...]

09:16:54 : ata5.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
09:16:54 : ata6.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
09:16:54 : ata6.00: cmd 25/00:08:1e:97:48/00:00:19:00:00/e0 tag 0 cdb 0x0
data 4096 in
09:16:54 :          res 40/00:00:01:4f:c2/00:00:00:00:00/00 Emask 0x4
(timeout)
09:16:54 : ata5.00: cmd 25/00:70:1e:97:48/00:00:19:00:00/e0 tag 0 cdb 0x0
data 57344 in
09:16:54 :          res 40/00:00:01:4f:c2/00:00:00:00:00/00 Emask 0x4
(timeout)
09:16:54 : ata6: soft resetting port
09:16:54 : ata5: soft resetting port
09:16:54 : ata6: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
09:16:54 : ata5: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
09:16:54 : NETDEV WATCHDOG: eth0: transmit timed out
09:16:54 : eth0: Got tx_timeout. irq: 00000032
09:16:54 : eth0: Ring at 37e50000
09:16:54 : eth0: Dumping tx registers

A more complete log can be found at:
http://www.e18.physik.tu-muenchen.de/~tnagel/misc/kernel-crash.log

The setup is strikingly similar to that of noah (I'm quoting all of this
by heart, if somebody is interested in more detail, just ask.):

Kernel: 2.6.22 (amd64, Debian patches, tainted)
Mainboard: Asus M2N-SLI Deluxe (nForce 570 SLI MCP --&gt; MCP55, same as noah)
CPU: Athlon64 Dual-Core (same as noah)
RAM: 1GB
HD: 22 x Samsung HD501LJ 500GB (same as noah), 1-6 connected to chipset,
7-22 connected to RocketRaid 2340.

I'm using software RAID like noah, (levels 1, 5 and 6), and like with noah
the problem occurred during RAID check, in my case during heavy NFS load
which had been ongoing for ~4 days.  This is the ...
Previous thread: Re: [rfc] lockless get_user_pages for dio (and more) by Dave Kleikamp on Monday, December 10, 2007 - 5:30 pm. (7 messages)

Next thread: Re: Iomega ZIP-100 drive unsupported with jmicron JMB361 chip? by Robert Hancock on Monday, December 10, 2007 - 8:21 pm. (7 messages)
speck-geostationary