Re: e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang

Previous thread: Re: REGRESSION: 2.6.20-rc3-git4: EIO not returned to direct i/o application following disk error by Michael Reed on Tuesday, January 9, 2007 - 7:05 pm. (1 message)

Next thread: [PATCH 1 of 2]: Make BH_Unwritten a first class bufferhead flag V2 by David Chinner on Tuesday, January 9, 2007 - 8:33 pm. (2 messages)
To: linux-kernel Mailing List <linux-kernel@...>
Cc: Jeb Cramer <cramerj@...>, John Ronciak <john.ronciak@...>, Jesse Brandeburg <jesse.brandeburg@...>, Jeff Kirsher <jeffrey.t.kirsher@...>, Auke Kok <auke-jan.h.kok@...>
Date: Tuesday, January 9, 2007 - 6:27 pm

Linux 2.6.19.1 SMP on Pentium D, Intel DQ965GF mobo.
Got this while bittorrenting knoppix:

2007-01-09 22:53:40.020693500 <4>NETFILTER drop IN=eth0 OUT= MAC=00:19:d1:00:5f:01:00:05:00:1c:58:1c:08:00 SRC=83.46.5.76 DST=80.223.106.128 LEN=121 TOS=0x00 PREC=0x00 TTL=112 ID=53273 PROTO=ICMP TYPE=3 CODE=3 [SRC=80.223.106.128 DST=192.168.1.37 LEN=93 TOS=0x00 PREC=0x00 TTL=45 ID=0 DF PROTO=UDP SPT=6881 DPT=6895 LEN=73 ]
2007-01-09 22:53:41.660249500 <3>e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang
2007-01-09 22:53:41.660253500 <4> Tx Queue <0>
2007-01-09 22:53:41.660254500 <4> TDH <3c>
2007-01-09 22:53:41.660255500 <4> TDT <ca>
2007-01-09 22:53:41.660255500 <4> next_to_use <ca>
2007-01-09 22:53:41.660256500 <4> next_to_clean <3c>
2007-01-09 22:53:41.660257500 <4>buffer_info[next_to_clean]
2007-01-09 22:53:41.660258500 <4> time_stamp <8c3b8e4>
2007-01-09 22:53:41.660259500 <4> next_to_watch <3f>
2007-01-09 22:53:41.660274500 <4> jiffies <8c3bf13>
2007-01-09 22:53:41.660275500 <4> next_to_watch.status <0>
2007-01-09 22:53:42.660365500 <3>e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang
2007-01-09 22:53:42.660368500 <4> Tx Queue <0>
2007-01-09 22:53:42.660369500 <4> TDH <3c>
2007-01-09 22:53:42.660370500 <4> TDT <ca>
2007-01-09 22:53:42.660370500 <4> next_to_use <ca>
2007-01-09 22:53:42.660371500 <4> next_to_clean <3c>
2007-01-09 22:53:42.660372500 <4>buffer_info[next_to_clean]
2007-01-09 22:53:42.660373500 <4> time_stamp <8c3b8e4>
2007-01-09 22:53:42.660374500 <4> next_to_watch <3f>
2007-01-09 22:53:42.660389500 <4> jiffies <8c3c2fb>
2007-01-09 22:53:42.6...

To: linux-kernel Mailing List <linux-kernel@...>, <7atbggg02@...>
Cc: Jeb Cramer <cramerj@...>, John Ronciak <john.ronciak@...>, Jesse Brandeburg <jesse.brandeburg@...>, Jeff Kirsher <jeffrey.t.kirsher@...>
Date: Tuesday, January 9, 2007 - 7:59 pm

I'm unsure whether v7.2.x already automatically disables TSO for 100mbit speed link,
probably not. It should.

Please try our updated driver from http://e1000.sf.net/ (7.3.20) against the same
kernel. There are some changes with regard to the ich8/TSO driver that might affect
this, so re-testing is worth it for us.

also, please always include the full dmesg output. Feel free to CC
e1000-devel@lists.sourceforge.net on this.

Auke
-

To: linux-kernel Mailing List <linux-kernel@...>
Cc: Jeb Cramer <cramerj@...>, John Ronciak <john.ronciak@...>, Jesse Brandeburg <jesse.brandeburg@...>, Jeff Kirsher <jeffrey.t.kirsher@...>, Auke Kok <auke-jan.h.kok@...>
Date: Tuesday, January 9, 2007 - 9:10 pm

I now run 7.3.20-NAPI.

BTW. the Makefile is buggy: it does not get CC from kernel's Makefile.
Using wrong compiler can cause for example a reboot when loading the module.

I enabled TSO again. I write again if TSO causes problems.
Why shouldn't it work with 100 Mbps? Not that it would help a lot,
but I ask this on principle.

/* disable TSO for pcie and 10/100 speeds, to avoid
* some hardware issues */

Issues on the motherboard or the NIC?

2007-01-10 02:39:51.889908500 <6>ACPI: PCI interrupt for device 0000:00:19.0 disabled
2007-01-10 02:39:54.545194500 <6>Intel(R) PRO/1000 Network Driver - version 7.3.20-NAPI
2007-01-10 02:39:54.545198500 <6>Copyright (c) 1999-2006 Intel Corporation.
2007-01-10 02:39:54.545395500 <6>ACPI: PCI Interrupt 0000:00:19.0[A] -> GSI 20 (level, low) -> IRQ 22
2007-01-10 02:39:54.545435500 <7>PCI: Setting latency timer of device 0000:00:19.0 to 64
2007-01-10 02:39:54.562905500 <6>e1000: 0000:00:19.0: e1000_probe: (PCI Express:2.5Gb/s:Width x1) 00:19:d1:00:5f:01
2007-01-10 02:39:54.638093500 <6>e1000: eth0: e1000_probe: Intel(R) PRO/1000 Network Connection
2007-01-10 02:40:07.513619500 <6>ADDRCONF(NETDEV_UP): eth0: link is not ready
2007-01-10 02:40:07.614768500 <6>e1000: eth0: e1000_watchdog: NIC Link is Up 100 Mbps Full Duplex, Flow Control: None
2007-01-10 02:40:07.614770500 <6>e1000: eth0: e1000_watchdog: 10/100 speed: disabling TSO
2007-01-10 02:40:07.614771500 <6>ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
2007-01-10 02:40:09.271631500 <3>e1000: eth0: e1000_reset: Hardware Error
2007-01-10 02:40:10.930000500 <6>e1000: eth0: e1000_watchdog: NIC Link is Up 100 Mbps Full Duplex, Flow Control: None
2007-01-10 02:40:10.930049500 <6>e1000: eth0: e1000_watchdog: 10/100 speed: disabling TSO

PS. please do not delete Mail-Followup-To header field.

--
-

To: <7atbggg02@...>
Cc: linux-kernel Mailing List <linux-kernel@...>, Jeb Cramer <cramerj@...>, John Ronciak <john.ronciak@...>, Jesse Brandeburg <jesse.brandeburg@...>, Jeff Kirsher <jeffrey.t.kirsher@...>
Date: Tuesday, January 9, 2007 - 9:48 pm

There are known problems with that configuration, that's why the newer drivers disable
TSO for 10/100 speeds.

do you really think that you can see the performance gain fro musing TSO at those speeds

we (the e1000 team) don't write drivers for the motherboard, but only for the NIC

I hit "reply-all" and I have no control over which field thunderbird removes or adds. I
have to manually add your e-mail address too? Maybe your mail client is broken instead?
Don't you want to receive replies?

Cheers,

Auke
-

To: linux-kernel Mailing List <linux-kernel@...>
Cc: Jeb Cramer <cramerj@...>, John Ronciak <john.ronciak@...>, Jesse Brandeburg <jesse.brandeburg@...>, Jeff Kirsher <jeffrey.t.kirsher@...>
Date: Tuesday, January 9, 2007 - 10:12 pm

loop-AES has some hacks which figure out the correct CC

No.

I was thinking that if TSO does not work at 100, why 1000 would be
any better. But I can't test at 1000 speeds right now.

But if you say driver is supposed to hang at 100 speed,
I believe you.

Ohh... that was fast.

2007-01-10 04:07:42.303056500 <3>e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang
2007-01-10 04:07:42.303081500 <4> Tx Queue <0>
2007-01-10 04:07:42.303082500 <4> TDH <48>
2007-01-10 04:07:42.303083500 <4> TDT <fa>
2007-01-10 04:07:42.303084500 <4> next_to_use <fa>
2007-01-10 04:07:42.303085500 <4> next_to_clean <48>
2007-01-10 04:07:42.303086500 <4>buffer_info[next_to_clean]
2007-01-10 04:07:42.303087500 <4> time_stamp <9e332d8>
2007-01-10 04:07:42.303088500 <4> next_to_watch <49>
2007-01-10 04:07:42.303088500 <4> jiffies <9e336df>
2007-01-10 04:07:42.303094500 <4> next_to_watch.status <0>
2007-01-10 04:07:43.302826500 <3>e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang
2007-01-10 04:07:43.302850500 <4> Tx Queue <0>
2007-01-10 04:07:43.302851500 <4> TDH <48>
2007-01-10 04:07:43.302852500 <4> TDT <34>
2007-01-10 04:07:43.302853500 <4> next_to_use <34>
2007-01-10 04:07:43.302854500 <4> next_to_clean <48>
2007-01-10 04:07:43.302855500 <4>buffer_info[next_to_clean]
2007-01-10 04:07:43.302855500 <4> time_stamp <9e332d8>
2007-01-10 04:07:43.302856500 <4> next_to_watch <49>
2007-01-10 04:07:43.302857500 <4> jiffies <9e33ac7>
2007-01-10 04:07:43.302862500 <4> next_to_watch.status <0>

Yes.

I am subscribed to the mailing list.
That's why my email was n...

Previous thread: Re: REGRESSION: 2.6.20-rc3-git4: EIO not returned to direct i/o application following disk error by Michael Reed on Tuesday, January 9, 2007 - 7:05 pm. (1 message)

Next thread: [PATCH 1 of 2]: Make BH_Unwritten a first class bufferhead flag V2 by David Chinner on Tuesday, January 9, 2007 - 8:33 pm. (2 messages)