Hi, I have a new notebook, a Thinkpad T60, which is freezing in random intervals (like 30 minutes to two days) as long as I am using the on-board wired ethernet interface, which is an e1000e, [8086:109a]. As long as I keep using the WLAN, the system runs for weeks despite frequent suspend/resume cycles etc. The crashes seem really to be tied to using the wired ethernet. This is a hard freeze, with nothing happening on the system, only a long push on the power button helps. Additionally, sometimes, probably after suspend/resume, the wired ethernet does not come up properly again, ip addr claims "NO CARRIER" even if the LEDs on the interface and on the switch claim that there was a link. No packets are received by the interface when it's at this stage. Both issues appear with 2.6.34 and 2.6.34.1. I didn't try any of these issues with an older kernel, 2.6.34 was already out when I started using the T60. To rule out defective hardware, I have tried with a second T60, with the same results. Full dmesg and lspci-nn attached, please say if you need more. Greetings Marc P.S.: Please Cc: me on replies, both linux-kernel and netdev are too big for me to timely follow. I am subscribed to both lists, but a Cc helps in getting a faster reply. -- ----------------------------------------------------------------------------- Marc Haber | "I don't trust Computers. They | Mailadresse im Header Mannheim, Germany | lose things." Winona Ryder | Fon: *49 621 72739834 Nordisch by Nature | How to make an American Quilt | Fax: *49 3221 2323190 --
Adding e1000-devel (the Intel LAN developers list). Please supply the full dmesg you meant to attach with the original report, as well as the output of lspci -vvv. Thanks, Bruce. --
Stupid me. Greetings Marc -- ----------------------------------------------------------------------------- Marc Haber | "I don't trust Computers. They | Mailadresse im Header Mannheim, Germany | lose things." Winona Ryder | Fon: *49 621 72739834 Nordisch by Nature | How to make an American Quilt | Fax: *49 3221 2323190
Please also provide an eeprom dump from the wired LOM via 'ethtool -e ethX'. Thanks, Bruce.--
Offset Values ------ ------ 0x0000 00 16 41 aa be 37 30 0b b2 ff 51 00 ff ff ff ff 0x0010 53 00 03 01 6b 02 01 20 aa 17 9a 10 86 80 df 80 0x0020 00 00 00 20 54 7e 00 00 14 00 da 00 04 00 00 27 0x0030 c9 6c 50 31 3e 07 0b 04 8b 29 00 00 00 f0 02 0f 0x0040 08 10 00 00 04 0f ff 7f 01 4d ff ff ff ff ff ff 0x0050 14 00 1d 00 14 00 1d 00 af aa 1e 00 00 00 1d 00 0x0060 00 01 00 40 1f 12 07 40 ff ff ff ff ff ff ff ff 0x0070 ff ff ff ff ff ff ff ff ff ff ff ff ff ff 7a a7 Greetings Marc -- ----------------------------------------------------------------------------- Marc Haber | "I don't trust Computers. They | Mailadresse im Header Mannheim, Germany | lose things." Winona Ryder | Fon: *49 621 72739834 Nordisch by Nature | How to make an American Quilt | Fax: *49 3221 2323190 --
On Fri, 30 Jul 2010 14:56:14 +0200 That's not very useful. The pcie capabilities are completely missing. --
Again, apologizes. The attached lspci -vvv is from 2.6.35, right after the first freeze with this kernel version. Greetings Marc 00:00.0 Host bridge: Intel Corporation Mobile 945GM/PM/GMS, 943/940GML and 945GT Express Memory Controller Hub (rev 03) Subsystem: Lenovo ThinkPad T60 Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx- Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort+ >SERR- <PERR- INTx- Latency: 0 Capabilities: [e0] Vendor Specific Information: Len=09 <?> 00:01.0 PCI bridge: Intel Corporation Mobile 945GM/PM/GMS, 943/940GML and 945GT Express PCI Express Root Port (rev 03) (prog-if 00 [Normal decode]) Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx+ Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 0, Cache Line Size: 64 bytes Bus: primary=00, secondary=01, subordinate=01, sec-latency=0 I/O behind bridge: 00002000-00002fff Memory behind bridge: ee100000-ee1fffff Prefetchable memory behind bridge: 00000000d8000000-00000000dfffffff Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort+ <SERR- <PERR- BridgeCtl: Parity- SERR- NoISA+ VGA+ MAbort- >Reset- FastB2B- PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn- Capabilities: [88] Subsystem: Lenovo Device 2014 Capabilities: [80] Power Management version 2 Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+) Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME- Capabilities: [90] MSI: Enable+ Count=1/1 Maskable- 64bit- Address: fee0300c Data: 4161 Capabilities: [a0] Express (v1) Root Port (Slot+), MSI 00 DevCap: MaxPayload 128 bytes, PhantFunc 0, Latency L0s <64ns, L1 <1us ExtTag- RBE- FLReset- DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported- RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop- MaxPayload 128 bytes, ...
When the crashes occur - is there a trace on the screen? Do you know of a way to reproduce the issue? For example were you downloading files, browsing internet, or using the ethernet device Doesn't seem that there were any attachments to this email. Thanks, Emil
Hi, Not that I can see any. I'm working in X11, and the screen simply freezes. Mouse doesn't move any more, clock stops. System is not pingable, does not react to Ctrl-Alt-Bksp nor to any Magic Sysrq, nor does it shut down when I have the power button issue an acpi shutdown Sadly, no. I have it happen when typing in a remote ssh session, while on the other hand hundreds of megabytes can be downloaded just fine. It just happens "at random", two or three times a day. I have resorted to using the WLAN instead of the wired ethernet, but that's not a Attached as well. Greetings Marc -- ----------------------------------------------------------------------------- Marc Haber | "I don't trust Computers. They | Mailadresse im Header Mannheim, Germany | lose things." Winona Ryder | Fon: *49 621 72739834 Nordisch by Nature | How to make an American Quilt | Fax: *49 3221 2323190
I have a T60 running with 2.6.34.1 using your config and so far no issues. Looking at your lspci output - your system has a slightly different HW, but I don't know if this is significant. Are you loading the kernel with any parameters (cat /proc/cmdline)? Do you have firewall configured (iptables -L)? Thanks, Emil
BOOT_IMAGE=/vmlinuz-2.6.35-zgws1 root=/dev/mapper/root ro I am working pretty intensively with virtual machines which are natted here and there. I have a handful of MASQUERADE rules in the Tried, no improvement. Greetings Marc -- ----------------------------------------------------------------------------- Marc Haber | "I don't trust Computers. They | Mailadresse im Header Mannheim, Germany | lose things." Winona Ryder | Fon: *49 621 72739834 Nordisch by Nature | How to make an American Quilt | Fax: *49 3221 2323190 --
[adding e1000-devel, the Intel wired ethernet developers mailing list] We have had other recent reports of issues with this part that are due to ASPM L1 being enabled. Would you please try disabling L1 after the driver is loaded as follows (assuming your adapter is still PCI bus/device/number 02:00.0 as indicated in the lspci output you provided earlier): 1) First check the hexadecimal value of the LnkCtl register - # setpci -s 2:0.0 0xf0 2) Disable ASPM (both L0s and L1) by zeroing out bits 0 and 1 in the value returned by the previous step. For example, if it returned 42 (hex 42, that is) - # setpci -s 2:0.0 0xf0=0x40 3) Confirm ASPM is disabled by checking the output from lspci again. Please let us know if this helps your situation, thanks. Bruce. --
Hi, $ sudo setpci --version setpci version 3.1.7 $ sudo setpci -s 2:0.0 0xf0 setpci: Missing width. Try `setpci --help' for more information. $ Looking at --help doesn't help me, sorry. Greetings Marc -- ----------------------------------------------------------------------------- Marc Haber | "I don't trust Computers. They | Mailadresse im Header Mannheim, Germany | lose things." Winona Ryder | Fon: *49 621 72739834 Nordisch by Nature | How to make an American Quilt | Fax: *49 3221 2323190 --
Hmm, that's a newer version than I am familiar with. Apparently in more recent versions, the tool is requiring a width be specified for unnamed registers and/or registers for which the width is unknown. That being the case, append the width specifier .B (one byte) to the register address. For example: # setpci -s 2:0.0 0xf0.B HTH, Bruce. --
Hi Bruce, It returned 42, and after setting 0x40, LnkCtl now says "ASPM Disabled". I'll dock the system now and will report after the weekend or after the first crash. Greetings Marc -- ----------------------------------------------------------------------------- Marc Haber | "I don't trust Computers. They | Mailadresse im Header Mannheim, Germany | lose things." Winona Ryder | Fon: *49 621 72739834 Nordisch by Nature | How to make an American Quilt | Fax: *49 3221 2323190 --
Didn't work, freeze within the first 60 minutes after starting serious work. The system did sit idle for the night though without freezing. Greetings Marc -- ----------------------------------------------------------------------------- Marc Haber | "I don't trust Computers. They | Mailadresse im Header Mannheim, Germany | lose things." Winona Ryder | Fon: *49 621 72739834 Nordisch by Nature | How to make an American Quilt | Fax: *49 3221 2323190 --
Hi,
As of 2.6.35.3, this issue has become worse. It now even happens
directly after freshly booting the system that I cannot get a link on
the wired ethernet.
What information do you need to find out what's going on there?
Please Cc: me on replies
Greetings
Marc
$ sudo ethtool eth0
Settings for eth0:
Supported ports: [ TP ]
Supported link modes: 10baseT/Half 10baseT/Full
100baseT/Half 100baseT/Full
1000baseT/Full
Supports auto-negotiation: Yes
Advertised link modes: 10baseT/Half 10baseT/Full
100baseT/Half 100baseT/Full
1000baseT/Full
Advertised pause frame use: No
Advertised auto-negotiation: Yes
Speed: Unknown!
Duplex: Unknown! (255)
Port: Twisted Pair
PHYAD: 1
Transceiver: internal
Auto-negotiation: on
MDI-X: Unknown
Supports Wake-on: pumbag
Wake-on: g
Current message level: 0x00000001 (1)
Link detected: no
$
[ 0.000000] #38 [0002810840 - 0002810850] BOOTMEM
[ 0.000000] #39 [0002810880 - 00028108f3] BOOTMEM
[ 0.000000] #40 [0002810900 - 0002810973] BOOTMEM
[ 0.000000] #41 [0002c00000 - 0002c10000] BOOTMEM
[ 0.000000] #42 [0002e00000 - 0002e10000] BOOTMEM
[ 0.000000] #43 [0002812980 - 0002812984] BOOTMEM
[ 0.000000] #44 [00028129c0 - 00028129c4] BOOTMEM
[ 0.000000] #45 [0002812a00 - 0002812a08] BOOTMEM
[ 0.000000] #46 [0002812a40 - 0002812a48] BOOTMEM
[ 0.000000] #47 [0002812a80 - 0002812b28] BOOTMEM
[ 0.000000] #48 [0002812b40 - 0002812ba8] BOOTMEM
[ 0.000000] #49 [0002812bc0 - 0002816bc0] BOOTMEM
[ 0.000000] #50 [0002816bc0 - 0002896bc0] BOOTMEM
[ 0.000000] #51 [0002896bc0 - 00028d6bc0] BOOTMEM
[ ...Attached, for an interface in the working state. I'll deliver another Will do in the next days. I guess it would be better to file two bugs, one for the crashes and one for the no-link-in-some-situations issue, right? Or is it likely that they both have the same cause? Greetings Marc -- ----------------------------------------------------------------------------- Marc Haber | "I don't trust Computers. They | Mailadresse im Header Mannheim, Germany | lose things." Winona Ryder | Fon: *49 621 72739834 Nordisch by Nature | How to make an American Quilt | Fax: *49 3221 2323190
Here we go, with a non-working interface. Greetings Marc -- ----------------------------------------------------------------------------- Marc Haber | "I don't trust Computers. They | Mailadresse im Header Mannheim, Germany | lose things." Winona Ryder | Fon: *49 621 72739834 Nordisch by Nature | How to make an American Quilt | Fax: *49 3221 2323190
After reviewing the output from dmidecode I determined that your model T60 is slightly different than mine. It appears that you have the widescreen version. Is that correct? Also you seem to be running a fairly old version of the BIOS (1.08). The latest is 1.18: http://www-307.ibm.com/pc/support/site.wss/MIGR-67020.html I would recommend that you upgrade your BIOS. If that does not help we can continue with the investigation. I will also try to locate a widescreen T60 that would hopefully help me with the repro. Thanks, Emil
Hi, The dmidecode output is from the widescreen model, yes, but I also have two "normal" T60 with the non-wide screen 15" display (with 1400x1050 pixels). The freezes happen on all three. The one I have at hand is running BIOS 2.26 dated 2010-04-01. I will also try updating the Widescreen unit which is - not surprisingly - the one I use the Thanks for that pointer, I am having difficulties in navigating the I can give you ssh access to mine if you want to. Do you have IPv6 Please note that usually the freezes happen when the network is rather slightly loaded, for example when I'm typing into an ssh window with nothing else happening on the box. When I do things that are rather traffic intensive such as a backup, the box is fine. The "no link" issue appears most frequently on a system that has been running for some time and suspend-to-ram was used. I am traveling a lot, and every change of train or bus involves a suspend-resume cycle. Greetings Marc -- ----------------------------------------------------------------------------- Marc Haber | "I don't trust Computers. They | Mailadresse im Header Mannheim, Germany | lose things." Winona Ryder | Fon: *49 621 72739834 Nordisch by Nature | How to make an American Quilt | Fax: *49 3221 2323190 --
Hi, No, IPv6 is just some kind of luxury. The majority of work is done via IPv4. It is just easier to make the box accessible from the outside I will try doing so. At the moment, I am pretty much traveling five days a week and do not have much opportunity to use the wired Ethernet, but I haven't seen the crashes recently. So it might as well I will do that as soon as I see the issue again. Greetings Marc -- ----------------------------------------------------------------------------- Marc Haber | "I don't trust Computers. They | Mailadresse im Header Mannheim, Germany | lose things." Winona Ryder | Fon: *49 621 72739834 Nordisch by Nature | How to make an American Quilt | Fax: *49 3221 2323190 --
If that helps -- there's a serial port option available for the T60's UltraBay too that works as ttyS0. Maciej --
Hi, I have indeed recently bought a docking station. How do I obtain a trace from a frozen sysem? Greetings Marc -- ----------------------------------------------------------------------------- Marc Haber | "I don't trust Computers. They | Mailadresse im Header Mannheim, Germany | lose things." Winona Ryder | Fon: *49 621 72739834 Nordisch by Nature | How to make an American Quilt | Fax: *49 3221 2323190 --
