e1000e NVM corruption issue status

!MAILaRCHIVE_VOTE_RePLACE
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]
To: LKML <linux-kernel@...>
Cc: Jiri Kosina <jkosina@...>, <agospoda@...>, Ronciak, John <john.ronciak@...>, Allan, Bruce W <bruce.w.allan@...>, Graham, David <david.graham@...>, <kkiel@...>, <jesse.brandeburg@...>, <tglx@...>, <chris.jones@...>, <arjan@...>
Date: Thursday, September 25, 2008 - 9:50 pm

A quick summary of the issue, if you think you have more data, please 
reply.  If you have had this issue, please reply with results of "cat 
/proc/iomem" and "lspci".  It will help us correlate data.

Problem: some users report that with many of the latest beta distros, 
during a reboot when e1000e loads it says "NVM checksum is not valid" and 
the driver fails to load.

Result: At this point it appears that most users can load the e1000e 
driver if they skip the nvm validation error exit.  LAN traffic may or may 
not work at this point.  Some users report they can dump their eeprom 
using ethtool -e and see some varying data, most report the eeprom read 
returns all ff ff ff

NOTE: if you have not had this problem, but wish to continue using e1000e 
I strongly suggest you do a "ethtool -e eth0 > savemyeep.txt"

Many of the reports seem to be related in time to a graphics crash, no one 
has been able to give us more detail about how to reproduce.  We NEED HELP 
reproducing this.  Steps, hints, anything.  We are trying rebooting, 
suspending, opensuse, fedora, ubuntu, and several hardware platforms, etc.

This seems to effect both 32 and 64 bit kernels, but we haven't heard much 
either way.

hardware affected:
laptops and desktops with 82566 or 82567 based LAN parts, which are 
machines with the ICH8 and ICH9 chipsets and a variety of processors.
The machines I know of that have reported the issue include
Lenovo X300
HP 2510p
Intel DP35JO
Lenovo T61 (possibly)
Lenovo X61 (possibly)

Next steps:
We are still trying to reproduce the issue locally, we should have a 
machine here tomorrow that reportedly had the issue with ubuntu.

We have a series of kernel patches that I will reply to this mail with 
that may help users willing to test.

We should have ready (hopefully tomorrow) an app that should be able to 
restore eeproms as long as the driver can still load.

We also have a band-aid patch that should allow "locking" of the NVM area 
to prevent an errant write, we are looking to post that tomorrow.  This 
should prevent the damage but not really find the culprit.

Jesse
--
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

Messages in current thread:
e1000e NVM corruption issue status, Brandeburg, Jesse, (Thu Sep 25, 9:50 pm)
Re: e1000e NVM corruption issue status, James Courtier-Dutton, (Sat Oct 18, 3:13 pm)
Re: e1000e NVM corruption issue status, Jiri Kosina, (Sat Oct 18, 6:49 pm)
Re: e1000e NVM corruption issue status, Karsten Keil, (Fri Sep 26, 3:19 am)
Re: e1000e NVM corruption issue status, Jesse Brandeburg, (Fri Sep 26, 1:44 am)
Re: e1000e NVM corruption issue status, Brandeburg, Jesse, (Thu Sep 25, 10:01 pm)
Re: e1000e NVM corruption issue status, Karsten Keil, (Fri Sep 26, 10:23 am)
Re: e1000e NVM corruption issue status, Jiri Kosina, (Fri Sep 26, 2:13 am)
Re: e1000e NVM corruption issue status, Arjan van de Ven, (Fri Sep 26, 7:49 am)
Re: e1000e NVM corruption issue status, Jesse Barnes, (Fri Sep 26, 1:52 pm)
Re: e1000e NVM corruption issue status, Jesse Barnes, (Fri Sep 26, 2:23 pm)
Re: e1000e NVM corruption issue status, Tim Gardner, (Fri Sep 26, 2:53 pm)
Re: e1000e NVM corruption issue status, Brandeburg, Jesse, (Fri Sep 26, 8:05 pm)
Re: e1000e NVM corruption issue status, Tim Gardner, (Sat Sep 27, 12:20 am)
Re: e1000e NVM corruption issue status, Krzysztof Halasa, (Fri Sep 26, 6:04 pm)
RE: e1000e NVM corruption issue status, Brandeburg, Jesse, (Fri Sep 26, 6:23 pm)
Re: e1000e NVM corruption issue status, Krzysztof Halasa, (Sat Sep 27, 2:45 pm)
Re: e1000e NVM corruption issue status, Jesse Barnes, (Fri Sep 26, 2:39 pm)
Re: e1000e NVM corruption issue status, Jesse Barnes, (Fri Sep 26, 2:43 pm)
Re: e1000e NVM corruption issue status, Brandeburg, Jesse, (Thu Sep 25, 10:13 pm)
Re: e1000e NVM corruption issue status, Jiri Kosina, (Mon Sep 29, 11:52 am)
Re: e1000e NVM corruption issue status, Jiri Kosina, (Mon Sep 29, 12:20 pm)
RE: e1000e NVM corruption issue status, Brandeburg, Jesse, (Mon Sep 29, 12:24 pm)
RE: e1000e NVM corruption issue status, Jiri Kosina, (Mon Sep 29, 1:18 pm)
RE: e1000e NVM corruption issue status, Jiri Kosina, (Mon Sep 29, 1:36 pm)
RE: e1000e NVM corruption issue status, Jiri Kosina, (Mon Sep 29, 6:43 pm)
Re: e1000e NVM corruption issue status, Brandeburg, Jesse, (Thu Sep 25, 10:13 pm)
Re: e1000e NVM corruption issue status, Brandeburg, Jesse, (Thu Sep 25, 10:12 pm)
Re: e1000e NVM corruption issue status, Brandeburg, Jesse, (Thu Sep 25, 10:12 pm)
Re: e1000e NVM corruption issue status, Brandeburg, Jesse, (Thu Sep 25, 10:11 pm)
Re: e1000e NVM corruption issue status, Brandeburg, Jesse, (Thu Sep 25, 10:11 pm)
Re: e1000e NVM corruption issue status, Brandeburg, Jesse, (Thu Sep 25, 10:10 pm)
Re: e1000e NVM corruption issue status, Brandeburg, Jesse, (Thu Sep 25, 10:10 pm)
Re: e1000e NVM corruption issue status, Brandeburg, Jesse, (Thu Sep 25, 10:10 pm)
Re: e1000e NVM corruption issue status, Brandeburg, Jesse, (Thu Sep 25, 10:09 pm)
Re: e1000e NVM corruption issue status, Brandeburg, Jesse, (Thu Sep 25, 10:09 pm)
Re: e1000e NVM corruption issue status, Ingo Molnar, (Fri Sep 26, 3:12 am)
Re: e1000e NVM corruption issue status, Chris Snook, (Thu Sep 25, 9:58 pm)
Re: e1000e NVM corruption issue status, Brandeburg, Jesse, (Thu Sep 25, 10:04 pm)