[PATCH] x86: mce: Xeon75xx specific interface to get corrected memory error information v2

Previous thread: [PATCH] asus-laptop: Return -ENOMEM in case of failed rfkill_alloc() by Axel Lin on Tuesday, March 23, 2010 - 10:40 pm. (1 message)

Next thread: [PATCH -mmotm] [BUGFIX] pagemap: fix pfn calculation for hugepage by Naoya Horiguchi on Tuesday, March 23, 2010 - 10:42 pm. (5 messages)
From: Andi Kleen
Date: Tuesday, March 23, 2010 - 10:40 pm

x86: mce: Xeon75xx specific interface to get corrected memory error information v2

[This version addresses the previous comments. It does not change
any interface to the outside and does not attempt to encode DIMMs
or anything like that, but only passes out the physical address of u
a corrected error in the standard ADDR register field. 
So for the outside it looks exactly the same as if the CPU supported this 
natively, but no otherwise special interfaces.

I hope this addresses previous concerns. I guess the DIMM error reporting
can be revisited once there's a new reporting interface. There are still
some traces of DIMM parsing in there, but it's only used for debug
purposes now.]

---

Xeon 75xx doesn't log physical addresses on corrected machine check
events in the standard architectural MSRs. Instead the address has to
be retrieved in a model specific way. This makes it impossible
to do predictive failure analysis.

Implement cpu model specific code to do this in mce-xeon75xx.c using a new hook 
that is called from the generic poll code. The code retrieves 
the physical addressof the last corrected error from the platform 
and makes the address look like a standard architectural MCA address for 
further processing.

There's no code to print this information on a panic because this only
works for corrected errors, and corrected errors do not usually result in 
panics.

The act of retrieving the PA information can take some time, so this
code has a rate limit to avoid taking too much CPU time on a error flood.

The whole thing can be loaded as a module and has suitable
PCI-IDs so that it can be auto-loaded by a distribution.
The code also checks explicitely for the expected CPU model
number to make sure this code doesn't run anywhere else.

Signed-off-by: Andi Kleen <ak@linux.intel.com>

---
 arch/x86/Kconfig                          |    8 
 arch/x86/kernel/cpu/mcheck/Makefile       |    1 
 arch/x86/kernel/cpu/mcheck/mce-internal.h |    1 
 ...
From: Andi Kleen
Date: Monday, March 29, 2010 - 12:47 am

Ping?  Please review this patch. Thanks.

-- 
ak@linux.intel.com -- Speaking for myself only.
--

From: Hidetoshi Seto
Date: Monday, March 29, 2010 - 1:29 am

Could you point proper specification or datasheet to know/check what
you are going to do here?


Thanks,
H.Seto

--

From: Andi Kleen
Date: Monday, March 29, 2010 - 2:01 am

You mean how the model specific interface works?

There's currently no public specification for the interface,
but it should be reasonably clear from reading the driver how
it works.

-Andi
-- 
ak@linux.intel.com -- Speaking for myself only.
--

From: Hidetoshi Seto
Date: Monday, March 29, 2010 - 3:46 am

It looks like overengineered...

I have some questions: Is it impossible to get the address
after polling handler have processed?  e.g. Is it possible to
implement this module as mcelog's add-on that hooked & invoked
immediately after reading /dev/mcelog?  I guess there are
some limitation/restriction to call pfa_command().

Are there any alternative way to get the address?
Polling like edac_i7 doesn't help this?

You pointed "This makes it impossible to do predictive failure
analysis", but I guess we could do rough-but-enough analysis that
requires coarse resolution like sockets.  Or we should not expect
that one of DIMMs connected to the socket is broken if the socket
reports corrected memory errors many time?


Thanks,
H.Seto



--

From: Andi Kleen
Date: Monday, March 29, 2010 - 3:55 am

You need to be in ring 0. In theory you could do it later, but then



The main predictive failure analysis interesting here is bad page
offlining and for that you need an address.

-Andi

-- 
ak@linux.intel.com -- Speaking for myself only.
--

Previous thread: [PATCH] asus-laptop: Return -ENOMEM in case of failed rfkill_alloc() by Axel Lin on Tuesday, March 23, 2010 - 10:40 pm. (1 message)

Next thread: [PATCH -mmotm] [BUGFIX] pagemap: fix pfn calculation for hugepage by Naoya Horiguchi on Tuesday, March 23, 2010 - 10:42 pm. (5 messages)