Re: Hardware Error Kernel Mini-Summit

Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]
From: Borislav Petkov
Date: Tuesday, May 18, 2010 - 12:18 pm

From: "Luck, Tony" <tony.luck@intel.com>
Date: Tue, May 18, 2010 at 03:08:58PM -0400


Yep, that's the idea.


Well, we have a trace_mce_record tracepoint in the mcheck code which
calls all the necessary callbacks when an mcheck occurs. For the time
being, the idea is to use the mce.c ring buffer for early mchecks and
copy them to the regular ftrace per-cpu buffer after the last has been
initialized. Later, we could switch to a another early bootmem buffer if
there's need to.

Also, we want to have a userspace daemon that reads out the mces from
the trace buffer and does further processing like thresholding etc in
userspace.

Concerning critical errors, there we bypass the perf subsystem and
execute the smallest amount of code possible while trying to shutdown
gracefully if the error type allows that.

These are the rough ideas at least...

-- 
Regards/Gruss,
Boris.

Operating Systems Research Center
Advanced Micro Devices, Inc.
--
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

Messages in current thread:
Hardware Error Kernel Mini-Summit, Mauro Carvalho Chehab, (Mon May 17, 11:23 am)
Re: Hardware Error Kernel Mini-Summit, Andi Kleen, (Mon May 17, 3:41 pm)
Re: Hardware Error Kernel Mini-Summit, Hidetoshi Seto, (Mon May 17, 11:52 pm)
Re: Hardware Error Kernel Mini-Summit, Borislav Petkov, (Tue May 18, 6:06 am)
Re: Hardware Error Kernel Mini-Summit, Mauro Carvalho Chehab, (Tue May 18, 9:44 am)
Re: Hardware Error Kernel Mini-Summit, Mauro Carvalho Chehab, (Tue May 18, 9:50 am)
Re: Hardware Error Kernel Mini-Summit, Mauro Carvalho Chehab, (Tue May 18, 9:52 am)
Re: Hardware Error Kernel Mini-Summit, Mauro Carvalho Chehab, (Tue May 18, 10:06 am)
Re: Hardware Error Kernel Mini-Summit, Joe Perches, (Tue May 18, 10:42 am)
Re: Hardware Error Kernel Mini-Summit, Mauro Carvalho Chehab, (Tue May 18, 10:59 am)
Re: Hardware Error Kernel Mini-Summit, Andi Kleen, (Tue May 18, 11:10 am)
Re: Hardware Error Kernel Mini-Summit, Andi Kleen, (Tue May 18, 11:45 am)
Re: Hardware Error Kernel Mini-Summit, Ingo Molnar, (Tue May 18, 11:53 am)
Re: Hardware Error Kernel Mini-Summit, Joe Perches, (Tue May 18, 11:57 am)
RE: Hardware Error Kernel Mini-Summit, Luck, Tony, (Tue May 18, 12:08 pm)
Re: Hardware Error Kernel Mini-Summit, Borislav Petkov, (Tue May 18, 12:18 pm)
Re: Hardware Error Kernel Mini-Summit, Ingo Molnar, (Tue May 18, 12:30 pm)
Re: Hardware Error Kernel Mini-Summit, Ingo Molnar, (Tue May 18, 12:34 pm)
Re: Hardware Error Kernel Mini-Summit, Ingo Molnar, (Tue May 18, 1:42 pm)
Re: Hardware Error Kernel Mini-Summit, Tony Luck, (Tue May 18, 2:37 pm)
Re: Hardware Error Kernel Mini-Summit, Ingo Molnar, (Tue May 18, 3:00 pm)
Re: Hardware Error Kernel Mini-Summit, Eric W. Biederman, (Tue May 18, 3:14 pm)
Re: Hardware Error Kernel Mini-Summit, Andi Kleen, (Tue May 18, 3:28 pm)
Re: Hardware Error Kernel Mini-Summit, Ingo Molnar, (Tue May 18, 3:29 pm)
Re: Hardware Error Kernel Mini-Summit, Eric W. Biederman, (Tue May 18, 6:14 pm)
Re: Hardware Error Kernel Mini-Summit, Ingo Molnar, (Tue May 18, 11:39 pm)
Re: Hardware Error Kernel Mini-Summit, Borislav Petkov, (Tue May 18, 11:46 pm)
Re: Hardware Error Kernel Mini-Summit, Ingo Molnar, (Wed May 19, 12:09 am)
Re: Hardware Error Kernel Mini-Summit, Andi Kleen, (Wed May 19, 2:03 am)
Re: Hardware Error Kernel Mini-Summit, Mauro Carvalho Chehab, (Wed May 19, 4:54 am)
Re: Hardware Error Kernel Mini-Summit, Tony Luck, (Wed May 19, 10:30 am)
Re: Hardware Error Kernel Mini-Summit, Ingo Molnar, (Thu May 20, 5:37 am)
Re: Hardware Error Kernel Mini-Summit, Russ Anderson, (Mon May 24, 8:55 am)
Re: Hardware Error Kernel Mini-Summit, Russ Anderson, (Mon May 24, 9:21 am)
Re: Hardware Error Kernel Mini-Summit, Russ Anderson, (Mon May 24, 10:13 am)
Re: Hardware Error Kernel Mini-Summit, Tony Luck, (Mon May 24, 10:35 am)
Re: Hardware Error Kernel Mini-Summit, Andi Kleen, (Mon May 24, 11:26 am)
Re: Hardware Error Kernel Mini-Summit, Andi Kleen, (Mon May 24, 11:31 am)
Re: Hardware Error Kernel Mini-Summit, Nils Carlson, (Mon Jun 14, 3:03 am)
Re: Hardware Error Kernel Mini-Summit, Andi Kleen, (Mon Jun 14, 4:49 am)
Re: Hardware Error Kernel Mini-Summit, Nils Carlson, (Mon Jun 14, 12:47 pm)
Re: Hardware Error Kernel Mini-Summit, Eric W. Biederman, (Mon Jun 14, 1:06 pm)
Re: Hardware Error Kernel Mini-Summit, Andi Kleen, (Mon Jun 14, 1:21 pm)
RE: Hardware Error Kernel Mini-Summit, Luck, Tony, (Mon Jun 14, 1:21 pm)
Re: Hardware Error Kernel Mini-Summit, Andi Kleen, (Mon Jun 14, 1:36 pm)
Re: Hardware Error Kernel Mini-Summit, Tony Luck, (Mon Jun 14, 2:34 pm)
Re: Hardware Error Kernel Mini-Summit, Andi Kleen, (Mon Jun 14, 11:44 pm)
Re: Hardware Error Kernel Mini-Summit, Andi Kleen, (Mon Jun 14, 11:56 pm)
Re: Hardware Error Kernel Mini-Summit, Nils Carlson, (Tue Jun 15, 1:06 am)
Re: Hardware Error Kernel Mini-Summit, Borislav Petkov, (Tue Jun 15, 3:01 am)
Re: Hardware Error Kernel Mini-Summit, Andi Kleen, (Tue Jun 15, 4:41 am)
Re: Hardware Error Kernel Mini-Summit, Nils Carlson, (Tue Jun 15, 5:21 am)
RE: Hardware Error Kernel Mini-Summit, Luck, Tony, (Tue Jun 15, 11:15 am)
Re: Hardware Error Kernel Mini-Summit, Nils Carlson, (Tue Jun 15, 11:38 am)
Re: Hardware Error Kernel Mini-Summit, Andi Kleen, (Tue Jun 15, 12:35 pm)
Re: Hardware Error Kernel Mini-Summit, Andi Kleen, (Tue Jun 15, 12:37 pm)
Re: Hardware Error Kernel Mini-Summit, Nils Carlson, (Tue Jun 15, 1:48 pm)
Re: Hardware Error Kernel Mini-Summit, Tony Luck, (Tue Jun 15, 3:33 pm)
Re: Hardware Error Kernel Mini-Summit, Andi Kleen, (Wed Jun 16, 2:40 am)