From: "Luck, Tony" <tony.luck@intel.com>
Date: Tue, May 18, 2010 at 03:08:58PM -0400
Yep, that's the idea.
Well, we have a trace_mce_record tracepoint in the mcheck code which
calls all the necessary callbacks when an mcheck occurs. For the time
being, the idea is to use the mce.c ring buffer for early mchecks and
copy them to the regular ftrace per-cpu buffer after the last has been
initialized. Later, we could switch to a another early bootmem buffer if
there's need to.
Also, we want to have a userspace daemon that reads out the mces from
the trace buffer and does further processing like thresholding etc in
userspace.
Concerning critical errors, there we bypass the perf subsystem and
execute the smallest amount of code possible while trying to shutdown
gracefully if the error type allows that.
These are the rough ideas at least...
--
Regards/Gruss,
Boris.
Operating Systems Research Center
Advanced Micro Devices, Inc.
--