Hi Andi,
I recently took some time to look at the daemon side of mcelog. I found
a few issues with that[0] that I will detail in a separate mail, but I
am more concerned that in recent kernels, the first read() of the mcelog
device performed by mcelog after boot always fails. 100% reproducible on
several machines - mcelog client always bombs out in process() due to
getting a "no such device" error from the kernel read attempt.
The /dev/mcelog device exists, but the first read fails.
Can you do a fresh boot, run mcelog, and see what I mean?
Jon.
[0] ignoredev not documented, cache-error-trigger uses $CPUS_AFFECTED
when it should be $AFFECTED_CPUS, mcelog-client socket is always created
no matter what the config says (and not deleted on shutdown).
--