exactly. The proposed patch brings the NMI notifier arms race to its
next level: it is in essence a "super high priority" notifier that wants
exclusivity. That is unnecessary: we already have sufficient
infrastructure in place to recognize the priority of notifiers, and die
notifiers make active use of it. (see the list below)
If the goal is to log relevant NMI events to NVRAM for later inspection
then the solution is to register a low-priority notifier which will
execute if no other notifier shows interest in an NMI event.
If the goal is to reboot the system if there's a "bad" NMI, then the
solution is to register a low-priority notifier which will execute and
panic the system if no other notifier shows interest in that NMI event.
This means all the 'good' NMI sources need to register at higher
priority and all the standard platform fallback notifiers need to
register at lowest priority. Debuggers go last. If any of the existing
notifiers in the kernel dont follow the rules and cause problems then we
can fix up those.
Here's a list of all current die notifiers we use in the kernel:
prio
----
# core kernel:
kernel/kprobes.c # instruction probing +INT_MAX
kernel/trace/trace.c # dump trace 150
# arch/x86:
arch/x86/mm/kmmio.c # mmiotrace, debug feature 0
arch/x86/kernel/kgdb.c # kgdb, kernel debugger -INT_MAX
arch/x86/kernel/crash.c # kdump 0
arch/x86/oprofile/nmi_int.c # oprofile 0
arch/x86/oprofile/nmi_timer_int.c # oprofile timer fallback 0
# drivers:
drivers/watchdog/hpwdt.c # HP watchdog driver +INT_MAX
drivers/char/ipmi/ipmi_watchdog.c # IPMI watchdog 150
drivers/misc/sgi-xp/xpc_main.c # SGI SN2 0
# upcoming:
arch/x86/mm/kmemcheck/smp.c # in tip/master 0
[ this should change to +INT_MAX ]
One change we could do is to move arch/x86/kernel/crash.c's priority to
-100. That would achieve the following effect: it would execute after
all known sources of NMIs, but before the lowest prio interactive
notifiers such as kgdb.
That way a platform driver could insert at priority -50 and execute
after the known self-generated NMI sources, but before any of the debug
and crash-tool notifiers. And the constants should be made symbolic as
well, so that we encode the purpose of a notifier, not the
implementation - and can map the purpose to priority and shape the
ordering flexibly.
furthermore, as i indicated it in my first reply, if the port 61H read
is causing problems on this platform (and i'd not be surprised if it did
so) we can turn it into a lowest-prio fallback notifier which only
executes if no known sources of NMIs show interest in an event.
Ingo
--