The LBRs are relatively cheap to keep enabled and provide some history to OOPSen, also some CPUs are reported to keep them over soft-reset, which allows us to use them to debug things like tripple faults. Therefore introduce a boot option: lbr_debug=on, which always enable the LBRs and will print the LBRs on CPU init and die(). --
The LBRs are relatively cheap to keep enabled and provide some history
to OOPSen, also some CPUs are reported to keep them over soft-reset,
which allows us to use them to debug things like tripple faults.
Therefore introduce a boot option: lbr_debug=on, which always enable
the LBRs and will print the LBRs on CPU init and die().
Requested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
---
arch/x86/include/asm/perf_event.h | 7 ++
arch/x86/kernel/cpu/perf_event_intel.c | 5 -
arch/x86/kernel/cpu/perf_event_intel_lbr.c | 86 +++++++++++++++++++++++++++--
arch/x86/kernel/dumpstack.c | 5 +
4 files changed, 95 insertions(+), 8 deletions(-)
Index: linux-2.6/arch/x86/include/asm/perf_event.h
===================================================================
--- linux-2.6.orig/arch/x86/include/asm/perf_event.h
+++ linux-2.6/arch/x86/include/asm/perf_event.h
@@ -155,9 +155,14 @@ extern void perf_events_lapic_init(void)
#define perf_instruction_pointer(regs) ((regs)->ip)
+void dump_lbr_state(void);
+void lbr_off(void);
+
#else
static inline void init_hw_perf_events(void) { }
-static inline void perf_events_lapic_init(void) { }
+static inline void perf_events_lapic_init(void) { }
+static inline void dump_lbr_state(void) { }
+static inline void lbr_off(void) { }
#endif
#endif /* _ASM_X86_PERF_EVENT_H */
Index: linux-2.6/arch/x86/kernel/cpu/perf_event_intel.c
===================================================================
--- linux-2.6.orig/arch/x86/kernel/cpu/perf_event_intel.c
+++ linux-2.6/arch/x86/kernel/cpu/perf_event_intel.c
@@ -804,10 +804,7 @@ static __initconst const struct x86_pmu
static void intel_pmu_cpu_starting(int cpu)
{
init_debug_store_on_cpu(cpu);
- /*
- * Deal with CPUs that don't clear their LBRs on power-up.
- */
- intel_pmu_lbr_reset();
+ intel_pmu_lbr_starting();
}
static void intel_pmu_cpu_dying(int cpu)
Index: ...When this is enabled, it will prevent changing the LBR configuration to record only selected branches. Unless you are willing to accept filtered --
Sure, but since we don't support that silly config reg anyway that's pretty much not an issue ;-) --
I will provide a patch to make it available. This is needed for certain measurements. --
Yummie! Have you got some sample lbr_debug=1 output as well by any chance, with a crash provoked somewhere? How good is the output in practice? (i.e. how many artificial entries do we have at the end of the buffer, filled with crash related addresses?) Also, i think we should use something more descriptive than lbr_debug=y. Perhaps crash_trace=1 or so? Plus, it would be nice to have a sysctl entry for this as well - so that production systems can enable this if they want to enrich the output of some difficult-to-analyze kernel crash, without yet another reboot. Ingo --
lbr_debug=on actually, =1 doesn't parse. I had some output, but I was still looking at finding out why my %pF formats for CPU0 didn't have Well, I would like to keep LBR in the name, since that is the mechanism Right, could do, but once it crashed it clearly to late to enable anything ;-) --
No. What i mean is that with your patch, a debugging session would go like this: < kernel crashes > # reboot #1 < admin logs in and scratches head > < admin consults kernel hackers and enables lbr_debug=1 in /etc/grub.conf > < admin reboots > # reboot #2 < kernel crashes again > # reboot #3 With the sysctl we'd have one reboot less: < kernel crashes > # reboot #1 < admin logs in and scratches head > < admin consults kernel hackers and tweaks /proc/sys/kernel/x86/lbr_debug > < kernel crashes again > # reboot #2 Thanks, Ingo --
die is too late. they will only contain the oops code then. -Andi -- ak@linux.intel.com -- Speaking for myself only. --
We do an lbr_off() in oops_begin(), or is there a better/earlier place we can do that? --
I had an old patch in the P4 era (slightly different but larger LBRs) which saved them all early in the exception handlers and then dumped them from the buffer. That's early enough that you only miss one or two. The problem is that it's somewhat more expensive, the MSR reads are not cheap and they will slow down all your page faults. I checked, but I can't find the old patch anymore. Could be probably redone. -Andi -- ak@linux.intel.com -- Speaking for myself only. --
