This patch implements code to handle ibs interrupts. If ibs data is
available a raw perf_event data sample is created and sent back to the
userland. Currently only the data is stored only in the raw data, but
this could be extended in a later patch by generating generic event
data such as the rip from the ibs sampling data.
Signed-off-by: Robert Richter <robert.richter@amd.com>
---
arch/x86/include/asm/msr-index.h | 3 ++
arch/x86/kernel/cpu/perf_event_amd.c | 65 +++++++++++++++++++++++++++++++++-
2 files changed, 67 insertions(+), 1 deletions(-)
diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index bc473ac..a7e4aa5 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -113,6 +113,7 @@
#define MSR_AMD64_IBSFETCHCTL 0xc0011030
#define MSR_AMD64_IBSFETCHLINAD 0xc0011031
#define MSR_AMD64_IBSFETCHPHYSAD 0xc0011032
+#define MSR_AMD64_IBSFETCH_SIZE 3
#define MSR_AMD64_IBSOPCTL 0xc0011033
#define MSR_AMD64_IBSOPRIP 0xc0011034
#define MSR_AMD64_IBSOPDATA 0xc0011035
@@ -120,7 +121,9 @@
#define MSR_AMD64_IBSOPDATA3 0xc0011037
#define MSR_AMD64_IBSDCLINAD 0xc0011038
#define MSR_AMD64_IBSDCPHYSAD 0xc0011039
+#define MSR_AMD64_IBSOP_SIZE 7
#define MSR_AMD64_IBSCTL 0xc001103a
+#define MSR_AMD64_IBS_SIZE_MAX MSR_AMD64_IBSOP_SIZE
/* Fam 10h MSRs */
#define MSR_FAM10H_MMIO_CONF_BASE 0xc0010058
diff --git a/arch/x86/kernel/cpu/perf_event_amd.c b/arch/x86/kernel/cpu/perf_event_amd.c
index 3dc327c..78b0b34 100644
--- a/arch/x86/kernel/cpu/perf_event_amd.c
+++ b/arch/x86/kernel/cpu/perf_event_amd.c
@@ -283,6 +283,69 @@ static inline void __amd_pmu_enable_ibs_event(struct hw_perf_event *hwc)
__x86_pmu_enable_event(hwc, IBS_OP_ENABLE);
}
+static int amd_pmu_check_ibs(int idx, unsigned int msr, u64 valid,
+ u64 reenable, int size, struct pt_regs *iregs)
+{
+ struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+ struct perf_event *event = ...Robert, Some comments below. If you have both regular counter intr + IBS you will double-count apic_perf_irqs. I would do: if (handled2 && !handled) inc_irq_stat(). --
Yes, this triggers a warning. Will change it and add 4 padding bytes True, will change this. -Robert -- Advanced Micro Devices, Inc. Operating System Research Center email: robert.richter@amd.com --
The u32 size header in the raw sampling data layout leads to unaligned memory access for 64 bit values. No matter if the padding is inserted at the beginning or end of the raw sample, it introduces either on the sending or receiving side an 32 bit offset. Wouldn't it be better to simply make size an u64 or to add another reserved u32 value to the header. This could introduce some overhead on smaller architectures, but currently this is only used on x86. The event size encoding of the ring_buffer implementation could be another alternative. -Robert -- Advanced Micro Devices, Inc. Operating System Research Center email: robert.richter@amd.com --
During code review I found a bug in perf_output_sample(). A fix is
--
From 6373951f1c660400650066b73c3bb2f6d232be67 Mon Sep 17 00:00:00 2001
From: Robert Richter <robert.richter@amd.com>
Date: Fri, 7 May 2010 15:49:56 +0200
Subject: [PATCH] perf: fix raw sample size if no sampling data is attached
The header size of a raw sample is not included in the total size of a
raw data sample. Thus, if no data is attached the size must be
null. In this case a buffer overflow may occur when copying the
sampling data.
Signed-off-by: Robert Richter <robert.richter@amd.com>
---
kernel/perf_event.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/kernel/perf_event.c b/kernel/perf_event.c
index 9dbe8cd..f6ddae9 100644
--- a/kernel/perf_event.c
+++ b/kernel/perf_event.c
@@ -3229,7 +3229,7 @@ void perf_output_sample(struct perf_output_handle *handle,
u32 size;
u32 data;
} raw = {
- .size = sizeof(u32),
+ .size = 0,
.data = 0,
};
perf_output_put(handle, raw);
--
1.7.0.3
--
Advanced Micro Devices, Inc.
Operating System Research Center
email: robert.richter@amd.com
--
Uhhh, this padding data was completly hidden to me by a SEP field. It became visible to me with a far distance of at least 4km and only from the corner of my eye. So, *now* I realized that size is the size of padding data, not the size header! Please ignore this, I shouln't send patches at Friday afternoon. -- Advanced Micro Devices, Inc. Operating System Research Center email: robert.richter@amd.com --
When thinking about this I was wondering if it wouldn't make sense to better fix the alignment and move the data buffer to a 64 bit boundary. So take a look at the enclosed RFC patch. Though it breaks the ABI it would solve some problems. I think more than it introduces. Hopefully I found all effected code locations using it. -Robert -- From 2427dda67b072f27ecff678f8829b9e2fc537988 Mon Sep 17 00:00:00 2001 From: Robert Richter <robert.richter@amd.com> Date: Fri, 7 May 2010 15:32:45 +0200 Subject: [PATCH] perf: align raw sample data on 64-bit boundaries In the current implementation 64 bit raw sample data values are not aligned due to the 32 bit size header. The size header is located before the data buffer on a 64 bit boundary. This leads to unaligned memory access to the data buffer for arrays or structures containing 64 bit values. To avoid this, the patch adds a 32 bit reserved value to the raw sample data header. The data buffer starts then at a 64 bit boundary. This changes the ABI and requires changes in the userland tools. For tools/perf this is at a single location in event.c only. This could also introduce some overhead on smaller architectures, but currently this is only used on x86 or for transferring raw tracepoint data. Though an ABI change should be avoided, it is worth to align raw sample data on 64-bit boundaries as the change fixes unaligned memory access, eases the implementation for raw sample data and reduces the risk of data corruption due to different pad structures inserted by the compiler. Signed-off-by: Robert Richter <robert.richter@amd.com> --- include/linux/perf_event.h | 1 + kernel/perf_event.c | 21 ++++++++++----------- kernel/trace/trace_kprobe.c | 6 ++---- kernel/trace/trace_syscalls.c | 6 ++---- tools/perf/util/event.c | 4 ++-- 5 files changed, 17 insertions(+), 21 deletions(-) diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h index 3fd5c82..016969c 100644 --- ...
No this is used on any architectures that event have a minimal support for perf events. I use tracepoint raw samples in sparc 64 for example (which has much more I don't think we should do this. Ok it's true we've screwed up something there but breaking the ABI is going to make the things even worst I think. I would feel better with a new PERF_SAMPLE_RAW_ALIGNED sample_type and schedule the deprecation of PERF_SAMPLE_RAW for later but keep it for some releases. What do you think? --
Isn't here the same alignment problem on archs there unsigned long is 64 bit? Also, most samples I found have a size of multiples of 8 bytes, so even on 32 bit archs there would be a padding of 4 bytes I was not sure how hard an ABI breakage would be. I think the small number of users of raw samples is manageable, but I understand if you This could be an alternative. Though, it duplicates code paths and introduces ugly sample type checks. Another alternative would be to check the size value, if it is (n * sizeof(u64)) we could asume 64 bit alignment. But maybe this makes things much worse. -Robert -- Advanced Micro Devices, Inc. Operating System Research Center email: robert.richter@amd.com --
Yeah there was an alignment problem in sparc 64 that I fixed in perf tools lately. The fix is more a hack though, the real solution would be to have this alignment thing fixed. I don't know how many people use it. But I prefer not to take that It doesn't duplicate much code paths, we only have a few corner cases to plug in. And more importantly, that would be temporary if we schedule the older PERF_SAMPLE_RAW in, say, three releases from now. This ensures an easy forward compatibility (older perf tools -> newer kernel). But the backward compatibility is less easy (newer perf tools -> older kernel) as it means we need to test dynamically if we have PERF_SAMPLE_RAW_ALIGNED, otherwise we need to fall back to using the older one. --
Frederic, I will send an updated version with PERF_SAMPLE_RAW_ALIGNED. -Robert -- Advanced Micro Devices, Inc. Operating System Research Center email: robert.richter@amd.com --
