The perf-events backend for OProfile that Will Deacon wrote in 8c1fc96f6fd1f361428ba805103af0d0eee65179 ("ARM: 6072/1: oprofile: use perf-events framework as backend") is of use to more architectures than just ARM. Move the code into drivers/oprofile/ so that SH can use it instead of the nearly identical copy of its OProfile code. The benefit of the backend is that it becomes necessary to only maintain one copy of the PMU accessor functions for each architecture, with bug fixes and new features benefiting both OProfile and perf. Note that I haven't been able to test these patches on an ARM board to see if I've caused any regressions. If anyone else could do that I'd appreciate it. This patch series is based on tip/master. Matt Fleming (3): sh: Accessor functions for the sh_pmu state oprofile: Abstract the perf-events backend for oprofile sh: Use the perf-events backend for oprofile arch/arm/oprofile/Makefile | 4 + arch/arm/oprofile/common.c | 176 +----------------------------------- arch/sh/include/asm/perf_event.h | 2 + arch/sh/kernel/perf_event.c | 13 +++ arch/sh/oprofile/Makefile | 4 + arch/sh/oprofile/common.c | 95 +++---------------- arch/sh/oprofile/op_impl.h | 33 ------- drivers/oprofile/oprofile_perf.c | 188 ++++++++++++++++++++++++++++++++++++++ include/linux/oprofile.h | 12 +++ 9 files changed, 240 insertions(+), 287 deletions(-) delete mode 100644 arch/sh/oprofile/op_impl.h create mode 100644 drivers/oprofile/oprofile_perf.c -- 1.7.2.1 --
Use the perf-events based wrapper for oprofile available in
drivers/oprofile. This allows us to centralise the code to control
performance counters.
Signed-off-by: Matt Fleming <matt@console-pimps.org>
---
arch/sh/oprofile/Makefile | 4 ++
arch/sh/oprofile/common.c | 95 ++++++-------------------------------------
arch/sh/oprofile/op_impl.h | 33 ---------------
3 files changed, 18 insertions(+), 114 deletions(-)
delete mode 100644 arch/sh/oprofile/op_impl.h
diff --git a/arch/sh/oprofile/Makefile b/arch/sh/oprofile/Makefile
index 4886c5c..00ef946 100644
--- a/arch/sh/oprofile/Makefile
+++ b/arch/sh/oprofile/Makefile
@@ -6,4 +6,8 @@ DRIVER_OBJS = $(addprefix ../../../drivers/oprofile/, \
oprofilefs.o oprofile_stats.o \
timer_int.o )
+ifeq ($(CONFIG_PERF_EVENTS), y)
+DRIVER_OBJS += $(addprefix ../../../drivers/oprofile/, oprofile_perf.o)
+endif
+
oprofile-y := $(DRIVER_OBJS) common.o backtrace.o
diff --git a/arch/sh/oprofile/common.c b/arch/sh/oprofile/common.c
index ac60493..b5b0d5e 100644
--- a/arch/sh/oprofile/common.c
+++ b/arch/sh/oprofile/common.c
@@ -17,73 +17,23 @@
#include <linux/init.h>
#include <linux/errno.h>
#include <linux/smp.h>
+#include <linux/perf_event.h>
#include <asm/processor.h>
-#include "op_impl.h"
-
-static struct op_sh_model *model;
-
-static struct op_counter_config ctr[20];
extern void sh_backtrace(struct pt_regs * const regs, unsigned int depth);
-static int op_sh_setup(void)
-{
- /* Pre-compute the values to stuff in the hardware registers. */
- model->reg_setup(ctr);
-
- /* Configure the registers on all cpus. */
- on_each_cpu(model->cpu_setup, NULL, 1);
-
- return 0;
-}
-
-static int op_sh_create_files(struct super_block *sb, struct dentry *root)
-{
- int i, ret = 0;
-
- for (i = 0; i < model->num_counters; i++) {
- struct dentry *dir;
- char buf[4];
-
- snprintf(buf, sizeof(buf), "%d", i);
- dir = oprofilefs_mkdir(sb, root, buf);
-
- ret |= oprofilefs_create_ulong(sb, ...Introduce some accessor functions for getting at the name and number of
counters of the current sh_pmu instance.
Signed-off-by: Matt Fleming <matt@console-pimps.org>
---
arch/sh/include/asm/perf_event.h | 2 ++
arch/sh/kernel/perf_event.c | 13 +++++++++++++
2 files changed, 15 insertions(+), 0 deletions(-)
diff --git a/arch/sh/include/asm/perf_event.h b/arch/sh/include/asm/perf_event.h
index 3d0c9f3..5b7fa84 100644
--- a/arch/sh/include/asm/perf_event.h
+++ b/arch/sh/include/asm/perf_event.h
@@ -25,6 +25,8 @@ struct sh_pmu {
extern int register_sh_pmu(struct sh_pmu *);
extern int reserve_pmc_hardware(void);
extern void release_pmc_hardware(void);
+extern int sh_pmu_num_events(void);
+extern const char *sh_pmu_name(void);
static inline void set_perf_event_pending(void)
{
diff --git a/arch/sh/kernel/perf_event.c b/arch/sh/kernel/perf_event.c
index 7a3dc35..086f788 100644
--- a/arch/sh/kernel/perf_event.c
+++ b/arch/sh/kernel/perf_event.c
@@ -60,6 +60,19 @@ static inline int sh_pmu_initialized(void)
}
/*
+ * Return the number of events for the current sh_pmu.
+ */
+int sh_pmu_num_events(void)
+{
+ return sh_pmu->num_events;
+}
+
+const char *sh_pmu_name(void)
+{
+ return sh_pmu->name;
+}
+
+/*
* Release the PMU if this is the last perf_event.
*/
static void hw_perf_event_destroy(struct perf_event *event)
--
1.7.2.1
--
Move the perf-events backend from arch/arm/oprofile into
drivers/oprofile so that the code can be shared between architectures.
This allows each architecture to maintain only a single copy of the
PMU accessor functions instead of one for both perf and OProfile. It
also becomes possible for other architectures to delete much of their
OProfile code in favour of the common code now available in
drivers/oprofile/oprofile_perf.c.
Signed-off-by: Matt Fleming <matt@console-pimps.org>
---
arch/arm/oprofile/Makefile | 4 +
arch/arm/oprofile/common.c | 176 +-----------------------------------
drivers/oprofile/oprofile_perf.c | 188 ++++++++++++++++++++++++++++++++++++++
include/linux/oprofile.h | 12 +++
4 files changed, 207 insertions(+), 173 deletions(-)
create mode 100644 drivers/oprofile/oprofile_perf.c
diff --git a/arch/arm/oprofile/Makefile b/arch/arm/oprofile/Makefile
index e666eaf..def77a1 100644
--- a/arch/arm/oprofile/Makefile
+++ b/arch/arm/oprofile/Makefile
@@ -6,4 +6,8 @@ DRIVER_OBJS = $(addprefix ../../../drivers/oprofile/, \
oprofilefs.o oprofile_stats.o \
timer_int.o )
+ifeq ($(CONFIG_HW_PERF_EVENTS),y)
+DRIVERS_OBJS += $(addprefix ../../../drivers/oprofile/, oprofile_perf.o)
+endif
+
oprofile-y := $(DRIVER_OBJS) common.o
diff --git a/arch/arm/oprofile/common.c b/arch/arm/oprofile/common.c
index 0691176..50e4980 100644
--- a/arch/arm/oprofile/common.c
+++ b/arch/arm/oprofile/common.c
@@ -25,136 +25,9 @@
#include <asm/ptrace.h>
#ifdef CONFIG_HW_PERF_EVENTS
-/*
- * Per performance monitor configuration as set via oprofilefs.
- */
-struct op_counter_config {
- unsigned long count;
- unsigned long enabled;
- unsigned long event;
- unsigned long unit_mask;
- unsigned long kernel;
- unsigned long user;
- struct perf_event_attr attr;
-};
-
static int op_arm_enabled;
static DEFINE_MUTEX(op_arm_mutex);
-static struct op_counter_config *counter_config;
-static struct perf_event ...Nice, I didn't know such a backend already existed. Now that you have made it generic we should aim towards making it the only oprofile backend and getting rid of all the duplication. --
Definitely. I've added linux-arch in case there's some maintainers that want to use this now. The new generic code is in this patch, http://lkml.org/lkml/2010/8/23/118 For an example of how the SH oprofile stuff changed see, http://lkml.org/lkml/2010/8/23/117 --
Even better would be to do the surgery at a higher level and provide the oprofile API without the oprofile buffer management. My experience is that it doesn't scale, and on heavily threaded large SMP setup, there is an enormous amount of time wasted contending on the global oprofile buffer mutex. Cheers, Ben. --
Ben, Ideally, the OProfile userspace tools would talk native perf and we could do away with oprofilefs and the oprofile buffer altogether. Compatibility with the old [read current] OProfile tools can be maintained using these patches until the API is deprecated. Or is that a bit too optimistic? :) Will --
Heh, well, dunno, I suspect just moving userspace to native perf tools is going to happen as soon is not sooner :-) Ben. --
Hi Matt,
The downside is that it's only really applicable if all the subarch
targets which have OProfile support have equivalent perf support. I know
this is the case for SH and ARM, but I'm not sure about other
I tried to test them but they don't compile:
arch/arm/oprofile/common.c: In function 'oprofile_arch_exit':
arch/arm/oprofile/common.c:234: error: 'perf_events' undeclared (first use in this function)
arch/arm/oprofile/common.c:234: error: (Each undeclared identifier is reported only once
arch/arm/oprofile/common.c:234: error: for each function it appears in.)
arch/arm/oprofile/common.c:237: error: 'perf_num_counters' undeclared (first use in this function)
arch/arm/oprofile/common.c:246: error: 'counter_config' undeclared (first use in this function)
This is because the oprofile_arch_exit implementation for ARM frees
data structures that were previously allocated in oprofile_arch_init.
Since this is now done in op_perf_create_files, I'm not sure where the
freeing should be done. OProfile can be compiled as a module, so this
does need to be implemented somewhere (plus, if oprofile_arch_init fails
oprofile_arch_exit is called immediately). Perhaps an op_perf_exit()
function could be called from the arch code?
Looking at the existing ARM implementation, it's not entirely safe in
the case that oprofile_arch_init fails and needs something like:
diff --git a/arch/arm/oprofile/common.c b/arch/arm/oprofile/common.c
index 0691176..15d379f 100644
--- a/arch/arm/oprofile/common.c
+++ b/arch/arm/oprofile/common.c
@@ -275,10 +275,12 @@ out:
return ret;
}
-static void exit_driverfs(void)
+static void __exit exit_driverfs(void)
{
- platform_device_unregister(oprofile_pdev);
- platform_driver_unregister(&oprofile_driver);
+ if (!IS_ERR_OR_NULL(oprofile_pdev)) {
+ platform_device_unregister(oprofile_pdev);
+ platform_driver_unregister(&oprofile_driver);
+ }
}
#else
static int __init init_driverfs(void) ...Sure. This doesn't have to be a flag day. Architectures can move over if and when they're ready. I haven't looked very closely at any other architectures Eek! I totally messed this up, sorry. Thanks very much for compiling these and reviewing them. I've just grabbed an ARM toolchain so I'll compile the next version before I post it ;-) You've highlighted a good point - the allocation and freeing is done in the wrong places. We need a function in drivers/oprofile/oprofile_perf.c that is called from oprofile_arch_init() that allocates the 'counter_config' and 'perf_events[]' data structures. These can then be Ah, I see what you mean. I'll fold this change into my series to avoid conflicts, but as a separate patch. I'll retain your authorship in the commit just be sure to check it when I send out V2 of this series to make sure I haven't messed your patch up. Thanks for the review! --
