Hi, Paul, there are some sh changes in this set. I couldn't test them by myself, do you think you could give it a try? You can pull the set from this branch: git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing.git perf/hw-breakpoints This new shot brings the following features: - You can now profile userspace memory accesses from perf, whenever you are profiling cpu wide or task wide - You can profile kernel memory accesses from perf if you are doing a task bound profiling (previously you could do it only for cpu wide profiling). That requires admin privileges though. So if you want to profile the global variable "A" in your "test" application. Get the address of "A" and do: perf record -e mem:addr_of_A:rw ./test - Separate constraint space between data and instruction breakpoints. If you have separate addr registers for your instruction and data breakpoints, one type won't eat the remaining slot constraints of the other in the constraint table. - Handle the weight of a breakpoint. Simple breakpoints only consume one register but range breakpoints or other complicated things may eat more. Handle this case in the constraint table. - Handle the fact we don't always know at build time the total number of available breakpoint registers. For example Arm needs to get this information dynamically on runtime. TODO: - ptrace usually get breakpoint informations through virtual arch regs provided by the user. This requires a wasteful level of indirection where we need to translate the arch info to generic info, and to translate back again to arch info. I need to implement a shortcut for the ptrace case. Not only is it wasteful, but we may lose some information in the process as the generic interface will never be able to integrate 100% of the subtleties that archs can implement for their breakpoints which can be expressed through arch register images in ptrace. We only want the most common features in ...
We stopped checking disabled breakpoints because we weren't allowing breakpoints on NULL addresses. And gdb tends to set NULL addresses on inactive breakpoints. But refusing NULL addresses was actually a regression that has been fixed now. There is no reason anymore to not validate inactive breakpoint settings. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Cc: K. Prasad <prasad@linux.vnet.ibm.com> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Jason Wessel <jason.wessel@windriver.com> Cc: Ingo Molnar <mingo@elte.hu> --- kernel/hw_breakpoint.c | 12 +----------- 1 files changed, 1 insertions(+), 11 deletions(-) diff --git a/kernel/hw_breakpoint.c b/kernel/hw_breakpoint.c index 03808ed..9ed9ae3 100644 --- a/kernel/hw_breakpoint.c +++ b/kernel/hw_breakpoint.c @@ -316,17 +316,7 @@ int register_perf_hw_breakpoint(struct perf_event *bp) if (ret) return ret; - /* - * Ptrace breakpoints can be temporary perf events only - * meant to reserve a slot. In this case, it is created disabled and - * we don't want to check the params right now (as we put a null addr) - * But perf tools create events as disabled and we want to check - * the params for them. - * This is a quick hack that will be removed soon, once we remove - * the tmp breakpoints from ptrace - */ - if (!bp->attr.disabled || !bp->overflow_handler) - ret = arch_validate_hwbkpt_settings(bp, bp->ctx->task); + ret = arch_validate_hwbkpt_settings(bp, bp->ctx->task); /* if arch_validate_hwbkpt_settings() fails then release bp slot */ if (ret) -- 1.6.2.3 --
The breakpoint generic layer assumes that archs always know in advance
the static number of address registers available to host breakpoints
through the HBP_NUM macro.
However this is not true for every archs. For example Arm needs to get
this information dynamically to handle the compatiblity between
different versions.
To solve this, this patch proposes to drop the static HBP_NUM macro
and let the arch provide the number of available slots through a
new hw_breakpoint_slots() function. For archs that have
CONFIG_HAVE_MIXED_BREAKPOINTS_REGS selected, it will be called once
as the number of registers fits for instruction and data breakpoints
together.
For the others it will be called first to get the number of
instruction breakpoint registers and another time to get the
data breakpoint registers, the targeted type is given as a
parameter of hw_breakpoint_slots().
Reported-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Cc: K. Prasad <prasad@linux.vnet.ibm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Jason Wessel <jason.wessel@windriver.com>
Cc: Ingo Molnar <mingo@elte.hu>
---
arch/sh/include/asm/hw_breakpoint.h | 5 +++
arch/x86/include/asm/hw_breakpoint.h | 5 +++
include/linux/hw_breakpoint.h | 10 ++++++
kernel/hw_breakpoint.c | 53 ++++++++++++++++++++++++++-------
kernel/trace/trace_ksym.c | 26 ++++------------
5 files changed, 68 insertions(+), 31 deletions(-)
diff --git a/arch/sh/include/asm/hw_breakpoint.h b/arch/sh/include/asm/hw_breakpoint.h
index c575cf5..8e5e433 100644
--- a/arch/sh/include/asm/hw_breakpoint.h
+++ b/arch/sh/include/asm/hw_breakpoint.h
@@ -46,6 +46,11 @@ struct pmu;
/* Maximum number of UBC channels */
#define HBP_NUM 2
+static inline int hw_breakpoint_slots(int type)
+{
+ return HBP_NUM;
+}
+
...Depending on their nature and on what an arch supports, breakpoints
may consume more than one address register. For example a simple
absolute address match usually only requires one address register.
But an address range match may consume two registers.
Currently our slot allocation constraints, that tend to reflect the
limited arch's resources, always consider that a breakpoint consumes
one slot.
Then provide a way for archs to tell us the weight of a breakpoint
through a new hw_breakpoint_weight() helper. This weight will be
computed against the generic allocation constraints instead of
a constant value.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Cc: K. Prasad <prasad@linux.vnet.ibm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Ingo Molnar <mingo@elte.hu>
---
kernel/hw_breakpoint.c | 63 ++++++++++++++++++++++++++++++++++-------------
1 files changed, 45 insertions(+), 18 deletions(-)
diff --git a/kernel/hw_breakpoint.c b/kernel/hw_breakpoint.c
index 8ead134..974498b 100644
--- a/kernel/hw_breakpoint.c
+++ b/kernel/hw_breakpoint.c
@@ -77,6 +77,11 @@ struct bp_busy_slots {
/* Serialize accesses to the above constraints */
static DEFINE_MUTEX(nr_bp_mutex);
+__weak int hw_breakpoint_weight(struct perf_event *bp)
+{
+ return 1;
+}
+
static inline enum bp_type_idx find_slot_idx(struct perf_event *bp)
{
if (bp->attr.bp_type & HW_BREAKPOINT_RW)
@@ -124,7 +129,7 @@ static int task_bp_pinned(struct task_struct *tsk, enum bp_type_idx type)
list_for_each_entry(bp, list, event_entry) {
if (bp->attr.type == PERF_TYPE_BREAKPOINT)
if (find_slot_idx(bp) == type)
- count++;
+ count += hw_breakpoint_weight(bp);
}
raw_spin_unlock_irqrestore(&ctx->lock, flags);
@@ -174,25 +179,40 @@ fetch_bp_busy_slots(struct bp_busy_slots *slots, struct perf_event ...There are two outstanding fashions for archs to implement hardware breakpoints. The first is to separate breakpoint address pattern definition space between data and instruction breakpoints. We then have typically distinct instruction address breakpoint registers and data address breakpoint registers, delivered with separate control registers for data and instruction breakpoints as well. This is the case of PowerPc and ARM for example. The second consists in having merged breakpoint address space definition between data and instruction breakpoint. Address registers can host either instruction or data address and the access mode for the breakpoint is defined in a control register. This is the case of x86 and Super H. This patch adds a new CONFIG_HAVE_MIXED_BREAKPOINTS_REGS config that archs can select if they belong to the second case. Those will have their slot allocation merged for instructions and data breakpoints. The others will have a separate slot tracking between data and instruction breakpoints. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Cc: K. Prasad <prasad@linux.vnet.ibm.com> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Ingo Molnar <mingo@elte.hu> --- arch/Kconfig | 11 +++++ arch/sh/Kconfig | 1 + arch/x86/Kconfig | 1 + include/linux/hw_breakpoint.h | 9 +++- kernel/hw_breakpoint.c | 86 ++++++++++++++++++++++++++++------------- 5 files changed, 78 insertions(+), 30 deletions(-) diff --git a/arch/Kconfig b/arch/Kconfig index f06010f..acda512 100644 --- a/arch/Kconfig +++ b/arch/Kconfig @@ -137,6 +137,17 @@ config HAVE_HW_BREAKPOINT bool depends on PERF_EVENTS +config HAVE_MIXED_BREAKPOINTS_REGS + bool + depends on HAVE_HW_BREAKPOINT + help + Depending on the arch implementation of ...
The current policies of breakpoints in x86 and SH are the following: - task bound breakpoints can only break on userspace addresses - cpu wide breakpoints can only break on kernel addresses The former rule prevents ptrace breakpoints to be set to trigger on kernel addresses, which is good. But as a side effect, we can't breakpoint on kernel addresses for task bound breakpoints. The latter rule simply makes no sense, there is no reason why we can't set breakpoints on userspace while performing cpu bound profiles. We want the following new policies: - task bound breakpoint can set userspace address breakpoints, with no particular privilege required. - task bound breakpoints can set kernelspace address breakpoints but must be privileged to do that. - cpu bound breakpoints can do what they want as they are privileged already. To implement these new policies, this patch checks if we are dealing with a kernel address breakpoint, if so and if the exclude_kernel parameter is set, we tell the user that the breakpoint is invalid, which makes a good generic ptrace protection. If we don't have exclude_kernel, ensure the user has the right privileges as kernel breakpoints are quite sensitive (risk of trap recursion attacks and global performance impacts). Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Cc: K. Prasad <prasad@linux.vnet.ibm.com> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Jason Wessel <jason.wessel@windriver.com> Cc: Ingo Molnar <mingo@elte.hu> --- arch/sh/include/asm/hw_breakpoint.h | 3 +- arch/sh/kernel/hw_breakpoint.c | 31 +++++-------------------- arch/x86/include/asm/hw_breakpoint.h | 5 +-- arch/x86/kernel/hw_breakpoint.c | 41 +++++----------------------------- kernel/hw_breakpoint.c | 26 +++++++++++++++++++- ...
Sorry for the part on this file, it's an unrelated leftover, I'll drop it from the final set. --
You forgot to kill off the existing arch_validate_hwbkpt_settings() definition, so this part at least breaks the build. --
We were also using the va_in_userspace check for the case of signal
delivery, so I've just inverted the test for that. Perhaps there's a
cleaner way to handle it, though.
Other than that, everything seems to work ok. The ksym tracer and ptrace
tests still pass at least. Feel free to add my Tested/Acked-by on the
rest if you're respinning it at some point.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
---
diff --git a/arch/sh/include/asm/hw_breakpoint.h b/arch/sh/include/asm/hw_breakpoint.h
index 4d5e514..89890f6 100644
--- a/arch/sh/include/asm/hw_breakpoint.h
+++ b/arch/sh/include/asm/hw_breakpoint.h
@@ -54,8 +54,6 @@ static inline int hw_breakpoint_slots(int type)
/* arch/sh/kernel/hw_breakpoint.c */
extern int arch_check_bp_in_kernelspace(struct perf_event *bp);
extern int arch_validate_hwbkpt_settings(struct perf_event *bp);
-extern int arch_validate_hwbkpt_settings(struct perf_event *bp,
- struct task_struct *tsk);
extern int hw_breakpoint_exceptions_notify(struct notifier_block *unused,
unsigned long val, void *data);
diff --git a/arch/sh/kernel/hw_breakpoint.c b/arch/sh/kernel/hw_breakpoint.c
index 67564e3..efae6ab 100644
--- a/arch/sh/kernel/hw_breakpoint.c
+++ b/arch/sh/kernel/hw_breakpoint.c
@@ -344,8 +344,7 @@ static int __kprobes hw_breakpoint_handler(struct die_args *args)
perf_bp_event(bp, args->regs);
/* Deliver the signal to userspace */
- if (arch_check_va_in_userspace(bp->attr.bp_addr,
- bp->attr.bp_len)) {
+ if (!arch_check_bp_in_kernelspace(bp)) {
siginfo_t info;
info.si_signo = args->signr;
--
That looks fine. I wonder if I can assume that such tests, that are the same between x86 and Sh, can apply to every archs, in which Great, I'll then merge this fix to the appropriate patch and add the appropriate tags. --
Tag ptrace breakpoints with the exclude_kernel attribute set. This
will make it easier to set generic policies on breakpoints, when it
comes to ensure nobody unpriviliged try to breakpoint on the kernel.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Cc: K. Prasad <prasad@linux.vnet.ibm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Ingo Molnar <mingo@elte.hu>
---
arch/sh/kernel/ptrace_32.c | 2 +-
arch/x86/kernel/ptrace.c | 2 +-
include/linux/hw_breakpoint.h | 6 ++++++
3 files changed, 8 insertions(+), 2 deletions(-)
diff --git a/arch/sh/kernel/ptrace_32.c b/arch/sh/kernel/ptrace_32.c
index 7759a9a..d4104ce 100644
--- a/arch/sh/kernel/ptrace_32.c
+++ b/arch/sh/kernel/ptrace_32.c
@@ -85,7 +85,7 @@ static int set_single_step(struct task_struct *tsk, unsigned long addr)
bp = thread->ptrace_bps[0];
if (!bp) {
- hw_breakpoint_init(&attr);
+ ptrace_breakpoint_init(&attr);
attr.bp_addr = addr;
attr.bp_len = HW_BREAKPOINT_LEN_2;
diff --git a/arch/x86/kernel/ptrace.c b/arch/x86/kernel/ptrace.c
index 055be0a..70c4872 100644
--- a/arch/x86/kernel/ptrace.c
+++ b/arch/x86/kernel/ptrace.c
@@ -688,7 +688,7 @@ static int ptrace_set_breakpoint_addr(struct task_struct *tsk, int nr,
struct perf_event_attr attr;
if (!t->ptrace_bps[nr]) {
- hw_breakpoint_init(&attr);
+ ptrace_breakpoint_init(&attr);
/*
* Put stub len and type to register (reserve) an inactive but
* correct bp
diff --git a/include/linux/hw_breakpoint.h b/include/linux/hw_breakpoint.h
index c70d27a..a0aa5a9 100644
--- a/include/linux/hw_breakpoint.h
+++ b/include/linux/hw_breakpoint.h
@@ -34,6 +34,12 @@ static inline void hw_breakpoint_init(struct perf_event_attr *attr)
attr->sample_period = 1;
}
+static inline void ptrace_breakpoint_init(struct perf_event_attr ...