Re: [PATCHv10 2.6.35-rc6-tip 9/14] trace: uprobes trace_event interface

Previous thread: [PATCH] regulator: add TPS6586X regulator driver by y on Tuesday, July 27, 2010 - 4:03 am. (3 messages)

Next thread: [PATCH + an old question] firewire: ohci: use memory barriers to order descriptor updates by Stefan Richter on Tuesday, July 27, 2010 - 4:20 am. (4 messages)
From: Srikar Dronamraju
Date: Tuesday, July 27, 2010 - 4:08 am

Changelog from V9:
 - Resolved comments from Arnaldo on perf support for uprobes.
 - perf probe -S will now list only global binding functions as
   requested by Christoph Hellwig.
 - Moved Changelog to below Signed-off-by: line, so that its not part
   of the patch description. (Suggested by Christoph.)

Changelog from V8:
 - Fix build issues reported by Christoph.
 - List available probes in a file without need to specify pid.

Changelog from V7:
 - New feature: perf probe lists available probes.
 - Fix perf probes for uprobes to exit with a error message on dwarf
   based probes.
 - Merge changes to kprobes traceevent infrastructure.
 - Merge changes to perf.

Changelog from V6:
 - Remove perf adjust symbols patch.

Changelog from V5:
  - Merged user_bkpt and user_bkpt_xol into uprobes.
  - Addressed comments till now.

Changelog from V4:
  - Rebased to tip tree. (2.6.35-rc3-tip)

Changelog from v3:
  - Reverted to background page replacement as suggested by Peter Zijlstra.
  - Dso in 'perf probe' can be either be a short name or a absolute path.
  - Addressed comments from Masami, Frederic, Steven on traceevents and perf

Changelog from v2:
  - Addressed comments from Oleg, including removal of interrupt context
    handlers, reverting background page replacement in favour of
    access_process_vm().

  - Provides perf interface for uprobes.

Changelog from v1:
 - Added trace_event interface for uprobes.
 - Addressed comments from Andrew Morton and Randy Dunlap.

For previous posting: please refer: http://lkml.org/lkml/2010/7/12/67,
http://lkml.org/lkml/2010/7/8/239, http://lkml.org/lkml/2010/6/29/299,
http://lkml.org/lkml/2010/6/14/41, http://lkml.org/lkml/2010/3/20/107
and http://lkml.org/lkml/2010/5/18/307

This patchset implements Uprobes which enables you to dynamically break
into any routine in a user space application and collect information
non-disruptively.

This patchset is a rework based on suggestions from discussions on lkml
in January ...
From: Srikar Dronamraju
Date: Tuesday, July 27, 2010 - 4:09 am

User bkpt will use background page replacement approach to insert/delete
breakpoints. Background page replacement approach will be based on
replace_page and write_protect_page.
Now replace_page() loses its static attribute.

Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
---

 include/linux/mm.h |    4 ++
 mm/ksm.c           |  112 -------------------------------------------------
 mm/memory.c        |  120 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 124 insertions(+), 112 deletions(-)


diff --git a/include/linux/mm.h b/include/linux/mm.h
index a2b4804..0fafef9 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -858,6 +858,10 @@ void account_page_dirtied(struct page *page, struct address_space *mapping);
 int set_page_dirty(struct page *page);
 int set_page_dirty_lock(struct page *page);
 int clear_page_dirty_for_io(struct page *page);
+int replace_page(struct vm_area_struct *vma, struct page *page,
+					struct page *kpage, pte_t orig_pte);
+int write_protect_page(struct vm_area_struct *vma, struct page *page,
+						      pte_t *orig_pte);
 
 extern unsigned long move_page_tables(struct vm_area_struct *vma,
 		unsigned long old_addr, struct vm_area_struct *new_vma,
diff --git a/mm/ksm.c b/mm/ksm.c
index 6c3e99b..ce432e1 100644
--- a/mm/ksm.c
+++ b/mm/ksm.c
@@ -713,118 +713,6 @@ static inline int pages_identical(struct page *page1, struct page *page2)
 	return !memcmp_pages(page1, page2);
 }
 
-static int write_protect_page(struct vm_area_struct *vma, struct page *page,
-			      pte_t *orig_pte)
-{
-	struct mm_struct *mm = vma->vm_mm;
-	unsigned long addr;
-	pte_t *ptep;
-	spinlock_t *ptl;
-	int swapped;
-	int err = -EFAULT;
-
-	addr = page_address_in_vma(page, vma);
-	if (addr == -EFAULT)
-		goto out;
-
-	ptep = page_check_address(page, mm, addr, &ptl, 0);
-	if (!ptep)
-		goto out;
-
-	if (pte_write(*ptep)) {
-		pte_t ...
From: Srikar Dronamraju
Date: Tuesday, July 27, 2010 - 4:09 am

Provides a mechanism in kernel to insert/remove breakpoints in
user space applications including
   - architecture independent mechanism to establish breakpoints in
     userspace applications.
   - helper functions for reading/writing/validating data/opcodes from
     target process's address space.
   - wrappers and default implementation(whereever possible) of
     architecture dependent functions(setting breakpoint)
   - preprocessing and postprocessing of singlestep on breakpoint hit

Single stepping inline is the traditional method where original
instructions replace the breakpointed instructions on a breakpoint
hit.  This method works well with single threaded applications.
However its racy with multithreaded applications.

In execution out of line, threads single steps on a copy of the
instruction. This method works well for both single-threaded and
multithreaded applications.

Uprobes uses execution out of line method.

There could be other strategies like emulating an instruction. However
they are currently not implemented.

Insertion and removal of breakpoints is by "Background page
replacement". i.e make a copy of the page, modify its the contents,
set the pagetable and flush the tlbs. This page uses enhanced
replace_page to cow the page. Modified page is only reflected for the
interested process. Others sharing the page will still see the old
copy.

You need to follow this up with the uprobes patch for your
architecture to define architecture specific functionality for
reading/writing/validating data/opcodes.

Signed-off-by: Jim Keniston <jkenisto@us.ibm.com>
Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
---

Changelog from V5: (Merge user_bkpt into uprobes)
  * Merged user_bkpt into uprobes as suggested by Christoph Hellwig
    and Peter Zijlstra.

Changelog from V3: (reimplement background page replacement)
  * Replemented background page replacement based on inputs
    from Peter Zijlstra.

Changelog from v2: (addressing comments ...
From: Srikar Dronamraju
Date: Tuesday, July 27, 2010 - 4:09 am

Provides slot allocation mechanism for execution out of line for use
with user space breakpointing.

Traditional method of replacing the original instructions on
breakpoint hit are racy when used on multithreaded applications.

Alternatives for the traditional method include:
	- Emulating the breakpointed instruction.
	- Execution out of line.

Emulating the instruction:
	This approach would use a in-kernel instruction emulator to
emulate the breakpointed instruction. This approach could be looked in
at a later point of time.

Execution out of line:
	In execution out of line strategy, a new vma is injected into
the target process, a copy of the instructions which are breakpointed
is stored in one of the slots. On breakpoint hit, the copy of the
instruction is single-stepped leaving the breakpoint instruction as
is.  This method is architecture independent.

This method is useful while handling multithreaded processes.

This patch allocates one page per process for slots to be used to copy
the breakpointed instructions.

Current slot allocation mechanism:
1. Allocate one dedicated slot per user breakpoint. Each slot is big
enuf to accomodate the biggest instruction for that architecture. (16
bytes for x86).
2. We currently allocate only one page for slots. Hence the number of
slots is limited to active breakpoint hits on that process.
3. Bitmap to track used slots.

Signed-off-by: Jim Keniston <jkenisto@us.ibm.com>
Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
---

Changelog from V5: Merged into uprobes.

Changelog form V3:
   * Added a memory barrier after the slot gets initialized.

Changelog from V2: (addressing Oleg's comments)
   * Removed code in !CONFIG_UPROBES_XOL
   * Functions now pass pointer to uprobes_xol_area instead of pointer
     to void.

 include/linux/uprobes.h |    2 
 kernel/uprobes.c        |  283 +++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 285 insertions(+), 0 deletions(-)


diff --git ...
From: Srikar Dronamraju
Date: Tuesday, July 27, 2010 - 4:09 am

Provides x86 specific functions for instruction analysis and
instruction validation and x86 specific pre-processing and
post-processing of singlestep especially for RIP relative
instructions. Uses "x86: instruction decoder API" for validation and
analysis of user space instructions. This analysis is used at the time
of post-processing of breakpoint hit to do the necessary fix-ups.
There is support for breakpointing RIP relative instructions. However
there are still few instructions that cannot be singlestepped.

Also defines TIF_UPROBE flag for x86.

This patch requires "x86: instruction decoder API"
http://lkml.org/lkml/2009/6/1/459

Signed-off-by: Jim Keniston <jkenisto@us.ibm.com>
Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
---

Changelog from V5: Merged into uprobes layer.

Changelog from V1:
   set UPROBES_FIX_SLEEPY if post_xol might sleep.

 arch/x86/Kconfig                   |    1 
 arch/x86/include/asm/thread_info.h |    2 
 arch/x86/include/asm/uprobes.h     |   43 +++
 arch/x86/kernel/Makefile           |    2 
 arch/x86/kernel/uprobes.c          |  547 ++++++++++++++++++++++++++++++++++++
 5 files changed, 595 insertions(+), 0 deletions(-)
 create mode 100644 arch/x86/include/asm/uprobes.h
 create mode 100644 arch/x86/kernel/uprobes.c


diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 3069a6d..8f2bdbf 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -54,6 +54,7 @@ config X86
 	select HAVE_KERNEL_LZO
 	select HAVE_HW_BREAKPOINT
 	select HAVE_MIXED_BREAKPOINTS_REGS
+	select ARCH_SUPPORTS_UPROBES
 	select PERF_EVENTS
 	select HAVE_PERF_EVENTS_NMI
 	select ANON_INODES
diff --git a/arch/x86/include/asm/thread_info.h b/arch/x86/include/asm/thread_info.h
index f0b6e5d..5b9c9f0 100644
--- a/arch/x86/include/asm/thread_info.h
+++ b/arch/x86/include/asm/thread_info.h
@@ -84,6 +84,7 @@ struct thread_info {
 #define TIF_SECCOMP		8	/* secure computing */
 #define TIF_MCE_NOTIFY		10	/* notify userspace of an MCE */
 #define ...
From: Srikar Dronamraju
Date: Tuesday, July 27, 2010 - 4:09 am

The uprobes infrastructure enables a user to dynamically establish
probepoints in user applications and collect information by executing
a handler function when a probepoint is hit.

The user specifies the virtual address and the pid of the process of
interest along with the action to be performed.  Uprobes uses the
execution out of line strategy and follows lazy slot allocation. I.e,
on the first probe hit for that process, a new vma (to hold the probed
instructions for execution out of line) is allocated.  Once allocated,
this vma remains for the life of the process, and is reused as needed
for subsequent probes.  A slot in the vma is allocated for a
probepoint when it is first hit.

A slot is marked for reuse only when the probe gets unregistered and
there are no threads in the vicinity.

In a multithreaded process, a probepoint once registered is active for
all threads of a process. If a thread specific action for a probepoint
is required then the handler should be implemented to do the same.

If a breakpoint already exists at a particular address (irrespective
of who inserted the breakpoint including uprobes), uprobes will refuse
to register any more probes at that address.

You need to follow this up with the uprobes patch for your
architecture.

For more information: please refer to Documentation/uprobes.txt

TODO:
1. Allow multiple probes at a probepoint.
2. Booster probes.
3. Allow probes to be inherited across fork.
4. probing function returns.

Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Jim Keniston <jkenisto@us.ibm.com>
---

Changelog from V5:
    - Merged user_bkpt and user_bkpt_xol layers into uprobes.

Changelog from V2:
   - Introduce TIF_UPROBE flag.
   - uprobes hooks now in fork/exec/exit paths instead of tracehooks.
   - uprobe_process is now part of the mm struct and is shared between
     processes that share the mm.
   - per thread information is now allocated on the fly.
     * Hence allocation and ...
From: Srikar Dronamraju
Date: Tuesday, July 27, 2010 - 4:10 am

Provides x86 specific details for uprobes.
This includes interrupt notifier for uprobes, enabling/disabling
singlestep.

Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
---

Changelog from V5: Using local_irq_enable() instead of
    native_irq_enable and no more disabling irqs as suggested by Oleg
    Nesterov.

 arch/x86/kernel/signal.c  |   13 +++++++++++
 arch/x86/kernel/uprobes.c |   52 +++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 65 insertions(+), 0 deletions(-)


diff --git a/arch/x86/kernel/signal.c b/arch/x86/kernel/signal.c
index 4fd173c..3657563 100644
--- a/arch/x86/kernel/signal.c
+++ b/arch/x86/kernel/signal.c
@@ -848,6 +848,19 @@ do_notify_resume(struct pt_regs *regs, void *unused, __u32 thread_info_flags)
 	if (thread_info_flags & _TIF_SIGPENDING)
 		do_signal(regs);
 
+	if (thread_info_flags & _TIF_UPROBE) {
+		clear_thread_flag(TIF_UPROBE);
+#ifdef CONFIG_X86_32
+		/*
+		 * On x86_32, do_notify_resume() gets called with
+		 * interrupts disabled. Hence enable interrupts if they
+		 * are still disabled.
+		 */
+		local_irq_enable();
+#endif
+		uprobe_notify_resume(regs);
+	}
+
 	if (thread_info_flags & _TIF_NOTIFY_RESUME) {
 		clear_thread_flag(TIF_NOTIFY_RESUME);
 		tracehook_notify_resume(regs);
diff --git a/arch/x86/kernel/uprobes.c b/arch/x86/kernel/uprobes.c
index 1eb85bb..9456328 100644
--- a/arch/x86/kernel/uprobes.c
+++ b/arch/x86/kernel/uprobes.c
@@ -26,6 +26,7 @@
 #include <linux/ptrace.h>
 #include <linux/uprobes.h>
 
+#include <linux/kdebug.h>
 #include <asm/insn.h>
 
 #ifdef CONFIG_X86_32
@@ -545,3 +546,54 @@ struct user_bkpt_arch_info user_bkpt_arch_info = {
 	.analyze_insn = analyze_insn,
 	.post_xol = post_xol,
 };
+
+/*
+ * Wrapper routine for handling exceptions.
+ */
+int uprobes_exception_notify(struct notifier_block *self,
+				       unsigned long val, void *data)
+{
+	struct die_args *args = data;
+	struct ...
From: Srikar Dronamraju
Date: Tuesday, July 27, 2010 - 4:10 am

Uprobes Documentation.

Signed-off-by: Jim Keniston <jkenisto@us.ibm.com>
Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
---

Changelog from V5: Removed references to Modules, Samples, and
   probe Overhead.

Changelog from v3: Updated measurements.

Changelog from v2: Updated measurements.

Changelog from v1: Addressed comments from Randy Dunlap.
		 : Updated measurements.

 Documentation/uprobes.txt |  193 +++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 193 insertions(+), 0 deletions(-)
 create mode 100644 Documentation/uprobes.txt


diff --git a/Documentation/uprobes.txt b/Documentation/uprobes.txt
new file mode 100644
index 0000000..cf87136
--- /dev/null
+++ b/Documentation/uprobes.txt
@@ -0,0 +1,193 @@
+Title	: User-Space Probes (Uprobes)
+Authors	: Jim Keniston <jkenisto@us.ibm.com>
+	: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
+
+CONTENTS
+
+1. Concepts: Uprobes
+2. Architectures Supported
+3. API Reference
+4. Uprobes Features and Limitations
+5. TODO
+6. Uprobes Team
+
+1. Concepts: Uprobes
+
+Uprobes enables you to dynamically break into any routine in a
+user application and collect debugging and performance information
+non-disruptively. You can trap at any code address, specifying a
+kernel handler routine to be invoked when the breakpoint is hit.
+
+A uprobe can be inserted on any instruction in the application's
+virtual address space.  The registration function register_uprobe()
+specifies which process is to be probed, where the probe is to be
+inserted, and what handler is to be called when the probe is hit.
+
+Uprobes-based instrumentation can be packaged as a kernel
+module.  In the simplest case, the module's init function installs
+("registers") one or more probes, and the exit function unregisters
+them.
+
+1.1 How Does a Uprobe Work?
+
+When a uprobe is registered, Uprobes makes a copy of the probed
+instruction, stops the probed application, replaces the first byte(s)
+of the probed ...
From: Srikar Dronamraju
Date: Tuesday, July 27, 2010 - 4:10 am

Move common parts of trace_kprobe.c and trace_uprobe.c
Adjust kernel/trace/trace_kprobe.c after moving common code to
kernel/trace/trace_probe.h and kernel/trace/trace_probe.c.

TODO: Merge both events to a single probe event.

Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
---

Changelog from v7: Merge changes due to string support in kprobes
	traceevent.

Changelog from v5: Addressed comments from Masami Hiramatsu
	and Steven Rostedt. Also shared lot more code from kprobes
        traceevents.

 kernel/trace/Kconfig        |    4 
 kernel/trace/Makefile       |    1 
 kernel/trace/trace_kprobe.c |  742 +------------------------------------------
 kernel/trace/trace_probe.c  |  654 ++++++++++++++++++++++++++++++++++++++
 kernel/trace/trace_probe.h  |  155 +++++++++
 5 files changed, 823 insertions(+), 733 deletions(-)
 create mode 100644 kernel/trace/trace_probe.c
 create mode 100644 kernel/trace/trace_probe.h


diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig
index c7683fd..c681fa7 100644
--- a/kernel/trace/Kconfig
+++ b/kernel/trace/Kconfig
@@ -364,6 +364,7 @@ config KPROBE_EVENT
 	depends on HAVE_REGS_AND_STACK_ACCESS_API
 	bool "Enable kprobes-based dynamic events"
 	select TRACING
+	select PROBE_EVENTS
 	default y
 	help
 	  This allows the user to add tracing events (similar to tracepoints)
@@ -376,6 +377,9 @@ config KPROBE_EVENT
 	  This option is also required by perf-probe subcommand of perf tools.
 	  If you want to use perf tools, this option is strongly recommended.
 
+config PROBE_EVENTS
+	def_bool n
+
 config DYNAMIC_FTRACE
 	bool "enable/disable ftrace tracepoints dynamically"
 	depends on FUNCTION_TRACER
diff --git a/kernel/trace/Makefile b/kernel/trace/Makefile
index 438e84a..eb11f7d 100644
--- a/kernel/trace/Makefile
+++ b/kernel/trace/Makefile
@@ -53,5 +53,6 @@ endif
 obj-$(CONFIG_EVENT_TRACING) += trace_events_filter.o
 obj-$(CONFIG_KPROBE_EVENT) += trace_kprobe.o
 obj-$(CONFIG_EVENT_TRACING) += ...
From: Masami Hiramatsu
Date: Tuesday, July 27, 2010 - 6:22 am

Hi Srikar,


If you use "bool" for "is_kprobe", change "is_return" type too.

And, maybe you missed that the fetch function supports "string" type now,
which needs a bit different manner for storing fetched value. You can find
store_trace_args() function in trace_kprobe.c.

BTW, current fetch functions doesn't support fetching "paged-out" user-variables
because kprobe can't sleep inside its handler.
However, user-space memory can be paged out, and I assume that uprobes allows
its handler to I/O (and yield). If so, it can wait for accessing paged-out
variable, can't it?

Thank you,

-- 
Masami HIRAMATSU
2nd Research Dept.
Hitachi, Ltd., Systems Development Laboratory
E-mail: masami.hiramatsu.pt@hitachi.com
--

From: Srikar Dronamraju
Date: Tuesday, July 27, 2010 - 7:03 am

Yes, I have seen your changes for supporting string type.  Though all
the fetch functions are in common code, uprobe based probes for now
supports register fetching only. We have to add support for other
types gradually. Please let me know if you see a reason to change the

Yes, the uprobes handler might sleep and hence we would have to
handle accessing the paged-out user variables. When perf-uprobes
starts supporting dwarf based probing, we should look into these
issues.  Currently its a todo.

--
Thanks and Regards
Srikar
--

From: Masami Hiramatsu
Date: Wednesday, July 28, 2010 - 12:56 am

Great!

Thank you,

-- 
Masami HIRAMATSU
2nd Research Dept.
Hitachi, Ltd., Systems Development Laboratory
E-mail: masami.hiramatsu.pt@hitachi.com
--

From: Srikar Dronamraju
Date: Thursday, July 29, 2010 - 7:16 am

Masami, 
 Below patch should address the comments raised by you.
 Please let me know if this is fine with you.

--
Thanks and Regards
Srikar

---
From: Srikar Dronamraju <srikar@linux.vnet.ibm.com>


Move common parts of trace_kprobe.c and trace_uprobe.c
Adjust kernel/trace/trace_kprobe.c after moving common code to
kernel/trace/trace_probe.h and kernel/trace/trace_probe.c.

TODO: Merge both events to a single probe event.

Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
---

Changelog from v7: Merge changes due to string support in kprobes
	traceevent.

Changelog from v5: Addressed comments from Masami Hiramatsu
	and Steven Rostedt. Also shared lot more code from kprobes
        traceevents.

 kernel/trace/Kconfig        |    4 
 kernel/trace/Makefile       |    1 
 kernel/trace/trace_kprobe.c |  752 +------------------------------------------
 kernel/trace/trace_probe.c  |  654 +++++++++++++++++++++++++++++++++++++
 kernel/trace/trace_probe.h  |  155 +++++++++
 5 files changed, 828 insertions(+), 738 deletions(-)
 create mode 100644 kernel/trace/trace_probe.c
 create mode 100644 kernel/trace/trace_probe.h


diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig
index c7683fd..c681fa7 100644
--- a/kernel/trace/Kconfig
+++ b/kernel/trace/Kconfig
@@ -364,6 +364,7 @@ config KPROBE_EVENT
 	depends on HAVE_REGS_AND_STACK_ACCESS_API
 	bool "Enable kprobes-based dynamic events"
 	select TRACING
+	select PROBE_EVENTS
 	default y
 	help
 	  This allows the user to add tracing events (similar to tracepoints)
@@ -376,6 +377,9 @@ config KPROBE_EVENT
 	  This option is also required by perf-probe subcommand of perf tools.
 	  If you want to use perf tools, this option is strongly recommended.
 
+config PROBE_EVENTS
+	def_bool n
+
 config DYNAMIC_FTRACE
 	bool "enable/disable ftrace tracepoints dynamically"
 	depends on FUNCTION_TRACER
diff --git a/kernel/trace/Makefile b/kernel/trace/Makefile
index 438e84a..eb11f7d 100644
--- ...
From: Srikar Dronamraju
Date: Tuesday, July 27, 2010 - 4:10 am

Implements trace_event support for uprobes. In its
current form it can be used to put probes at a specified text address
in a process and dump the required registers when the code flow reaches
the probed address.

TODO: Documentation/trace/uprobetrace.txt

Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
---

Changelog from v5: Addressed comments from Masami Hiramatsu and Steven
      Rostedt. Some changes because of changes in common probe events.

Changelog from v4: (Merged to 2.6.35-rc3-tip)

Changelog from v2/v3: (Addressing comments from Steven Rostedt
					and Frederic Weisbecker)
	* removed pit field from uprobe_trace_entry.
	* share common parts with kprobe trace events.
	* use trace_create_file instead of debugfs_create_file.

The following example shows how to dump the instruction pointer and %ax a
register at the probed text address.

Start a process to trace. Get the address to trace.
  [Here pid is asssumed as 6016]
  [Address to trace is 0x0000000000446420]
  [Registers to be dumped are %ip and %ax]

# cd /sys/kernel/debug/tracing/
# echo 'p 6016:0x0000000000446420 %ip %ax' > uprobe_events
# cat uprobe_events
p:uprobes/p_6016_0x0000000000446420 6016:0x0000000000446420 %ip=%ip %ax=%ax
# cat events/uprobes/p_6016_0x0000000000446420/enable
0
[enable the event]
# echo 1 > events/uprobes/p_6016_0x0000000000446420/enable
# cat events/uprobes/p_6016_0x0000000000446420/enable
1
# #### do some activity on the program so that it hits the breakpoint
# cat uprobe_profile
  6016 p_6016_0x0000000000446420                                234
# tracer: nop
#
#           TASK-PID    CPU#    TIMESTAMP  FUNCTION
#              | |       |          |         |
             zsh-6016  [004] 227931.093579: p_6016_0x0000000000446420: (0x446420) %ip=446421 %ax=79
             zsh-6016  [005] 227931.097541: p_6016_0x0000000000446420: (0x446420) %ip=446421 %ax=79
             zsh-6016  [000] 227931.124909: p_6016_0x0000000000446420: (0x446420) %ip=446421 ...
From: Masami Hiramatsu
Date: Wednesday, July 28, 2010 - 10:04 pm

Possible enhancement: Moving this config right after KPROBE_EVENT, because
 those two provide similar dynamic events.

Thank you,
-- 
Masami HIRAMATSU
2nd Research Dept.
Hitachi, Ltd., Systems Development Laboratory
E-mail: masami.hiramatsu.pt@hitachi.com
--

From: Masami Hiramatsu
Date: Wednesday, July 28, 2010 - 10:04 pm

Possible enhancement: Moving this config right after KPROBE_EVENT, because
 those two provide similar dynamic events.

Thank you,
-- 
Masami HIRAMATSU
2nd Research Dept.
Hitachi, Ltd., Systems Development Laboratory
E-mail: masami.hiramatsu.pt@hitachi.com

--

From: Srikar Dronamraju
Date: Wednesday, July 28, 2010 - 10:20 pm

Agree. 


--
Thanks and Regards
Srikar
--

From: Srikar Dronamraju
Date: Thursday, July 29, 2010 - 7:15 am

Masami, 
 Below patch should address the comments raised by you.

--
Thanks and Regards
Srikar

---
From: Srikar Dronamraju <srikar@linux.vnet.ibm.com>


Implements trace_event support for uprobes. In its
current form it can be used to put probes at a specified text address
in a process and dump the required registers when the code flow reaches
the probed address.

TODO: Documentation/trace/uprobetrace.txt

Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
---

Changelog from v5: Addressed comments from Masami Hiramatsu and Steven
      Rostedt. Some changes because of changes in common probe events.

Changelog from v4: (Merged to 2.6.35-rc3-tip)

Changelog from v2/v3: (Addressing comments from Steven Rostedt
					and Frederic Weisbecker)
	* removed pit field from uprobe_trace_entry.
	* share common parts with kprobe trace events.
	* use trace_create_file instead of debugfs_create_file.


The following example shows how to dump the instruction pointer and %ax a
register at the probed text address.

Start a process to trace. Get the address to trace.
  [Here pid is asssumed as 6016]
  [Address to trace is 0x0000000000446420]
  [Registers to be dumped are %ip and %ax]

# cd /sys/kernel/debug/tracing/
# echo 'p 6016:0x0000000000446420 %ip %ax' > uprobe_events
# cat uprobe_events
p:uprobes/p_6016_0x0000000000446420 6016:0x0000000000446420 %ip=%ip %ax=%ax
# cat events/uprobes/p_6016_0x0000000000446420/enable
0
[enable the event]
# echo 1 > events/uprobes/p_6016_0x0000000000446420/enable
# cat events/uprobes/p_6016_0x0000000000446420/enable
1
# #### do some activity on the program so that it hits the breakpoint
# cat uprobe_profile
  6016 p_6016_0x0000000000446420                                234
# tracer: nop
#
#           TASK-PID    CPU#    TIMESTAMP  FUNCTION
#              | |       |          |         |
             zsh-6016  [004] 227931.093579: p_6016_0x0000000000446420: (0x446420) %ip=446421 %ax=79
             zsh-6016  [005] ...
From: Frederic Weisbecker
Date: Sunday, August 1, 2010 - 7:28 pm

Why did you split CONFIG_UPROBES and CONFIG_UPROBE_EVENT?
Is there another kind of use of uprobes than through "trace events"?

Hmm, speaking about it, I think kprobes has the same problem. In fact
now I remember I noticed it by the past but we found another user of
kprobes, mostly unused I guess.

Anyway, uprobes config itself doesn't need to be split from uprobes events.

Also, what about the "Dynamic Probes" menu I proposed? (it's possibly a
crappy idea, I don't know).

--

From: Frederic Weisbecker
Date: Sunday, August 1, 2010 - 7:20 pm

In fact this could be a menu "Dynamic Probes", perhaps default off, inside
which Kprobes and Uprobes would be default on (but depend on "Dynamic Probes").

So that you can quickly enable them all in one.

--

From: Masami Hiramatsu
Date: Sunday, August 1, 2010 - 8:45 pm

Hmm, I disagree with it, because both Kprobes and Uprobes provides
APIs for modules too.

I'd like to suggest below config tree

Kenrel hacking
  - Kprobes
  - Uprobes
  - Tracing
     -- Dynamic Events
        depends on Kprobes || Uprobes
or
	select Kprobes && Uprobes

Thank you,

-- 
Masami HIRAMATSU
2nd Research Dept.
Hitachi, Ltd., Systems Development Laboratory
E-mail: masami.hiramatsu.pt@hitachi.com
--

From: Srikar Dronamraju
Date: Sunday, August 1, 2010 - 11:46 pm

I would agree with Masami since there could be people who might be
apprehensive to try out Uprobes (which would still be experimental) but
would be interested to use kprobes only since its more mature.

One change I would suggest would be to select respective events(i.e
kprobe_event, uprobe_event) instead of kprobes and uprobes.
	
-
Thanks and Regards
Srikar
--

From: Frederic Weisbecker
Date: Monday, August 2, 2010 - 12:58 am

Yeah sure. The goal was to still have both selectable independently, but
have a menu that can select all in one.

ie:

config DYNAMIC_PROBE
	depends on (KPROBES || UPROBES) && EVENTS_TRACING
	default n

config KPROBES_EVENT:
	depends on DYNAMIC_PROBE && KPROBES
	default y

config UPROBES_EVENT:
	depends on DYNAMIC_PROBE && UPROBES
	default y


So that people who want dynamic probes just don't care and select dynamic probe.
Those who want more granularity can still unselect uprobes events or kprobes
events after that.

--

From: Frederic Weisbecker
Date: Monday, August 2, 2010 - 12:46 am

I'm not sure there is a point in maintaining a leightweight version
for out of tree code. These modules could just select kprobes/uprobes
events as well.



Yep, that version looks good.

--

From: Ingo Molnar
Date: Monday, August 2, 2010 - 12:56 am

The upstream policy always was that out of tree code does not exist as far as 
the kernel is concerned. So it is wrong to make the kernel crappier while 
helping out of tree code.

Thanks,

	Ingo
--

From: Christoph Hellwig
Date: Monday, August 2, 2010 - 1:00 am

Indeed.  In addition to that the current version of uprobes does not
actually have any exported symbols.

--

From: Masami Hiramatsu
Date: Monday, August 2, 2010 - 2:29 am

Ah, indeed. :-(

And then, it conflicts with the description about uprobes in

So, that could be a bug.

Anyway, at least kprobes has some sample modules under samples/kprobes/.
Aren't they in-tree consumers of kprobes?

Thank you,

-- 
Masami HIRAMATSU
2nd Research Dept.
Hitachi, Ltd., Systems Development Laboratory
E-mail: masami.hiramatsu.pt@hitachi.com
--

From: Christoph Hellwig
Date: Monday, August 2, 2010 - 2:36 am

We have quite a few kprobes users in the kernel, just no really useful
modular ones.  We have a couple of useful users of jprobes, which are
the more useful kprobes variants in modules.  All of them might better
be converted to trace events, but those weren't around when they were
added.

--

From: Srikar Dronamraju
Date: Tuesday, July 27, 2010 - 4:10 am

As a precursor for perf to support uprobes, rename fields/functions
that had kprobe in their name but can be shared across perf-kprobes
and perf-uprobes to probe.

Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
---

 tools/perf/builtin-probe.c     |    1 
 tools/perf/util/probe-event.c  |  136 +++++++++++++++++++++-------------------
 tools/perf/util/probe-event.h  |   27 +++-----
 tools/perf/util/probe-finder.c |   34 +++++-----
 tools/perf/util/probe-finder.h |   10 +--
 5 files changed, 102 insertions(+), 106 deletions(-)


diff --git a/tools/perf/builtin-probe.c b/tools/perf/builtin-probe.c
index 5455186..199d5e1 100644
--- a/tools/perf/builtin-probe.c
+++ b/tools/perf/builtin-probe.c
@@ -267,4 +267,3 @@ int cmd_probe(int argc, const char **argv, const char *prefix __used)
 	}
 	return 0;
 }
-
diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
index 4445a1e..db3d619 100644
--- a/tools/perf/util/probe-event.c
+++ b/tools/perf/util/probe-event.c
@@ -1,5 +1,5 @@
 /*
- * probe-event.c : perf-probe definition to kprobe_events format converter
+ * probe-event.c : perf-probe definition to probe_events format converter
  *
  * Written by Masami Hiramatsu <mhiramat@redhat.com>
  *
@@ -120,8 +120,12 @@ static int open_vmlinux(void)
 	return open(machine.vmlinux_maps[MAP__FUNCTION]->dso->long_name, O_RDONLY);
 }
 
-/* Convert trace point to probe point with debuginfo */
-static int convert_to_perf_probe_point(struct kprobe_trace_point *tp,
+/*
+ * Convert trace point to probe point with debuginfo
+ * Currently only handles kprobes.
+ */
+
+static int kprobe_convert_to_perf_probe(struct probe_trace_point *tp,
 				       struct perf_probe_point *pp)
 {
 	struct symbol *sym;
@@ -151,8 +155,8 @@ static int convert_to_perf_probe_point(struct kprobe_trace_point *tp,
 }
 
 /* Try to find perf_probe_event with debuginfo */
-static int try_to_find_kprobe_trace_events(struct perf_probe_event *pev,
-					   struct kprobe_trace_event ...
From: Masami Hiramatsu
Date: Thursday, July 29, 2010 - 4:51 am

Yeah, renaming itself is OK for me. But please do it carefully,
I can see some gaps between 1st line and 2nd line like below
after applying this patch...

static int kprobe_convert_to_perf_probe(struct probe_trace_point *tp,
                                       struct perf_probe_point *pp)

But just a trivial style issue. :)

-- 
Masami HIRAMATSU
2nd Research Dept.
Hitachi, Ltd., Systems Development Laboratory
E-mail: masami.hiramatsu.pt@hitachi.com
--

From: Srikar Dronamraju
Date: Thursday, July 29, 2010 - 7:13 am

Masami, 
 Below patch should address the comments raised by you.

Ingo, Arnaldo, 
Please let me know if you want this to be sent as a new patchset.

--
Thanks and Regards
Srikar

---
From: Srikar Dronamraju <srikar@linux.vnet.ibm.com>

As a precursor for perf to support uprobes, rename fields/functions
that had kprobe in their name but can be shared across perf-kprobes
and perf-uprobes to probe.

Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
---

 tools/perf/builtin-probe.c     |    1 
 tools/perf/util/probe-event.c  |  135 ++++++++++++++++++++--------------------
 tools/perf/util/probe-event.h  |   27 +++-----
 tools/perf/util/probe-finder.c |   34 +++++-----
 tools/perf/util/probe-finder.h |   10 +--
 5 files changed, 101 insertions(+), 106 deletions(-)


diff --git a/tools/perf/builtin-probe.c b/tools/perf/builtin-probe.c
index 5455186..199d5e1 100644
--- a/tools/perf/builtin-probe.c
+++ b/tools/perf/builtin-probe.c
@@ -267,4 +267,3 @@ int cmd_probe(int argc, const char **argv, const char *prefix __used)
 	}
 	return 0;
 }
-
diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
index 4445a1e..2e665cb 100644
--- a/tools/perf/util/probe-event.c
+++ b/tools/perf/util/probe-event.c
@@ -1,5 +1,5 @@
 /*
- * probe-event.c : perf-probe definition to kprobe_events format converter
+ * probe-event.c : perf-probe definition to probe_events format converter
  *
  * Written by Masami Hiramatsu <mhiramat@redhat.com>
  *
@@ -120,8 +120,11 @@ static int open_vmlinux(void)
 	return open(machine.vmlinux_maps[MAP__FUNCTION]->dso->long_name, O_RDONLY);
 }
 
-/* Convert trace point to probe point with debuginfo */
-static int convert_to_perf_probe_point(struct kprobe_trace_point *tp,
+/*
+ * Convert trace point to probe point with debuginfo
+ * Currently only handles kprobes.
+ */
+static int kprobe_convert_to_perf_probe(struct probe_trace_point *tp,
 				       struct perf_probe_point *pp)
 {
 	struct symbol *sym;
@@ -151,8 ...
From: Arnaldo Carvalho de Melo
Date: Thursday, July 29, 2010 - 12:42 pm

I'll try and cherrypick this one for perf/core,

Thanks,

- Arnaldo
--

From: tip-bot for Srikar Dronamraju
Date: Monday, August 2, 2010 - 12:53 am

Commit-ID:  0e60836bbd392300198c5c2d918c18845428a1fe
Gitweb:     http://git.kernel.org/tip/0e60836bbd392300198c5c2d918c18845428a1fe
Author:     Srikar Dronamraju <srikar@linux.vnet.ibm.com>
AuthorDate: Thu, 29 Jul 2010 19:43:51 +0530
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Fri, 30 Jul 2010 12:01:38 -0300

perf probe: Rename common fields/functions from kprobe to probe.

As a precursor for perf to support uprobes, rename fields/functions
that had kprobe in their name but can be shared across perf-kprobes
and perf-uprobes to probe.

Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: "Frank Ch. Eigler" <fche@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Jim Keniston <jkenisto@linux.vnet.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mark Wielaard <mjw@redhat.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Naren A Devaiah <naren.devaiah@in.ibm.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Randy Dunlap <rdunlap@xenotime.net>
Cc: Steven Rostedt <rostedt@goodmis.org>
LKML-Reference: <20100729141351.GG21723@linux.vnet.ibm.com>
Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-probe.c     |    1 -
 tools/perf/util/probe-event.c  |  135 ++++++++++++++++++++-------------------
 tools/perf/util/probe-event.h  |   27 +++-----
 tools/perf/util/probe-finder.c |   34 +++++-----
 tools/perf/util/probe-finder.h |   10 ++--
 5 files changed, 101 insertions(+), 106 deletions(-)

diff --git a/tools/perf/builtin-probe.c b/tools/perf/builtin-probe.c
index 5455186..199d5e1 100644
--- a/tools/perf/builtin-probe.c
+++ b/tools/perf/builtin-probe.c
@@ -267,4 +267,3 @@ int cmd_probe(int argc, const char **argv, const ...
From: Srikar Dronamraju
Date: Tuesday, July 27, 2010 - 4:11 am

Enhances perf probe to accept pid and user vaddr.
Provides very basic support for uprobes.

TODO:
Update perf-probes.txt.
Global tracing.

Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
---

Changelog from v9: Renaming common fields/functions to refer to
probe instead of kprobe. This was suggested by Arnaldo.

Changelog from v8: Fixed a build break reported by Christoph Hellwig.

Changelog from v6: Changelog from v6: Fixed a bug reported by Masami.
  i.e Throw an error message and exit if perf probe is for a dwarf
  based probes.

Changelog from v4: Merged to 2.6.35-rc3-tip.

Changelog from v3: (addressed comments from Masami Hiramatsu)
	* Every process id has a different group name.
	* event name starts with function name.
	* If vaddr is specified, event name has vaddr appended
	  along with function name, (this is to avoid subsequent probes
	  using same event name.)
	* warning if -p and --list options are used together.

	Also dso can either be a short name or absolute path.

Here is a terminal snapshot of placing, using and removing a probe on a
process with pid 3591 (corresponding to zsh)

[ Probing a function in the executable using function name  ]
-------------------------------------------------------------
[root@ABCD]# perf probe -p 3591 zfree@zsh
Added new event:
  probe_3591:zfree                       (on 0x446420)

You can now use it on all perf tools, such as:

	perf record -e probe_3591:zfree -a sleep 1
[root@ABCD]# perf probe --list
probe_3591:zfree                       (on 3591:0x0000000000446420)
[root@ABCD]# cat /sys/kernel/debug/tracing/uprobe_events
p:probe_3591/zfree 3591:0x0000000000446420
[root@ABCD]# perf record -f -e probe_3591:zfree -a sleep 10
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.039 MB perf.data (~1716 samples) ]
[root@ABCD]# perf probe -p 3591 --del probe_3591:zfree
Remove event: probe_3591:zfree
[root@ABCD]# perf report
# Samples: 447
#
# Overhead          ...
From: Masami Hiramatsu
Date: Thursday, July 29, 2010 - 5:01 am

Hi Srikar,





And I found a small bug.

# ./perf probe -vf -p 3199 "setenv %ax"
probe-definition(0): setenv %ax
symbol:setenv file:(null) line:0 offset:0 return:0 lazy:(null)
parsing arg: %ax into %ax
1 arguments
Opening /debug/tracing/uprobe_events write=1
Add new event:
Writing event: p:probe_3199/setenv 3199:0x47e680
 %ax
Failed to write event: Invalid argument
  Error: Failed to add events. (-1)

# ./perf probe -l
  probe_3199:setenv    (on 3199:0x000000000047e680)


Here is a "\n".


Thank you,

-- 
Masami HIRAMATSU
2nd Research Dept.
Hitachi, Ltd., Systems Development Laboratory
E-mail: masami.hiramatsu.pt@hitachi.com
--

From: Srikar Dronamraju
Date: Thursday, July 29, 2010 - 7:11 am

Masami, 
 Below patch should address the comments raised by you.

--
Thanks and Regards
Srikar

---
From: Srikar Dronamraju <srikar@linux.vnet.ibm.com>

Enhances perf probe to accept pid and user vaddr.
Provides very basic support for uprobes.

TODO:
Update perf-probes.txt.
Global tracing.

Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
---

Changelog from v9: Renaming common fields/functions to refer to
probe instead of kprobe. This was suggested by Arnaldo.

Changelog from v8: Fixed a build break reported by Christoph Hellwig.

Changelog from v6: Changelog from v6: Fixed a bug reported by Masami.
  i.e Throw an error message and exit if perf probe is for a dwarf
  based probes.

Changelog from v4: Merged to 2.6.35-rc3-tip.

Changelog from v3: (addressed comments from Masami Hiramatsu)
	* Every process id has a different group name.
	* event name starts with function name.
	* If vaddr is specified, event name has vaddr appended
	  along with function name, (this is to avoid subsequent probes
	  using same event name.)
	* warning if -p and --list options are used together.

	Also dso can either be a short name or absolute path.

Here is a terminal snapshot of placing, using and removing a probe on a
process with pid 3591 (corresponding to zsh)

[ Probing a function in the executable using function name  ]
-------------------------------------------------------------
[root@ABCD]# perf probe -p 3591 zfree@zsh
Added new event:
  probe_3591:zfree                       (on 0x446420)

You can now use it on all perf tools, such as:

	perf record -e probe_3591:zfree -a sleep 1
[root@ABCD]# perf probe --list
probe_3591:zfree                       (on 3591:0x0000000000446420)
[root@ABCD]# cat /sys/kernel/debug/tracing/uprobe_events
p:probe_3591/zfree 3591:0x0000000000446420
[root@ABCD]# perf record -f -e probe_3591:zfree -a sleep 10
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.039 MB perf.data (~1716 samples) ...
From: Arnaldo Carvalho de Melo
Date: Friday, July 30, 2010 - 12:19 pm

"ubrobes based probe" could be made clear as:

"Specify pid of process where the probe will be added"

?

The following three hunks could be moved to a separate patch that I'd
apply on my perf/core branch, so to reduce this patchset size, like I
did with the s/kprobe/probe/g one that is already there:





--

From: Srikar Dronamraju
Date: Friday, July 30, 2010 - 7:57 pm

[snipped]


Actually tev->group gets checked where tev->event gets checked. 
However I had moved assigning tev->group to where group gets
assigned. This probably is causing the confusion. I will move the

This strdup(group) was moved ahead by few lines. I can move it back

Right, but I dont think this leak was introduced by my patch(s). I

--
Thanks and Regards
Srikar
--

From: Arnaldo Carvalho de Melo
Date: Saturday, July 31, 2010 - 12:30 pm

Sorry, will fix that then on a separate patch.

- Arnaldo
--

From: Masami Hiramatsu
Date: Sunday, August 1, 2010 - 6:51 pm

Yes, sorry, that was introduced by me...

Thanks for pointing it out.

-- 
Masami HIRAMATSU
2nd Research Dept.
Hitachi, Ltd., Systems Development Laboratory
E-mail: masami.hiramatsu.pt@hitachi.com
--

From: Srikar Dronamraju
Date: Monday, August 2, 2010 - 5:27 am

I will soon 2 patches instead of this patch splitting the 3 hunks into
a new patch and the resuting patch after removing the previous 3 hunks.

-Srikar
--

From: Arnaldo Carvalho de Melo
Date: Monday, August 2, 2010 - 7:56 am

Thanks, I'm just trying to erode the patchset by merging the completely
uncontentious hunks. :)

- Arnaldo
--

From: Srikar Dronamraju
Date: Monday, August 2, 2010 - 5:38 am

event__process is useful in processing /proc/<pid>/maps.
All of the functions that are called from event__process are defined
in util/event.c. Though its defined in builtin-top.c, it could be
reused for perf probe for uprobes. Hence moving it to util/event.c
and exporting the function.

Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
---

 tools/perf/builtin-top.c |   20 --------------------
 tools/perf/util/event.c  |   20 ++++++++++++++++++++
 tools/perf/util/event.h  |    1 +
 3 files changed, 21 insertions(+), 20 deletions(-)


diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index 1e8e92e..b513e40 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -1082,26 +1082,6 @@ static void event__process_sample(const event_t *self,
 	}
 }
 
-static int event__process(event_t *event, struct perf_session *session)
-{
-	switch (event->header.type) {
-	case PERF_RECORD_COMM:
-		event__process_comm(event, session);
-		break;
-	case PERF_RECORD_MMAP:
-		event__process_mmap(event, session);
-		break;
-	case PERF_RECORD_FORK:
-	case PERF_RECORD_EXIT:
-		event__process_task(event, session);
-		break;
-	default:
-		break;
-	}
-
-	return 0;
-}
-
 struct mmap_data {
 	int			counter;
 	void			*base;
diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c
index 6b0db55..08b6424 100644
--- a/tools/perf/util/event.c
+++ b/tools/perf/util/event.c
@@ -554,6 +554,26 @@ int event__process_task(event_t *self, struct perf_session *session)
 	return 0;
 }
 
+int event__process(event_t *event, struct perf_session *session)
+{
+	switch (event->header.type) {
+	case PERF_RECORD_COMM:
+		event__process_comm(event, session);
+		break;
+	case PERF_RECORD_MMAP:
+		event__process_mmap(event, session);
+		break;
+	case PERF_RECORD_FORK:
+	case PERF_RECORD_EXIT:
+		event__process_task(event, session);
+		break;
+	default:
+		break;
+	}
+
+	return 0;
+}
+
 void thread__find_addr_map(struct thread *self,
 			   struct ...
From: tip-bot for Srikar Dronamraju
Date: Thursday, August 5, 2010 - 1:01 am

Commit-ID:  b83f920e179101a54721e5ab1d6c3edfb9d4bcbb
Gitweb:     http://git.kernel.org/tip/b83f920e179101a54721e5ab1d6c3edfb9d4bcbb
Author:     Srikar Dronamraju <srikar@linux.vnet.ibm.com>
AuthorDate: Mon, 2 Aug 2010 18:08:51 +0530
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Wed, 4 Aug 2010 12:41:23 -0300

perf: expose event__process function

The event__process function is useful in processing /proc/<pid>/maps.  All of
the functions that are called from event__process are defined in util/event.c.
Though its defined in builtin-top.c, it could be reused for perf probe for
uprobes. Hence moving it to util/event.c and exporting the function.

LKML-Reference: <20100802123851.GD22812@linux.vnet.ibm.com>
Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-top.c |   20 --------------------
 tools/perf/util/event.c  |   20 ++++++++++++++++++++
 tools/perf/util/event.h  |    1 +
 3 files changed, 21 insertions(+), 20 deletions(-)

diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index 1e8e92e..b513e40 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -1082,26 +1082,6 @@ static void event__process_sample(const event_t *self,
 	}
 }
 
-static int event__process(event_t *event, struct perf_session *session)
-{
-	switch (event->header.type) {
-	case PERF_RECORD_COMM:
-		event__process_comm(event, session);
-		break;
-	case PERF_RECORD_MMAP:
-		event__process_mmap(event, session);
-		break;
-	case PERF_RECORD_FORK:
-	case PERF_RECORD_EXIT:
-		event__process_task(event, session);
-		break;
-	default:
-		break;
-	}
-
-	return 0;
-}
-
 struct mmap_data {
 	int			counter;
 	void			*base;
diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c
index db8a1d4..dab9e75 100644
--- a/tools/perf/util/event.c
+++ b/tools/perf/util/event.c
@@ -548,6 +548,26 @@ int event__process_task(event_t *self, struct perf_session ...
From: Srikar Dronamraju
Date: Monday, August 2, 2010 - 5:41 am

Enhances perf probe to accept pid and user vaddr.
Provides very basic support for uprobes.

TODO:
Update perf-probes.txt.
Global tracing.

Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
---

Changelog from v10: moving of event__process moved to a new patch.
	Also addressed other comments from Arnaldo.

Changelog from v9: Renaming common fields/functions to refer to
probe instead of kprobe. This was suggested by Arnaldo.

Changelog from v8: Fixed a build break reported by Christoph Hellwig.

Changelog from v6: Changelog from v6: Fixed a bug reported by Masami.
  i.e Throw an error message and exit if perf probe is for a dwarf
  based probes.

Changelog from v4: Merged to 2.6.35-rc3-tip.

Changelog from v3: (addressed comments from Masami Hiramatsu)
	* Every process id has a different group name.
	* event name starts with function name.
	* If vaddr is specified, event name has vaddr appended
	  along with function name, (this is to avoid subsequent probes
	  using same event name.)
	* warning if -p and --list options are used together.

	Also dso can either be a short name or absolute path.

Here is a terminal snapshot of placing, using and removing a probe on a
process with pid 3591 (corresponding to zsh)

[ Probing a function in the executable using function name  ]
-------------------------------------------------------------
[root@ABCD]# perf probe -p 3591 zfree@zsh
Added new event:
  probe_3591:zfree                       (on 0x446420)

You can now use it on all perf tools, such as:

	perf record -e probe_3591:zfree -a sleep 1
[root@ABCD]# perf probe --list
probe_3591:zfree                       (on 3591:0x0000000000446420)
[root@ABCD]# cat /sys/kernel/debug/tracing/uprobe_events
p:probe_3591/zfree 3591:0x0000000000446420
[root@ABCD]# perf record -f -e probe_3591:zfree -a sleep 10
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.039 MB perf.data (~1716 samples) ]
[root@ABCD]# perf probe -p 3591 --del ...
From: Srikar Dronamraju
Date: Tuesday, July 27, 2010 - 4:11 am

GElf_Sym for a symbol is needed to filter out based on binding,
type, value. This will be needed to list only global binding
functions when listing functions from  perf probe.

Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
---

 tools/perf/builtin-test.c |    4 +++-
 tools/perf/builtin-top.c  |    6 +++++-
 tools/perf/util/map.h     |    4 +++-
 tools/perf/util/symbol.c  |   10 +++++-----
 4 files changed, 16 insertions(+), 8 deletions(-)


diff --git a/tools/perf/builtin-test.c b/tools/perf/builtin-test.c
index 035b9fa..31fb9e3 100644
--- a/tools/perf/builtin-test.c
+++ b/tools/perf/builtin-test.c
@@ -11,10 +11,12 @@
 #include "util/session.h"
 #include "util/symbol.h"
 #include "util/thread.h"
+#include <gelf.h>
 
 static long page_size;
 
-static int vmlinux_matches_kallsyms_filter(struct map *map __used, struct symbol *sym)
+static int vmlinux_matches_kallsyms_filter(struct map *map __used,
+		struct symbol *sym, GElf_Sym *gsym __used)
 {
 	bool *visited = symbol__priv(sym);
 	*visited = true;
diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index b513e40..56989e8 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -54,6 +54,7 @@
 
 #include <linux/unistd.h>
 #include <linux/types.h>
+#include <gelf.h>
 
 static int			*fd[MAX_NR_CPUS][MAX_COUNTERS];
 
@@ -932,7 +933,10 @@ static const char *skip_symbols[] = {
 	NULL
 };
 
-static int symbol_filter(struct map *map, struct symbol *sym)
+#define __unused __attribute__((unused))
+
+static int symbol_filter(struct map *map, struct symbol *sym,
+					GElf_Sym *gsym __unused)
 {
 	struct sym_entry *syme;
 	const char *name = sym->name;
diff --git a/tools/perf/util/map.h b/tools/perf/util/map.h
index f391345..1fcde24 100644
--- a/tools/perf/util/map.h
+++ b/tools/perf/util/map.h
@@ -7,6 +7,7 @@
 #include <stdio.h>
 #include <stdbool.h>
 #include "types.h"
+#include <gelf.h>
 
 enum map_type {
 	MAP__FUNCTION = 0,
@@ -100,7 +101,8 @@ u64 ...
From: Arnaldo Carvalho de Melo
Date: Thursday, August 5, 2010 - 8:19 am

Humm, I think it is better to store the symbol binding in struct symbol,
so that we can have the full symtab and then, on a particular routine,
decide wheter we should use it or not.

I'll cook up a patch now.

- Arnaldo
--

From: Arnaldo Carvalho de Melo
Date: Thursday, August 5, 2010 - 8:20 am

One extra reason is that when we read the symtab from kallsyms, we don't
--

From: Srikar Dronamraju
Date: Thursday, August 5, 2010 - 8:23 am

Okay, Please keep me in loop on that patch so that I can update this
patch accordingly.

--
Thanks and Regards
Srikar
--

From: Srikar Dronamraju
Date: Tuesday, July 27, 2010 - 4:11 am

Introduces an option to list potential probes to probe using perf probe
command. Also introduces an option to limit the dso to list the potential
probes. Listing of potential probes is sorted by dso and
alphabetical order.

Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
---

Changelog from V9:
	Filter labels, weak, and local binding functions from listing
as suggested by Christoph Hellwig.
	Incorporated comments from Arnaldo on Version 9 of patchset.

Show all potential probes in the current running kernel and limit to
the last 10 functions.
# perf probe -S | tail
zlib_inflateInit2
zlib_inflateReset
zlib_inflate_blob
zlib_inflate_table
zlib_inflate_workspacesize
zone_pcp_update
zone_reclaim
zone_reclaimable_pages
zone_statistics
zone_watermark_ok

Show all potential probes in a process by pid 3104 across all dsos
and limit to the last 10 functions.
# perf probe -S -p 3104 | tail
_nss_files_setgrent
_nss_files_sethostent
_nss_files_setnetent
_nss_files_setnetgrent
_nss_files_setprotoent
_nss_files_setpwent
_nss_files_setrpcent
_nss_files_setservent
_nss_files_setspent
_nss_netgroup_parseline

Show all potentail probes in a process by pid 3104 limit to zsh dso
and limit to the last 10 functions.
# perf probe -S -p 3104 -D zsh | tail
zstrtol
ztrcmp
ztrdup
ztrduppfx
ztrftime
ztrlen
ztrncpy
ztrsub
zwarn
zwarnnam

 tools/perf/builtin-probe.c    |   45 ++++++++++++++++-
 tools/perf/util/map.h         |   27 ++++++++++
 tools/perf/util/probe-event.c |  109 +++++++++++++++++++++++++++++++++++++++++
 tools/perf/util/probe-event.h |    1 
 tools/perf/util/symbol.c      |   14 +++++
 tools/perf/util/symbol.h      |    1 
 6 files changed, 194 insertions(+), 3 deletions(-)


diff --git a/tools/perf/builtin-probe.c b/tools/perf/builtin-probe.c
index cb915a5..e8af545 100644
--- a/tools/perf/builtin-probe.c
+++ b/tools/perf/builtin-probe.c
@@ -50,9 +50,11 @@ static struct {
 	bool list_events;
 	bool force_add;
 	bool show_lines;
+	bool ...
From: Srikar Dronamraju
Date: Tuesday, July 27, 2010 - 4:11 am

Lists function names in a dso. Dso needs to a filename.
However passing Dso short name will not work.

Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
---

Changelog from V9: Filter labels, weak, and local binding functions
from listing as suggested by Christoph Hellwig.

Show last 10 functions in /bin/zsh.

# perf probe -S -D /bin/zsh | tail
zstrtol
ztrcmp
ztrdup
ztrduppfx
ztrftime
ztrlen
ztrncpy
ztrsub
zwarn
zwarnnam

Show first 10 functions in /lib/libc.so.6

# perf probe -S -D /lib/libc.so.6 | head
_IO_adjust_column
_IO_adjust_wcolumn
_IO_default_doallocate
_IO_default_finish
_IO_default_pbackfail
_IO_default_uflow
_IO_default_xsgetn
_IO_default_xsputn
_IO_do_write@@GLIBC_2.2.5
_IO_doallocbuf

 tools/perf/util/probe-event.c |   75 ++++++++++++++++++++++-------------------
 tools/perf/util/symbol.c      |    8 ++++
 tools/perf/util/symbol.h      |    1 +
 3 files changed, 50 insertions(+), 34 deletions(-)


diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
index d0636e1..a079ecc 100644
--- a/tools/perf/util/probe-event.c
+++ b/tools/perf/util/probe-event.c
@@ -2018,14 +2018,14 @@ static int print_list_available_symbols(struct map *map,
 
 int show_possible_probes(struct strlist *limitlist, pid_t pid)
 {
-	struct perf_session *session;
-	struct thread *thread;
+	struct perf_session *session = NULL;
+	struct thread *thread = NULL;
 	struct str_node *ent;
 	struct map *map = NULL;
 	char *name = NULL, *tmpname = NULL, *str;
 	int ret = -EINVAL;
 
-	if (!pid) {
+	if (!pid && !limitlist) { /* Show functions in kernel */
 		ret = init_vmlinux();
 		if (ret < 0)
 			return ret;
@@ -2036,26 +2036,28 @@ int show_possible_probes(struct strlist *limitlist, pid_t pid)
 		if (ret < 0)
 			return ret;
 	}
-	session = perf_session__new(NULL, O_WRONLY, false, false);
-	if (!session) {
-		ret = -ENOMEM;
-		goto out_delete;
-	}
-	event__synthesize_thread(pid, event__process, session);
-
-	thread = ...
Previous thread: [PATCH] regulator: add TPS6586X regulator driver by y on Tuesday, July 27, 2010 - 4:03 am. (3 messages)

Next thread: [PATCH + an old question] firewire: ohci: use memory barriers to order descriptor updates by Stefan Richter on Tuesday, July 27, 2010 - 4:20 am. (4 messages)