Re: [PATCH 1/7] Simple Performance Counters: Core Piece

Previous thread: [PATCH 4/7] Simple Performance Counters: IA64 support by Christoph Lameter on Tuesday, July 31, 2007 - 7:25 pm. (1 message)

Next thread: [PATCH 5/7] Simple Performance Counters: __alloc_pages instrumentation by Christoph Lameter on Tuesday, July 31, 2007 - 7:25 pm. (1 message)
To: <linux-kernel@...>
Cc: Christoph Lameter <clameter@...>
Date: Tuesday, July 31, 2007 - 7:25 pm

Simple performance counters are a way to measure the performance on code
paths in the Linux kernel. Code must be instrumented with calls that signal
the start and the stop of a measurement.

The beginning of a code path must have the following. Either:

INIT_PC(var)

or
struct pc var;

...

pc_start(&var);

and at the end of the segment of code to be measured either:

pc_stop(&var, PC_xxx);

to just measure time intervals. Or

pc_bytes(&var, bytes, PC_xxx)

to measure the amount of data that a code path can handle.

The data can then be viewed as the kernel runs via

cat /proc/perf/all

Which will show some timing and performance statistics. The numbers in ()
show 3 values: (mininum/average/maximum)

update_process_times 21370 14.8ms(194ns/693ns/9us)
alloc_pages 297542 189.4ms(96ns/637ns/68.7us) 1.2gb(4.1kb/4.2kb/16.4kb)
kmem_cache_alloc 637116 71.7ms(10ns/113ns/60.8us)
kmem_cache_free 566426 39.2ms(19ns/69ns/7.1us)
kfree 48622 4.1ms(19ns/84ns/3.7us)

update_process_times needed between 194ns and 9us. On average is needsd 693 nanoseconds.
21370 measurements werer performed.

Data can be zeroed by writing to /proc/perf/reset.

Typically one would zero the counters and then perform a kernel activity that
exercises the instrumented code path.

Data can be viewed in
/proc/perf

Special files:

/proc/perf/all Shows a summary
/proc/perf/reset Writing to this file resets counters
/proc/perf/0 Counters on processor 0

Signed-off-by: Christoph Lameter <clameter@sgi.com>
---
include/linux/perf.h | 55 ++++++++
init/Kconfig | 10 ++
kernel/Makefile | 1 +
kernel/perf.c | 368 ++++++++++++++++++++++++++++++++++++++++++++++++++
kernel/timer.c | 3 +
5 files changed, 437 insertions(+), 0 deletions(-)
create mode 100644 include/linux/perf.h
create mode 100644 kernel/perf.c

diff -...

To: Christoph Lameter <clameter@...>
Cc: <linux-kernel@...>
Date: Friday, August 17, 2007 - 12:42 pm

Hi Christoph,

Actually, get_cycles() at least on some AMD cpus, do not synchronize the
core, which can skew the results. You might want to use
get_cycles_sync() there.

Mathieu

--
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
-

To: Mathieu Desnoyers <mathieu.desnoyers@...>
Cc: <linux-kernel@...>
Date: Friday, August 17, 2007 - 4:42 pm

get_cycle() results as used here are bound to a single processor. If we
end up on a different processor at the end of the measurement then the
result is discarded. So not need for get_cycles_sync.

-

To: Christoph Lameter <clameter@...>
Cc: <linux-kernel@...>
Date: Friday, August 17, 2007 - 11:25 pm

I may be wrong, but I think the UP case still needs to synchronize the
core to have precise measurement. This sync core will make sure that
rdtsc is not executed speculatively. It therefore makes sure it is not
misplaced in the instruction stream. I therefore don't think this is a
SMP special case.

Mathieu

--
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
-

To: Christoph Lameter <clameter@...>
Cc: <linux-kernel@...>
Date: Tuesday, July 31, 2007 - 9:30 pm

For what it's worth, this kind of measurement widget could be usefully
recast as a pure client of the lttng/systemtap markers that Mathieu is
still working on. Instead of the custom pc_start/pc_stop functions,
you would have generic markers to identify the start/end spots, and a
bit of callback code to compute/export the statistics to procfs.

- FChE
-

Previous thread: [PATCH 4/7] Simple Performance Counters: IA64 support by Christoph Lameter on Tuesday, July 31, 2007 - 7:25 pm. (1 message)

Next thread: [PATCH 5/7] Simple Performance Counters: __alloc_pages instrumentation by Christoph Lameter on Tuesday, July 31, 2007 - 7:25 pm. (1 message)