Re: [patch for 2.6.26 0/7] Architecture Independent Markers

Previous thread: [patch for 2.6.26 1/7] Markers - define non optimized marker by Mathieu Desnoyers on Thursday, March 27, 2008 - 9:20 am. (1 message)

Next thread: [patch for 2.6.26 3/7] LTTng instrumentation ipc by Mathieu Desnoyers on Thursday, March 27, 2008 - 9:21 am. (1 message)
To: <akpm@...>, <linux-kernel@...>
Date: Thursday, March 27, 2008 - 9:20 am

Hi Andrew,

After a few RFC rounds, I propose these markers for 2.6.26. They include
work done after comments from the memory management community. Most of them have
been used by the LTTng project for about 2 years.

The first patch in the patchset does a small addition to the markers API : it
allows marker sites to declare a _trace_mark marker. I forces use of a marker
which does not rely on an instruction modification mechanism to enable itself.
It's required in some kernel code paths (lockdep, printk, some traps, __init and
__exit code). I would prefer to get this in before the immediate values, since
the immediate values optimization, which depends on the rework of x86
alternatives, paravirt and kprobes currently being merged, takes longer than
expected.

I do not expect this marker set to cover every bits of the kernel (and this is
not its purpose). It's just a good start that has proven to be very useful to
the LTTng community in the past 2 years.

This patchset applies over 2.6.25-rc7 in this order :

markers-define-non-optimized-marker.patch
lttng-instrumentation-fs.patch
lttng-instrumentation-ipc.patch
lttng-instrumentation-kernel.patch
lttng-instrumentation-mm.patch
lttng-instrumentation-net.patch
lttng-instrumentation-lib.patch

Mathieu

--
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
--

To: Mathieu Desnoyers <mathieu.desnoyers@...>
Cc: <akpm@...>, <linux-kernel@...>
Date: Thursday, March 27, 2008 - 8:01 pm

Not in this patch, but I noticed:

#define __trace_mark(name, call_private, format, args...) \
do { \
static const char __mstrtab_##name[] \
__attribute__((section("__markers_strings"))) \
= #name "\0" format; \
static struct marker __mark_##name \
__attribute__((section("__markers"), aligned(8))) = \
{ __mstrtab_##name, &__mstrtab_##name[sizeof(#name)], \
0, 0, marker_probe_cb, \
{ __mark_empty_function, NULL}, NULL }; \
__mark_check_format(format, ## args); \
if (unlikely(__mark_##name.state)) { \
(*__mark_##name.call) \
(&__mark_##name, call_private, \
format, ## args); \
} \
} while (0)

In this call:

(*__mark_##name.call) \
(&__mark_##name, call_private, \
format, ## args); \

you make gcc allocate duplicate format string. You can use
&__mstrtab_##name[sizeof(#name)] instead since it holds the same string,
or drop ", format," above and "const char *fmt" from here:

void (*call)(const struct marker *mdata, /* Probe wrapper */
void *call_private, const char *fmt, ...);

since mdata->format is the same and all callees which need it can take it there.
--
vda
--

To: Denys Vlasenko <vda.linux@...>
Cc: <akpm@...>, <linux-kernel@...>
Date: Thursday, March 27, 2008 - 9:04 pm

Markers - define non optimized marker

To support the forthcoming "immediate values" marker optimization, we must have
a way to declare markers in few code paths that does not use instruction
modification based enable. This will be the case of printk(), some traps and
eventually lockdep instrumentation.

Changelog :
- Fix reversed boolean logic of "generic".

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
---
include/linux/marker.h | 29 ++++++++++++++++++++++++-----
1 file changed, 24 insertions(+), 5 deletions(-)

Index: linux-2.6-lttng/include/linux/marker.h
===================================================================
--- linux-2.6-lttng.orig/include/linux/marker.h 2008-03-27 20:47:44.000000000 -0400
+++ linux-2.6-lttng/include/linux/marker.h 2008-03-27 20:49:04.000000000 -0400
@@ -58,8 +58,12 @@ struct marker {
* Make sure the alignment of the structure in the __markers section will
* not add unwanted padding between the beginning of the section and the
* structure. Force alignment to the same alignment as the section start.
+ *
+ * The "generic" argument controls which marker enabling mechanism must be used.
+ * If generic is true, a variable read is used.
+ * If generic is false, immediate values are used.
*/
-#define __trace_mark(name, call_private, format, args...) \
+#define __trace_mark(generic, name, call_private, format, args...) \
do { \
static const char __mstrtab_##name[] \
__attribute__((section("__markers_strings"))) \
@@ -79,7 +83,7 @@ struct marker {
extern void marker_update_probe_range(struct marker *begin,
struct marker *end);
#else /* !CONFIG_MARKERS */
-#define __trace_mark(name, call_private, format, args...) \
+#define __trace_mark(generic, name, call_private, format, args...) \
__mark_check_format(format, ## args)
static inline void marker_update_probe_range(struct marker *begin,
struct marker *end)
@@ -87,15 +91,30 @@ static inline void marker_update_probe_r
#endif /* CONFI...

To: Denys Vlasenko <vda.linux@...>
Cc: <akpm@...>, <linux-kernel@...>
Date: Thursday, March 27, 2008 - 9:02 pm

Very good point. I actually thought about dropping it, since it would
remove an unnecessary argument from the stack. And actually, since I now
have the marker_probe_cb sitting between the marker site and the
callbacks, there is no API change required. Thanks :)

Mathieu

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
CC: Denys Vlasenko <vda.linux@googlemail.com>
---
include/linux/marker.h | 11 +++++------
kernel/marker.c | 30 ++++++++++++++----------------
2 files changed, 19 insertions(+), 22 deletions(-)

Index: linux-2.6-lttng/include/linux/marker.h
===================================================================
--- linux-2.6-lttng.orig/include/linux/marker.h 2008-03-27 20:51:34.000000000 -0400
+++ linux-2.6-lttng/include/linux/marker.h 2008-03-27 20:54:55.000000000 -0400
@@ -44,8 +44,8 @@ struct marker {
*/
char state; /* Marker state. */
char ptype; /* probe type : 0 : single, 1 : multi */
- void (*call)(const struct marker *mdata, /* Probe wrapper */
- void *call_private, const char *fmt, ...);
+ /* Probe wrapper */
+ void (*call)(const struct marker *mdata, void *call_private, ...);
struct marker_probe_closure single;
struct marker_probe_closure *multi;
} __attribute__((aligned(8)));
@@ -72,8 +72,7 @@ struct marker {
__mark_check_format(format, ## args); \
if (unlikely(__mark_##name.state)) { \
(*__mark_##name.call) \
- (&__mark_##name, call_private, \
- format, ## args); \
+ (&__mark_##name, call_private, ## args);\
} \
} while (0)

@@ -117,9 +116,9 @@ static inline void __printf(1, 2) ___mar
extern marker_probe_func __mark_empty_function;

extern void marker_probe_cb(const struct marker *mdata,
- void *call_private, const char *fmt, ...);
+ void *call_private, ...);
extern void marker_probe_cb_noarg(const struct marker *mdata,
- void *call_private, const char *fmt, ...);
+ void *call_private, ...);

/*
* Connect a probe to a marke...

To: Mathieu Desnoyers <mathieu.desnoyers@...>
Cc: Denys Vlasenko <vda.linux@...>, <akpm@...>, <linux-kernel@...>
Date: Friday, March 28, 2008 - 1:35 am

Hi Mathieu,

First of all, I'm very interested in the marker, and your patches look
useful for me.

By the way, could you tell me what the call_private is for?

However, I think call_data can be passed as an argument.
If so, I think you can also drop it, like this.

void (*call)(const struct marker *mdata, ...);

Best regards,

--
Masami Hiramatsu

Software Engineer
Hitachi Computer Products (America) Inc.
Software Solutions Division

e-mail: mhiramat@redhat.com

--

To: Mathieu Desnoyers <mathieu.desnoyers@...>
Cc: <akpm@...>, <linux-kernel@...>, Linus Torvalds <torvalds@...>
Date: Thursday, March 27, 2008 - 11:40 am

very strong NACK. When markers went into 2.6.24 i initially believed
your claims that my earlier objections about markers have been resolved
and that it's a lightweight, useful facility.

so we optimistically used markers in ftrace (see sched-devel.git) for
the scheduler, and i was shocked about marker impact:

just 3 ftrace markers in the scheduler plus their support code bloated
the kernel by 5k (!), 288 bytes for only 3 markers in the scheduler
itself, the rest in support code to manage the markers - that's 96 bytes
added per every marker (44 (!) bytes of that in the fastpath!).

44 bytes per marker per fastpast is _NOT_ acceptable in any way, shape
or form. Those 3 limited markers have the same cache cost as adding
mcount callbacks for dyn-ftrace to the _whole_ scheduler ...

as i told you many, many moons ago, repeatedly: acceptable cost is a 5
bytes callout that is patched to a NOP, and _maybe_ a little bit more to
prepare parameters for the function calls. Not 44 bytes. Not 96 bytes.
Not 5K total cost. Paravirt ops are super-lightweight in comparison.

and this stuff _can_ be done sanely and cheaply and in fact we have done
it: see ftrace in sched-devel.git, and compare its cost.

see further details in the tongue-in-cheek commit below.

Ingo

--------------------->
Subject: sched: reintroduce markers
From: Ingo Molnar <mingo@elte.hu>

Scheduler markers are seriously bloated - so lets add them:

text data bss dec hex filename
7209664 852020 1634872 9696556 93f52c vmlinux.before
7214679 852188 1634872 9701739 94096b vmlinux.after

5K more bloat!

but that's not just bloat in the slowpath and tracer bloat, it's
also fastpath bloat:

text data bss dec hex filename
37642 7014 384 45040 aff0 sched.o.before
37930 7134 384 45448 b188 sched.o.after

288 bytes for only 3 markers in the scheduler - that's 96 bytes added
per every marker (44 bytes of that in the ...

To: Ingo Molnar <mingo@...>
Cc: Mathieu Desnoyers <mathieu.desnoyers@...>, <akpm@...>, <linux-kernel@...>, Linus Torvalds <torvalds@...>
Date: Thursday, March 27, 2008 - 5:49 pm

Do you have any *time* measurements?

- FChE
--

To: Ingo Molnar <mingo@...>
Cc: <akpm@...>, <linux-kernel@...>, Linus Torvalds <torvalds@...>
Date: Thursday, March 27, 2008 - 4:39 pm

Hi Ingo,

Let's compare one marker against one ftrace statement in sched.o on the
sched-dev tree on x86_32 and see where your "bloat" impression about markers
comes from. I think it's mostly due to the different metrics we use.

sched.o w/o CONFIG_CONTEXT_SWITCH_TRACER
text data bss dec hex filename
46564 2924 200 49688 c218 kernel/sched.o

Let's get an idea of CONFIG_CONTEXT_SWITCH_TRACER impact on sched.o :

sched.o with CONFIG_CONTEXT_SWITCH_TRACER

text data bss dec hex filename
46788 2924 200 49912 c2f8 kernel/sched.o

224 bytes added for 6 ftrace_*(). This is partly due to the helper function
ftrace_all_fair_tasks(). So let's be fair and not take it in account.

Only the cost for one ftrace_*(). All the others commented out, leaving this
one :

static inline void
context_switch(struct rq *rq, struct task_struct *prev,
struct task_struct *next)
{
struct mm_struct *mm, *oldmm;

prepare_task_switch(rq, prev, next);
ftrace_ctx_switch(rq, prev, next);
...

text data bss dec hex filename
46644 2924 200 49768 c268 kernel/sched.o

Commenting this one out :

text data bss dec hex filename
46628 2924 200 49752 c258 kernel/sched.o

For an extra 16 bytes (13 + alignment).

Due to this addition to schedule fast path :

movl %ebx, %ecx
movl -48(%ebp), %edx
movl -40(%ebp), %eax
call ftrace_ctx_switch

corresponding to :

38c: 89 d9 mov %ebx,%ecx
38e: 8b 55 d0 mov -0x30(%ebp),%edx
391: 8b 45 d8 mov -0x28(%ebp),%eax
394: e8 fc ff ff ff call 395 <schedule+0x12c>

Which adds 13 bytes to the fast path. It reads the stack to populate the
registers even when the code is dynamically disabled. The size of this code
directly depends on the number of parameters passed to the trace...

To: Mathieu Desnoyers <mathieu.desnoyers@...>
Cc: <akpm@...>, <linux-kernel@...>, Linus Torvalds <torvalds@...>
Date: Friday, March 28, 2008 - 9:33 am

you talk about 32-bit while i talk about 64-bit. All these costs go up
on 64-bit and you should know that. I measured 44 bytes in the fastpath
and 52 bytes in the slowpath, which gives 96 bytes. (with a distro
.config and likely with a different gcc)

96 bytes _per marker_ sprinkled throughout the kernel. This blows up the
cache footprint of the kernel quite substantially, because it's all
fragmented - even if this is in the 'slowpath'.

so yes, that is the bloat i'm talking about.

dont just compare it to ftrace-sched-switch, compare it to dyn-ftrace
which gives us more than 78,000 trace points in the kernel _here and
today_ at no measurable runtime cost, with a 5 byte NOP per trace point
and _zero_ instruction stream (register scheduling, etc.) intrusion. No
slowpath cost.

and the basic API approach of markers is flawed a well - the coupling to
the kernel is too strong. The correct and long-term maintainable
coupling is via ASCII symbol names, not via any binding built into the
kernel.

With dyn-ftrace (see sched-devel.git/latest) tracing filters can be
installed trivially by users, via function _symbols_, via:

/debugfs/tracing/available_filter_functions
/debugfs/tracing/set_ftrace_filter

wildcards are recognized as well, so if you do:

echo '*lock' > /debugfs/tracing/set_ftrace_filter

all functions that have 'lock' in their name will have their tracepoints
activated transparently from that point on.

even multiple names can be passed in at once:

echo 'schedule wake_up* *acpi*' > /debugfs/tracing/set_ftrace_filter

so it's trivial to use it, very powerful and we've only begun exposing
it towards users. I see no good reason why we'd patch any marker into
the kernel - it's a maintenance cost from that point on.

so yes, my argument is: tens of thousands of lightweight tracepoints in
the kernel here and today, which are configurable via function names,
each of which can be turned on and off individually, and none of which
needs...

To: Ingo Molnar <mingo@...>
Cc: <akpm@...>, <linux-kernel@...>, Linus Torvalds <torvalds@...>
Date: Saturday, March 29, 2008 - 1:16 pm

I did some testing with gcc -Os and -O2 on x86_64 and noticed that -Os
behaves badly in that it does not uses -freorder-blocks. This
optimization is required to have the unlikely branches moved out of the
critical path.

With -O2 :
mov $0,%al
movq %rsi, 1912(%rbx)
movq -96(%rbp), %rdi
incq (%rdi)
testb %al, %al
jne .L1785

4de: b0 00 mov $0x0,%al
4e0: 48 89 b3 78 07 00 00 mov %rsi,0x778(%rbx)
4e7: 48 8b 7d a0 mov 0xffffffffffffffa0(%rbp),%rdi
4eb: 48 ff 07 incq (%rdi)
4ee: 84 c0 test %al,%al
4f0: 0f 85 5f 03 00 00 jne 855 <thread_return+0x2b4>

So, as far as the assembly for the markers in the fast path is
concerned, it adds 10 bytes to the fast path, on x86_64. (I did not
count the %rdi stuff in this since I suppose it's unrelated to markers
and put there by the compiler which reorders instructions)

The bloc which contains the call is much lower at the end of
thread_return.

855: 49 89 f0 mov %rsi,%r8
858: 48 89 d1 mov %rdx,%rcx
85b: 31 f6 xor %esi,%esi
85d: 48 89 da mov %rbx,%rdx
860: 48 c7 c7 00 00 00 00 mov $0x0,%rdi
867: 31 c0 xor %eax,%eax
869: ff 15 00 00 00 00 callq *0(%rip) # 86f <thread_return+0x2ce>
86f: e9 82 fc ff ff jmpq 4f6 <schedule+0x166>

For an added 31 bytes.

I think the very different compiler options we use change the picture

Markers and dyn-ftrace does not fulfill the same purpose, so I don't see
why we should compare them. dyn-ftrace is good at tracing function
entry/exit, so let's keep it. However, it's not designed to extract
variables at specific locations in the kernel code.

Which slowpath cost are you talking about ? When markers are disabled,
their unused function call instructions are placed car...

To: Mathieu Desnoyers <mathieu.desnoyers@...>
Cc: <akpm@...>, <linux-kernel@...>, Linus Torvalds <torvalds@...>
Date: Friday, March 28, 2008 - 5:43 am

it's not 6 ftrace calls, you forgot about kernel/sched_fair.c, so it's 9
tracepoints.

note that all but the 2 core trace hooks are temporary, i used them to
debug a specific scheduler problem. Especially one trace point:
ftrace_all_fair_tasks() is a totally ad-hoc trace-all-tasks-in-the-rq
heavy function.

if you want to compare apples to apples, try the patch below, which
removes the ad-hoc tracepoints.

Ingo

------------------------>
Subject: no: ad hoc ftrace points
From: Ingo Molnar <mingo@elte.hu>
Date: Fri Mar 28 10:30:37 CET 2008

Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
kernel/sched.c | 47 -----------------------------------------------
kernel/sched_fair.c | 3 ---
2 files changed, 50 deletions(-)

Index: linux/kernel/sched.c
===================================================================
--- linux.orig/kernel/sched.c
+++ linux/kernel/sched.c
@@ -2005,53 +2005,6 @@ static int sched_balance_self(int cpu, i

#endif /* CONFIG_SMP */

-#ifdef CONFIG_CONTEXT_SWITCH_TRACER
-
-void ftrace_task(struct task_struct *p, void *__tr, void *__data)
-{
-#if 0
- /*
- * trace timeline tree
- */
- __trace_special(__tr, __data,
- p->pid, p->se.vruntime, p->se.sum_exec_runtime);
-#else
- /*
- * trace balance metrics
- */
- __trace_special(__tr, __data,
- p->pid, p->se.avg_overlap, 0);
-#endif
-}
-
-void ftrace_all_fair_tasks(void *__rq, void *__tr, void *__data)
-{
- struct task_struct *p;
- struct sched_entity *se;
- struct rb_node *curr;
- struct rq *rq = __rq;
-
- if (rq->cfs.curr) {
- p = task_of(rq->cfs.curr);
- ftrace_task(p, __tr, __data);
- }
- if (rq->cfs.next) {
- p = task_of(rq->cfs.next);
- ftrace_task(p, __tr, __data);
- }
-
- for (curr = first_fair(&rq->cfs); curr; curr = rb_next(curr)) {
- se = rb_entry(curr, struct sched_entity, run_node);
- if (!entity_is_task(se))
- continue;
-
- p = task_of(se);
- ftrace_task(p, __tr, __data);
- ...

To: Ingo Molnar <mingo@...>
Cc: <akpm@...>, <linux-kernel@...>, Linus Torvalds <torvalds@...>
Date: Friday, March 28, 2008 - 7:38 am

Hrm, you are only quoting my introduction, where I introduce the reason
why I do in a more in-depth analysis on a _single_ ftrace statement.

Ingo, if you care to read the rest of my email, you will discover that I
concentrated my effort on a single ftrace statement in context_switch().
Whether or not I removed the trace points from kernel/sched_fair.c does
not change the validity of the results that follow. I commented out your
ad-hoc tracepoints from sched.c by hand in my test cases, and
sched_fair.c trace points were there in every scenario, so they were
invariant and _not_ considered, except in this introduction you quoted.

--
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
--

To: Mathieu Desnoyers <mathieu.desnoyers@...>
Cc: <akpm@...>, <linux-kernel@...>, Linus Torvalds <torvalds@...>
Date: Friday, March 28, 2008 - 7:22 am

... new version of that is in sched-devel.git.

Ingo
--

To: Ingo Molnar <mingo@...>
Cc: Mathieu Desnoyers <mathieu.desnoyers@...>, <akpm@...>, <linux-kernel@...>, Linus Torvalds <torvalds@...>
Date: Thursday, March 27, 2008 - 1:08 pm

I wonder why nobody doesn't fix this serious problem until now.
and I am interesting to what difference to ftrace and marker.

if I have time at next week, I will investigate the marker problem.
--

To: KOSAKI Motohiro <m-kosaki@...>
Cc: Mathieu Desnoyers <mathieu.desnoyers@...>, <akpm@...>, <linux-kernel@...>, Linus Torvalds <torvalds@...>, Frank Ch. Eigler <fche@...>
Date: Friday, March 28, 2008 - 6:15 am

i warned and moaned about it ad nauseum.

furthermore, and because it's Friday again, let me remind folks that
SystemTap has an even more significant bloat problem: the fact that it
needs a huge download:

Installing:
kernel-debuginfo x86_64 2.6.25-0.163.rc7.git1.fc9
development-debuginfo 198 M
Installing for dependencies:
kernel-debuginfo-common x86_64 2.6.25-0.163.rc7.git1.fc9
development-debuginfo 30 M

Total download size: 229 M

for _every_ updated kernel. That 229 MB size reduces the user base of
SystemTap (which is otherwise a totally brilliant and cool tool) to 1%
of its potential userbase, to those truly desperate persons who really
_need_ to get their problem debugged somehow. But it's nowhere near
usable as an easy, ad-hoc kernel instrumentation tool, just due to the
sheer size it brings.

for heaven's sake, we can have 3 years of _full Linux kernel history_,
with all 87875 commits, with full changelog and dependencies and full
source code included, packed into a rather tight, 180 MB git repository
...

[ and then i havent even begun about the on-disk footprint of this
monstrum, after the packages have been installed: an additional 850 MB.
Puh-lease ... ]

there's a huge disconnect with reality here.

Ingo
--

To: Ingo Molnar <mingo@...>
Cc: KOSAKI Motohiro <m-kosaki@...>, Mathieu Desnoyers <mathieu.desnoyers@...>, <akpm@...>, <linux-kernel@...>, Linus Torvalds <torvalds@...>
Date: Friday, March 28, 2008 - 9:40 am

Frankly, the "problem" has not yet been established. Ingo has failed
to respond to Mathieu's analysis of the actual instruction streams,
which cast doubt on Ingo's technique of simply counting bytes. So
far, neither Ingo nor Mathieu have published *time measurements* to

It's a problem, and there are a few improvements under way or being
contemplated:

- compression of the DWARF data to eliminate more duplication
- support for inserting probes only on function entry/exit points
without debugging data - relying on symbol tables and raw ABI
- a networked service to allow one machine to serve site-wide
systemtap needs

At the end of the day, lots of information will still need to be
around for those who need to put probes approximately anywhere and to

It may seem counterintuitive to you, but remember that object code is
recreated from scratch new every build, yet source code only gradually
(?) changes. Try checking in 3 years' builds of stripped object
files into a git repo.

- FChE
--

To: Frank Ch. Eigler <fche@...>
Cc: KOSAKI Motohiro <m-kosaki@...>, Mathieu Desnoyers <mathieu.desnoyers@...>, <akpm@...>, <linux-kernel@...>, Linus Torvalds <torvalds@...>
Date: Friday, March 28, 2008 - 10:41 am

it's been the primary usability problem ever since SystemTap has been
incepted 3 years ago. The 229+850 MB size numbers i cited were from a
bleeding-edge (Fedora 9 beta) distro. So whatever is being contemplated,
it's not here today.

Ingo
--

To: Ingo Molnar <mingo@...>
Cc: KOSAKI Motohiro <m-kosaki@...>, Mathieu Desnoyers <mathieu.desnoyers@...>, <akpm@...>, <linux-kernel@...>, Linus Torvalds <torvalds@...>
Date: Friday, March 28, 2008 - 11:31 am

Hi -

This complaint is particularly rich, seeing how you constantly try to
shoot down patches that would reduce systemtap's need for this, such
as markers. And no, dyn-ftrace is not a complete substitute. See how
many of the markers in mathieu's lttng suite occur at places *other*
than function entry points. See how many require parameters other
than what may already sit in registers.

We in systemtap land have always had to make do with whatever was in
the kernel. There was very little other than kprobes 3 years ago. Of
newer facilities such as markers, ftrace, perfmon2 (I hope), and a
bunch of other little tracer thingies, all those that provide a usable
callback style interface can be exposed to systemtap. The more the
merrier, and the less we'd have to resort to the lowest-level
kprobes/dwarf stuff. Please help rather than obstruct.

- FChE
--

To: Frank Ch. Eigler <fche@...>
Cc: KOSAKI Motohiro <m-kosaki@...>, Mathieu Desnoyers <mathieu.desnoyers@...>, <akpm@...>, <linux-kernel@...>, Linus Torvalds <torvalds@...>
Date: Friday, March 28, 2008 - 10:18 am

uhm no. The "difference" is due to Matthieu counting on the platform
that is most favorable to his argument: 32-bit, while i counted on
64-bit where most servers operate (and where instrumentation micro-cost
actually matters).

Ingo
--

To: Ingo Molnar <mingo@...>
Cc: KOSAKI Motohiro <m-kosaki@...>, Mathieu Desnoyers <mathieu.desnoyers@...>, <akpm@...>, <linux-kernel@...>, Linus Torvalds <torvalds@...>, Frank Ch. Eigler <fche@...>, systemtap-ml <systemtap@...>, Jim Keniston <jkenisto@...>
Date: Friday, March 28, 2008 - 9:34 am

Sure, it is a thorny issue especially for users of daily-kernel-updating
distribution.

However, when updating kernel from tarball, We can get full set of debuginfo
just by building kernel with CONFIG_DEBUG_INFO. I think it is enough
usable for kernel developers.

BTW, we already started working on the symbol-table based probing.

Thank for Moore's law and all storage engineers, nowadays we can buy
an 1TB(=1000GB!) HDD with less than $300!(I also surprised...)
And many other compressed filesystem can be used for saving disk space.

Best regards,

--
Masami Hiramatsu

Software Engineer
Hitachi Computer Products (America) Inc.
Software Solutions Division

e-mail: mhiramat@redhat.com

--

To: Masami Hiramatsu <mhiramat@...>
Cc: Ingo Molnar <mingo@...>, KOSAKI Motohiro <m-kosaki@...>, Mathieu Desnoyers <mathieu.desnoyers@...>, <akpm@...>, <linux-kernel@...>, Linus Torvalds <torvalds@...>, Frank Ch. Eigler <fche@...>, systemtap-ml <systemtap@...>, Jim Keniston <jkenisto@...>
Date: Monday, March 31, 2008 - 9:43 pm

This is not a valid technical reason for creating bloatware.

Bloatware's main problem is not a cost of storing it or
downloading it. The main problem that over time it becomes
an unmaintainable monstrosity nobody is willing to deal with.

I would rather try to find and fix a bug in 2000 lines of C code
than in 200 000 lines.
--
vda
--

To: Denys Vlasenko <vda.linux@...>
Cc: Ingo Molnar <mingo@...>, KOSAKI Motohiro <m-kosaki@...>, Mathieu Desnoyers <mathieu.desnoyers@...>, <akpm@...>, <linux-kernel@...>, Linus Torvalds <torvalds@...>, Frank Ch. Eigler <fche@...>, systemtap-ml <systemtap@...>, Jim Keniston <jkenisto@...>
Date: Tuesday, April 1, 2008 - 10:30 am

Hi Denys,

If it is a program code, you're right.
However, the debuginfo is just a set of data files generated from
c-source code by the compiler, so you don't need to maintain it.

--
Masami Hiramatsu

Software Engineer
Hitachi Computer Products (America) Inc.
Software Solutions Division

e-mail: mhiramat@redhat.com

--

Previous thread: [patch for 2.6.26 1/7] Markers - define non optimized marker by Mathieu Desnoyers on Thursday, March 27, 2008 - 9:20 am. (1 message)

Next thread: [patch for 2.6.26 3/7] LTTng instrumentation ipc by Mathieu Desnoyers on Thursday, March 27, 2008 - 9:21 am. (1 message)