* Linus Torvalds (torvalds@linux-foundation.org) wrote:vmstat "counter increments" and blktrace instrumentation, profile.c "profile_hits" calls could be all expressed as "generic markup", and then used for profiling and tracing. But that would imply the creation of a markup management that would permit it without hurting performance. If we have to put it that way, code markup can be itself seen as a user-visible interface. The marker name, if a particular analysis depends on it, will have to keep its name unchanged. The same applies to the arguments passed to it. Therefore, even though the scheduler code changed a lot over the past 10 years, its context switch marker could always be expressed as trace_mark(kernel_sched_schedule, "prev_pid %d next_pid %d prev_state %ld", prev->pid, next->pid, prev->state); Where kernel_sched_schedule and the format string field names are kept unchanged. Only its location and the name of the variables it touches could have to be modified to follow the kernel tree. Since I am not a kprobe user myself, so I understand you completely. :) What users expect when they try to fix that kind of issue, when oprofile and gdb are not sufficient, is to start a data collection mechanism that will tell them what is going in their system at large, without requiring them to write kernel code. However, that involves marking up key kernel code that will call into a tracer to extract that information. Other projects has done this in different ways.. SystemTAP, for instance, does it out of tree by keeping a separate list of address where kprobes must be installed. It does the job on a distribution kernel maintainer perspective (Redhat), since they freeze to a particular kernel version and update this list every time it breaks, but will always be a source of frustration for vanilla kernel users and kernel developers. I think the best way to follow the code flow is to add markup in the code itself: it would follow the kernel HEAD and let each subsystem maintainer identify the key instrumentation sites of their subsystem. It's important to state that if anyone want to have his own marker set in a separate patchset, he can do so. I currently have my own set of markers to trace the most important kernel sites required to analyze and show a trace of the Linux kernel in my LTTng kernel tracer. It's derived from the set found in LTT which did not change much in about 8 years. I could always submit that for comments to see how subsystem maintainers will react to the proposed instrumentation. About extending on ptrace, I am sorry to say that this solution has the same downsides as kprobes: it is too slow for high performance applications, especially if turned on system-wide. It will also change the system behavior so much that it may hide the bugs and performance issues people are struggling to find. Ptrace is very good at what it does: looking inside _one_ application and tracing its system calls and signals, but the approach finds its limits when we are trying to look at the interactions between multiple applications and the kernel more globally. Absolutely. Let's do it. Mathieu -- Mathieu Desnoyers Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 -
| Max Krasnyansky | Re: Inquiry: Should we remove "isolcpus= kernel boot option? (may have realtime us... |
| Jeremy Allison | Re: [RFC] Heads up on sys_fallocate() |
| Randy Dunlap | Re: -mm merge plans for 2.6.23 (pcmcia) |
| Damien Wyart | ACPI power off regression in 2.6.23-rc8 (NOT in rc7) |
git: | |
| Josip Rodin | Re: bnx2_poll panicking kernel |
| Linus Torvalds | Re: [GIT]: Networking |
| Denys Fedoryshchenko | thousands of classes, e1000 TX unit hang |
