Re: [patch for 2.6.26 0/7] Architecture Independent Markers

!MAILaRCHIVE_VOTE_RePLACE
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]
To: Ingo Molnar <mingo@...>
Cc: <akpm@...>, <linux-kernel@...>, Linus Torvalds <torvalds@...>
Date: Thursday, March 27, 2008 - 4:39 pm

* Ingo Molnar (mingo@elte.hu) wrote:

Hi Ingo,

Let's compare one marker against one ftrace statement in sched.o on the
sched-dev tree on x86_32 and see where your "bloat" impression about markers
comes from. I think it's mostly due to the different metrics we use.


sched.o w/o CONFIG_CONTEXT_SWITCH_TRACER
   text    data     bss     dec     hex filename
  46564    2924     200   49688    c218 kernel/sched.o

Let's get an idea of CONFIG_CONTEXT_SWITCH_TRACER impact on sched.o :

sched.o with CONFIG_CONTEXT_SWITCH_TRACER

  text    data     bss     dec     hex filename
  46788    2924     200   49912    c2f8 kernel/sched.o

224 bytes added for 6 ftrace_*(). This is partly due to the helper function
ftrace_all_fair_tasks(). So let's be fair and not take it in account.

Only the cost for one ftrace_*(). All the others commented out, leaving this
one :

static inline void
context_switch(struct rq *rq, struct task_struct *prev,
               struct task_struct *next)
{
        struct mm_struct *mm, *oldmm;

        prepare_task_switch(rq, prev, next);
        ftrace_ctx_switch(rq, prev, next);
...

   text    data     bss     dec     hex filename
  46644    2924     200   49768    c268 kernel/sched.o

Commenting this one out :

   text    data     bss     dec     hex filename
  46628    2924     200   49752    c258 kernel/sched.o

For an extra 16 bytes (13 + alignment).

Due to this addition to schedule fast path :

        movl    %ebx, %ecx
        movl    -48(%ebp), %edx
        movl    -40(%ebp), %eax
        call    ftrace_ctx_switch

corresponding to :

 38c:   89 d9                   mov    %ebx,%ecx
 38e:   8b 55 d0                mov    -0x30(%ebp),%edx
 391:   8b 45 d8                mov    -0x28(%ebp),%eax
 394:   e8 fc ff ff ff          call   395 <schedule+0x12c>

Which adds 13 bytes to the fast path. It reads the stack to populate the
registers even when the code is dynamically disabled. The size of this code
directly depends on the number of parameters passed to the tracer. It would also
have to dereference pointers from memory if there would happen to be some data
not present on the stack. All this when disabled. I suppose you patch a no-op
instead of the call to dynamically disable it.

Changing this for a trace_mark :

        trace_mark(ctx_switch, "rq %p prev %p next %p", rq, prev, next);

Adds this to schedule fast path :
(this is without immediate values)

        cmpb    $0, __mark_ctx_switch.33881+8
        jne     .L2164

corresponding to :

 38c:   80 3d 08 00 00 00 00    cmpb   $0x0,0x8
 393:   0f 85 0c 03 00 00       jne    6a5 <schedule+0x43c>

(13 bytes in the fast path, including a memory reference)

With immediate values optimization, we do better :

        mov $0,%al
        testb   %al, %al
        jne     .L2164

Corresponding to :

 389:   b0 00                   mov    $0x0,%al
 38b:   84 c0                   test   %al,%al
 38d:   0f 85 0c 03 00 00       jne    69f <schedule+0x436>

(10 bytes in the fast path instead of 13, and we remove any memory reference)


Near the end of schedule, we find the jump target :


.L2164:
        movl    %ebx, 20(%esp)
        movl    -48(%ebp), %edx
        movl    %edx, 16(%esp)
        movl    %ecx, 12(%esp)
        movl    $.LC108, 8(%esp)
        movl    $0, 4(%esp)
        movl    $__mark_ctx_switch.33881, (%esp)
        call    *__mark_ctx_switch.33881+12
        jmp     .L2126


 6a5:   89 5c 24 14             mov    %ebx,0x14(%esp)
 6a9:   8b 55 d0                mov    -0x30(%ebp),%edx
 6ac:   89 54 24 10             mov    %edx,0x10(%esp)
 6b0:   89 4c 24 0c             mov    %ecx,0xc(%esp)
 6b4:   c7 44 24 08 f7 04 00    movl   $0x4f7,0x8(%esp)
 6bb:   00 
 6bc:   c7 44 24 04 00 00 00    movl   $0x0,0x4(%esp)
 6c3:   00 
 6c4:   c7 04 24 00 00 00 00    movl   $0x0,(%esp)
 6cb:   ff 15 0c 00 00 00       call   *0xc
 6d1:   e9 c3 fc ff ff          jmp    399 <schedule+0x130>

Which adds an extra 50 bytes.

With immediate values optimization, we have a total size of 

   text    data     bss     dec     hex filename
  46767    2956     200   49923    c303 kernel/sched.o

(baseline)

   text    data     bss     dec     hex filename
  46638    2924     200   49762    c262 kernel/sched.o

We add 129 bytes of text here. Which does not balance. We should add 60 bytes. I
guess some code alignment is the cause. Let's look at the size of schedule()
instead, since this is the only code I touch :

With immediate values optimization, with the marker :
00000269 <schedule>
...
0000086c <cond_resched_softirq>
1539 bytes

And without the marker :
00000269 <schedule>
...
0000082d <cond_resched_softirq>
1476 bytes

For an added 63 bytes to schedule, which balances modulo some alignment.

If we look at the surrounding of the added 50 bytes (label .L2164) at the end of
schedule(), we see the assembly :

....
.L2103:
        movl    -32(%ebp), %eax
        testl   %eax, %eax
        je      .L2101
        movl    $0, 68(%esi)
        jmp     .L2089
.L2106:
        movl    $0, -32(%ebp)
        .p2align 4,,3
        jmp     .L2089
.L2164:
        movl    %ebx, 20(%esp)
        movl    -48(%ebp), %edx
        movl    %edx, 16(%esp)
        movl    %ecx, 12(%esp)
        movl    $.LC108, 8(%esp)
        movl    $0, 4(%esp)
        movl    $__mark_ctx_switch.33909, (%esp)
        call    *__mark_ctx_switch.33909+12
        jmp     .L2126
.L2124:
        movl    -40(%ebp), %eax
        call    _spin_unlock_irq
        .p2align 4,,6
        jmp     .L2141
.L2161:
        movl    $1, %ecx
        movl    $2, %eax
        call    profile_hits
        .p2align 4,,4
        jmp     .L2069
.L2160:
        movl    -48(%ebp), %edx
        movl    192(%edx), %eax
        testl   %eax, %eax
        jne     .L2066
        movl    %edx, %eax
        call    __schedule_bug
        jmp     .L2066
....

Which are all targets of "unlikely" branches. Therefore, it shares a cache line
with these targets on architectures with associative L1 i-cache. I don't see
how this could be considered as "fast path".

Therefore, on a 3 arguments marker (with immediate values), the marker seems
to outperform ftrace on the following items :

- Adds 10 bytes to the fast path instead of 13.
- No memory read is required on the fast path when the marker is dynamically
  disabled.
- The added fast path code size does not depend on the number of parameters
  passed.
- The runtime cost, when dynamically disabled, does not depend on the number of
  parameters passed.

However, you are right in that the _total_ code size of the ftrace statement is
smaller, but since it is all located and executed in the fast path, even when
dynamically disabled, I don't see this as an overall improvement.

About the cost of code size required to handle the data afterward : it will be
amortized by a common infrastructure such as LTTng, where the same code will
translate the data received as parameter into a trace.

Regards,

Mathieu

-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68
--
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

Messages in current thread:
[patch for 2.6.26 0/7] Architecture Independent Markers, Mathieu Desnoyers, (Thu Mar 27, 9:20 am)
Re: [patch for 2.6.26 1/7] Markers - define non optimized ma..., Mathieu Desnoyers, (Thu Mar 27, 9:04 pm)
[PATCH] Markers - remove extra format argument, Mathieu Desnoyers, (Thu Mar 27, 9:02 pm)
Re: [PATCH] Markers - remove extra format argument, Masami Hiramatsu, (Fri Mar 28, 1:35 am)
Re: [patch for 2.6.26 0/7] Architecture Independent Markers, Frank Ch. Eigler, (Thu Mar 27, 5:49 pm)
Re: [patch for 2.6.26 0/7] Architecture Independent Markers, Mathieu Desnoyers, (Thu Mar 27, 4:39 pm)
Re: [patch for 2.6.26 0/7] Architecture Independent Markers, Mathieu Desnoyers, (Sat Mar 29, 1:16 pm)
Re: [patch for 2.6.26 0/7] Architecture Independent Markers, Mathieu Desnoyers, (Fri Mar 28, 7:38 am)
Re: [patch for 2.6.26 0/7] Architecture Independent Markers, KOSAKI Motohiro, (Thu Mar 27, 1:08 pm)
Re: [patch for 2.6.26 0/7] Architecture Independent Markers, Frank Ch. Eigler, (Fri Mar 28, 9:40 am)
Re: [patch for 2.6.26 0/7] Architecture Independent Markers, Frank Ch. Eigler, (Fri Mar 28, 11:31 am)