[PATCH, take 2] Speedup divides by cpu_power in scheduler

!MAILaRCHIVE_VOTE_RePLACE
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]
To: Ingo Molnar <mingo@...>, Andrew Morton <akpm@...>
Cc: <linux-kernel@...>
Date: Thursday, February 22, 2007 - 4:19 am

I noticed expensive divides done in try_to_wakeup() and find_busiest_group() 
on a bi dual core Opteron machine (total of 4 cores), moderatly loaded (15.000 
context switch per second)

oprofile numbers :

CPU: AMD64 processors, speed 2600.05 MHz (estimated)
Counted CPU_CLK_UNHALTED events (Cycles outside of halt state) with a unit 
mask of 0x00 (No unit mask) count 50000
samples  %        symbol name
...
613914    1.0498  try_to_wake_up
    834  0.0013 :ffffffff80227ae1:   div    %rcx
77513  0.1191 :ffffffff80227ae4:   mov    %rax,%r11

608893    1.0413  find_busiest_group
   1841  0.0031 :ffffffff802260bf:       div    %rdi
140109  0.2394 :ffffffff802260c2:       test   %sil,%sil


Some of these divides can use the reciprocal divides we introduced some time 
ago (currently used in slab AFAIK)

We can assume a load will fit in a 32bits number, because with a 
SCHED_LOAD_SCALE=128 value, its still a theorical limit of 33554432

When/if we reach this limit one day, probably cpus will have a fast hardware 
divide and we can zap the reciprocal divide trick.

Ingo suggested to rename cpu_power to __cpu_power to make clear it should not 
be modified without changing its reciprocal value too.

I did not convert the divide in cpu_avg_load_per_task(), because tracking 
nr_running changes may be not worth it ? We could use a static table of 32 
reciprocal values but it would add a conditional branch and table lookup.

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

Messages in current thread:
[Patch 1/2] cciss: fix for 2TB support, Mike Miller (OS Dev), (Wed Feb 21, 5:10 pm)
Re: [Patch 1/2] cciss: fix for 2TB support, Andrew Morton, (Wed Feb 21, 11:14 pm)
Re: [Patch 1/2] cciss: fix for 2TB support, Mike Miller (OS Dev), (Thu Feb 22, 4:18 pm)
RE: [Patch 1/2] cciss: fix for 2TB support, Miller, Mike (OS Dev), (Thu Feb 22, 5:22 pm)
Re: [Patch 1/2] cciss: fix for 2TB support, Mike Miller (OS Dev), (Thu Feb 22, 12:51 pm)
Re: [Patch 1/2] cciss: fix for 2TB support, Andrew Morton, (Thu Feb 22, 5:24 pm)
Re: [Patch 1/2] cciss: fix for 2TB support, James Bottomley, (Thu Feb 22, 5:41 pm)
Re: [Patch 1/2] cciss: fix for 2TB support, Mike Miller (OS Dev), (Thu Feb 22, 6:02 pm)
Re: [Patch 1/2] cciss: fix for 2TB support, James Bottomley, (Thu Feb 22, 6:06 pm)
Re: [Patch 1/2] cciss: fix for 2TB support, Mike Miller (OS Dev), (Fri Feb 23, 4:52 pm)
Re: [Patch 1/2] cciss: fix for 2TB support, Andrew Morton, (Sat Feb 24, 2:35 am)
[PATCH] Speedup divides by cpu_power in scheduler, Eric Dumazet, (Thu Feb 22, 3:31 am)
[PATCH, take 2] Speedup divides by cpu_power in scheduler, Eric Dumazet, (Thu Feb 22, 4:19 am)
Re: [PATCH] Speedup divides by cpu_power in scheduler, Ingo Molnar, (Thu Feb 22, 3:56 am)