Re: Efficient x86 and x86_64 NOP microbenchmarks

Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]
From: Mathieu Desnoyers
Date: Wednesday, August 13, 2008 - 12:16 pm

* Linus Torvalds (torvalds@linux-foundation.org) wrote:

Yup, I agree. Actually, the tests I ran shows that using jumps as nops
does not seems to be the best solution, even cycle-wise.


Yes, I am aware of these "high locality" effects. I use these tests as a
starting point to find out which nops are good candidates, and then it
can be later validated with more thorough testing on real workloads,
which will suffer from higher standard deviation.

Interestingly enough, the P6_NOPS seems to be a poor choice both at the
macro and micro levels for the Intel Xeon (referring to
http://lkml.org/lkml/2008/8/13/253 for the macro-benchmarks).


As long as the whole kernel agrees on which instructions should be used
for frequently used nops, the instruction trace cache should behave
properly.


I assume the effect of I$ miss to be the same for all the tested
scenarios (except on P4, and maybe except for the jump cases), given
that in each case we load 5-bytes worth of instructions. Even
considering this, the results I get show that the choices made in the
current kernel does might not be the best ones.


Yep. I think it may make a difference if we use jumps, but I doubt it
will change anything to the other various nops. Still, having that
information would be good.

Some more numbers follow for older architectures.

Intel Pentium 3, 550MHz

NR_TESTS                                    10000000
test empty cycles :                        510000254
test 2-bytes jump cycles :                 510000077
test 5-bytes jump cycles :                 510000101
test 3/2 nops cycles :                     500000072
test 5-bytes nop with long prefix cycles : 500000107
test 5-bytes P6 nop cycles :               500000069 (current choice ok)
test Generic 1/4 5-bytes nops cycles :     514687590
test K7 1/4 5-bytes nops cycles :          530000012

Intel Pentium 3, 933MHz

NR_TESTS                                    10000000
test empty cycles :                        510000565
test 2-bytes jump cycles :                 510000133
test 5-bytes jump cycles :                 510000363
test 3/2 nops cycles :                     500000358
test 5-bytes nop with long prefix cycles : 500000331
test 5-bytes P6 nop cycles :               500000625 (current choice ok)
test Generic 1/4 5-bytes nops cycles :     514687797
test K7 1/4 5-bytes nops cycles :          530000273


Intel Pentium M, 2GHz

NR_TESTS                                    10000000
test empty cycles :                        180000515
test 2-bytes jump cycles :                 180000386 (would be the best)
test 5-bytes jump cycles :                 205000435
test 3/2 nops cycles :                     193333517
test 5-bytes nop with long prefix cycles : 205000167
test 5-bytes P6 nop cycles :               205937652
test Generic 1/4 5-bytes nops cycles :     187500174
test K7 1/4 5-bytes nops cycles :          193750161


Intel Pentium 3, 550MHz

processor	: 0
vendor_id	: GenuineIntel
cpu family	: 6
model		: 7
model name	: Pentium III (Katmai)
stepping	: 3
cpu MHz		: 551.295
cache size	: 512 KB
fdiv_bug	: no
hlt_bug		: no
f00f_bug	: no
coma_bug	: no
fpu		: yes
fpu_exception	: yes
cpuid level	: 2
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 sep mtrr pge mca cmov pat pse36 mmx fxsr sse
bogomips	: 1103.44
clflush size	: 32

Intel Pentium 3, 933MHz

processor	: 0
vendor_id	: GenuineIntel
cpu family	: 6
model		: 8
model name	: Pentium III (Coppermine)
stepping	: 6
cpu MHz		: 933.134
cache size	: 256 KB
fdiv_bug	: no
hlt_bug		: no
f00f_bug	: no
coma_bug	: no
fpu		: yes
fpu_exception	: yes
cpuid level	: 2
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 sep mtrr pge mca cmov pat pse36 mmx fxsr sse
bogomips	: 1868.22
clflush size	: 32

Intel Pentium M, 2GHz

processor	: 0
vendor_id	: GenuineIntel
cpu family	: 6
model		: 13
model name	: Intel(R) Pentium(R) M processor 2.00GHz
stepping	: 8
cpu MHz		: 2000.000
cache size	: 2048 KB
fdiv_bug	: no
hlt_bug		: no
f00f_bug	: no
coma_bug	: no
fpu		: yes
fpu_exception	: yes
cpuid level	: 2
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat clflush dts acpi mmx fxsr sse sse2 ss tm pbe nx bts est tm2
bogomips	: 3994.64
clflush size	: 64

Mathieu





-- 
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68
--
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

Messages in current thread:
[PATCH 0/5] ftrace: to kill a daemon, Steven Rostedt, (Thu Aug 7, 11:20 am)
Re: [PATCH 0/5] ftrace: to kill a daemon, Mathieu Desnoyers, (Thu Aug 7, 11:47 am)
Re: [PATCH 0/5] ftrace: to kill a daemon, Steven Rostedt, (Thu Aug 7, 1:42 pm)
Re: [PATCH 0/5] ftrace: to kill a daemon, Jeremy Fitzhardinge, (Thu Aug 7, 2:11 pm)
Re: [PATCH 0/5] ftrace: to kill a daemon, Steven Rostedt, (Thu Aug 7, 2:29 pm)
Re: [PATCH 0/5] ftrace: to kill a daemon, Roland McGrath, (Thu Aug 7, 3:26 pm)
Re: [PATCH 0/5] ftrace: to kill a daemon, Steven Rostedt, (Thu Aug 7, 6:21 pm)
Re: [PATCH 0/5] ftrace: to kill a daemon, Steven Rostedt, (Thu Aug 7, 6:24 pm)
Re: [PATCH 0/5] ftrace: to kill a daemon, Steven Rostedt, (Thu Aug 7, 6:56 pm)
Re: [PATCH 0/5] ftrace: to kill a daemon, Sam Ravnborg, (Thu Aug 7, 9:54 pm)
Re: [PATCH 0/5] ftrace: to kill a daemon, Peter Zijlstra, (Fri Aug 8, 12:22 am)
Re: [PATCH 0/5] ftrace: to kill a daemon, Steven Rostedt, (Fri Aug 8, 4:31 am)
Re: [PATCH 0/5] ftrace: to kill a daemon, Mathieu Desnoyers, (Fri Aug 8, 10:22 am)
Re: [PATCH 0/5] ftrace: to kill a daemon, Steven Rostedt, (Fri Aug 8, 10:36 am)
Re: [PATCH 0/5] ftrace: to kill a daemon, Mathieu Desnoyers, (Fri Aug 8, 10:46 am)
Re: [PATCH 0/5] ftrace: to kill a daemon, Steven Rostedt, (Fri Aug 8, 11:13 am)
Re: [PATCH 0/5] ftrace: to kill a daemon, Peter Zijlstra, (Fri Aug 8, 11:15 am)
Re: [PATCH 0/5] ftrace: to kill a daemon, Mathieu Desnoyers, (Fri Aug 8, 11:21 am)
Re: [PATCH 0/5] ftrace: to kill a daemon, Steven Rostedt, (Fri Aug 8, 11:41 am)
Re: [PATCH 0/5] ftrace: to kill a daemon, Linus Torvalds, (Fri Aug 8, 12:04 pm)
Re: [PATCH 0/5] ftrace: to kill a daemon, Mathieu Desnoyers, (Fri Aug 8, 12:05 pm)
Re: [PATCH 0/5] ftrace: to kill a daemon, Jeremy Fitzhardinge, (Fri Aug 8, 12:08 pm)
Re: [PATCH 0/5] ftrace: to kill a daemon, Steven Rostedt, (Fri Aug 8, 4:38 pm)
Re: [PATCH 0/5] ftrace: to kill a daemon, Andi Kleen, (Fri Aug 8, 5:23 pm)
Re: [PATCH 0/5] ftrace: to kill a daemon, Steven Rostedt, (Fri Aug 8, 5:30 pm)
Re: [PATCH 0/5] ftrace: to kill a daemon, Steven Rostedt, (Fri Aug 8, 5:36 pm)
Re: [PATCH 0/5] ftrace: to kill a daemon, Jeremy Fitzhardinge, (Fri Aug 8, 5:47 pm)
Re: [PATCH 0/5] ftrace: to kill a daemon, Linus Torvalds, (Fri Aug 8, 5:51 pm)
Re: [PATCH 0/5] ftrace: to kill a daemon, Steven Rostedt, (Fri Aug 8, 5:51 pm)
Re: [PATCH 0/5] ftrace: to kill a daemon, Roland McGrath, (Fri Aug 8, 5:53 pm)
Re: [PATCH 0/5] ftrace: to kill a daemon, Andi Kleen, (Fri Aug 8, 6:13 pm)
Re: [PATCH 0/5] ftrace: to kill a daemon, Andi Kleen, (Fri Aug 8, 6:19 pm)
Re: [PATCH 0/5] ftrace: to kill a daemon, Steven Rostedt, (Fri Aug 8, 6:25 pm)
Re: [PATCH 0/5] ftrace: to kill a daemon, Steven Rostedt, (Fri Aug 8, 6:30 pm)
Re: [PATCH 0/5] ftrace: to kill a daemon, Andi Kleen, (Fri Aug 8, 6:55 pm)
Re: [PATCH 0/5] ftrace: to kill a daemon, Steven Rostedt, (Fri Aug 8, 7:03 pm)
Re: [PATCH 0/5] ftrace: to kill a daemon, Andi Kleen, (Fri Aug 8, 7:23 pm)
Re: [PATCH 0/5] ftrace: to kill a daemon, Steven Rostedt, (Fri Aug 8, 9:12 pm)
Re: [PATCH 0/5] ftrace: to kill a daemon, Abhishek Sagar, (Sat Aug 9, 2:48 am)
Re: [PATCH 0/5] ftrace: to kill a daemon, Steven Rostedt, (Sat Aug 9, 6:01 am)
Re: [PATCH 0/5] ftrace: to kill a daemon, Abhishek Sagar, (Sat Aug 9, 8:01 am)
Re: [PATCH 0/5] ftrace: to kill a daemon, Steven Rostedt, (Sat Aug 9, 8:37 am)
Re: [PATCH 0/5] ftrace: to kill a daemon, Abhishek Sagar, (Sat Aug 9, 10:14 am)
Re: [PATCH 0/5] ftrace: to kill a daemon, Rusty Russell, (Sun Aug 10, 7:41 pm)
Re: [PATCH 0/5] ftrace: to kill a daemon, Steven Rostedt, (Mon Aug 11, 5:33 am)
Re: [PATCH 0/5] ftrace: to kill a daemon, Mathieu Desnoyers, (Mon Aug 11, 11:21 am)
Re: [PATCH 0/5] ftrace: to kill a daemon, Steven Rostedt, (Mon Aug 11, 12:28 pm)
Re: [PATCH 0/5] ftrace: to kill a daemon, Mathieu Desnoyers, (Tue Aug 12, 11:31 pm)
Re: [PATCH 0/5] ftrace: to kill a daemon, Mathieu Desnoyers, (Wed Aug 13, 8:38 am)
Efficient x86 and x86_64 NOP microbenchmarks, Mathieu Desnoyers, (Wed Aug 13, 10:52 am)
Re: Efficient x86 and x86_64 NOP microbenchmarks, Linus Torvalds, (Wed Aug 13, 11:27 am)
Re: Efficient x86 and x86_64 NOP microbenchmarks, Andi Kleen, (Wed Aug 13, 11:41 am)
Re: Efficient x86 and x86_64 NOP microbenchmarks, Avi Kivity, (Wed Aug 13, 11:45 am)
Re: Efficient x86 and x86_64 NOP microbenchmarks, Andi Kleen, (Wed Aug 13, 11:51 am)
Re: Efficient x86 and x86_64 NOP microbenchmarks, Avi Kivity, (Wed Aug 13, 11:56 am)
Re: Efficient x86 and x86_64 NOP microbenchmarks, Mathieu Desnoyers, (Wed Aug 13, 12:16 pm)
Re: Efficient x86 and x86_64 NOP microbenchmarks, Mathieu Desnoyers, (Wed Aug 13, 12:30 pm)
Re: Efficient x86 and x86_64 NOP microbenchmarks, Andi Kleen, (Wed Aug 13, 12:37 pm)
Re: Efficient x86 and x86_64 NOP microbenchmarks, Mathieu Desnoyers, (Wed Aug 13, 1:01 pm)
[RFC PATCH] x86 alternatives : fix LOCK_PREFIX race with p ..., Mathieu Desnoyers, (Wed Aug 13, 4:41 pm)
Re: [RFC PATCH] x86 alternatives : fix LOCK_PREFIX race wi ..., Mathieu Desnoyers, (Wed Aug 13, 6:13 pm)
Re: [RFC PATCH] x86 alternatives : fix LOCK_PREFIX race wi ..., Jeremy Fitzhardinge, (Wed Aug 13, 6:22 pm)
Re: [RFC PATCH] x86 alternatives : fix LOCK_PREFIX race wi ..., Mathieu Desnoyers, (Wed Aug 13, 6:49 pm)
Re: [RFC PATCH] x86 alternatives : fix LOCK_PREFIX race wi ..., Jeremy Fitzhardinge, (Wed Aug 13, 8:35 pm)
Re: [RFC PATCH] x86 alternatives : fix LOCK_PREFIX race wi ..., Mathieu Desnoyers, (Thu Aug 14, 8:18 am)
Re: [RFC PATCH] x86 alternatives : fix LOCK_PREFIX race wi ..., Mathieu Desnoyers, (Thu Aug 14, 9:58 am)
Re: [RFC PATCH] x86 alternatives : fix LOCK_PREFIX race wi ..., Jeremy Fitzhardinge, (Thu Aug 14, 10:04 am)
Re: [RFC PATCH] x86 alternatives : fix LOCK_PREFIX race wi ..., Jeremy Fitzhardinge, (Thu Aug 14, 10:05 am)
Re: [RFC PATCH] x86 alternatives : fix LOCK_PREFIX race wi ..., Jeremy Fitzhardinge, (Thu Aug 14, 10:28 am)
Re: [RFC PATCH] x86 alternatives : fix LOCK_PREFIX race wi ..., Mathieu Desnoyers, (Thu Aug 14, 10:30 am)
Re: [RFC PATCH] x86 alternatives : fix LOCK_PREFIX race wi ..., Jeremy Fitzhardinge, (Thu Aug 14, 10:43 am)
Re: [RFC PATCH] x86 alternatives : fix LOCK_PREFIX race wi ..., Mathieu Desnoyers, (Thu Aug 14, 10:46 am)
Re: [RFC PATCH] x86 alternatives : fix LOCK_PREFIX race wi ..., Jeremy Fitzhardinge, (Thu Aug 14, 10:49 am)
Re: [RFC PATCH] x86 alternatives : fix LOCK_PREFIX race wi ..., Mathieu Desnoyers, (Thu Aug 14, 10:55 am)
Re: [RFC PATCH] x86 alternatives : fix LOCK_PREFIX race wi ..., Mathieu Desnoyers, (Thu Aug 14, 11:09 am)
Re: [RFC PATCH] x86 alternatives : fix LOCK_PREFIX race wi ..., Mathieu Desnoyers, (Thu Aug 14, 11:53 am)
Re: [RFC PATCH] x86 alternatives : fix LOCK_PREFIX race wi ..., Jeremy Fitzhardinge, (Thu Aug 14, 12:29 pm)
Re: [RFC PATCH] x86 alternatives : fix LOCK_PREFIX race wi ..., Mathieu Desnoyers, (Thu Aug 14, 12:49 pm)
Re: [RFC PATCH] x86 alternatives : fix LOCK_PREFIX race wi ..., Mathieu Desnoyers, (Thu Aug 14, 1:31 pm)
Re: [RFC PATCH] x86 alternatives : fix LOCK_PREFIX race wi ..., Jeremy Fitzhardinge, (Thu Aug 14, 2:46 pm)
Re: Efficient x86 and x86_64 NOP microbenchmarks, Steven Rostedt, (Fri Aug 15, 2:34 pm)
Re: Efficient x86 and x86_64 NOP microbenchmarks, Andi Kleen, (Fri Aug 15, 2:51 pm)