Con Kolivas, maintainer of the high performance -ck patchsets [earlier story], contributed some interesting benchmark results that help compare the performance of several Linux kernels under varying loads. To perform the tests he combined compiling the kernel with a handy tool written by Bob Matthews of RedHat, called irman, described by the tool's readme file as an "Interactive Response Measurement and ANalysis thingy".
Con's tests compare four kernels head to head: 2.4.19, 2.4.19-ck7, 2.4.19-ck7-rmap, and 2.4.18-6mdk. The results compare four types of loads, showing in each how long kernel compilation took, and how much CPU remained available to the system.
I've been using a couple of Con's patch sets on my personal desktop computer, trying to get a feel for the differences between the -aa VM and the -rmap VM. What I found from normal usage is that both seem to handle quite nicely until memory gets full, at which point the -rmap VM feels as if it offers much better performance. Looking at Con's recent benchmarking, his test results agree with what I "felt". As it stands, I am quite content with my Linux desktop server currently running 2.4.19-ck7-rmap.
I came up with a very simple way of measuring responsiveness that gives me
numbers that are meaningful to me. What I've done is the old faithful kernel
compile and measured it under different loads to simulate the pc's ability to
perform under various loads. I have so far benchmarked 2.4.19 versus 2.4.19-ck7,
2.4.19-ck7-rmap and 2.4.18-6mdk(mandrake's kernel in 8.2). 2.5.34 has a dead
keyboard for me so I'm unable to test it as yet.
Here is the story so far:
No Load
Kernel Time %CPU
2.4.19 1:49.17 98%
2.4.19-ck7 1:47.66 97%
2.4.19-ck7-rmap 1:48.58 98%
2.4.18-6mdk 1:48.18 98%
Memory Load
Kernel Time %CPU
2.4.19 2:15.21 78%
2.4.19-ck7 1:55.88 92%
2.4.19-ck7-rmap 2:18.55 79%
2.4.18-6mdk 2:15.68 79%
IO Load
Kernel Time %CPU
2.4.19 3:00.76 58%
2.4.19-ck7 2:01.68 86%
2.4.19-ck7-rmap 2:05.95 83%
2.4.18-6mdk 3:01.48 58%
Process Load
Kernel Time %CPU
2.4.19 2:09.42 80%
2.4.19-ck7 1:53.52 92%
2.4.19-ck7-rmap 1:54.39 93%
2.4.18-6mdk 2:10.57 80%
Kernel compiles were done on the same config kernel, fresh boot etc.. on a
single PIII 1133 with make -j 4
The loads were taken from BMatthew's iman found here:
http://people.redhat.com/bmatthews/irman/
Unlike the original program I am not looking at average latencies (which by the
way are http://kernel.kolivas.net
I have yet to merge compressed cache fully without bugs, so -ck8 is still not
finished but R. De Castro is working hard to help me do it :)
Please send me your comments and please cc me to ensure I get your email.
Con Kolivas
From: Rik van Riel
Subject: Re: System response benchmarks in performance patches
Date: Fri, 13 Sep 2002 13:22:54 -0300 (BRT)
On Sat, 14 Sep 2002, Con Kolivas wrote:
> I came up with a very simple way of measuring responsiveness that gives
> me numbers that are meaningful to me. What I've done is the old faithful
> kernel compile and measured it under different loads to simulate the
> pc's ability to perform under various loads.
Absolutely wonderful. I'd love to see this easily scriptable
so we can just run it with one command, eg:
$ ./contest
Kernel Time %CPU 2.4.19 3:00.76 58% 2.4.19-ck7 2:01.68 86% 2.4.19-ck7-rmap 2:05.95 83% 2.4.18-6mdk 3:01.48 58%
Very interesting results. People benchmarking just one thing
at a time won't get variances anywhere near this big, while
real system workload is pretty much always multitasking.
I think I've finally found a benchmark that gives results which
are meaningful in the context of a multitasking system.
regards,
Rik
--
Bravely reimplemented by the knights who say "NIH".
http://www.surriel.com/ http://distro.conectiva.com/
From: Andrew Morton
Subject: Re: System response benchmarks in performance patches
Date: Fri, 13 Sep 2002 13:10:28 -0700
Con Kolivas wrote:
>
> I came up with a very simple way of measuring responsiveness that gives me
> numbers that are meaningful to me. What I've done is the old faithful kernel
> compile and measured it under different loads to simulate the pc's ability to
> perform under various loads. I have so far benchmarked 2.4.19 versus 2.4.19-ck7,
> 2.4.19-ck7-rmap and 2.4.18-6mdk(mandrake's kernel in 8.2). 2.5.34 has a dead
> keyboard for me so I'm unable to test it as yet.
Yes, this is a wonderful test. Very real-world, easy to do and it
tickles a few fairly serious performance problems which we have.
> ...
> The loads were taken from BMatthew's iman found here:
> http://people.redhat.com/bmatthews/irman/
I have issues with irman (I think - didn't read the code really
closely).
It appears to always perform file overwrites - seeking over files,
rewriting them.
This tends to cause best-case behaviour in the VM. The affected pages
are tucked up out of the way on the active list and we do quite well.
If instead the background application is writing _new_ files then
everything falls apart.
I'd suggest that you stick with the kernel compile as the workload,
and vary the background activity a bit. Try tiobench.
(oh, and try turning on everything in the `input' menu; that might
get the keyboard working again in 2.5)
/*******************************************
*
* Copyright 2001, Red Hat, Inc., all rights reserved.
*
* Author: Bob Matthews [email blocked]
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 2 of the License, or
* (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software
* Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
*
*******************************************/
Q: What is irman?
A: Interactive Response Measurement and ANalysis thingy.
Q: How do I use irman?
A: Cd to /opt/irman and execute irman.
Q: How should I use irman?
A: You should use irman to compare the response times of otherwise
quiescent machines with identical hardware configurations running
different versions of software, or different hardware configurations
running identical versions of software.
Q: Can I use irman to compare two completely different hardware
configurations running two completely different versions of software?
A: Doing so will not generally produce any interesting results except
to tell you that different hardware configurations running different
software configurations tend to perform differently. It is typically
not necessary to perform tests to obtain such knowledge.
Q: How does irman work?
A: Irman works by measuring the IPC time between two processes under
various artificial loads. In the current implementation, we measure
the amount of time required for a process to write a character to a
pipe, have this character read by another process and then echoed back
to the original process over another pipe.
Q: What does the output of irman mean?
A: Irman measures the round-trip time described above several hundred
thousand times. It then computes the minimum, maximum and average of
these times, and outputs this information for each test, along with
the standard deviation of the observed times.
Q: Can I add different types of loads to irman?
A: Yes. Create a stand-alone process which induces the type of load
you are interested in. Ideally, the process should not require any
user interaction, since it is difficult to reproduce exact behavior
for these types of processes. The process should continue to generate
the load you are interested in until it receives a SIGHUP. At that
point, it should clean up and exit.
To make irman use your load-process, add the appropriate definitions
to irman.h and irman.c:fire_up_load().
Q: What types of loads does irman currently run?
A: 1 - Null load.
2 - Memory load - Repeatedly reference 110% of RAM in a pattern
designed to cause cache misses.
3 - IO load - Read and write 1K chunks from random places in a file
using multiple processes.
4 - Process load - Fork and exec N processes, connected in a
unidirectional ring by pipes. Insert M<<N chunks of data into
the ring and pass them around.
Q: How do *you*, the author, use irman?
A: I, the author, use irman to compare the performance of experimental
kernels against known good ones. Typically, I install all the kernels
I am interested in on a single test box. I then sequentially reboot
the machine into each kernel, running irman immediately after the
reboot, and capturing the output of each run in a separate file.
After all the runs are complete, I massage all of the output files
using the Perl script "massage.pl", which is included in the irman
rpm. I then mail the results to my friends and loved ones.
Related links:
More information
irman was designed to test latency under different loads before a process starts. However many of us agree that latency alone does not equate to system responsiveness. What I've done is extract the load creating processes from irman and do a real world test (kernel compile) under those loads.
I've created a tarball which can be used to test kernels in the same way. However because of /proc changes in 2.5 it can only test 2.4 kernels at the moment. Rik Van Riel is helping me sort out this limitation. It can be downloaded from my website kernel.kolivas.net as contest-0.11 (first section of the FAQ).
Let me remind you that this test is designed to test different kernels on the same machine for their system responsiveness, and tests between machines are unhelpful.
The smaller the difference between noload and different loads, the better, and the higher the cpu usage under load the better.
Rmap felt better?
After reading Con's email to LKML I was thinking it looked like under memory load the AA VM seemed more responsive than the Rmap. But this is the opposite of Jeremy's observations, and he said his observations jived with Cons tests. Now after Con's comment to the story I'm almost positive the results say the AA VM is more effective under memory load. Am I reading this wrong?
re: Rmap felt better?
Yes You're reading it correctly. The rmap patch isn't really optimised yet, it just works whereas -aa is about as optimised as it gets. 2.5.x promises alot more from rmap.
New version and revealing benchmarks
Hi. I've just updated it to version 0.20 with substantial changes to the io load module, and the results are very revealing (check lkml). Oh and now there is data to support what Jeremy "feels" :)
for the lazy among us
Subject: Revealing benchmarks and new version of contest.
From: Con Kolivas
Date: Sep 15 2002
After my first incarnation of the responsiveness benchmark (contest), Rik helped
me get the memory load working for 2.5.x testing. Now Andrew Morton has helped
me improve the IO load. The previous IO load was "nice" to VM systems. Now for
IO load there are two separate tests. First it continually rewrites a file half
the size of the physical memory on the machine. Secondly it rewrites a file the
same size as the physical memory. Below are the new benchmarks with these loads:
Noload:
CPU Load:
Mem Load:
IO Load Half:
IO Load Full:
A quick reminder. Faster times are better, and higher cpu% is also better.
As you can see there are stark differences in these kernels, particularly the
mm4 changes. This time the -rmap VM shows us significant improvement under very
heavy IO load. Repeat tests show similar results.
The updated version of contest (v0.20) can be downloaded from my site:
http://kernel.kolivas.net (look under the FAQ).
Comments (and please cc me)?
Con.
P.S. !
Subject: Re: Revealing benchmarks and new version of contest.
From: Paolo Ciarrocchi
Date: Sep 15 2002
> From: Con Kolivas
[...]
> Below are the new benchmarks with these loads:
Con,
I have different results:
HW is a HP omnibook6000, 256 MiB RAM, PIII@800
Ciao,
Paolo
re: New version and revealing benchmarks
> Oh and now there is data to support what Jeremy "feels" :)
Phew! ;^)
heh
Jeremy's existential quandry is narrowly abated. whew!
Stable version released
I've just released version 0.30 of this benchmark
The tests are now unchanged from 0.22 and hopefully will remain that way unless other flaws in the load code is identified.
The code is now cleaner, the README more comprehensive and the times are in seconds only (for easier comparison).
It also now has a homepage http://contest.kolivas.net
How can I display CPU %??
How can I display CPU %??
'time'
try the 'time' command
Question
Hi
I have a question please, which is: Does irman run on the SUSE operating system?