Mel Gorman recently announced the release of version 0.6 of VM Regress, "a regression, benchmarking and test tool for the Linux VM." Following this release, Mel posted the results of a series of four tests comparing the stock 2.4.19 VM with Rik van Riel's [earlier interview] 2.4.19-rmap14a VM [earlier story].
Regarding the results of these tests, Mel acknowledges that, "It is hard to draw solid conclusions because large gaps still exist in the data but some can be drawn." His tests indicate no significant difference between the two VM's when there is sufficient memory (no swapping). However, with low memory when memory references occur in a linear pattern, the stock VM appears to perform better. And then when memory references occur in other patterns with large anonymous page use, the rmap VM appears to perform better. Future versions of this VM tool will provide much more information, such as "the page age graphs [which] are on the way and will be available in VM Regress 0.7"
Building VM Regress requires that you have the source for your running kernel, as it creates several modules using source and objects from your running kernel tree. With the kernel modules built and then loaded into the running kernel, a new directory, /proc/vmregress, provides VM information and an interface for controlling VM Regress tests. The modules are broken into two types: 'sense' and 'test'. The former offering a view into the kernel, the latter for performing tests.
Project page: http://www.csn.ul.ie/~mel/projects/vmregress/
Download: http://www.csn.ul.ie/~mel/projects/vmregress/vmregress-0.6.tar.gz
This is the third public release of VM Regress. It is the beginnings of a
regression, benchmarking and test tool for the Linux VM. The web page has
an introduction and the project itself has quite comprehensive
documentation and commentary. It is still in it's very early days but
there is a lot more in here than there was in 0.5.
This release had a lot of minor bug fixes in it including building with
highmem support and late 2.5 kernels. It has been heavily tested with both
2.4.19 and 2.4.19-rmap14a .
The first item feature of note is that multiple instances of the same test can
now run but only one will output information to the proc entry. This will
allow 100 small instances of a test to run rather than one very large
instance.
Second item is the pagemap.o module. When read, it will print out all VMA's
of the reading process and print what pages are present/swapped in that region
in encoded format. A perl library is provided for decoding the information.
Third item is the introduction of the mapanon.o module. It exports four
proc interfaces for open, reading, writing and closing memory mapped
regions. It is designed to be used by a benchmarking perl script
(bin/bench_mapanon.pl --man for details) for testing how quickly anonymous
pages are used within an mmaped region and illustrates what pages the
kernel decides to swap out. The report from the benchmark will show how
quickly pages were accessed, what pages were present/swapped in comparison
to how often a page was referenced and a graph of vmstat output. Tests are
running currently measuring the performance of 2.4.19 and 2.4.19-rmap14a.
They will be posted up when they complete running.
Fourth item is several perl libraries made available that are aimed at making
developing of new tests very easy. They cover a lot of the drudge work a test
has to do such as graphing, reading proc entries, decoding information and so
on. The manual has most of the details. All of VM Regress is designed to
be very easy to interface with so other tests can be easily developed.
The next step is to update mapanon to cover mmaped files as well as anonymous
memory. This is so a simulation web server will be run complete with bots
browsing web pages similar to what Rik Van Riel outlined in an email sent
to the list. This will help the tool be both a micro analysis and
overall performance testing and benchmark tool.
Further down the line is the development of statistical analysis tools for
examining different data sets, in particular the timing information the
bench_mapanon.pl script produces.
This is still very much in it's early days and is expected to take a long time
to develop fully but it's at the point where it can produce useful figures.
Reasonably comprehensive documentation is available with the package and from
the webpage. Any feedback is appreciated.
Full changelog for 0.6
Version 0.6
-----------
o Allow multiple instances of tests to run. Only one will print to proc
o pagemap.o module will dump out address space with pages swapped/present
o mapanon.o benchmark, creates and references mmaped areas so that a script
can simulate program behavior and see what the process space looks like
after
o Created various benchmark perl scripts.
o Created various support perl modules for running tests in bin/lib/VMR
o Print out kernel messages
o Moved the pagemap decode perl routines to a library
o Fixed CONFIG_HIGHMEM compile error
o Fixed spinlock redefine errors
o Fixed use of KERNEL_VERSION macro
o Fixed various deadlocks
--
Mel Gorman
MSc Student, University of Limerick
http://www.csn.ul.ie/~mel
I ran a brief series of tests on a small crash box. the intention was to
see what sort of figures and conclusions could be gathered with VM Regress
in it's current public release. VM Regress is the beginnings of a tool
that ultimatly aims to answer questions about the VM by testing and
benchmarking individual parts of it. The conclusions drawn here are
extremly ad-hoc so take them with a very large grain of salt.
4 tests were run on each machine each related to anonymous memory used in
a mmaped region. Two reference patterns were used. smooth_sin and
smooth_sin-random . Both sets show a sin curve when the number of times
each page is referenced is graphed (See the green line in the graph Pages
Present/Swapped). With smooth_sin, the pages are reffered to in order.
With smooth_sin-random, the pages are referenced in a random order but the
amount of times a page is referenced.
Both patterns are tested with 2,000,000 page references made to a mmaped
region. The first memory mapped region is 25000 pages large, about the
size of physical memory on the machine. The second was with 50000.
Unfortunatly detailed statistical information is unavailable, but some
conclusions can still be drawn. Statistical information is aimed to be
available at least by 0.9
Test 1 - smooth-sin_25000
http://www.csn.ul.ie/~mel/vmr/2.4.19/smooth_sin_25000/mapanon.html
http://www.csn.ul.ie/~mel/vmr/2.4.19-rmap14a/smooth_sin_25000/mapanon.html
Behaviour is pretty much comparable. The average page access times look
roughly the same so at the very least the performance is similiar. rmap14a
did perform faster but hte test wasn't long enough to be conclusive. All
in all, when enough physical memory is avilable, rmap14a and stock will
perform roughly the same with a linear reference pattern and enough memory
is available.
Test 2 - smooth-sin-random_25000
http://www.csn.ul.ie/~mel/vmr/2.4.19/smooth_sin-random_25000/mapanon.html
http://www.csn.ul.ie/~mel/vmr/2.4.19-rmap14a/smooth_sin-random_25000/mapanon.html
here, the average performanceremains roughly the same. It is interesting
to note that rmap14a had periodic large access times to pages and it's
unclea. Despite this, rmap14a still completed the test faster. So again,
with enough memory available, the performance remains roughly the same
even with a relatively random page reference pattern
Test 3 - smooth_sin_50000
http://www.csn.ul.ie/~mel/vmr/2.4.19/smooth_sin_50000/mapanon.html
http://www.csn.ul.ie/~mel/vmr/2.4.19-rmap14a/smooth_sin_50000/mapanon.html
This test is interesting. Remember that the references are linear in
memory. At about the 1,000,000 page reference, physical memory is
exhausted. Both tests completed in the same time so in "raw performance"
they would appear the same but not so. The time access graph shows that
for most of the test, rmap14a performed much better on average except
for the occasional large spikes. At the end, it degrades very quickly but
is still faster than the stock kernel about about 300000 microseconds to
access a page which the unscaled graphs show
http://www.csn.ul.ie/~mel/vmr/2.4.19/smooth_sin_50000/mapanon-time-unscaled.png
http://www.csn.ul.ie/~mel/vmr/2.4.19-rmap14a/smooth_sin_50000/mapanon-time-unscaled.png
This would appear consistent with reports that the stock kernel degrades
slowly where rmap seems to fall apart really quickly in some situations.
It is suspected that the large periodic spikes are where the proper page
to select out is found but it's pure guesswork and VM Regress is not at
the point where it can investigate more.
The second point of note is the present pages at the end of the test.
stock makes no attempt to keep certain pages in memory. When physical
memory is out, it swaps out enitre processes unconditionally. rmap14a
tries to keep the proper pages in memory and the page reference vs
presense graph shows that it did. stock has a large block of pages
present, rmap14a had swapped out some pages from the beginning of the
test.
In this case, stock just happened to swap out correctly because the pages
remove were not going to be used again in this particular case
Test 4 - smooth_sin-random
http://www.csn.ul.ie/~mel/vmr/2.4.19/smooth_sin-random_50000/mapanon.html
http://www.csn.ul.ie/~mel/vmr/2.4.19-rmap14a/smooth_sin-random_50000/mapanon.html
With this test, the page references are in random order so determining
which page to remove is much more difficult. rmap14a completed this test
almost 10 minutes quicker than stock.
The average time for the stock kernel is consistently bad. I am guessing
that this is because the kernel consistently ends up swapping out the
entire process. rmap has periods of quick accesses with unfortunatly large
spikes because it is trying to keep the right pages in memory and a lot of
the time gets it right. This is better than stock kernel which never keeps
the right pages in memory.
Conclusion
It is hard to draw solid conclusions because large gaps still exist in the
data but some can be drawn. I am sure an experienced VM developer will be
able to draw much more reliable conclusions :-)
First, when enough physical memory is available, rmap and stock perform
more or less the same so appreciatable overhead is not introduced for
normal anonymous memory use.
Second, when memory is tight, the type of memory reference behaviour will
determine how good or bad the two will perform. With a strictly linear
pattern, stock will perform better because it just dumps all the old pages
en-mass. I seriously doubt this reference is common.
For other patterns with large anonymous page use, rmap is more likely to
perform better because it tries to keep anonymous pages in memory. Even
with a totally random pattern, it'll perform reasonably well.
Lastly, it is obvious from the tests that for deciding which page to swap,
age is more important than frequency but that is already known. The page
age graphs are on the way and will be available in VM Regress 0.7
--
Mel Gorman
MSc Student, University of Limerick
http://www.csn.ul.ie/~mel
From: Daniel Phillips
Subject: Re: 2.4.19 Vs 2.4.19-rmap14a with anonymous mmaped memory
Date: Mon, 26 Aug 2002 14:08:53 +0200
On Monday 26 August 2002 00:22, Mel Gorman wrote:
> 4 tests were run on each machine each related to anonymous memory used in
> a mmaped region. Two reference patterns were used. smooth_sin and
> smooth_sin-random . Both sets show a sin curve when the number of times
> each page is referenced is graphed (See the green line in the graph Pages
> Present/Swapped). With smooth_sin, the pages are reffered to in order.
> With smooth_sin-random, the pages are referenced in a random order but the
> amount of times a page is referenced.
Could you please provide pseudocode, to specify these reference patterns
more precisely?
--
Daniel
From: Mel Gorman
Subject: Re: 2.4.19 Vs 2.4.19-rmap14a with anonymous mmaped memory
Date: Mon, 26 Aug 2002 16:13:54 +0100 (IST)
On Mon, 26 Aug 2002, Daniel Phillips wrote:
> Could you please provide pseudocode, to specify these reference patterns
> more precisely?
>
Rather than providing pseudo code, here is a link to the actual function
that generates the smooth_sin references
http://www.csn.ul.ie/~mel/vmr/smooth_sin.html
It is really crude and written to generate any type of data until I
found the time to generate more realistic data which is a project in
itself. Anyone who wants to generate better data only has to edit the
References.pm file.
It takes there inputs
references - number of references to generate
range - the size in pages of the region to reference
output - the output filename
the function has three parts
part 1: Plot a sin wave so that the sum of all the integer values of each
part of it would generate enough references to satisify at least
half of the requessted number
part 2: Starting at the beginning of the range, reference each page in a
linear pattern until all the required references are generated
part 3: Dump all references to disk
now that I think of it, it would have made more sense to begin with the
linear reference pattern and then generate the sin curve but seeing as
this pattern is nothing resembling real life, I didn't worry about it too
much. It is probably something I should change as it would illustrate
better what pages are kept in memory.
smooth_sin-random
http://www.csn.ul.ie/~mel/vmr/randomize_references.txt
This is a perl script for randomizing an input file. It takes an input
file generated by the smooth_sin function and outputs a randomized version
of it. It is pretty simple
1. For each input reference, output a random number between 0 and range
followed by the input reference
2. Sort the file numerically with sort. This will efficively randomize the
input
3. Reread the randomized input and strip away the generated random number
--
Mel Gorman
MSc Student, University of Limerick
http://www.csn.ul.ie/~mel
woohoo!
Wow. This is great news. Now that linux has a VM system (and jfs support!), linux can be used for "mission-critical" stuff, like downloading kde themes, and chatting in irc!
It looks like Xenix just got some serious competition!
Mod
This is why a moderation system is good.
re: Mod
Agreed. There's work being done on a new comment module for Drupal. When it's ready I'll be upgrading, restoring the ability for users to moderate comments.
re: Mod
Excellent :-)
Do you really care that much?
Nero,
Do you realy need a mod system? You can't just ignore that one post? If it was hard to seperate the wheet from the chaff a mod system might be useful. But when a large story gets 15 comments is it really needed?
Yes
Unless this is you doing the trolling, I dont see why you're against a moderation system. If you don't want to use the moderation system when it is enabled, don't. Is it that hard to ignore? Myself, I like to not have to mentally sift through these things, and read real comments.
I agree with gncluster
I don't think this is a busy enough site to warrent a moderation system. So he's a troll, ignore him. By even responding we're lowering himself. For especially henous replies like his perhaps its best to simply delete rather then worry about moderation. At least thats my thoughts...
~Christopher
loki_x btw, until I register my name
deletion
deletion is fine, if it actually gets done.
I thought...
I thought it was quite funny