login
Header Space

 
 

Re: rsync vs. cvsup benchmarks

Previous thread: wiki unbroken by Justin C. Sherrill on Tuesday, January 29, 2008 - 2:12 am. (11 messages)

Next thread: Re: rsync vs. cvsup benchmarks by Rahul Siddharthan on Wednesday, January 30, 2008 - 5:54 am. (5 messages)
To: <users@...>
Date: Wednesday, January 30, 2008 - 2:38 am

I have some benchmark test results comparing rsync to cvsup.  I did 12 client
side tests over the last week.  5 against TheShell.com, 3 against AllBSD.org,
and 4 against chlamydia.fs.ei.tum.de.  All tests were mirroring the DragonFly
BSD source repository.  The tests were done with various aged repositories at
different times of the day and night, some with compression on and some with it
off.  Each test was done by unpacking two identical copies of a given aged
repository, one to run the cvsup test on and one to run the rsync test on.
Then the rsync and cvsup test was run back to back.


===============
* Environment *
===============

All tests were done in the following environment:

DragonFly 1.10.1-RELEASE system with a 1.11.0-DEVELOPMENT kernel.
CPU:                Intel(R) Pentium(R) 4 CPU 1.60GHz
Inernet connection: Cox cable
rsync version:      2.6.9
rsync directories:  CVSROOT, doc, and src
rsync flags:        --archive --hard-links --delete --verbose
                  + --compress (on tests with compression on)
cvsup version:      SNAP_16_1h
cvsup supfile:      DragonFly-cvs-supfile that comes with DragonFly BSD
cvsup file collections:
    dragonfly-cvs-root
    dragonfly-cvs-src
    dragonfly-cvs-doc


===========
* Summary *
===========

On average:
===========
    cvsup took 3.76 times as long as rsync making rsync 276% faster.
    rsync consumed 13.5% of the cpu time on updates to an existing repository.
    cvsup consumed 33.7% of the cpu time on updates to an existing repository.

Minimum performance difference: 
    cvsup took from 1.34 times as long as rsync.
Maximum performance difference: 
    cvsup took 9.1 times as long as rsync

rsync's best performance was on repositories that are less than a week old,
where there was not a large number of updates.  In these cases it was usually
from 4 to 7 times faster than cvsup, except for one test from allbsd.org where
rsync was only about 40% faster.  

On a Full download of the entire repository, ...
To: <users@...>
Date: Thursday, January 31, 2008 - 9:27 pm

...

okay.. so like:

you'd think with all of these repository copies flying around,
there'd be a lot less flaming and a lot more coding going on..

enough!

sheesh.. You people are making me want to write this email

IN EMACS

and,

WE ALL KNOW HOW HORRIBLE EMACS IS !!!

especially

WHEN COMPARED TO VI !!!

and also, just to finish this off (for now:)

People on each side are LIKE NAZIS

etc, etc, etc.
To: <users@...>
Date: Friday, February 1, 2008 - 1:46 am

LOL!

I think I'd rather have polio that either EMACS or VI....

But you are right - try for new and better solutions.

On which I say again HAMMER fs to HAMMER fs will probably have neat 
syncing features inbuilt.

Pulling from HAMMER fs to &lt;other fs&gt; OTOH, needs new tools if best use 
of HAMMER features is to be used advantageously.

Eiffel should be good for coding that...

(ducks and waddles away....)

;-)

Bill
To: <users@...>
Date: Wednesday, January 30, 2008 - 5:55 pm

The additional testing is nice to see, but you're not thinking the
issues through far enough when it comes to scaling up a service like
this.  It's good to benchmark something which hasn't been tested in
some time, but you have to do pretty extensive benchmarks if you're
going to come to any sweeping conclusions for *all* uses of a program.

Let's say rsync takes 10% of the cpu on the client, and 10% of the
cpu on a server.  Let's say cvsup for the same update takes 15% CPU
on the client, and 7% on the server.  If your benchmarks ignore the
load on the server, then they can not possibly see problems which
could occur when scaling up to more clients.

With a single client connecting to the server, *neither* side is the
bottleneck.  It might be the disk-speed is the main bottleneck at that
point.  The update might take longer with cvsup due to the 15% CPU on
the client, but the CPU isn't much of a *bottleneck* at that point.

But with 10 connections in my fake scenario, rsync could be using 100%
of the CPU on the server.  It's at this point that rsync will see some
bottleneck, while cvsup would only be using 70% of a CPU.  Yes, cvsup
will be using much more on each client, but then each client shows up
with it's own CPU(s) to take up whatever load is thrown at that client.
The server does not receive additional CPU's or network-cards for
each connection that it accepts.

Again, my feeling is that rsync is almost certainly fine for using
with dragonfly's repository, given how much faster machines and
networks have gotten, and how many simultaneous connections are seen
for dragonfly repo-servers.

It makes plenty of sense to stick with rsync if your servers are not
overloaded.  But if you want to prove rsync is better than cvsup for
what the loads that cvsup was *MEANT* to solve, then your tests are
not extensive enough.  Benchmarking a client/server setup like cvsup
is a lot of work to get a complete picture.

Also note that you don't need to "prove" that rsync is "better".  If ...
To: <users@...>
Date: Wednesday, January 30, 2008 - 9:46 am

The only minor thing I'd bring up is that I recall one reason for cvsup is
that rsync placed a relatively higher load per client on the server.

Of course, that may complaint may date from when people only had 400Mhz
CPUs and older versions of rsync, so I doubt it's a strong reason to stay
with cvsup any more.
To: <users@...>
Date: Wednesday, January 30, 2008 - 10:20 am

That needs to be established.  We already heard that cvsup - contrary to 
claims - is not competitive with rsync, on the client side.  So I can very 
well believe that this is also true for the server side.  I for myself 
always notice that when syncing from chlamydia, the server basically 
traverses all 60k files *instantly*, while it takes quite some time on my 
desktop.  So the load doesn't seem to be a problem once the directory 

Quick test on chlamydia:  rsync of already synced repo:

% time rsync --delete -aH chlamydia.fs.ei.tum.de::dragonfly-cvs .
rsync --delete -aH chlamydia.fs.ei.tum.de::dragonfly-cvs .  0.72s user 
1.43s system 36% cpu 5.865 total

considering that rsync spends half of the time on the local side, that's &lt; 
3s of load on the server:

53331 nobody   161   0  4536K  3888K select   0:00 355.68% 33.89% rsync

nobody cares about that.  it might take some more cycles when transfering, 
but so what.  seriously.  I don't care, this is peanuts.

cheers
   simon

-- 
Serve - BSD     +++  RENT this banner advert  +++    ASCII Ribbon   /"\
Work - Mac      +++  space for low €€€ NOW!1  +++      Campaign     \ /
Party Enjoy Relax   |   http://dragonflybsd.org      Against  HTML   \
Dude 2c 2 the max   !   http://golden-apple.biz       Mail + News   / \
To: <users@...>
Date: Wednesday, January 30, 2008 - 7:54 am

Hello Vincent,


Thank you for these thorough tests!  We finally have some hard numbers to 
work with.  I think it is obvious that rsync should be the preferred 
update mechanism if you want to download the cvs repository.  Cvsup might 
still be better suited when only downloading the checked out sources.

To state it clearly for everybody:

=========================================================================

   Use rsync to sync your repos!  It is faster and can even be compiled!

=========================================================================

cheers
   simon

-- 
Serve - BSD     +++  RENT this banner advert  +++    ASCII Ribbon   /"\
Work - Mac      +++  space for low €€€ NOW!1  +++      Campaign     \ /
Party Enjoy Relax   |   http://dragonflybsd.org      Against  HTML   \
Dude 2c 2 the max   !   http://golden-apple.biz       Mail + News   / \
To: <users@...>
Date: Wednesday, January 30, 2008 - 10:45 am

To state it even MORE clearly...

" ...so long as you do not give a damn about the extra load you are 
placing on the source server...."


Think about it.

rsync predates CVSUP.

If rsync plus a bit of scripting or 'steering' code was better 'all around'?

- cvsup would never have seen the light of day in the first place.

- NOR been adopted so *very* widely.

- NOR have remained in service for so long on so many projects.

- NOR survived challenges from 'Mercurial' and several other similar tools.

A vast supposition about rsync, backed up by half-vast testing doesn't 
change any of that. Not even with a nicely done write-up.

It is all still one-ended.

Set up the repo you have mirrored as a source server.

Instrument that server's load with 100 simultaneous rsync clients and 
again with 100 simultaneous cvsup clients.

Post the results.

Bill
To: <users@...>
Date: Wednesday, January 30, 2008 - 11:20 am

wrong.  there is no extra load.  do you have numbers?  if not, don't state 
this like a fact.  I believe that cvsup will produce a higher load on the 

unix predates windows, yet windows is in wide use.  is it good?  who 

there are thousands of useless and stupid software products out there. 

very?  it's seemingly only used by the bsds, and of those only intensively 


which challenges?  mercurial is orthogonal.  if you're stuck with cvs, you 

oh come on, seriously.  you believe that?  how can you pull out a couple 
of shady arguments and shoot down a well-founded evaluation?  that's not 

I am running the server.  I checked the load while syncing one client. 

go ahead, make my day.  I for sure don't have time for such useless 
experiments.  the choice is rsync.  cvsup is dead.  it is written in an 
unmaintainable language and it performs slower on the client.  'nuff said, 
no more data needed.  the facts speak clearly, even if they don't cover 
the server part.

cheers
   simon

-- 
Serve - BSD     +++  RENT this banner advert  +++    ASCII Ribbon   /"\
Work - Mac      +++  space for low €€€ NOW!1  +++      Campaign     \ /
Party Enjoy Relax   |   http://dragonflybsd.org      Against  HTML   \
Dude 2c 2 the max   !   http://golden-apple.biz       Mail + News   / \
To: <users@...>
Date: Wednesday, January 30, 2008 - 1:35 pm

Simon 'corecode' Schubert wrote:

*snip*

Simon,

Your command of the *language* is superb.

But it isn't about debating skills.

Test 100 simultaneous connections.

Or Not.

IDGASEW


Bill
Previous thread: wiki unbroken by Justin C. Sherrill on Tuesday, January 29, 2008 - 2:12 am. (11 messages)

Next thread: Re: rsync vs. cvsup benchmarks by Rahul Siddharthan on Wednesday, January 30, 2008 - 5:54 am. (5 messages)
speck-geostationary