I have some benchmark test results comparing rsync to cvsup. I did 12 client
side tests over the last week. 5 against TheShell.com, 3 against AllBSD.org,
and 4 against chlamydia.fs.ei.tum.de. All tests were mirroring the DragonFly
BSD source repository. The tests were done with various aged repositories at
different times of the day and night, some with compression on and some with it
off. Each test was done by unpacking two identical copies of a given aged
repository, one to run the cvsup test on and one to run the rsync test on.
Then the rsync and cvsup test was run back to back.
===============
* Environment *
===============
All tests were done in the following environment:
DragonFly 1.10.1-RELEASE system with a 1.11.0-DEVELOPMENT kernel.
CPU: Intel(R) Pentium(R) 4 CPU 1.60GHz
Inernet connection: Cox cable
rsync version: 2.6.9
rsync directories: CVSROOT, doc, and src
rsync flags: --archive --hard-links --delete --verbose
+ --compress (on tests with compression on)
cvsup version: SNAP_16_1h
cvsup supfile: DragonFly-cvs-supfile that comes with DragonFly BSD
cvsup file collections:
dragonfly-cvs-root
dragonfly-cvs-src
dragonfly-cvs-doc
===========
* Summary *
===========
On average:
===========
cvsup took 3.76 times as long as rsync making rsync 276% faster.
rsync consumed 13.5% of the cpu time on updates to an existing repository.
cvsup consumed 33.7% of the cpu time on updates to an existing repository.
Minimum performance difference:
cvsup took from 1.34 times as long as rsync.
Maximum performance difference:
cvsup took 9.1 times as long as rsync
rsync's best performance was on repositories that are less than a week old,
where there was not a large number of updates. In these cases it was usually
from 4 to 7 times faster than cvsup, except for one test from allbsd.org where
rsync was only about 40% faster.
On a Full download of the entire repository, ...... okay.. so like: you'd think with all of these repository copies flying around, there'd be a lot less flaming and a lot more coding going on.. enough! sheesh.. You people are making me want to write this email IN EMACS and, WE ALL KNOW HOW HORRIBLE EMACS IS !!! especially WHEN COMPARED TO VI !!! and also, just to finish this off (for now:) People on each side are LIKE NAZIS etc, etc, etc.
LOL! I think I'd rather have polio that either EMACS or VI.... But you are right - try for new and better solutions. On which I say again HAMMER fs to HAMMER fs will probably have neat syncing features inbuilt. Pulling from HAMMER fs to <other fs> OTOH, needs new tools if best use of HAMMER features is to be used advantageously. Eiffel should be good for coding that... (ducks and waddles away....) ;-) Bill
The additional testing is nice to see, but you're not thinking the issues through far enough when it comes to scaling up a service like this. It's good to benchmark something which hasn't been tested in some time, but you have to do pretty extensive benchmarks if you're going to come to any sweeping conclusions for *all* uses of a program. Let's say rsync takes 10% of the cpu on the client, and 10% of the cpu on a server. Let's say cvsup for the same update takes 15% CPU on the client, and 7% on the server. If your benchmarks ignore the load on the server, then they can not possibly see problems which could occur when scaling up to more clients. With a single client connecting to the server, *neither* side is the bottleneck. It might be the disk-speed is the main bottleneck at that point. The update might take longer with cvsup due to the 15% CPU on the client, but the CPU isn't much of a *bottleneck* at that point. But with 10 connections in my fake scenario, rsync could be using 100% of the CPU on the server. It's at this point that rsync will see some bottleneck, while cvsup would only be using 70% of a CPU. Yes, cvsup will be using much more on each client, but then each client shows up with it's own CPU(s) to take up whatever load is thrown at that client. The server does not receive additional CPU's or network-cards for each connection that it accepts. Again, my feeling is that rsync is almost certainly fine for using with dragonfly's repository, given how much faster machines and networks have gotten, and how many simultaneous connections are seen for dragonfly repo-servers. It makes plenty of sense to stick with rsync if your servers are not overloaded. But if you want to prove rsync is better than cvsup for what the loads that cvsup was *MEANT* to solve, then your tests are not extensive enough. Benchmarking a client/server setup like cvsup is a lot of work to get a complete picture. Also note that you don't need to "prove" that rsync is "better". If ...
The only minor thing I'd bring up is that I recall one reason for cvsup is that rsync placed a relatively higher load per client on the server. Of course, that may complaint may date from when people only had 400Mhz CPUs and older versions of rsync, so I doubt it's a strong reason to stay with cvsup any more.
That needs to be established. We already heard that cvsup - contrary to claims - is not competitive with rsync, on the client side. So I can very well believe that this is also true for the server side. I for myself always notice that when syncing from chlamydia, the server basically traverses all 60k files *instantly*, while it takes quite some time on my desktop. So the load doesn't seem to be a problem once the directory Quick test on chlamydia: rsync of already synced repo: % time rsync --delete -aH chlamydia.fs.ei.tum.de::dragonfly-cvs . rsync --delete -aH chlamydia.fs.ei.tum.de::dragonfly-cvs . 0.72s user 1.43s system 36% cpu 5.865 total considering that rsync spends half of the time on the local side, that's < 3s of load on the server: 53331 nobody 161 0 4536K 3888K select 0:00 355.68% 33.89% rsync nobody cares about that. it might take some more cycles when transfering, but so what. seriously. I don't care, this is peanuts. cheers simon -- Serve - BSD +++ RENT this banner advert +++ ASCII Ribbon /"\ Work - Mac +++ space for low €€€ NOW!1 +++ Campaign \ / Party Enjoy Relax | http://dragonflybsd.org Against HTML \ Dude 2c 2 the max ! http://golden-apple.biz Mail + News / \
Hello Vincent, Thank you for these thorough tests! We finally have some hard numbers to work with. I think it is obvious that rsync should be the preferred update mechanism if you want to download the cvs repository. Cvsup might still be better suited when only downloading the checked out sources. To state it clearly for everybody: ========================================================================= Use rsync to sync your repos! It is faster and can even be compiled! ========================================================================= cheers simon -- Serve - BSD +++ RENT this banner advert +++ ASCII Ribbon /"\ Work - Mac +++ space for low €€€ NOW!1 +++ Campaign \ / Party Enjoy Relax | http://dragonflybsd.org Against HTML \ Dude 2c 2 the max ! http://golden-apple.biz Mail + News / \
To state it even MORE clearly... " ...so long as you do not give a damn about the extra load you are placing on the source server...." Think about it. rsync predates CVSUP. If rsync plus a bit of scripting or 'steering' code was better 'all around'? - cvsup would never have seen the light of day in the first place. - NOR been adopted so *very* widely. - NOR have remained in service for so long on so many projects. - NOR survived challenges from 'Mercurial' and several other similar tools. A vast supposition about rsync, backed up by half-vast testing doesn't change any of that. Not even with a nicely done write-up. It is all still one-ended. Set up the repo you have mirrored as a source server. Instrument that server's load with 100 simultaneous rsync clients and again with 100 simultaneous cvsup clients. Post the results. Bill
wrong. there is no extra load. do you have numbers? if not, don't state this like a fact. I believe that cvsup will produce a higher load on the unix predates windows, yet windows is in wide use. is it good? who there are thousands of useless and stupid software products out there. very? it's seemingly only used by the bsds, and of those only intensively which challenges? mercurial is orthogonal. if you're stuck with cvs, you oh come on, seriously. you believe that? how can you pull out a couple of shady arguments and shoot down a well-founded evaluation? that's not I am running the server. I checked the load while syncing one client. go ahead, make my day. I for sure don't have time for such useless experiments. the choice is rsync. cvsup is dead. it is written in an unmaintainable language and it performs slower on the client. 'nuff said, no more data needed. the facts speak clearly, even if they don't cover the server part. cheers simon -- Serve - BSD +++ RENT this banner advert +++ ASCII Ribbon /"\ Work - Mac +++ space for low €€€ NOW!1 +++ Campaign \ / Party Enjoy Relax | http://dragonflybsd.org Against HTML \ Dude 2c 2 the max ! http://golden-apple.biz Mail + News / \
Simon 'corecode' Schubert wrote: *snip* Simon, Your command of the *language* is superb. But it isn't about debating skills. Test 100 simultaneous connections. Or Not. IDGASEW Bill
| Pardo | Re: pthread_create() slow for many threads; also time to revisit 64b context switc... |
| Andrew Morton | 2.6.23-rc4-mm1 |
| Albert Cahalan | JIT emulator needs |
| Jack Stone | [PATCH 5/7] Replace DPRINTK with pr_debug in ncpfs |
git: | |
| Theodore Tso | Re: git on MacOSX and files with decomposed utf-8 file names |
| Johan Herland | [PATCH 0/6] Refactor the tag object |
| Ingo Molnar | [OT] Your branch is ahead of the tracked remote branch 'origin/master' by 50 commi... |
| Johannes Schindelin | [WIP PATCH] Add 'git fast-export', the sister of 'git fast-import' |
| Mark Reitblatt | US Export of Cryptography |
| Rico Secada | About non-free software in OpenBSD |
| Reza Muhammad | Dell PowerEdge 1950 III / R200 |
| Ivo Chutkin | problem installing some packages on 4.2 |
| David Miller | Re: [RFC PATCH 05/13] ip: support for TX timestamps on UDP and RAW sockets |
| Adrian Bunk | [2.6 patch] remove CONFIG_NET_SCH_RR |
| Erik Mouw | Lots of "BUG eth1 code -5 qlen 0" messages in 2.6.24 |
| David Miller | Re: [PATCH] pkt_sched: Destroy gen estimators under rtnl_lock(). |
