On Thu, Jan 10, 2008 at 07:02:39AM +0000, Sam Vilain wrote:Well, lzma is excellent for *big* chunks of data, but not that impressive f= or small files: $ ll git.c git.c.gz git.c.lzma git.c.lzop -rw-r--r-- 1 madcoder madcoder 12915 2008-01-09 13:47 git.c -rw-r--r-- 1 madcoder madcoder 4225 2008-01-10 10:00 git.c.gz -rw-r--r-- 1 madcoder madcoder 4094 2008-01-10 10:00 git.c.lzma -rw-r--r-- 1 madcoder madcoder 5068 2008-01-10 09:59 git.c.lzop And lzma performs really bad if you have few memory available. The "big" se= cret of lzma is that it basically works with a huge window to check for repetiti= ve data, and even decompression needs quite a fair amount of memory, making it= a really bad choice for git IMNSHO. Though I don't agree with you (and some others) about the fact that gzip is fast enough. It's clearly a bottleneck in many log related commands where y= ou would expect it to be rather IO bound than CPU bound. LZO seems like a fai= rer choice, especially since what it makes gain is basically the compression of= the biggest blobs, aka the delta chains heads. It's really unclear to me if we really gain in compressing the deltas, trees, and other smallish informatio= ns. And when it comes to times, for a big file enough to give numbers, here are= the decompression times (best of 10 runs, smaller is better, second number is t= he size of the packed data, original data was 7.8Mo): * lzma: 0.374s (2.2Mo) * gzip: 0.127s (2.9Mo) * lzop: 0.053s (3.2Mo) For a 300k original file: * lzma: 0.022s (124Ko) * gzip: 0.008s (144Ko) * lzop: 0.004s (156Ko) /* most of the samples were actually 0.005 */ What is obvious to me is that lzop seems to take 10% more space than gzip, while being around 1.5 to 2 times faster. Of course this is very sketchy an= d a real test with git will be better. --=20 =C2=B7O=C2=B7 Pierre Habouzit =C2=B7=C2=B7O madcoder@debia= n.org OOO http://www.madism.org
| Stephen Smalley | Re: [AppArmor 39/45] AppArmor: Profile loading and manipulation, pathname matching |
| Tarkan Erimer | Re: Dual-Licensing Linux Kernel with GPL V2 and GPL V3 |
| Jan Engelhardt | intel iommu (Re: -mm merge plans for 2.6.23) |
| Greg Kroah-Hartman | [PATCH 005/196] Chinese: add translation of SubmittingDrivers |
git: | |
| David Fenyes | sigsetmask()? (LINUX) |
| Stephen Tweedie | Unmounting root (no kidding!) [was: Some Linux problems---solved] |
| Les Andrzejewski | X386/WD90C31/SUMSUNG SYNC MASTER 4 |
| Doug Evans | Re: Stabilizing Linux |
| Gerrit Renker | [PATCH 27/37] dccp: Integration of dynamic feature activation - part 2 (server side) |
| Linus Torvalds | Re: [GIT]: Networking |
| Jarek Poplawski | [PATCH] pkt_sched: Destroy gen estimators under rtnl_lock(). |
| Herbert Xu | Re: [PATCH] myr10ge: again fix lro_gen_skb() alignment |
