On 9/6/08, Linus Torvalds wrote:
When I was playing with those giant Mozilla packs speed of zlib wasn't
a big problem. Number one problem was the repack process exceeding 3GB
which forced me to get 64b hardware and 8GB of memory. If you start
swapping in a repack, kill it, it will probably take a month to
finish.
I'm forgetting the numbers now but on a quad core machine (with git
changes to use all cores) and 8GB I believe I was able to repack the
Mozilla repo in under an hour. At that point I believe I was being
limited by disk IO.
Size and speed are not unrelated. Buy reducing the pack size in half
you reduce the IO and memory demands (cache misses) a lot. For example
if we went to no compression we'd be killed by memory and IO
consumption. It's not obvious to me what's the best trade off for git
without trying several compression algorithms and comparing. They were
feeding 100MB into PAQ on that site, I don't know what PAQ would do
with a bunch of 2K objects.
Most delta chains in the Mozilla data were easy to process. There was
a single 2000 delta chain that consumed 15% of the total CPU time to
process. Something causes performance to fall apart on really long
chains.
> > Turning a 500MB packfile into a 250MB has lots of advantages in IO
--
Jon Smirl
jonsmirl@gmail.com
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
| Linus Torvalds | Linux 2.6.27-rc8 |
| Greg KH | [GIT PATCH] driver core patches against 2.6.24 |
| Linus Torvalds | Linux 2.6.20-rc6 |
| Mike Snitzer | Re: Distributed storage. |
git: | |
| Gerrit Renker | [PATCH 03/37] dccp: List management for new feature negotiation |
| Jarek Poplawski | [PATCH] pkt_sched: Destroy gen estimators under rtnl_lock(). |
| David Miller | [GIT]: Networking |
| Herbert Xu | Re: Kernel oops with 2.6.26, padlock and ipsec: probably problem with fpu state ch... |
