On 10/2/07, David Tweed <david.tweed@gmail.com> wrote:Dictionary compression can be used without full-text indexes. It is just really easy to build the full-text index if the data is already dictionary compressed. Dictionary compression works for everything except binary or random data. Git is already using a small scale dictionary compressor via zip. I suspect doing a full scale dictionary for a pack file and then using arithmetic encoding of the tokens would provide substantially more compression. The big win is having a single dictionary instead of a new dictionary each time zip is used. When we were working on Mozilla, Mozilla changed licenses three times. The license text ended up taking about 30MB in the current scheme. With full dictionary compression this would reduce down to a few kb. More compression is good for git. It means we can keep more data in RAM and reduce download times. With current hardware it is almost always better to trade off CPU to reduce IO. -- Jon Smirl jonsmirl@gmail.com - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
| monstr | [PATCH 27/56] microblaze_v2: support for a.out |
| Andrew Morton | Re: Dual-Licensing Linux Kernel with GPL V2 and GPL V3 |
| mdew . | Re: [patch] CFS scheduler, v4 |
| Gabriel C | Re: 2.6.21-mm1 |
git: | |
| Gerrit Renker | [PATCH 27/37] dccp: Integration of dynamic feature activation - part 2 (server side) |
| Willy Tarreau | Re: [PATCH] tcp: splice as many packets as possible at once |
| Linus Torvalds | Re: [GIT]: Networking |
| Jarek Poplawski | [PATCH] pkt_sched: Destroy gen estimators under rtnl_lock(). |
