On Thu, Jan 10, 2008 at 09:30:59PM +0000, Nicolas Pitre wrote:=2E=20 =20 Well that wasn't a random assertion, I made it, because I assumed that a delta is usually less than a few hundred bytes, and as compression is applied only to the delta without context, you end up packing 500 bytes per 500 bytes which will seldomly have excellent compression ratios. =20 Well, one could use the fact that deltas are not packed to avoid copying them around, and that will _necessarily_ become a gain (you can read them where they have been mmapped for instance). The number that were given for git annotate use a compression of `0' which doesn't use that fact, and I wouldn't be surprised to see a noticeable gain if one does that. And actually, maybe that it's not the deltas we should not pack, but objects under a certain size (say 512 bytes e.g. ?), whichever type they have, and to have the code exploit that fact for real, and avoid copies. With this criterion, I expect the repository to not grow a lot larger (I'd say quite less than the 10% you had, as even in the kernel, there _are_ some larger deltas, and we definitely loose space for them, I'd expect less than a 5% size variation), and I _think_ it's worth investigating. At least I expect visible results on commands (like blame of even log[0]) that go through a lot of small objects to see 10 to 20% increase speed (backed up by some experience I have in avoiding copies in not-so-similar cases though, so it may be less, and I'll stand corrected -- and disappointed, a bit). [0] If I'm correct commit messages are "objects" on their own, and I don't expect them to be very often over 512 octets. --=20 =C2=B7O=C2=B7 Pierre Habouzit =C2=B7=C2=B7O madcoder@debia= n.org OOO http://www.madism.org
| Kok, Auke | Re: -mm merge plans for 2.6.23 - ioat/dma engine |
| Jeff Garzik | Re: Dual-Licensing Linux Kernel with GPL V2 and GPL V3 |
| Greg Kroah-Hartman | [PATCH 001/196] Chinese: Add the known_regression URI to the HOWTO |
| Matthew Garrett | [PATCH] Remove process freezer from suspend to RAM pathway |
| Gerrit Renker | [PATCH 15/37] dccp: Set per-connection CCIDs via socket options |
| David Miller | [GIT]: Networking |
| Jarek Poplawski | Re: [PATCH] pkt_sched: Destroy gen estimators under rtnl_lock(). |
| Jens Axboe | Re: [BUG] New Kernel Bugs |
git: | |
