Re: Decompression speed: zip vs lzo

!MAILaRCHIVE_VOTE_RePLACE
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]
To: Nicolas Pitre <nico@...>
Cc: Linus Torvalds <torvalds@...>, Sam Vilain <sam@...>, Git Mailing List <git@...>, Johannes Schindelin <Johannes.Schindelin@...>, Marco Costalba <mcostalba@...>, Junio C Hamano <gitster@...>
Date: Friday, January 11, 2008 - 4:57 am

On Thu, Jan 10, 2008 at 09:30:59PM +0000, Nicolas Pitre wrote:
=2E=20
=20

  Well that wasn't a random assertion, I made it, because I assumed that
a delta is usually less than a few hundred bytes, and as compression is
applied only to the delta without context, you end up packing 500 bytes
per 500 bytes which will seldomly have excellent compression ratios.

=20

  Well, one could use the fact that deltas are not packed to avoid
copying them around, and that will _necessarily_ become a gain (you can
read them where they have been mmapped for instance). The number that
were given for git annotate use a compression of `0' which doesn't use
that fact, and I wouldn't be surprised to see a noticeable gain if one
does that.

  And actually, maybe that it's not the deltas we should not pack, but
objects under a certain size (say 512 bytes e.g. ?), whichever type they
have, and to have the code exploit that fact for real, and avoid copies.
With this criterion, I expect the repository to not grow a lot larger
(I'd say quite less than the 10% you had, as even in the kernel, there
_are_ some larger deltas, and we definitely loose space for them, I'd
expect less than a 5% size variation), and I _think_ it's worth
investigating. At least I expect visible results on commands (like blame
of even log[0]) that go through a lot of small objects to see 10 to 20%
increase speed (backed up by some experience I have in avoiding copies
in not-so-similar cases though, so it may be less, and I'll stand
corrected -- and disappointed, a bit).

  [0] If I'm correct commit messages are "objects" on their own, and I
      don't expect them to be very often over 512 octets.
--=20
=C2=B7O=C2=B7  Pierre Habouzit
=C2=B7=C2=B7O                                                madcoder@debia=
n.org
OOO                                                http://www.madism.org
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

Messages in current thread:
Decompression speed: zip vs lzo, Marco Costalba, (Wed Jan 9, 6:01 pm)
Re: Decompression speed: zip vs lzo, Junio C Hamano, (Wed Jan 9, 6:55 pm)
Re: Decompression speed: zip vs lzo, Sam Vilain, (Wed Jan 9, 7:23 pm)
Re: Decompression speed: zip vs lzo, Junio C Hamano, (Wed Jan 9, 7:49 pm)
Re: Decompression speed: zip vs lzo, Johannes Schindelin, (Wed Jan 9, 7:31 pm)
Re: Decompression speed: zip vs lzo, Nicolas Pitre, (Wed Jan 9, 11:41 pm)
Re: Decompression speed: zip vs lzo, Marco Costalba, (Thu Jan 10, 2:55 am)
Re: Decompression speed: zip vs lzo, Dana How, (Thu Jan 10, 3:34 pm)
Re: Decompression speed: zip vs lzo, Marco Costalba, (Thu Jan 10, 7:45 am)
Re: Decompression speed: zip vs lzo, Johannes Schindelin, (Thu Jan 10, 8:12 am)
Re: Decompression speed: zip vs lzo, Marco Costalba, (Thu Jan 10, 8:18 am)
Re: Decompression speed: zip vs lzo, Sam Vilain, (Wed Jan 9, 9:02 pm)
Re: Decompression speed: zip vs lzo, Sam Vilain, (Thu Jan 10, 1:02 am)
Re: Decompression speed: zip vs lzo, Pierre Habouzit, (Thu Jan 10, 5:16 am)
Re: Decompression speed: zip vs lzo, Nicolas Pitre, (Thu Jan 10, 4:39 pm)
Re: Decompression speed: zip vs lzo, Morten Welinder, (Fri Jan 11, 10:18 am)
Re: Decompression speed: zip vs lzo, Pierre Habouzit, (Fri Jan 11, 5:45 am)
Re: Decompression speed: zip vs lzo, Nicolas Pitre, (Fri Jan 11, 10:27 am)
Re: Decompression speed: zip vs lzo, Marco Costalba, (Thu Jan 10, 5:51 pm)
Re: Decompression speed: zip vs lzo, Nicolas Pitre, (Thu Jan 10, 6:18 pm)
Re: Decompression speed: zip vs lzo, Sam Vilain, (Thu Jan 10, 6:01 pm)
Re: Decompression speed: zip vs lzo, Linus Torvalds, (Thu Jan 10, 5:01 pm)
Re: Decompression speed: zip vs lzo, Sam Vilain, (Thu Jan 10, 5:45 pm)
Re: Decompression speed: zip vs lzo, Linus Torvalds, (Thu Jan 10, 6:03 pm)
Re: Decompression speed: zip vs lzo, Sam Vilain, (Thu Jan 10, 6:28 pm)
Re: Decompression speed: zip vs lzo, Linus Torvalds, (Thu Jan 10, 6:56 pm)
Re: Decompression speed: zip vs lzo, Sam Vilain, (Thu Jan 10, 9:01 pm)
Re: Decompression speed: zip vs lzo, Linus Torvalds, (Thu Jan 10, 10:10 pm)
Re: Decompression speed: zip vs lzo, Sam Vilain, (Fri Jan 11, 2:29 am)
Re: Decompression speed: zip vs lzo, Linus Torvalds, (Fri Jan 11, 12:03 pm)
Re: Decompression speed: zip vs lzo, Sam Vilain, (Fri Jan 11, 9:52 pm)
Re: Decompression speed: zip vs lzo, Junio C Hamano, (Sat Jan 12, 12:46 am)
Re: Decompression speed: zip vs lzo, Nicolas Pitre, (Fri Jan 11, 10:32 pm)
Re: Decompression speed: zip vs lzo, Sam Vilain, (Fri Jan 11, 11:06 pm)
Re: Decompression speed: zip vs lzo, Nicolas Pitre, (Sat Jan 12, 12:09 pm)
Re: Decompression speed: zip vs lzo, Johannes Schindelin, (Sat Jan 12, 12:44 pm)
Re: Decompression speed: zip vs lzo, Sam Vilain, (Fri Jan 11, 3:05 am)
Re: Decompression speed: zip vs lzo, Nicolas Pitre, (Thu Jan 10, 5:30 pm)
Re: Decompression speed: zip vs lzo, Pierre Habouzit, (Fri Jan 11, 4:57 am)