Re: pack operation is thrashing my server

!MAILaRCHIVE_VOTE_RePLACE
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]
To: Geert Bosch <bosch@...>
Cc: Nicolas Pitre <nico@...>, Andi Kleen <andi@...>, Ken Pratt <ken@...>, Shawn O. Pearce <spearce@...>, <git@...>, <danahow@...>
Date: Wednesday, August 13, 2008 - 1:13 pm

Hi Geert,

I wrote the blob-size-threshold patch last year to which
Jakub Narebski referred.

I think there will eventually be a way to better handle large
objects in Git.  Some possible elements:
* Loose objects have a format which can be streamed
  directly into or out of packs.  This avoids a round-trip through zlib,
  which is a big deal for big objects.  This was the effect of the "new"
  loose object format to which Shawn referred.  This was
  removed apparently because it was ugly and/or difficult
  to maintain,  which I didn't understand since I didn't personally
  suffer.
* Loose objects actually _are_ singleton packs,  but saved
  in .git/objects/xx.  Workable,  but would never happen due to
  the extra pack header at the beginning it would add.  This
  takes advantage of the existing pack-to-pack streaming.
* Large loose objects are never deltified and/or never packed.
  The latter was the focus of my patch.
* Large loose objects are placed in their own packs in .git/packs .
  Doesn't work for me since I have too many large objects,
  thus slowing down _all_ pack operations.
All this is complicated by the dual nature of packfiles --
they are used as a "wire format" for serial transmission,
as well as a database format for random access.

The "magic" entropy detection idea is cute,  but probably not
needed -- using the blob size should be sufficient.  Trying to
(re)compress an incompressible _smallish_ blob is probably
not worth trying to avoid,  and any computation on sufficiently large
blobs should be avoided.

Hopefully I can return to this problem after New Year's.  And
perhaps with the expanding Git userbase,  more people will have
"large blob" problems ;-) and there will be more interest in
better addressing this usage pattern.

At the moment,  I am thinking about how to better structure
git's handling of very large repositories in a team entirely
connected by high-speed LAN.  It seems a method where
each user has a repository with deep history,  but shallow
blobs,  would be ideal,  but that's also very different from
how git does things now.

Have fun,

Dana How

On Wed, Aug 13, 2008 at 9:01 AM, Geert Bosch <bosch@adacore.com> wrote:



-- 
Dana L. How danahow@gmail.com +1 650 804 5991 cell
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

Messages in current thread:
pack operation is thrashing my server, Ken Pratt, (Sun Aug 10, 3:47 pm)
Re: pack operation is thrashing my server, Jakub Narebski, (Wed Aug 13, 8:43 am)
Re: pack operation is thrashing my server, Shawn O. Pearce, (Sun Aug 10, 11:04 pm)
Re: pack operation is thrashing my server, Ken Pratt, (Mon Aug 11, 3:43 am)
Re: pack operation is thrashing my server, Andi Kleen, (Mon Aug 11, 3:10 pm)
Re: pack operation is thrashing my server, Geert Bosch, (Tue Aug 12, 11:12 pm)
Re: pack operation is thrashing my server, Nicolas Pitre, (Wed Aug 13, 10:35 am)
Re: pack operation is thrashing my server, Geert Bosch, (Wed Aug 13, 12:01 pm)
Re: pack operation is thrashing my server, Nicolas Pitre, (Wed Aug 13, 1:26 pm)
Re: pack operation is thrashing my server, Dana How, (Wed Aug 13, 1:13 pm)
Re: pack operation is thrashing my server, Shawn O. Pearce, (Wed Aug 13, 10:59 am)
Re: pack operation is thrashing my server, Nicolas Pitre, (Wed Aug 13, 11:43 am)
Re: pack operation is thrashing my server, Shawn O. Pearce, (Wed Aug 13, 11:50 am)
Re: pack operation is thrashing my server, Nicolas Pitre, (Wed Aug 13, 1:04 pm)
Re: pack operation is thrashing my server, Linus Torvalds, (Thu Aug 14, 1:21 pm)
Re: pack operation is thrashing my server, Nicolas Pitre, (Thu Aug 14, 2:38 pm)
Re: pack operation is thrashing my server, Linus Torvalds, (Thu Aug 14, 2:55 pm)
Re: pack operation is thrashing my server, Linus Torvalds, (Thu Aug 14, 1:58 pm)
Re: pack operation is thrashing my server, Nicolas Pitre, (Thu Aug 14, 3:04 pm)
Re: pack operation is thrashing my server, Linus Torvalds, (Thu Aug 14, 3:44 pm)
Re: pack operation is thrashing my server, Nicolas Pitre, (Thu Aug 14, 5:50 pm)
Re: pack operation is thrashing my server, Linus Torvalds, (Thu Aug 14, 7:14 pm)
Re: pack operation is thrashing my server, Linus Torvalds, (Fri Aug 15, 8:34 pm)
Re: pack operation is thrashing my server, Junio C Hamano, (Sat Sep 6, 9:03 pm)
Re: pack operation is thrashing my server, Linus Torvalds, (Sat Sep 6, 9:46 pm)
Re: pack operation is thrashing my server, Mike Hommey, (Sun Sep 7, 3:45 am)
Re: pack operation is thrashing my server, Jon Smirl, (Sat Sep 6, 10:50 pm)
Re: pack operation is thrashing my server, Linus Torvalds, (Sat Sep 6, 11:07 pm)
Re: pack operation is thrashing my server, Andreas Ericsson, (Sun Sep 7, 4:18 am)
Re: pack operation is thrashing my server, Jon Smirl, (Sat Sep 6, 11:43 pm)
Re: pack operation is thrashing my server, Linus Torvalds, (Sun Sep 7, 12:50 am)
Re: pack operation is thrashing my server, Jon Smirl, (Sun Sep 7, 9:58 am)
Re: pack operation is thrashing my server, Nicolas Pitre, (Sun Sep 7, 1:08 pm)
Re: pack operation is thrashing my server, Jon Smirl, (Sun Sep 7, 4:33 pm)
Re: pack operation is thrashing my server, Nicolas Pitre, (Mon Sep 8, 10:17 am)
Re: pack operation is thrashing my server, Jon Smirl, (Mon Sep 8, 11:12 am)
Re: pack operation is thrashing my server, Jon Smirl, (Mon Sep 8, 12:01 pm)
Re: pack operation is thrashing my server, Junio C Hamano, (Sat Sep 6, 10:33 pm)
Re: pack operation is thrashing my server, Nicolas Pitre, (Sun Sep 7, 1:11 pm)
Re: pack operation is thrashing my server, Junio C Hamano, (Sun Sep 7, 1:41 pm)
Re: pack operation is thrashing my server, Björn, (Thu Aug 14, 7:39 pm)
Re: pack operation is thrashing my server, Linus Torvalds, (Thu Aug 14, 8:06 pm)
Re: pack operation is thrashing my server, Björn, (Sat Aug 16, 8:47 am)
Re: pack operation is thrashing my server, Linus Torvalds, (Thu Aug 14, 8:25 pm)
Re: pack operation is thrashing my server, Andi Kleen, (Thu Aug 14, 5:30 pm)
Re: pack operation is thrashing my server, Linus Torvalds, (Fri Aug 15, 12:15 pm)
Re: pack operation is thrashing my server, Andreas Ericsson, (Thu Aug 14, 2:33 am)
Re: pack operation is thrashing my server, Nicolas Pitre, (Thu Aug 14, 10:01 am)
Re: pack operation is thrashing my server, Thomas Rast, (Thu Aug 14, 6:04 am)
Re: pack operation is thrashing my server, Andreas Ericsson, (Thu Aug 14, 6:15 am)
Re: pack operation is thrashing my server, Shawn O. Pearce, (Thu Aug 14, 6:33 pm)
Re: pack operation is thrashing my server, Nicolas Pitre, (Thu Aug 14, 9:46 pm)
Re: pack operation is thrashing my server, Shawn O. Pearce, (Wed Aug 13, 1:19 pm)
Re: pack operation is thrashing my server, Shawn O. Pearce, (Tue Aug 12, 11:15 pm)
Re: pack operation is thrashing my server, Geert Bosch, (Tue Aug 12, 11:58 pm)
Re: pack operation is thrashing my server, Nicolas Pitre, (Wed Aug 13, 10:37 am)
Re: pack operation is thrashing my server, Jakub Narebski, (Wed Aug 13, 10:56 am)
Re: pack operation is thrashing my server, Shawn O. Pearce, (Wed Aug 13, 11:04 am)
Re: pack operation is thrashing my server, Johan Herland, (Wed Aug 13, 12:10 pm)
Re: pack operation is thrashing my server, Ken Pratt, (Wed Aug 13, 1:38 pm)
Re: pack operation is thrashing my server, Nicolas Pitre, (Wed Aug 13, 1:57 pm)
Re: pack operation is thrashing my server, David Tweed, (Wed Aug 13, 11:26 am)
Re: pack operation is thrashing my server, Martin Langhoff, (Wed Aug 13, 7:54 pm)
Re: pack operation is thrashing my server, David Tweed, (Thu Aug 14, 5:04 am)
Re: pack operation is thrashing my server, Shawn O. Pearce, (Mon Aug 11, 3:22 pm)
Re: pack operation is thrashing my server, Ken Pratt, (Mon Aug 11, 3:29 pm)
Re: pack operation is thrashing my server, Shawn O. Pearce, (Mon Aug 11, 3:34 pm)
Re: pack operation is thrashing my server, Andi Kleen, (Mon Aug 11, 4:10 pm)
Re: pack operation is thrashing my server, Ken Pratt, (Mon Aug 11, 3:15 pm)
Re: pack operation is thrashing my server, Nicolas Pitre, (Tue Aug 12, 10:38 pm)
Re: pack operation is thrashing my server, Andi Kleen, (Tue Aug 12, 10:50 pm)
Re: pack operation is thrashing my server, Shawn O. Pearce, (Tue Aug 12, 10:57 pm)
Re: pack operation is thrashing my server, Shawn O. Pearce, (Mon Aug 11, 11:01 am)
Re: pack operation is thrashing my server, Ken Pratt, (Mon Aug 11, 3:13 pm)
Re: pack operation is thrashing my server, Avery Pennarun, (Mon Aug 11, 11:40 am)
Re: pack operation is thrashing my server, Shawn O. Pearce, (Mon Aug 11, 11:59 am)
Re: pack operation is thrashing my server, Martin Langhoff, (Sun Aug 10, 7:06 pm)
Re: pack operation is thrashing my server, Ken Pratt, (Sun Aug 10, 7:12 pm)
Re: pack operation is thrashing my server, Martin Langhoff, (Sun Aug 10, 7:30 pm)
Re: pack operation is thrashing my server, Ken Pratt, (Sun Aug 10, 7:34 pm)