Re: Packfile can't be mapped

Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]
From: Shawn Pearce
Date: Sunday, August 27, 2006 - 7:47 pm

Jon Smirl <jonsmirl@gmail.com> wrote:

I'm going to try to get tree deltas written to the pack sometime this
week. That should compact this intermediate pack down to something
that git-pack-objects would be able to successfully mmap into a
32 bit address space.  A complete repack with no delta reuse will
hopefully generate a pack closer to 400 MB in size.  But I know
Jon would like to get that pack even smaller.  :)

I should point out that the input stream to fast-import was 20 GB
(completely decompressed revisions from RCS) plus all commit data.
The original CVS ,v files are around 3 GB.  An archive .tar.gz'ing
the ,v files is around 550 MB.  Going to only 1.7 GB without tree
or commit deltas is certainly pretty good.  :)


All of that says that aside from the 1.7 GB output file fast-import
ran extremely well.  About 1.9 million objects were written into
the output pack file, with 41k duplicate trees (duplicate blobs
were removed by cvs2svn prior to fast-import so they don't appear).
200k commits were created across 1600 branches.  And we did it in
only 67 MB of memory.

We also had ~8000 LRU cache misses related to our branch data;
this just means that cvs2svn likes to frequently jump around
between branches rather than import an entire branch at a time.
Boosting the size of the LRU cache (at the expense of needing more
memory) should reduce those cache misses as well as 'Pack remaps'.

I'd also like to clean up that pack remapping code and move it
into sha1_file.c.  Its an implementation of partial pack mapping
and it is apparently working quite well for us in fast-import.
It may help GIT deal with very large packs (e.g. 1.7 GB) on smaller
address space systems (e.g. 32 bit).


We're not confident that this import is completely valid yet.
We have a few translation issues we're still working on.  But now
that we have a complete pack going from start to finish we can start
to focus on those issues.  Especially since this entire process
(,v to .pack) is less than half a day to run.

-- 
Shawn.
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

Messages in current thread:
Packfile can't be mapped, Jon Smirl, (Sun Aug 27, 6:04 pm)
Re: Packfile can't be mapped, Shawn Pearce, (Sun Aug 27, 7:47 pm)
Re: Packfile can't be mapped, Nicolas Pitre, (Sun Aug 27, 9:27 pm)
Re: Packfile can't be mapped, Linus Torvalds, (Sun Aug 27, 9:36 pm)
Re: Packfile can't be mapped, Shawn Pearce, (Sun Aug 27, 10:33 pm)
Re: Packfile can't be mapped, Shawn Pearce, (Sun Aug 27, 11:00 pm)
Re: Packfile can't be mapped, Jon Smirl, (Mon Aug 28, 7:15 am)
Re: Packfile can't be mapped, Nicolas Pitre, (Mon Aug 28, 7:40 am)
Re: Packfile can't be mapped, Nicolas Pitre, (Mon Aug 28, 7:48 am)
Re: Packfile can't be mapped, Jon Smirl, (Mon Aug 28, 8:44 am)
Re: Packfile can't be mapped, Shawn Pearce, (Mon Aug 28, 9:42 am)
Re: Packfile can't be mapped, Nicolas Pitre, (Mon Aug 28, 9:43 am)
Re: Packfile can't be mapped, Shawn Pearce, (Mon Aug 28, 9:48 am)
Re: Packfile can't be mapped, Nicolas Pitre, (Mon Aug 28, 10:19 am)
Re: Packfile can't be mapped, Shawn Pearce, (Mon Aug 28, 9:52 pm)
Re: Packfile can't be mapped, Shawn Pearce, (Mon Aug 28, 10:33 pm)