[PATCH 0/3] git-svn and temporary file improvements

Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]
From: Marcus Griep
Date: Monday, August 11, 2008 - 8:53 am

This series of patches relates to temp file usage within git-svn and possible
extensions applicable to other perl auxiliary functions.

The first patch allows for a central "registry" of temp files to be maintained.
It offers both locking and non-locking constructs depending upon the user's
complexity concern. The functions provided are also documented for perldoc.

The second patch changes git-svn to utilize the central registry in the first
patch to help reduce the amount of temp files created and destroyed during a
normal run of git-svn. The asymptotic limit on the number of temp files needed
is decreased from O(n+m) to O(1) where n is the number of files imported and
m is the number of file deltas. In real terms, this change does not
significantly reduce the time required for an import as other concerns, such as
network and disk i/o dominate over inode/MFT changes, however an incremental
reduction of ~10% system time was found on large change sets, though in a large
repository of small changesets, this incremental reduction reduced to 
approximately 3%.

The third patch modifies the way git-svn handles symlinks versus normal files
imported from svn. Currently, git-svn is very inefficient in this respect,
duplicating entire files solely for the sake of eliminating the first five
bytes of the file if it is a symlink. This causes a large amount of unnecessary
disk i/o, even when considering most of it takes place in in-memory buffers.
By eliminating the unnecessary duplication for normal files, a significant 48%
reduction in system time and a 33% reduction in user time was realized on
large changesets. Over many commits with small changesets, other operations
dominate, but an incremental 6% reduction was still noted. In addition, in both
cases a 15-25% reduction in maximum resident set size was found.

Logs and results of the benchmarks along with the procedure used are available
at http://blog.xpdm.us/2008/08/git-svn-and-temporary-files.html.

Marcus Griep (3):
      Git.pm: Add faculties to allow temp files to be cached
      git-svn: Make it scream by minimizing temp files
      git-svn: Reduce temp file usage when dealing with non-links

 git-svn.perl |   84 ++++++++++++++++++++--------------
 perl/Git.pm  |  145 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++-
 2 files changed, 192 insertions(+), 37 deletions(-)

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

Messages in current thread:
[PATCH 0/3] git-svn and temporary file improvements, Marcus Griep, (Mon Aug 11, 8:53 am)
[PATCH] Git.pm: require Perl 5.6.1, Lea Wiemann, (Wed Aug 13, 3:30 pm)