login
Header Space

 
 

Excruciatingly slow git-svn imports

Score:
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]
To: git@vger.kernel.org List <git@...>
Date: Thursday, April 24, 2008 - 2:54 pm

I'm trying to import a 9.7G, 130K revision svn repository
but it seems to only import about 6K revisions per day on fast hardware
using a recent git (1.5.5).

This means about 20 days, or more if things slow down as the repo gets  
bigger
Are there any tips/tricks on how to most efficiently convert large  
repos?
I'm using ssh+svn protocol for accessing the repository, but slowness
seems due to local inefficiency. An strace -fcp <pid> during a minute  
gives
the following results:

% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
  52.46   21.392640       17607      1215           clone
  47.47   19.358882        3983      4860      3645 execve
   0.05    0.019571          16      1216           wait4
   0.01    0.003944           0     14582      1215 open
   0.01    0.002458           0     14580     12150 access
   0.00    0.000797           0      8500           write
   0.00    0.000694           0     26013           read
   0.00    0.000574           0      3693           munmap
   0.00    0.000513           0     20659           close
   0.00    0.000452           0     21918           mmap
   0.00    0.000353           0      1215           stat
   0.00    0.000234           0     12158      1215 lseek
   0.00    0.000155           0     17013           fstat
   0.00    0.000077           0      6075           mprotect
   0.00    0.000076           0      8511           rt_sigaction
   0.00    0.000074           0      6078      6078 ioctl
   0.00    0.000049           0      2432           unlink
   0.00    0.000033           0      2430           dup2
   0.00    0.000033           0      7293           fcntl
   0.00    0.000022           0      3681           brk
   0.00    0.000022           0      1215           getppid
   0.00    0.000019           0      1215           uname
   0.00    0.000019           0      1215           arch_prctl
   0.00    0.000000           0      1215           lstat
   0.00    0.000000           0      1216           pipe
   0.00    0.000000           0        22           mremap
   0.00    0.000000           0      2431           dup
   0.00    0.000000           0      1215           getcwd
   0.00    0.000000           0      2430           getdents64
------ ----------- ----------- --------- --------- ----------------
100.00   40.781691                196296     24303 total

So, 99.93% of the time seems to be in clone/execve
(including actual work done by the forked programs)

In another trace, I found the following execve calls were made:
      22 execve("/homes/bosch/x86_64-linux/bin/git",
       2 execve("/homes/bosch/x86_64-linux/bin/git-commit-tree",
    2842 execve("/homes/bosch/x86_64-linux/bin/git-hash-object",
      22 execve("/opt/gnu/bin/git",
       2 execve("/opt/gnu/bin/git-commit-tree",
    2842 execve("/opt/gnu/bin/git-hash-object",
      22 execve("/opt/local/bin/git",
       2 execve("/opt/local/bin/git-commit-tree",
    2842 execve("/opt/local/bin/git-hash-object",
      22 execve("/opt/local/sbin/git",
       2 execve("/opt/local/sbin/git-commit-tree",
    2842 execve("/opt/local/sbin/git-hash-object",

I don't have git installed in either of /opt/gnu/bin, /opt/local/bin  
or /opt/local/sbin.
These three directories just happen to be before the one containing  
git in my path:

bosch:~/git$ echo $PATH
/opt/gnu/bin:/opt/local/bin:/opt/local/sbin:/homes/bosch/x86_64-linux/ 
bin ...

Before trying to brush up my Perl and propose patching fixes for this
(I doubt the extra execve's take much time at all), I was wondering why
we don't open a single stream to git-fast-import and have it do
the heavy lifting. Are there fundamental issues with this?

   -Geert

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

Messages in current thread:
Excruciatingly slow git-svn imports, Geert Bosch, (Thu Apr 24, 2:54 pm)
Re: Excruciatingly slow git-svn imports, Eric Wong, (Tue Apr 29, 3:03 am)
Re: Excruciatingly slow git-svn imports, Steven Grimm, (Thu Apr 24, 3:57 pm)
Re: Excruciatingly slow git-svn imports, Eric Wong, (Tue Apr 29, 3:11 am)
Re: Excruciatingly slow git-svn imports, Geert Bosch, (Mon May 5, 12:29 am)
Re: Excruciatingly slow git-svn imports, Eric Wong, (Mon May 5, 11:28 pm)
Re: Excruciatingly slow git-svn imports, Avery Pennarun, (Mon May 5, 11:56 pm)
Re: Excruciatingly slow git-svn imports, Eric Wong, (Tue May 6, 12:25 am)
Re: Excruciatingly slow git-svn imports, Geert Bosch, (Tue May 6, 7:23 am)
speck-geostationary