Re: git and larger trees, not so fast?

!MAILaRCHIVE_VOTE_RePLACE
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]
To: moe <moe-git@...>
Cc: <git@...>
Date: Thursday, August 9, 2007 - 1:11 pm

On Thu, 9 Aug 2007, moe wrote:

Good catch. Definitely not acceptable performance.

We seem to spend a lot of our time in memcpy:

	samples  %        image name               app name                 symbol name
	200527   25.4551  libc-2.6.so              libc-2.6.so              _wordcopy_bwd_aligned
	104505   13.2660  libc-2.6.so              libc-2.6.so              _wordcopy_fwd_aligned
	99185    12.5907  libz.so.1.2.3            libz.so.1.2.3            (no symbols)
	83452    10.5935  libc-2.5.so              libc-2.5.so              (no symbols)
	54203     6.8806  git                      git                      assign_blame
	46153     5.8587  git                      git                      read_directory_recursive
	27665     3.5118  git                      git                      handle_split
	21385     2.7146  vmlinux                  vmlinux                  blk_complete_sgv4_hdr_rq
	20745     2.6334  git                      git                      read_packed_refs
	12709     1.6133  git                      git                      builtin_diffstat
	7829      0.9938  git                      git                      show_patch_diff
	...

but the silly thing is, this is only true if you give the filenames 
explicitly!

Lookie here:

	[torvalds@woody bummer]$ date >50/500
	[torvalds@woody bummer]$ time git commit -a -m 'expose the turtle'
	Created commit 25ca22d: expose the turtle
	 1 files changed, 1 insertions(+), 1 deletions(-)
	
	real    0m4.612s
	user    0m4.224s
	sys     0m0.412s

	[torvalds@woody bummer]$ date >50/500
	[torvalds@woody bummer]$ time git commit -m 'expose the turtle' 50/500
	Created commit 009f6b5: expose the turtle
	 1 files changed, 1 insertions(+), 1 deletions(-)
	
	real    0m12.464s
	user    0m12.129s
	sys     0m0.336s

ie we take almost three times longer with explicitly naming the file, than 
when just using "git commit -a". Oops.

That said, even the 4.6 seconds is really not acceptable: this is on a 
good 2.6GHz Core 2 Duo too, so on weaker hardware it would be quite 
painful.

I haven't looked at *why* it's that slow, but it's not anything really 
fundamental, the basic operations are fast:

	[torvalds@woody bummer]$ time git add 50/500

	real    0m0.064s
	user    0m0.048s
	sys     0m0.016s

	[torvalds@woody bummer]$ time git write-tree
	7480230419e510c93082a4a19e23d928a426973a
	
	real    0m0.069s
	user    0m0.048s
	sys     0m0.024s

	[torvalds@woody bummer]$ time git diff
	
	real    0m0.127s
	user    0m0.000s
	sys     0m0.000s

so it's not the "lstat()" that we do on all files, or the write-tree 
(which are all O(n) in files, with a rather small constant), but some 
O(n**2) behaviour elsewhere.

And all the expense seems to be in not the commit itself, but in

	[torvalds@woody bummer]$ time git 'runstatus' '--nocolor'

	real    0m4.208s
	user    0m4.068s
	sys     0m0.140s

and that thing seems to suck really really hard.

Doing an ltrace on it shows tons and tons of:

	...
	strlen("35")
	strlen("349")
	calloc(1, 72)
	memcpy(0x73034e, "10/", 3)
	memcpy(0x730351, "349", 4)
	memmove(0x2ab637f41e80, 0x2ab637f41e78, 781768)
	...

but I haven't looked at where they come from yet.

		Linus
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

Messages in current thread:
git and larger trees, not so fast?, moe, (Thu Aug 9, 12:30 pm)
Re: git and larger trees, not so fast?, Linus Torvalds, (Sat Aug 11, 2:47 pm)
Re: git and larger trees, not so fast?, moe, (Sat Aug 11, 4:06 pm)
Re: git and larger trees, not so fast?, moe, (Wed Aug 22, 8:30 pm)
Re: git and larger trees, not so fast?, Fernando J. Pereda, (Sat Aug 11, 3:02 pm)
Re: git and larger trees, not so fast?, Linus Torvalds, (Sat Aug 11, 4:38 pm)
Re: git and larger trees, not so fast?, Fernando J. Pereda, (Sat Aug 11, 4:51 pm)
Re: git and larger trees, not so fast?, Linus Torvalds, (Sat Aug 11, 6:27 pm)
Re: git and larger trees, not so fast?, David Kastrup, (Sat Aug 11, 7:26 pm)
Re: git and larger trees, not so fast?, Linus Torvalds, (Fri Aug 10, 3:39 pm)
Re: git and larger trees, not so fast?, Linus Torvalds, (Thu Aug 9, 1:11 pm)
Re: git and larger trees, not so fast?, Linus Torvalds, (Thu Aug 9, 1:54 pm)
Re: git and larger trees, not so fast?, Linus Torvalds, (Thu Aug 9, 1:38 pm)
Re: git and larger trees, not so fast?, Linus Torvalds, (Thu Aug 9, 2:06 pm)
Re: git and larger trees, not so fast?, Junio C Hamano, (Thu Aug 9, 2:11 pm)
Re: git and larger trees, not so fast?, Junio C Hamano, (Thu Aug 9, 4:42 pm)
Re: git and larger trees, not so fast?, Sean, (Thu Aug 9, 4:52 pm)
Re: git and larger trees, not so fast?, Linus Torvalds, (Thu Aug 9, 5:41 pm)
Re: git and larger trees, not so fast?, Linus Torvalds, (Thu Aug 9, 5:46 pm)
Re: git and larger trees, not so fast?, Junio C Hamano, (Thu Aug 9, 6:02 pm)
Re: git and larger trees, not so fast?, Daniel Barkalow, (Thu Aug 9, 9:42 pm)
Re: git and larger trees, not so fast?, Junio C Hamano, (Thu Aug 9, 7:38 pm)
Re: git and larger trees, not so fast?, Linus Torvalds, (Thu Aug 9, 8:44 pm)
Re: git and larger trees, not so fast?, Junio C Hamano, (Thu Aug 9, 8:51 pm)
Re: git and larger trees, not so fast?, Linus Torvalds, (Thu Aug 9, 8:57 pm)
Re: git and larger trees, not so fast?, Junio C Hamano, (Thu Aug 9, 11:48 pm)
Re: git and larger trees, not so fast?, Linus Torvalds, (Fri Aug 10, 12:07 pm)
Fix "git commit directory/" performance anomaly, Linus Torvalds, (Fri Aug 10, 12:51 pm)
Re: Fix "git commit directory/" performance anomaly, Junio C Hamano, (Fri Aug 10, 2:31 pm)
Re: Fix "git commit directory/" performance anomaly, Linus Torvalds, (Fri Aug 10, 2:56 pm)
Re: Fix "git commit directory/" performance anomaly, Linus Torvalds, (Fri Aug 10, 1:14 pm)
Re: git and larger trees, not so fast?, Junio C Hamano, (Fri Aug 10, 1:55 am)
Re: git and larger trees, not so fast?, Linus Torvalds, (Fri Aug 10, 11:49 am)
Re: git and larger trees, not so fast?, Junio C Hamano, (Thu Aug 9, 8:04 pm)
Re: git and larger trees, not so fast?, Junio C Hamano, (Thu Aug 9, 5:37 pm)
Re: git and larger trees, not so fast?, Junio C Hamano, (Thu Aug 9, 2:00 pm)