Re: [PATCH] fast-export: ensure we traverse commits in topological order

Previous thread: none

Next thread: [PATCH] Fix file mark handling and sort side-effects in git.el by Brent Goodrick on Tuesday, February 10, 2009 - 11:12 pm. (2 messages)
From: newren
Date: Tuesday, February 10, 2009 - 11:03 pm

From: Elijah Newren <newren@gmail.com>

Signed-off-by: Elijah Newren <newren@gmail.com>
---
I'm not certain this testcase will actually trigger the bug for everyone,
since I don't know the order of commits used by git rev-list.  It does
trigger it for me.  What you need is a history that looks like
    B--D
   /  /
  A--C
with the master branch pointing at D, and have "git rev-list --all" show
these commits in the order
  C
  D
  B
  A
If you have such a repository, "git fast-export --all" will give
instructions to create a master branch that only contains D, B, and A.
 t/t9301-fast-export.sh |   29 +++++++++++++++++++++++++++++
 1 files changed, 29 insertions(+), 0 deletions(-)

diff --git a/t/t9301-fast-export.sh b/t/t9301-fast-export.sh
index 9985721..a1ee400 100755
--- a/t/t9301-fast-export.sh
+++ b/t/t9301-fast-export.sh
@@ -259,4 +259,33 @@ test_expect_success 'cope with tagger-less tags' '
 
 '
 
+test_expect_success 'avoid squashing merges' '
+
+	mkdir repo &&
+	cd repo &&
+	git init &&
+	echo fee-fi-fo-fum > giant &&
+	git add giant &&
+	git commit -m "Initial commit" &&
+	git branch alternate_root &&
+	echo hello > world &&
+	git add world &&
+	git commit -m "Commit on master" &&
+	git checkout alternate_root &&
+	echo foo > bar &&
+	git add bar &&
+	git commit -m "Commit on alternate_root" &&
+	git checkout master &&
+	git merge alternate_root &&
+	MASTER=$(git rev-parse --verify refs/heads/master) &&
+	rm -rf ../new &&
+	mkdir ../new &&
+	git --git-dir=../new/.git init &&
+	git fast-export --all |
+	(cd ../new &&
+	 git fast-import &&
+	 test $MASTER = $(git rev-parse --verify refs/heads/master))
+
+'
+
 test_done
-- 
1.6.0.6

--

From: newren
Date: Tuesday, February 10, 2009 - 11:03 pm

From: Elijah Newren <newren@gmail.com>

fast-export will only list as parents those commits which have already
been traversed (making it appear as if merges have been squashed if not
all parents have been traversed).  To avoid this silent squashing of
merge commits, we request commits in topological order.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 builtin-fast-export.c |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/builtin-fast-export.c b/builtin-fast-export.c
index e9ee2c7..fdf4ae9 100644
--- a/builtin-fast-export.c
+++ b/builtin-fast-export.c
@@ -514,6 +514,7 @@ int cmd_fast_export(int argc, const char **argv, const char *prefix)
 
 	get_tags_and_duplicates(&revs.pending, &extra_refs);
 
+	revs.topo_order = 1;
 	if (prepare_revision_walk(&revs))
 		die("revision walk setup failed");
 	revs.diffopt.format_callback = show_filemodify;
-- 
1.6.0.6

--

From: Johannes Schindelin
Date: Wednesday, February 11, 2009 - 3:25 am

Hi,


ACK.

Ciao,
Dscho
--

From: Mike Ralphson
Date: Wednesday, February 11, 2009 - 3:48 am

Any comparative timings? We don't need to rename this to 'git
reasonably-speedy-export'? 8-)

Mike
--

From: Johannes Schindelin
Date: Wednesday, February 11, 2009 - 3:59 am

Hi,


What we _could_ do is add proper error handling: fast-export should 
recognize that it did not see the commit yet, and error out suggesting 
using --topo-order with fast-export.

Ciao,
Dscho

--

From: Jeff King
Date: Wednesday, February 11, 2009 - 6:56 am

Hmm.

In git.git:

  $ time git fast-export --all --signed-tags=strip >/dev/null
  real    1m6.013s
  user    1m3.840s
  sys     0m2.140s

  $ time git fast-export --all --signed-tags=strip --topo-order >/dev/null
  real    0m49.018s
  user    0m47.987s
  sys     0m0.888s

I certainly didn't expect it to be _faster_.  More efficient use of the
delta cache, maybe?

-Peff
--

From: Johannes Schindelin
Date: Wednesday, February 11, 2009 - 8:18 am

Hi,


Or a warm against a cold cache?

Hides,
Dscho

--

From: Elijah Newren
Date: Wednesday, February 11, 2009 - 8:23 am

Yeah, I also saw a speed-up, though the difference was less pronounced
-- 19.7 seconds versus 19.5 (that was the average for 5 runs, after
first doing a run that I ignored to make sure I was comparing
hot-cache to hot-cache).

I'm a little surprised why my numbers were so much lower than yours,
though.  I don't think of my hard drive and CPU as being all that
quick -- is this a difference in packing, by chance, with your
repository not having recently been repacked?
--

From: Jeff King
Date: Wednesday, February 11, 2009 - 10:37 am

Nope. See how the CPU is more or less pegged in the numbers above? Cold
cache looks like this:

  $ echo 3 >/proc/sys/vm/drop_caches
  $ time git fast-export --all --signed-tags=strip >/dev/null
  real    1m11.148s
  user    1m3.952s
  sys     0m0.760s

So only 7 extra seconds paging in. Hooray for packing. :)

-Peff
--

From: Jeff King
Date: Wednesday, February 11, 2009 - 10:53 am

I don't think the hard drive matters much at all; the CPU is pegged the
whole time. I'm not fully packed, but almost so. I suspect it is a
combination of:

  1. I compile git with "-g" and no optimizations to make debugging
     easier.

  2. My system sucks. It is a 1.8GHz dual Athlon from around 2003.
     fast-export should be pretty memory intensive, which I'm sure is
     saturating my totally rocking 266MHz FSB.

Just for fun, I tried:

  - fully repacking; no change

  - compiling with -O2; shaved off about 2s. Not too surprising;
    previous profiling has shown that git spends a lot of time in zlib,
    and I am dynamically linking against an optimized version.

So I think it is mainly (2) above, assuming your machine sucks a lot
less than mine.

-Peff
--

Previous thread: none

Next thread: [PATCH] Fix file mark handling and sort side-effects in git.el by Brent Goodrick on Tuesday, February 10, 2009 - 11:12 pm. (2 messages)