I have tried to make --follow to support finding copies among unmodified files. And the first patch is to fix a bug introduced by '--follow' and 'git log' combination.
We use the code:
else if (--p->one->rename_used > 0)
p->status = DIFF_STATUS_COPIED;
to detect copies and renames. So, if diffcore_std run more than one time, p->one->rename_used will be reduced to a 'R' from 'C'. And this patch will fix this by allowing diffcore_std can only run once before a diff_flush, which seems rationale for our code.
Bo Yang (2):
Make diffcore_std only can run once before a diff_flush.
Make git log --follow find copies among unmodified files.
Documentation/git-log.txt | 2 +-
diff.c | 21 ++++++++-----
diffcore-break.c | 6 +--
diffcore-pickaxe.c | 3 +-
diffcore-rename.c | 3 +-
diffcore.h | 6 ++++
t/t4205-log-follow-harder-copies.sh | 56 +++++++++++++++++++++++++++++++++++
tree-diff.c | 2 +-
8 files changed, 81 insertions(+), 18 deletions(-)
create mode 100755 t/t4205-log-follow-harder-copies.sh
--
'git log --follow <path>' don't track copies from unmodified files, and this patch fix it. Signed-off-by: Bo Yang <struggleyb.nku@gmail.com> --- Documentation/git-log.txt | 2 +- t/t4205-log-follow-harder-copies.sh | 56 +++++++++++++++++++++++++++++++++++ tree-diff.c | 2 +- 3 files changed, 58 insertions(+), 2 deletions(-) create mode 100755 t/t4205-log-follow-harder-copies.sh diff --git a/Documentation/git-log.txt b/Documentation/git-log.txt index fb184ba..0727818 100644 --- a/Documentation/git-log.txt +++ b/Documentation/git-log.txt @@ -56,7 +56,7 @@ include::diff-options.txt[] commits, and doesn't limit diff for those commits. --follow:: - Continue listing the history of a file beyond renames. + Continue listing the history of a file beyond renames/copies. --log-size:: Before the log message print out its size in bytes. Intended diff --git a/t/t4205-log-follow-harder-copies.sh b/t/t4205-log-follow-harder-copies.sh new file mode 100755 index 0000000..ad29e65 --- /dev/null +++ b/t/t4205-log-follow-harder-copies.sh @@ -0,0 +1,56 @@ +#!/bin/sh +# +# Copyright (c) 2010 Bo Yang +# + +test_description='Test --follow should always find copies hard in git log. + +' +. ./test-lib.sh +. "$TEST_DIRECTORY"/diff-lib.sh + +echo >path0 'Line 1 +Line 2 +Line 3 +' + +test_expect_success \ + 'add a file path0 and commit.' \ + 'git add path0 && + git commit -m "Add path0"' + +echo >path0 'New line 1 +New line 2 +New line 3 +' +test_expect_success \ + 'Change path0.' \ + 'git add path0 && + git commit -m "Change path0"' + +cat <path0 >path1 +test_expect_success \ + 'copy path0 to path1.' \ + 'git add path1 && + git commit -m "Copy path1 from path0"' + +test_expect_success \ + 'find the copy path0 -> path1 harder' \ + 'git log --follow --name-status --pretty="format:%s" path1 > current' + +cat >expected <<\EOF +Copy path1 from path0 +C100 path0 path1 + +Change ...
When file renames/copies detection is turned on, the
second diffcore_std will degrade a 'C' pair to a 'R' pair.
And this may happen when we run 'git log --follow' with
hard copies finding. That is, the try_to_follow_renames()
will run diffcore_std to find the copies, and then
'git log' will issue another diffcore_std, which will reduce
'src->rename_used' and recognize this copy as a rename.
This is not what we want.
So, I think we really don't need to run diffcore_std more
than one time.
Signed-off-by: Bo Yang <struggleyb.nku@gmail.com>
---
diff.c | 21 +++++++++++++--------
diffcore-break.c | 6 ++----
diffcore-pickaxe.c | 3 +--
diffcore-rename.c | 3 +--
diffcore.h | 6 ++++++
5 files changed, 23 insertions(+), 16 deletions(-)
diff --git a/diff.c b/diff.c
index d0ecbc3..d32fc68 100644
--- a/diff.c
+++ b/diff.c
@@ -2544,6 +2544,7 @@ static void run_checkdiff(struct diff_filepair *p, struct diff_options *o)
void diff_setup(struct diff_options *options)
{
memset(options, 0, sizeof(*options));
+ memset(&diff_queued_diff, 0, sizeof(diff_queued_diff));
options->file = stdout;
@@ -3462,8 +3463,7 @@ int diff_flush_patch_id(struct diff_options *options, unsigned char *sha1)
diff_free_filepair(q->queue[i]);
free(q->queue);
- q->queue = NULL;
- q->nr = q->alloc = 0;
+ DIFF_QUEUE_CLEAR(q);
return result;
}
@@ -3591,8 +3591,7 @@ void diff_flush(struct diff_options *options)
diff_free_filepair(q->queue[i]);
free_queue:
free(q->queue);
- q->queue = NULL;
- q->nr = q->alloc = 0;
+ DIFF_QUEUE_CLEAR(q);
if (options->close_file)
fclose(options->file);
@@ -3614,8 +3613,7 @@ static void diffcore_apply_filter(const char *filter)
int i;
struct diff_queue_struct *q = &diff_queued_diff;
struct diff_queue_struct outq;
- outq.queue = NULL;
- outq.nr = outq.alloc = 0;
+ DIFF_QUEUE_CLEAR(&outq);
if (!filter)
return;
@@ -3683,8 +3681,7 @@ static void diffcore_skip_stat_unmatch(struct ...It actually is stronger than that; we should never run it more than once, and it would be a bug if we did so. Which codepath tries to call *_std() twice? The standard calling sequence is: - start from an empty queue. - use diff_change() and diff_addremove() to populate the queue. - call diffcore_std(). if you need to use a non-standard chain of diffcore transformations, you _could_ call the diffcore_* routines that diffcore_std() calls, if you choose to, but as you found out, some of them are not idempotent operations, and shouldn't be called twice. Shouldn't this be a BUG() instead? The trivial rewrite to use this macro is a good idea, but it probably --
In command 'git log --follow ...'
log_tree_diff call diff_tree_sha1 and then diff_tree_diff_flush, when
'--follow' is given, the former function will call
try_to_follow_renames, which will call diffcore_std to detect rename.
And then, diff_tree_diff_flush call 'diffcore_std' again
unconditional. (and I have try to find a condition to make the call,
but I fail, so I figure out this patch.)
Breakpoint 1, diffcore_std (options=0xbf9cc044) at diff.c:3748
3748 if (diff_queued_diff.run)
(gdb) bt
#0 diffcore_std (options=0xbf9cc044) at diff.c:3748
#1 0x08124206 in try_to_follow_renames (t1=0xbf9cc130, t2=0xbf9cc11c,
base=0x81571c9 "", opt=0xbf9cc468) at tree-diff.c:358
#2 0x08124480 in diff_tree_sha1 (old=0x9c51d8c
"$\033\222T���\a\035\200T����\210;8\235i", new=0x9c51d2c
"\201�\017<�\v��n]\226{�+�\001\003\232\232\230",
base=0x81571c9 "", opt=0xbf9cc468) at tree-diff.c:418
#3 0x080e660e in log_tree_diff (opt=0xbf9cc220, commit=0x9c51d28,
log=0xbf9cc1ac) at log-tree.c:536
#4 0x080e668f in log_tree_commit (opt=0xbf9cc220, commit=0x9c51d28)
at log-tree.c:560
#5 0x0807faa1 in cmd_log_walk (rev=0xbf9cc220) at builtin/log.c:237
#6 0x080806e2 in cmd_log (argc=5, argv=0xbf9cc788, prefix=0x0) at
builtin/log.c:481
#7 0x0804b8eb in run_builtin (p=0x8161524, argc=5, argv=0xbf9cc788)
at git.c:260
#8 0x0804ba51 in handle_internal_command (argc=5, argv=0xbf9cc788) at git.c:416
#9 0x0804bb2c in run_argv (argcp=0xbf9cc700, argv=0xbf9cc704) at git.c:458
#10 0x0804bcbe in main (argc=5, argv=0xbf9cc788) at git.c:529
(gdb) c
Continuing.
Breakpoint 1, diffcore_std (options=0xbf9cc468) at diff.c:3748
3748 if (diff_queued_diff.run)
(gdb) bt
#0 diffcore_std (options=0xbf9cc468) at diff.c:3748
#1 0x080e6356 in log_tree_diff_flush (opt=0xbf9cc220) at log-tree.c:449
#2 0x080e6619 in log_tree_diff (opt=0xbf9cc220, commit=0x9c51d28,
log=0xbf9cc1ac) at log-tree.c:537
#3 0x080e668f in log_tree_commit (opt=0xbf9cc220, commit=0x9c51d28)
at log-tree.c:560
#4 0x0807faa1 in cmd_log_walk ...Hi Junio, I have not receive any comments on this thread from you, but I think it worth some words. I want to make these series patches landed and could you please give some more advice on this? Regards! Bo --
