I get the above message all the time when pulling all the git trees. I'm frightened! y:/usr/src/git26> git --version git version 1.5.5.rc1 --
Rename detection is O(n^2), so when it looks like it will take a really long time, we skip it. This has been happening for a while, but 1.5.5 only recently started telling the user (based on some people wondering why renames weren't found during their enormous merges). The default rename limit is 100, but you can bump it via the diff.renamelimit config option. A few tests that I did imply that 200-400 is reasonable for logging, and 800-1000 for a merge: http://article.gmane.org/gmane.comp.version-control.git/73519 Are you running into actual problems with rename detection, or is the message just too scary and confusing? -Peff --
No observed problems, just scared! I don't use rename detection anyway - I use git to extract plain old diffs only. Perhaps the default should be bumped up a bit based on your measurements, dunno. --
git config --global diff.renamelimit 200 Ah, OK. If you are just doing a fast-forward merge, there is no rename detection going on as part of the merge. But the diffstat for a large is probably enough to trigger this behavior. So perhaps we should only Probably. I'll work up a patch for that, as well as suppressing the message on diffstat (where you really shouldn't care, and it serves only to scare users). -Peff --
Here's a patch series trying to improve rename limits, based on my discussion with Andrew and from some previous comments about the settings[1]. Patch 1 allows separate renamelimit values for diff vs merge. Patch 2 bumps up the default values based on some measurements I did in February. Patch 3 turns off the mostly cluttering warning message except for merges. [1]: http://permalink.gmane.org/gmane.comp.version-control.git/73470 -Peff --
The point of rename limiting is to bound the amount of time
we spend figuring out inexact renames. Currently we use a
single value, diff.renamelimit, for all situations. However,
it is probably the case that a user is willing to spend more
time finding renames during a merge than they are while
looking at git-log.
This patch provides a way of setting those values separately
(though for backwards compatibility, merge still falls back
on the diff renamelimit).
Signed-off-by: Jeff King <peff@peff.net>
---
Documentation/merge-config.txt | 5 +++
builtin-merge-recursive.c | 13 +++++--
t/t6032-merge-large-rename.sh | 73 ++++++++++++++++++++++++++++++++++++++++
3 files changed, 88 insertions(+), 3 deletions(-)
create mode 100755 t/t6032-merge-large-rename.sh
diff --git a/Documentation/merge-config.txt b/Documentation/merge-config.txt
index 9719311..48ce747 100644
--- a/Documentation/merge-config.txt
+++ b/Documentation/merge-config.txt
@@ -6,6 +6,11 @@ merge.log::
Whether to include summaries of merged commits in newly created
merge commit messages. False by default.
+merge.renameLimit::
+ The number of files to consider when performing rename detection
+ during a merge; if not specified, defaults to the value of
+ diff.renameLimit.
+
merge.tool::
Controls which merge resolution program is used by
linkgit:git-mergetool[1]. Valid built-in values are: "kdiff3",
diff --git a/builtin-merge-recursive.c b/builtin-merge-recursive.c
index 910c0d2..1293e3d 100644
--- a/builtin-merge-recursive.c
+++ b/builtin-merge-recursive.c
@@ -92,7 +92,8 @@ static struct path_list current_directory_set = {NULL, 0, 0, 1};
static int call_depth = 0;
static int verbosity = 2;
-static int rename_limit = -1;
+static int diff_rename_limit = -1;
+static int merge_rename_limit = -1;
static int buffer_output = 1;
static struct strbuf obuf = STRBUF_INIT;
@@ -361,7 +362,9 @@ static struct path_list *get_renames(struct tree *tree,
diff_setup(&opts);
...Aside: this is based on next due to the reorganization of the merge options into merge-config.txt. -Peff --
Makes one wonder where the magic 100 comes from. Wouldn't this
opts.rename_limit = (merge_rename_limit >= 0)
? merge_rename_limit
: diff_rename_limit;
be easier to maintain, with the same semantics?
--
Yes, though I was being a little sneaky in perparation for 2/3, which changes the defaults differently for diff versus merge. I could maybe have put that particular change into 2/3, but it looks like you have applied this to next as-is already. -Peff --
The current rename limit default of 100 was arbitrarily chosen. Testing[1] has shown that on modern hardware, a limit of 200 adds about a second of computation time, and a limit of 500 adds about 5 seconds of computation time. This patch bumps the default limit to 200 for viewing diffs, and to 500 for performing a merge. The limit for generating git-status templates is set independently; we bump it up to 200 here, as well, to match the diff limit. [1]: See <20080211113516.GB6344@coredump.intra.peff.net> Signed-off-by: Jeff King <peff@peff.net> --- builtin-merge-recursive.c | 2 +- diff.c | 2 +- wt-status.c | 2 +- 3 files changed, 3 insertions(+), 3 deletions(-) diff --git a/builtin-merge-recursive.c b/builtin-merge-recursive.c index 1293e3d..3902e91 100644 --- a/builtin-merge-recursive.c +++ b/builtin-merge-recursive.c @@ -364,7 +364,7 @@ static struct path_list *get_renames(struct tree *tree, opts.detect_rename = DIFF_DETECT_RENAME; opts.rename_limit = merge_rename_limit >= 0 ? merge_rename_limit : diff_rename_limit >= 0 ? diff_rename_limit : - 100; + 500; opts.output_format = DIFF_FORMAT_NO_OUTPUT; if (diff_setup_done(&opts) < 0) die("diff setup failed"); diff --git a/diff.c b/diff.c index 3632b55..f735519 100644 --- a/diff.c +++ b/diff.c @@ -19,7 +19,7 @@ #endif static int diff_detect_rename_default; -static int diff_rename_limit_default = 100; +static int diff_rename_limit_default = 200; int diff_use_color_default = -1; static const char *external_diff_cmd_cfg; int diff_auto_refresh_index = 1; diff --git a/wt-status.c b/wt-status.c index 532b4ea..a44c543 100644 --- a/wt-status.c +++ b/wt-status.c @@ -206,7 +206,7 @@ static void wt_status_print_updated(struct wt_status *s) rev.diffopt.format_callback = wt_status_print_updated_cb; rev.diffopt.format_callback_data = s; rev.diffopt.detect_rename = 1; - rev.diffopt.rename_limit = 100; + rev.diffopt.rename_limit = ...
In many cases, the warning ends up as clutter, because the
diff is being done "behind the scenes" from the user (e.g.,
when generating a commit diffstat), and whether we show
renames or not is not particularly interesting to the user.
However, in the case of a merge (which is what motivated the
warning in the first place), it is a useful hint as to why a
merge with renames might have failed.
This patch makes the warning optional based on the code
calling into diffcore. We default to not showing the
warning, but turn it on for merges.
Signed-off-by: Jeff King <peff@peff.net>
---
This neglects the case where the user specifically does a diff asking
for renames, but we turn it off. Maybe when "-M" is specified on the
commandline to git-diff, we should set this option as well.
builtin-merge-recursive.c | 1 +
diff.h | 1 +
diffcore-rename.c | 3 ++-
3 files changed, 4 insertions(+), 1 deletions(-)
diff --git a/builtin-merge-recursive.c b/builtin-merge-recursive.c
index 3902e91..46e636f 100644
--- a/builtin-merge-recursive.c
+++ b/builtin-merge-recursive.c
@@ -365,6 +365,7 @@ static struct path_list *get_renames(struct tree *tree,
opts.rename_limit = merge_rename_limit >= 0 ? merge_rename_limit :
diff_rename_limit >= 0 ? diff_rename_limit :
500;
+ opts.warn_on_too_large_rename = 1;
opts.output_format = DIFF_FORMAT_NO_OUTPUT;
if (diff_setup_done(&opts) < 0)
die("diff setup failed");
diff --git a/diff.h b/diff.h
index f2c7739..8931116 100644
--- a/diff.h
+++ b/diff.h
@@ -83,6 +83,7 @@ struct diff_options {
int pickaxe_opts;
int rename_score;
int rename_limit;
+ int warn_on_too_large_rename;
int dirstat_percent;
int setup;
int abbrev;
diff --git a/diffcore-rename.c b/diffcore-rename.c
index 1369a5e..1b2ebb4 100644
--- a/diffcore-rename.c
+++ b/diffcore-rename.c
@@ -492,7 +492,8 @@ void diffcore_rename(struct diff_options *options)
rename_limit = 32767;
if ((num_create > ...This will also fix the problem I had with gitk on cygwin; namely "gitk --all &" on my
"git" repo pops up an error dialog (see below), after which the diff display disappears.
The error dialog shows:
--- >8 ---
warning: too many files, skipping inexact rename detection
warning: too many files, skipping inexact rename detection
while executing
"close $bdf"
(procedure "getblobdiffline" line 89)
invoked from within
"getblobdiffline file102daa00 1d6aeb410dc19893adbc0209bcf859f35ff1c7d6"
("eval" body line 1)
invoked from within
"eval $script"
(procedure "dorunq" line 9)
invoked from within
"dorunq"
("after" script)
--- 8< ---
The git command issued by gitk appears to be:
git diff-tree -r -p -C --no-commit-id -U3 1d6aeb410dc19893adbc0209bcf859f35ff1c7d6
However, if I type the above into bash (and include --no-pager), then the error
message does not appear! Ho hum. Also, the same repo on Linux does not exhibit this
problem at all.
In any event, if the above warning() call is commented out, then gitk is quite happy
on cygwin as well.
All the best,
Ramsay Jones
--
Hrm. Is gitk on cygwin somehow squishing stderr and stdout together? Or does gitk in general look at what happens on stderr? Because while I am happy that removing this message fixes your problem, it is a little disconcerting to think that we can break gitk just by issuing a warning diagnostic on stderr. -Peff --
It's a more general Tcl thing - if you are reading from a process, and the process writes to stderr, and the script hasn't explicitly redirected stderr, the Tcl infrastructure assumes that the process is signalling an error, even if the exit status is 0. Gitk does redirect stderr (to stdout) when it does a git reset, but not for other commands. At the moment I don't think there is a good way in Tcl to get hold of the stderr output if a subcommand returns a non-zero exit status, but ignore it if the exit status is 0, other than by redirecting stderr to a temporary file, which has its own problems. Tcl can bundle stderr in with stdout, or ignore it, or take it as an error indication, or send it to a file. So if git commands can avoid writing non-error messages to stderr, that will make my life easier... Paul. --
In that case, Junio, perhaps we should restrict this particular warning
just to merge.
However, there a number of other warnings that can get printed on
stderr, many of them related to reflogs. So I suspect with some reflog
conditions (like a reflog that only goes back 1 day) you could end up
with a tcl error for "gitk HEAD@{2.days.ago}".
-Peff
--
Indeed, I tested this:
$ git clone git://git.kernel.org/pub/scm/git/git.git
$ cd git
$ sleep 10
$ gitk HEAD@{10.seconds.ago} ;# works fine
$ gitk HEAD@{10.minutes.ago} ;# barfs due to warning
-Peff
--
I am not sure if we really want to work around Tcl's braindamage like that. There is no stdwarn or stdinfo stream and I think it is a bug on the receiving end to assume that anything that comes to stderr is an error. --
If this is the warning about too many files to do inexact rename detection, I find that one annoying because I don't care about that (it's just for the diffstat when doing a pull AFAIK) and I don't know There are apparently some programs on some platforms for which this behaviour is necessary... Paul. --
Ah, OK. That explains it. Actually, I seem to have a vague recollection of having had this discussion before... All the Best, Ramsay Jones --
maybe. After further testing, the above "git diff-tree..." does indeed issue the
warning on stderr; on both cygwin and Linux.
[I forgot to type ./git-diff-tree as I had not installed the new git yet, since
it does not pass "make test". That is a different problem for another day...]
It appears that gitk on Linux is quite happy with that message on stderr, but on
Indeed.
Also note that the warning is issued twice, since gitk issues that same
command twice; viz:
$ GIT_TRACE=/home/ramsay/git-trace gitk --all & # exit asap
$ cat /home/ramsay/git-trace
trace: built-in: git 'config' '--get' 'i18n.commitencoding'
trace: built-in: git 'rev-parse' '--git-dir'
trace: built-in: git 'rev-parse' '--no-revs' '--no-flags' '--all'
trace: built-in: git 'rev-parse' '--is-inside-work-tree'
trace: built-in: git 'show-ref' '-d'
trace: built-in: git 'symbolic-ref' 'HEAD'
trace: built-in: git 'log' '--no-color' '-z' '--pretty=raw' '--topo-order' '--parents' '--boundary' '--all' '--'
trace: built-in: git 'diff-index' '--cached' 'HEAD'
trace: built-in: git 'rev-parse' '--git-dir'
trace: built-in: git 'diff-tree' '-r' '--no-commit-id' '1d6aeb410dc19893adbc0209bcf859f35ff1c7d6'
trace: built-in: git 'diff-files'
trace: built-in: git 'diff-tree' '-r' '-p' '-C' '--no-commit-id' '-U3' '1d6aeb410dc19893adbc0209bcf859f35ff1c7d6'
trace: built-in: git 'diff-tree' '-r' '-p' '-C' '--no-commit-id' '-U3' '1d6aeb410dc19893adbc0209bcf859f35ff1c7d6'
$
NOTE: I get exactly the same trace on Linux and cygwin.
As a quick-fix, I added a "-l300" parameter to the above git command in ~/bin/gitk.
[diff.renamelimit only affects git-diff]
All the Best,
Ramsay Jones
--
