These two patches are extremly useful for those users who chose submodules
because they wanted to reduce the time it takes for "git status" and "git
diff" to recurse through the whole work tree by putting sometimes large
amounts of files into submodules, which weren't scanned in the past.
Since 1.7.0 "git status" and "git diff" recurse through submodule work
trees, which has a - sometimes drastic - performance impact when large
submodules are present. Using the "ignore=dirty" config option restores
the former behaviour.
This option can be set in the .gitmodules file or in .git/config, any
settings found in the latter override those from .gitmodules. This
enables the distribution of a default setting for this option by simply
committing a modified .gitmodules file without each developer having to
call "git submodule sync". When using "git status" or "git diff" with
"--ignore-submodules=none" option the default behavior of scanning all
submodules work trees can be restored temporarily.
Comments?
Jens Lehmann (2):
Submodules: Add the new "ignore" config option for diff and status
Submodules: Use "ignore" settings from .gitmodules too for diff and
status
Documentation/config.txt | 13 ++++
Documentation/diff-options.txt | 6 ++-
Documentation/git-status.txt | 6 ++-
Documentation/gitmodules.txt | 15 ++++
builtin/commit.c | 2 +
builtin/diff-files.c | 2 +
builtin/diff-index.c | 2 +
builtin/diff-tree.c | 2 +
builtin/diff.c | 2 +
diff-lib.c | 15 +++--
diff.c | 35 ++++++++--
diff.h | 1 +
submodule.c | 63 ++++++++++++++++-
submodule.h | 4 +
t/t4027-diff-submodule.sh | 139 ++++++++++++++++++++++++++++++++++++
t/t7508-status.sh | 154 ++++++++++++++++++++++++++++++++++++++-
wt-status.c | 8 ++-
17 ...The new "ignore" config option controls the default behavior for "git status" and the diff family. It specifies under what circumstances they consider submodules as modified and can be set separately for each submodule. The command line option "--ignore-submodules=" has been extended to accept the new parameter "none" for both status and diff. Users that chose submodules to get rid of long work tree scanning times might want to set the "dirty" option for those submodules. This brings back the pre 1.7.0 behavior, where submodule work trees were never scanned for modifications. By using "--ignore-submodules=none" on the command line the status and diff commands can be told to do a full scan. This option can be set to the following values (which have the same name and meaning as for the "--ignore-submodules" option of status and diff): "all": All changes to the submodule will be ignored. "dirty": Only differences of the commit recorded in the superproject and the submodules HEAD will be considered modifications, all changes to the work tree of the submodule will be ignored. When using this value, the submodule will not be scanned for work tree changes at all, leading to a performance benefit on large submodules. "untracked": Only untracked files in the submodules work tree are ignored, a changed HEAD and/or modified files in the submodule will mark it as modified. "none" (which is the default): Either untracked or modified files in a submodules work tree or a difference between the subdmodules HEAD and the commit recorded in the superproject will make it show up as changed. This value is added as a new parameter for the "--ignore-submodules" option of the diff family and "git status" so the user can override the settings in the configuration. Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de> --- Documentation/config.txt | 10 ++++ Documentation/diff-options.txt | 6 ++- Documentation/git-status.txt | 6 ++- diff-lib.c ...
The .gitmodules file is parsed for "submodule.<name>.ignore" entries before looking for them in .git/config. Thus settings found in .git/config will override those from .gitmodules, thereby allowing the local developer to ignore settings given by the remote side while also letting upstream set defaults for those users who don't have special needs. Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de> --- Documentation/config.txt | 3 ++ Documentation/diff-options.txt | 2 +- Documentation/git-status.txt | 2 +- Documentation/gitmodules.txt | 15 +++++++++ builtin/commit.c | 2 + builtin/diff-files.c | 2 + builtin/diff-index.c | 2 + builtin/diff-tree.c | 2 + builtin/diff.c | 2 + submodule.c | 12 +++++++ submodule.h | 1 + t/t4027-diff-submodule.sh | 66 ++++++++++++++++++++++++++++++++++++++++ t/t7508-status.sh | 63 +++++++++++++++++++++++++++++++++++++- 13 files changed, 171 insertions(+), 3 deletions(-) diff --git a/Documentation/config.txt b/Documentation/config.txt index 783f264..06e70cb 100644 --- a/Documentation/config.txt +++ b/Documentation/config.txt @@ -1756,6 +1756,9 @@ submodule.<submodulename>.ignore:: let submodules with modified tracked files in their work tree show up. Using "none" (the default when this option is not set) also shows submodules that have untracked files in their work tree as changed. + This setting overrides any setting made in .gitmodules for this submodule, + both settings can be overriden on the command line by using the + "--ignore-submodule" option. tar.umask:: This variable can be used to restrict the permission bits of diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt index 9cf7506..faa467b 100644 --- a/Documentation/diff-options.txt +++ b/Documentation/diff-options.txt @@ -334,7 +334,7 @@ endif::git-format-patch[] Using "none" ...
Hmph. The value of "submodule.<name>.path" does not have to be "<name>". There seems to be a bit of confusion here. I thought that [PATCH 1/2] used the path to the submodule, not its name, as the key. Even though the patch to Documentation/config.txt talks about "submodule.<submodulename>.ignore", what the code actually seems to do in set-from-submodule-config is to use path. But this patch seems to key off of <name>. I suspect that it would make more sense to use <name> so that once the user configures ignore in .git/config, it will persist across moving of the submodule in the superproject tree. In either case, we would want to be consistent between the two, no? --
Thanks for clearing up my confusion, this is v2 of this patch using the
path configured for the submodule instead of its name.
Jens Lehmann (2):
Submodules: Add the new "ignore" config option for diff and status
Submodules: Use "ignore" settings from .gitmodules too for diff and
status
Documentation/config.txt | 13 +++
Documentation/diff-options.txt | 6 +-
Documentation/git-status.txt | 6 +-
Documentation/gitmodules.txt | 15 ++++
builtin/commit.c | 2 +
builtin/diff-files.c | 2 +
builtin/diff-index.c | 2 +
builtin/diff-tree.c | 2 +
builtin/diff.c | 2 +
diff-lib.c | 15 +++-
diff.c | 35 +++++++--
diff.h | 1 +
submodule.c | 76 +++++++++++++++++-
submodule.h | 4 +
t/t4027-diff-submodule.sh | 152 ++++++++++++++++++++++++++++++++++
t/t7508-status.sh | 175 +++++++++++++++++++++++++++++++++++++++-
wt-status.c | 8 ++-
17 files changed, 496 insertions(+), 20 deletions(-)
--
1.7.2.1.54.g6bed1
--
The new "ignore" config option controls the default behavior for "git status" and the diff family. It specifies under what circumstances they consider submodules as modified and can be set separately for each submodule. The command line option "--ignore-submodules=" has been extended to accept the new parameter "none" for both status and diff. Users that chose submodules to get rid of long work tree scanning times might want to set the "dirty" option for those submodules. This brings back the pre 1.7.0 behavior, where submodule work trees were never scanned for modifications. By using "--ignore-submodules=none" on the command line the status and diff commands can be told to do a full scan. This option can be set to the following values (which have the same name and meaning as for the "--ignore-submodules" option of status and diff): "all": All changes to the submodule will be ignored. "dirty": Only differences of the commit recorded in the superproject and the submodules HEAD will be considered modifications, all changes to the work tree of the submodule will be ignored. When using this value, the submodule will not be scanned for work tree changes at all, leading to a performance benefit on large submodules. "untracked": Only untracked files in the submodules work tree are ignored, a changed HEAD and/or modified files in the submodule will mark it as modified. "none" (which is the default): Either untracked or modified files in a submodules work tree or a difference between the subdmodules HEAD and the commit recorded in the superproject will make it show up as changed. This value is added as a new parameter for the "--ignore-submodules" option of the diff family and "git status" so the user can override the settings in the configuration. Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de> --- Documentation/config.txt | 10 ++++ Documentation/diff-options.txt | 6 ++- Documentation/git-status.txt | 6 ++- diff-lib.c ...
The .gitmodules file is parsed for "submodule.<name>.ignore" entries before looking for them in .git/config. Thus settings found in .git/config will override those from .gitmodules, thereby allowing the local developer to ignore settings given by the remote side while also letting upstream set defaults for those users who don't have special needs. Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de> --- Documentation/config.txt | 3 + Documentation/diff-options.txt | 2 +- Documentation/git-status.txt | 2 +- Documentation/gitmodules.txt | 15 +++++++ builtin/commit.c | 2 + builtin/diff-files.c | 2 + builtin/diff-index.c | 2 + builtin/diff-tree.c | 2 + builtin/diff.c | 2 + submodule.c | 19 ++++++++ submodule.h | 1 + t/t4027-diff-submodule.sh | 76 +++++++++++++++++++++++++++++++++ t/t7508-status.sh | 91 ++++++++++++++++++++++++++++++++++++---- 13 files changed, 209 insertions(+), 10 deletions(-) diff --git a/Documentation/config.txt b/Documentation/config.txt index 06faa02..d97b986 100644 --- a/Documentation/config.txt +++ b/Documentation/config.txt @@ -1756,6 +1756,9 @@ submodule.<name>.ignore:: let submodules with modified tracked files in their work tree show up. Using "none" (the default when this option is not set) also shows submodules that have untracked files in their work tree as changed. + This setting overrides any setting made in .gitmodules for this submodule, + both settings can be overriden on the command line by using the + "--ignore-submodule" option. tar.umask:: This variable can be used to restrict the permission bits of diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt index 9cf7506..faa467b 100644 --- a/Documentation/diff-options.txt +++ b/Documentation/diff-options.txt @@ -334,7 +334,7 @@ endif::git-format-patch[] Using "none" will consider ...
Hmph, I thought that it would make more sense to use <name> so that once the user configures ignore in .git/config, it will persist across moving of the submodule in the superproject tree. I wonder what others would think. Will queue, anyway. Thanks. --
I think that is just what this patch does, it uses the "config_name_for_path" string_list to get the submodules <name> for the path to then look up what is configured for the "ignore" option under that <name>. So the "ignore" option should survive renames, no? Or did I manage to confuse myself again? --
Yeah, what you said is all correct. I just misread "... this patch using the path ... instead of its name" to mean that the user-level interface is now keying off of a path, but that is not what you meant. You instead meant that the name is used as the key and it used as if it is a path in the previous round, but now it goes through name-to-path mapping. Thanks. --
Hi, This is incredibly useful stuff. Indeed, the time for one "git status" went down from > 50 sec to < 2 sec again in my project. (I have the impression that it still does a little too much, but I will have to have some time for strace()ing to find out for real.) Therefore, I applied it to 4msysgit.git's 'devel' branch. However, one thing is inconvenient: there are config options per submodule, but no single option to just turn the behavior back to before. I know the reasoning behind showing dirty submodules, so I am fine with that feature being opt-out, but for my real world scenario, it just slows me down, so I _need_ to turn it off. To put my money where my mouth is (since I need it myself, I also pushed it to 4msysgit.git's 'devel' branch): -- snipsnap -- From f7b1e98a0d850e5b1bc56a825fcfe5c87076dfa9 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin <johannes.schindelin@gmx.de> Date: Thu, 5 Aug 2010 10:49:55 +0200 Subject: [PATCH] Add the 'diff.ignoreSubmodules' config setting When you have a lot of submodules checked out, the time penalty to check for dirty submodules can easily imply a multiplication of the total time by the factor 20. This makes the difference between almost instantaneous (< 2 seconds) and unbearably slow (> 50 seconds) here, since the disk caches are constantly overloaded. To this end, the submodule.*.ignore config option was introduced, but it is per-submodule. This commit introduces a global config setting to set a default (porcelain) value for the --ignore-submodules option, keeping the default at 'none'. It can be overridden by the submodule.*.ignore setting and by the --ignore-submodules option. Incidentally, this commit fixes an issue with the overriding logic: multiple --ignore-submodules options would not clear the previously set flags. While at it, fix a typo in the documentation for submodule.*.ignore. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> --- Documentation/config.txt | 7 ...
Nice, at least from a cursory look. diff_setup() is shared by everybody both plumbing and Porcelain that use diff and log, but you placed the smudging of the default based on config is in diff_ui_config(), so this patch looks safe. And the default setting can be overriden with --ignore-submodules from the command line during diff_opt_parse(), which is also safe and sane. --
Yup, makes sense to me too. Acked-by: Jens Lehmann <Jens.Lehmann@web.de> --
Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de> --- And here are some test cases for this new option. t/t4027-diff-submodule.sh | 10 +++++++++- t/t7508-status.sh | 12 ++++++++++-- 2 files changed, 19 insertions(+), 3 deletions(-) diff --git a/t/t4027-diff-submodule.sh b/t/t4027-diff-submodule.sh index 1bc6e77..d99814a 100755 --- a/t/t4027-diff-submodule.sh +++ b/t/t4027-diff-submodule.sh @@ -115,6 +115,9 @@ test_expect_success 'git diff HEAD with dirty submodule (work tree, refs match)' ' test_expect_success 'git diff HEAD with dirty submodule (work tree, refs match) [.git/config]' ' + git config diff.ignoreSubmodules all && + git diff HEAD >actual && + ! test -s actual && git config submodule.subname.ignore none && git config submodule.subname.path sub && git diff HEAD >actual && @@ -136,10 +139,14 @@ test_expect_success 'git diff HEAD with dirty submodule (work tree, refs match) sed -e "1,/^@@/d" actual >actual.body && expect_from_to >expect.body $subprev $subprev-dirty && test_cmp expect.body actual.body && - git config --remove-section submodule.subname + git config --remove-section submodule.subname && + git config --unset diff.ignoreSubmodules ' test_expect_success 'git diff HEAD with dirty submodule (work tree, refs match) [.gitmodules]' ' + git config diff.ignoreSubmodules dirty && + git diff HEAD >actual && + ! test -s actual && git config --add -f .gitmodules submodule.subname.ignore none && git config --add -f .gitmodules submodule.subname.path sub && git diff HEAD >actual && @@ -166,6 +173,7 @@ test_expect_success 'git diff HEAD with dirty submodule (work tree, refs match) test_cmp expect.body actual.body && git config --remove-section submodule.subname && git config --remove-section -f .gitmodules submodule.subname && + git config --unset diff.ignoreSubmodules && rm .gitmodules ' diff --git a/t/t7508-status.sh b/t/t7508-status.sh index 1aae762..9c14b85 100755 --- a/t/t7508-status.sh +++ ...
