Re: [PATCHv2] git-mv: Keep moved index entries inact

Previous thread: none

Next thread: [PATCH] Reformat "your branch has diverged..." lines to reduce line length. by Avery Pennarun on Wednesday, July 16, 2008 - 12:19 pm. (4 messages)
From: Petr Baudis
Date: Wednesday, July 16, 2008 - 12:11 pm

The following series implements submodule support in git mv and git rm,
plus enhancing the submodules testsuite a bit. I'd appreciate comments,
especially on the git mv change, since the index_path_src_sha1 is
really a horrible hack.

The pinnacle of this series was supposed to be merge-recursive support
for submodule-somethingelese conflicts, however that seems a bit more
complicated than I expected, so I decided to first send the rest for
a review.

---

Petr Baudis (7):
      t7403: Submodule git mv, git rm testsuite
      git rm: Support for removing submodules
      git mv: Support moving submodules
      submodule.*: Introduce simple C interface for submodule lookup by path
      git submodule add: Fix naming clash handling
      t7400: Add short "git submodule add" testsuite
      git-mv: Remove dead code branch


 Documentation/git-rm.txt   |    6 +
 Makefile                   |    2 
 builtin-mv.c               |   67 ++++++++++--
 builtin-rm.c               |   65 ++++++++++--
 cache.h                    |    2 
 git-submodule.sh           |   15 ++-
 sha1_file.c                |   10 ++
 submodule.c                |   50 +++++++++
 submodule.h                |    8 +
 t/t7400-submodule-basic.sh |   39 +++++++
 t/t7403-submodule-mvrm.sh  |  242 ++++++++++++++++++++++++++++++++++++++++++++
 11 files changed, 476 insertions(+), 30 deletions(-)
 create mode 100644 submodule.c
 create mode 100644 submodule.h
 create mode 100755 t/t7403-submodule-mvrm.sh

--

From: Petr Baudis
Date: Wednesday, July 16, 2008 - 12:11 pm

The path list builder had a branch for the case the source is not in index, but
this can happen only if the source was a directory. However, in that case we
have already expanded the list to the directory contents and set mode
to WORKING_DIRECTORY, which is tested earlier.

The patch removes the superfluous branch and adds an assert() instead. git-mv
testsuite still passes.

Signed-off-by: Petr Baudis <pasky@suse.cz>
---

 builtin-mv.c |   16 +++++++---------
 1 files changed, 7 insertions(+), 9 deletions(-)

diff --git a/builtin-mv.c b/builtin-mv.c
index 5530e11..158a83d 100644
--- a/builtin-mv.c
+++ b/builtin-mv.c
@@ -227,15 +227,13 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 		if (mode == WORKING_DIRECTORY)
 			continue;
 
-		if (cache_name_pos(src, strlen(src)) >= 0) {
-			path_list_insert(src, &deleted);
-
-			/* destination can be a directory with 1 file inside */
-			if (path_list_has_path(&overwritten, dst))
-				path_list_insert(dst, &changed);
-			else
-				path_list_insert(dst, &added);
-		} else
+		assert(cache_name_pos(src, strlen(src)) >= 0);
+
+		path_list_insert(src, &deleted);
+		/* destination can be a directory with 1 file inside */
+		if (path_list_has_path(&overwritten, dst))
+			path_list_insert(dst, &changed);
+		else
 			path_list_insert(dst, &added);
 	}
 

--

From: Petr Baudis
Date: Wednesday, July 16, 2008 - 12:11 pm

This patch introduces basic tests for

	git submodule add

covering the basic functionality and the -b parameter.

A trivial update --init test fix freeloads on this commit as well.

Signed-off-by: Petr Baudis <pasky@suse.cz>
---

 t/t7400-submodule-basic.sh |   28 +++++++++++++++++++++++++++-
 1 files changed, 27 insertions(+), 1 deletions(-)

diff --git a/t/t7400-submodule-basic.sh b/t/t7400-submodule-basic.sh
index 6c7b902..ab5eb1e 100755
--- a/t/t7400-submodule-basic.sh
+++ b/t/t7400-submodule-basic.sh
@@ -200,7 +200,7 @@ test_expect_success 'update --init' '
 
 	mv init init2 &&
 	git config -f .gitmodules submodule.example.url "$(pwd)/init2" &&
-	git config --remove-section submodule.example
+	git config --remove-section submodule.example &&
 	git submodule update init > update.out &&
 	grep "not initialized" update.out &&
 	test ! -d init/.git &&
@@ -209,4 +209,30 @@ test_expect_success 'update --init' '
 
 '
 
+test_expect_success 'submodule add' '
+
+	git submodule add "$(pwd)/init2" init-added &&
+	test -d init-added/.git &&
+	[ "$(git config -f .gitmodules submodule.init-added.url)" = "$(pwd)/init2" ] &&
+	[ "$(git config -f .gitmodules submodule.init-added.path)" = "init-added" ]
+
+'
+
+test_expect_success 'submodule add -b' '
+
+	(
+		cd init2 &&
+		git checkout -b branch &&
+		echo t >s &&
+		git add s &&
+		git commit -m "change branch" &&
+		git checkout master
+	) &&
+	git submodule add -b branch -- "$(pwd)/init2" init-added-b &&
+	test -d init-added-b/.git &&
+	[ "$(git config -f .gitmodules submodule.init-added-b.url)" = "$(pwd)/init2" ] &&
+	[ "$(cd init2 && git rev-parse branch)" = "$(cd init-added-b && git rev-parse HEAD)" ]
+
+'
+
 test_done

--

From: Petr Baudis
Date: Wednesday, July 16, 2008 - 12:11 pm

This patch fixes git submodule add behaviour when we add submodule
living at a same path as logical name of existing submodule. This
can happen e.g. in case the user git mv's the previous submodule away
and then git submodule add's another under the same name.

A test-case is obviously included.

This is not completely satisfactory since .git/config cross-commit
conflicts can still occur. A question is whether this is worth
handling, maybe it would be worth adding some kind of randomization
of the autogenerated submodule name, e.g. appending $$ or a timestamp.

Signed-off-by: Petr Baudis <pasky@suse.cz>
---

 git-submodule.sh           |   15 ++++++++++++---
 t/t7400-submodule-basic.sh |   11 +++++++++++
 2 files changed, 23 insertions(+), 3 deletions(-)

diff --git a/git-submodule.sh b/git-submodule.sh
index 9228f56..a93547b 100755
--- a/git-submodule.sh
+++ b/git-submodule.sh
@@ -192,10 +192,19 @@ cmd_add()
 	git add "$path" ||
 	die "Failed to add submodule '$path'"
 
-	git config -f .gitmodules submodule."$path".path "$path" &&
-	git config -f .gitmodules submodule."$path".url "$repo" &&
+	name="$path"
+	if git config -f .gitmodules submodule."$name".path; then
+		name="$path~"; i=1;
+		while git config -f .gitmodules submodule."$name".path; do
+			name="$path~$i"
+			i=$((i+1))
+		done
+	fi
+
+	git config -f .gitmodules submodule."$name".path "$path" &&
+	git config -f .gitmodules submodule."$name".url "$repo" &&
 	git add .gitmodules ||
-	die "Failed to register submodule '$path'"
+	die "Failed to register submodule '$path' (name '$name')"
 }
 
 #
diff --git a/t/t7400-submodule-basic.sh b/t/t7400-submodule-basic.sh
index ab5eb1e..092dffc 100755
--- a/t/t7400-submodule-basic.sh
+++ b/t/t7400-submodule-basic.sh
@@ -235,4 +235,15 @@ test_expect_success 'submodule add -b' '
 
 '
 
+test_expect_success 'submodule add auto-naming clash' '
+
+	git submodule add "$(pwd)/init2/.git" example &&
+	test -d example/.git &&
+	[ "$(git config -f .gitmodules ...
From: Petr Baudis
Date: Wednesday, July 16, 2008 - 12:11 pm

This patch adds support for moving submodules to 'git mv', including
rewriting of the .gitmodules file to reflect the movement.

The usage of struct path_list here is a bit abusive, but keeps the
code simple and hopefully still reasonably readable. The horrid
index_path_src_sha1 hack is unfortunately much worse, however the author
is currently unaware of any more reasonable solution of the problem.

Signed-off-by: Petr Baudis <pasky@suse.cz>
---

 builtin-mv.c |   55 +++++++++++++++++++++++++++++++++++++++++++++++++------
 cache.h      |    2 ++
 sha1_file.c  |   10 ++++++++++
 3 files changed, 61 insertions(+), 6 deletions(-)

diff --git a/builtin-mv.c b/builtin-mv.c
index 158a83d..30c4e7d 100644
--- a/builtin-mv.c
+++ b/builtin-mv.c
@@ -9,6 +9,7 @@
 #include "cache-tree.h"
 #include "path-list.h"
 #include "parse-options.h"
+#include "submodule.h"
 
 static const char * const builtin_mv_usage[] = {
 	"git-mv [options] <source>... <destination>",
@@ -60,6 +61,24 @@ static const char *add_slash(const char *path)
 	return path;
 }
 
+static int ce_is_gitlink(int i)
+{
+	return i < 0 ? 0 : S_ISGITLINK(active_cache[i]->ce_mode);
+}
+
+static void rename_submodule(struct path_list_item *i)
+{
+	char *key = submodule_by_path(i->path);
+
+	config_exclusive_filename = ".gitmodules";
+	if (git_config_set(key, (const char *) i->util))
+		die("cannot update .gitmodules");
+	config_exclusive_filename = NULL;
+
+	free(key);
+}
+
+
 static struct lock_file lock_file;
 
 int cmd_mv(int argc, const char **argv, const char *prefix)
@@ -77,9 +96,12 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 	struct stat st;
 	struct path_list overwritten = {NULL, 0, 0, 0};
 	struct path_list src_for_dst = {NULL, 0, 0, 0};
+	/* .util contains sha1 for submodules */
 	struct path_list added = {NULL, 0, 0, 0};
 	struct path_list deleted = {NULL, 0, 0, 0};
 	struct path_list changed = {NULL, 0, 0, 0};
+	/* .path is source path, .util is destination path */
+	struct ...
From: Junio C Hamano
Date: Wednesday, July 16, 2008 - 7:37 pm

I do not think use of path_list->util is particularly bad.  It is a data
structure to store a bag of random pointers that can be indexed with

This one certainly is ugly beyond words.

By the way, why is it even necessary to rehash the contents when you run
"mv"?

IOW,

	$ >foo
        $ git add foo
        $ echo 1 >foo
        $ git mv foo bar

makes "foo" disappear from the index and adds "bar", but it makes the
local change added to the index at the same time.

	Side note. It actually is a lot worse than that.  When you move
	something to another existing entry, the code does not even add
	the object name of moved entry recorded in the index, nor rehashes
	the moved file.  This is totally inconsistent!

I personally think these buggy behaviours you are trying to inherit are
making this patch more complex than necessary.  Perhaps this needs to be
fixed up even further (you did some fix with the first patch) before
adding new features?  For example, I think it is a bug that the
"overwrite" codepath does not work with symlinks.

But let's assume that we do not want to "fix" any of these existing
(mis)behaviour, because doing so would change the semantics.

There are a few cases:

 (1) gitlink exists and so is directory but no repository (i.e. the user is
     not interested);

 (2) gitlink exists and there is a repository.  Its HEAD matches what
     gitlink records;

 (3) gitlink exists and there is a repository.  Its HEAD does not match what
     gitlink records;

The (mis)behaviour that automatically adds moved files to the index we saw
earlier suggests that in case (3) you would want to update the gitlink in
the index with whatever HEAD the submodule repository has to be
consistent.

So instead of adding a index_path_src_sha1 hack like that, how about

 * When you see ce_is_gitlink(j), you keep the original gitlink object
   name.  That is very good in order to preserve it for not just (1) but
   also for (3) once we decide to fix this "auto adding" ...
From: Petr Baudis
Date: Thursday, July 17, 2008 - 6:06 am

Hi!



  I agree that it would be much cleaner to fix this; I got puzzled about
this behaviour a bit, but I was afraid to break the traditional
behaviour. However, if you are feeling this brave, patch to follow up

  This is excellent hint, sort of what I hoped for, thanks! I forgot
about --cacheinfo completely, which is truly a shame especially when I
look at the history of this switch. ;-) (BTW, curiously, the commit
lists Linus as an author even though the patch is yours. Maybe this was
merely some imperfection of the early scripts around Git, though.)
I really did not touch git internals for way too long.

-- 
				Petr "Pasky" Baudis
GNU, n. An animal of South Africa, which in its domesticated state
resembles a horse, a buffalo and a stag. In its wild condition it is
something like a thunderbolt, an earthquake and a cyclone. -- A. Pierce
--

From: Petr Baudis
Date: Thursday, July 17, 2008 - 3:31 pm

The rewrite of git-mv from a shell script to a builtin was perhaps
a little too straightforward: the git add and git rm queues were
emulated directly, which resulted in a rather complicated code and
caused an inconsistent behaviour when moving dirty index entries;
git mv would update the entry based on working tree state,
except in case of overwrites, where the new entry would still have
sha1 of the old file.

This patch introduces rename_index_entry_at() into the index toolkit,
which will rename an entry while removing any entries the new entry
might render duplicate. This is then used in git mv instead
of all the file queues, resulting in a major simplification
of the code and an inevitable change in git mv -n output format.

A new test has been added to the testsuite to reflect this change.
Also, based on suggestion by Junio about desired symlink behaviour
of git mv, I have added two tests for that; however, I do not have
need or desire to spend time fixing this, so they are expected
to fail for now until someone gets around to fixing that.

Signed-off-by: Petr Baudis <pasky@suse.cz>
---

  This patch might depend on git-mv: Remove dead code branch.

 builtin-mv.c  |   62 ++++++++-------------------------------------------------
 cache.h       |    2 ++
 read-cache.c  |   15 ++++++++++++++
 t/t7001-mv.sh |   57 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 83 insertions(+), 53 deletions(-)

diff --git a/builtin-mv.c b/builtin-mv.c
index 33ad082..28ebc9c 100644
--- a/builtin-mv.c
+++ b/builtin-mv.c
@@ -36,17 +36,6 @@ static const char **copy_pathspec(const char *prefix, const char **pathspec,
 	return get_pathspec(prefix, result);
 }
 
-static void show_list(const char *label, struct path_list *list)
-{
-	if (list->nr > 0) {
-		int i;
-		printf("%s", label);
-		for (i = 0; i < list->nr; i++)
-			printf("%s%s", i > 0 ? ", " : "", list->items[i].path);
-		putchar('\n');
-	}
-}
-
 static const char *add_slash(const char *path)
 {
 ...
From: Petr Baudis
Date: Thursday, July 17, 2008 - 3:34 pm

This patch adds support for moving submodules to 'git mv', including
rewriting of the .gitmodules file to reflect the movement.

Signed-off-by: Petr Baudis <pasky@suse.cz>
---

  This is the updated version reflecting the recent rewrite. No other
  changes in my queue yet (except expanding the git rm explanation).

 builtin-mv.c |   37 +++++++++++++++++++++++++++++++++----
 1 files changed, 33 insertions(+), 4 deletions(-)

diff --git a/builtin-mv.c b/builtin-mv.c
index 28ebc9c..62d0c95 100644
--- a/builtin-mv.c
+++ b/builtin-mv.c
@@ -9,6 +9,7 @@
 #include "cache-tree.h"
 #include "path-list.h"
 #include "parse-options.h"
+#include "submodule.h"
 
 static const char * const builtin_mv_usage[] = {
 	"git mv [options] <source>... <destination>",
@@ -49,6 +50,24 @@ static const char *add_slash(const char *path)
 	return path;
 }
 
+static int ce_is_gitlink(int i)
+{
+	return i < 0 ? 0 : S_ISGITLINK(active_cache[i]->ce_mode);
+}
+
+static void rename_submodule(struct path_list_item *i)
+{
+	char *key = submodule_by_path(i->path);
+
+	config_exclusive_filename = ".gitmodules";
+	if (git_config_set(key, (const char *) i->util))
+		die("cannot update .gitmodules");
+	config_exclusive_filename = NULL;
+
+	free(key);
+}
+
+
 static struct lock_file lock_file;
 
 int cmd_mv(int argc, const char **argv, const char *prefix)
@@ -65,6 +84,8 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 	enum update_mode { BOTH = 0, WORKING_DIRECTORY, INDEX } *modes;
 	struct stat st;
 	struct path_list src_for_dst = {NULL, 0, 0, 0};
+	/* .path is source path, .util is destination path */
+	struct path_list submodules = {NULL, 0, 0, 0};
 
 	git_config(git_default_config, NULL);
 
@@ -84,7 +105,8 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 		/* special case: "." was normalized to "" */
 		destination = copy_pathspec(dest_path[0], argv, argc, 1);
 	else if (!lstat(dest_path[0], &st) &&
-			S_ISDIR(st.st_mode)) {
+			S_ISDIR(st.st_mode) ...
From: Junio C Hamano
Date: Saturday, July 19, 2008 - 4:54 pm

Hmm, would this use of copy_cache_entry() kosher, I have to wonder.  This
new copy of cache entry begins its life unhashed, doesn't it?  Shouldn't
we be not copying its hashed/unhashed bits from the old one?

Also setting of that ce_flags looks wrong when namelen does not fit within
the width of CE_NAMEMASK.  Shouldn't it be doing the same thing as


"rev-parse :dirty"?

--

From: Petr Baudis
Date: Sunday, July 20, 2008 - 5:23 pm

I have to admit that I don't clearly understand all the index entry
cannot simply use existing functions for this, but I want to keep a
different set of traits that either copy_cache_entry() or

Oh. Well, you sounded agitated in your original mail, but this actually

I want to make sure the whole index entry is intact, not just the sha1.

-- 
				Petr "Pasky" Baudis
As in certain cults it is possible to kill a process if you know
its true name.  -- Ken Thompson and Dennis M. Ritchie
--

From: Petr Baudis
Date: Sunday, July 20, 2008 - 5:25 pm

The rewrite of git-mv from a shell script to a builtin was perhaps
a little too straightforward: the git add and git rm queues were
emulated directly, which resulted in a rather complicated code and
caused an inconsistent behaviour when moving dirty index entries;
git mv would update the entry based on working tree state,
except in case of overwrites, where the new entry would still have
sha1 of the old file.

This patch introduces rename_index_entry_at() into the index toolkit,
which will rename an entry while removing any entries the new entry
might render duplicate. This is then used in git mv instead
of all the file queues, resulting in a major simplification
of the code and an inevitable change in git mv -n output format.

A new test has been added to the testsuite to reflect this change.
Also, based on suggestion by Junio about desired symlink behaviour
of git mv, I have added two tests for that; however, I do not have
need or desire to spend time fixing this, so they are expected
to fail for now until someone gets around to fixing that.

Signed-off-by: Petr Baudis <pasky@suse.cz>
---

 builtin-mv.c  |   62 ++++++++-------------------------------------------------
 cache.h       |    2 ++
 read-cache.c  |   17 ++++++++++++++++
 t/t7001-mv.sh |   55 +++++++++++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 83 insertions(+), 53 deletions(-)

diff --git a/builtin-mv.c b/builtin-mv.c
index 33ad082..28ebc9c 100644
--- a/builtin-mv.c
+++ b/builtin-mv.c
@@ -36,17 +36,6 @@ static const char **copy_pathspec(const char *prefix, const char **pathspec,
 	return get_pathspec(prefix, result);
 }
 
-static void show_list(const char *label, struct path_list *list)
-{
-	if (list->nr > 0) {
-		int i;
-		printf("%s", label);
-		for (i = 0; i < list->nr; i++)
-			printf("%s%s", i > 0 ? ", " : "", list->items[i].path);
-		putchar('\n');
-	}
-}
-
 static const char *add_slash(const char *path)
 {
 	int len = strlen(path);
@@ -75,11 +64,7 @@ int cmd_mv(int argc, ...
From: Junio C Hamano
Date: Sunday, July 20, 2008 - 9:36 pm

I haven't looked into the builtin-mv changes to see how involved that fix
would be yet (and I do not use "mv" and this is somewhat lower priority
for me myself), but I did look at changes to read-cache.c.  I've queued a
fixed up version in 'pu' for now but I am hoping that we can see some
comments from more competent people on the patch and move the review
result to 'next' shortly.

This may be a change in semantics, but after thinking about it a bit more,
I think this can even go into maintenance series.  IOW, it is really a
fix.  Nobody sane should have been relying on the old behaviour that new
contents of dirty paths are staged in the index sometimes, which was
simply just crazy.
--

From: Junio C Hamano
Date: Friday, July 25, 2008 - 11:46 pm

Thanks.  I think I've managed to fix the rename_index_entry_at() in a
satisfactory way, and also made builtin-mv to allow "mv -f symlink file"
and "mv -f file symlink".


A symlink to us is just a different kind of blob, and by definition a blob
is the leaf level of a tree structure that represents the working tree in
the index. There won't be anything hanging below it, and when adding
things to the index we should not dereference the symlink to see where it
leads to.

Traditionally we have been loose about this check, and the normal "git
add" and "git update-index" codepath is still forever broken, and we
allow:

	$ mkdir dir
        $ >dir/file
        $ ln -s dir symlink
        $ git add symlink/file

but some codepaths that matter more (because they do larger damage
unattended, as opposed to the above command sequence that can be avoided
by user education a bit more easily), such as "git apply" and "git
read-tree", have been corrected using has_symlink_leading_path() since mid
2007.  We would need to follow through c40641b (Optimize symlink/directory
detection, 2008-05-09) and further fix "git add" and "git update-index"
codepaths to forbid the above command sequence.

So my take on the above test piece is that after:

	>moved
	mkdir dir
        >dir/file
        ln -s dir symlink
	git add moved dir symlink

This should fail, as it is an overwrite:

	git mv moved symlink

and with "-f", the command should simply remove symlink and replace it
with a regular file whose contents come from the original "moved".

IOW, what a symlink points at should not matter.
--

From: Petr Baudis
Date: Sunday, July 27, 2008 - 6:41 am

Oh, sorry, I didn't realize there were still problems with the original

You convinced me, yes. (Especially since I started actually using
symlinks in some of my projects very recently and this would be the
exact semantic I would eventually expect as well.)

-- 
				Petr "Pasky" Baudis
As in certain cults it is possible to kill a process if you know
its true name.  -- Ken Thompson and Dennis M. Ritchie
--

From: Petr Baudis
Date: Sunday, July 27, 2008 - 6:47 am

Currently, git-mv will declare "not under source control" on an attempt
to move a conflicted index entry. This patch adds an expect_failure
testcase for this case, since this is an artificial restriction. (However,
the scenario is not critical enough for the author to fix right now.)

Signed-off-by: Petr Baudis <pasky@suse.cz>
---

I don't really know if it is ok to make "feature requests" like this by
adding failing testcases...

 t/t7001-mv.sh |   27 +++++++++++++++++++++++++++
 1 files changed, 27 insertions(+), 0 deletions(-)

diff --git a/t/t7001-mv.sh b/t/t7001-mv.sh
index 7e47931..241e9a2 100755
--- a/t/t7001-mv.sh
+++ b/t/t7001-mv.sh
@@ -173,6 +173,33 @@ test_expect_success 'git mv should not change sha1 of moved cache entry' '
 
 rm -f dirty dirty2
 
+cat >multistage <<EOT
+100644 e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 1	staged
+100755 e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 2	staged
+100755 d00491fd7e5bb6fa28c517a0bb32b8b506539d4d 3	staged
+EOT
+
+# Rationale: I cannot git mv around a conflicted file. This is unnecessary
+# restriction in case another part of conflict resolution requires me to
+# move the file around.
+test_expect_failure 'git mv should move all stages of cache entry' '
+
+	rm -fr .git &&
+	git init &&
+	# git mv requires object to exist in working tree (bug?)
+	touch staged &&
+	git update-index --index-info <multistage &&
+	git ls-files --stage >lsf_output &&
+	test_cmp multistage lsf_output &&
+	git mv staged staged-mv &&
+	sed "s/staged/staged-mv/" <multistage >multistage-mv &&
+	git ls-files --stage >lsf_output &&
+	test_cmp multistage-mv lsf_output
+
+'
+
+rm -f multistage multistage-mv lsf_output staged
+
 test_expect_failure 'git mv should overwrite symlink to a file' '
 
 	rm -fr .git &&

--

From: Junio C Hamano
Date: Sunday, July 27, 2008 - 6:13 pm

Of course it depends on who you are.  It's not Ok for somebody like you,
who is known to be competent, and certainly it is not Ok to do it on a
command that you know you care more about than I do.

I've neglected builtin-mv.c since its introduction, mostly because I never
say "git mv" myself (in other words, I haven't cared very much how broken
it was.  For one thing, its indentation style is peculiar and is hard to

Yes, I would agree this is a reasonable thing to support.  Something like
this patch, perhaps.

---
 builtin-mv.c  |   21 ++++++++++++++-------
 t/t7001-mv.sh |   21 +++++++++++++++++++++
 2 files changed, 35 insertions(+), 7 deletions(-)

diff --git a/builtin-mv.c b/builtin-mv.c
index 4f65b5a..cc9e505 100644
--- a/builtin-mv.c
+++ b/builtin-mv.c
@@ -96,7 +96,7 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 	/* Checking */
 	for (i = 0; i < argc; i++) {
 		const char *src = source[i], *dst = destination[i];
-		int length, src_is_dir;
+		int length, src_is_dir, pos;
 		const char *bad = NULL;
 
 		if (show_only)
@@ -177,7 +177,8 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 				} else
 					bad = "Cannot overwrite";
 			}
-		} else if (cache_name_pos(src, length) < 0)
+		} else if (((pos = cache_name_pos(src, length)) < 0) &&
+			   strcmp(active_cache[-1 - pos]->name, src))
 			bad = "not under version control";
 		else if (string_list_has_string(&src_for_dst, dst))
 			bad = "multiple sources for the same target";
@@ -202,7 +203,6 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 	for (i = 0; i < argc; i++) {
 		const char *src = source[i], *dst = destination[i];
 		enum update_mode mode = modes[i];
-		int pos;
 		if (show_only || verbose)
 			printf("Renaming %s to %s\n", src, dst);
 		if (!show_only && mode != INDEX &&
@@ -212,10 +212,17 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 		if (mode == WORKING_DIRECTORY)
 			continue;
 
-		pos = cache_name_pos(src, ...
From: Junio C Hamano
Date: Sunday, July 27, 2008 - 6:21 pm

Just in case if somebody is inclined to test the patch and polish it into

There is a bug here; "-1 - pos" needs to be checked against active_nr
before strcmp().

--

From: SZEDER
Date: Monday, July 28, 2008 - 7:20 am

Hi,

there is a race somewhere in these 'git-mv: Keep moved index entries
inact' changes.

The test cases 'git mv should overwrite symlink to a file' or 'git mv
should overwrite file with a symlink' fail occasionaly.  It's quite
non-deterministic:  I have run t7001-mv.sh in a loop (see below) and
one or the other usually fails around 50 runs (but sometimes only
after 150).  Adding some tracing echos to the tests shows that both
tests fail when running 'git diff-files' at the end.

Regards,
Gábor


---
#!/bin/bash

ret=0
i=0
while test $ret = 0 ; do
        GIT_TEST_OPTS='--verbose --debug' make t7001-mv.sh
        ret=$?
        i=$((i+1))
done
echo "Failed at ${i}th run"
--

From: Johannes Schindelin
Date: Monday, July 28, 2008 - 8:06 am

Hi,

On Mon, 28 Jul 2008, SZEDER G
From: Johannes Schindelin
Date: Monday, July 28, 2008 - 8:14 am

Hi,

On Mon, 28 Jul 2008, Johannes Schindelin wrote:

> On Mon, 28 Jul 2008, SZEDER G
From: Johannes Schindelin
Date: Monday, July 28, 2008 - 11:24 am

Hi,

> > On Mon, 28 Jul 2008, SZEDER G
From: Junio C Hamano
Date: Monday, July 28, 2008 - 12:19 pm

It's because we rename(2) but do not read back ctime, and reuse the cached
data from the old path that was renamed.  After the failed test that moves
a regular file "move" to "symlink":

$ stat symlink
  File: `symlink'
  Size: 2               Blocks: 8          IO Block: 4096   regular file
Device: 30ah/778d       Inode: 18104337    Links: 1
Access: (0664/-rw-rw-r--)  Uid: ( 1012/   junio)   Gid: (   40/     src)
Access: 2008-07-28 11:49:55.000000000 -0700
Modify: 2008-07-28 11:48:41.000000000 -0700
Change: 2008-07-28 11:48:42.000000000 -0700

But the cached stat information looks like this:

$ ../../git-ls-files --stat
ctime=1217270921, mtime=1217270921, ino=18104337, mode=100644, uid=1012, gid=40symlink

We need to refresh the entry to pick up potential ctime changes.


 read-cache.c       |    7 ++++++-
 builtin-ls-files.c |   21 +++++++++++++++------
 2 files changed, 21 insertions(+), 7 deletions(-)

diff --git a/read-cache.c b/read-cache.c
index 1cae361..834096f 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -40,7 +40,7 @@ static void replace_index_entry(struct index_state *istate, int nr, struct cache
 
 void rename_index_entry_at(struct index_state *istate, int nr, const char *new_name)
 {
-	struct cache_entry *old = istate->cache[nr], *new;
+	struct cache_entry *old = istate->cache[nr], *new, *refreshed;
 	int namelen = strlen(new_name);
 
 	new = xmalloc(cache_entry_size(namelen));
@@ -51,6 +51,11 @@ void rename_index_entry_at(struct index_state *istate, int nr, const char *new_n
 
 	cache_tree_invalidate_path(istate->cache_tree, old->name);
 	remove_index_entry_at(istate, nr);
+
+	/* the renaming could have smudged the ctime */
+	refreshed = refresh_cache_entry(new, 0);
+	if (refreshed && refreshed != new)
+		new = refreshed;
 	add_index_entry(istate, new, ADD_CACHE_OK_TO_ADD|ADD_CACHE_OK_TO_REPLACE);
 }
 
diff --git a/builtin-ls-files.c b/builtin-ls-files.c
index e8d568e..a6b30c8 100644
--- a/builtin-ls-files.c
+++ b/builtin-ls-files.c
@@ ...
From: Johannes Schindelin
Date: Monday, July 28, 2008 - 4:41 pm

Hi,

> > On Mon, 28 Jul 2008, SZEDER G
From: Johannes Schindelin
Date: Monday, July 28, 2008 - 4:55 pm

Hi,


IOW something like this squashed into your patch:

-- snipsnap --

 t/t7001-mv.sh |    4 ++++
 1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/t/t7001-mv.sh b/t/t7001-mv.sh
index b0fa407..c749059 100755
--- a/t/t7001-mv.sh
+++ b/t/t7001-mv.sh
@@ -179,6 +179,10 @@ test_expect_success 'git mv should overwrite symlink to a file' '
 	git init &&
 	echo 1 >moved &&
 	ln -s moved symlink &&
+	if test ! -z "$TEST_EXPENSIVE_CTIME"
+	then
+		sleep 1
+	fi &&
 	git add moved symlink &&
 	test_must_fail git mv moved symlink &&
 	git mv -f moved symlink &&
--

From: Petr Baudis
Date: Monday, July 28, 2008 - 5:17 pm

If you are going to apply this, you might be interested in squashing
this:

diff --git a/Documentation/git-ls-files.txt b/Documentation/git-ls-files.txt
index f43af41..2ead7af 100644
--- a/Documentation/git-ls-files.txt
+++ b/Documentation/git-ls-files.txt
@@ -53,7 +53,14 @@ OPTIONS
 
 -s::
 --stage::
-	Show stage files in the output
+	Show cached files in an extended format also listing the entry
+	mode, sha1 and stage number; see below for details.
+
+-S::
+--stat::
+	Show cached files in a special format listing stat information
+	stored in the index; this is useful probably only for Git
+	debugging purposes.
 
 --directory::
 	If a whole directory is classified as "other", show just its

(and even if you aren't, maybe the first part is useful, though it's
minor.)

-- 
				Petr "Pasky" Baudis
As in certain cults it is possible to kill a process if you know
its true name.  -- Ken Thompson and Dennis M. Ritchie
--

From: Junio C Hamano
Date: Monday, July 28, 2008 - 5:46 pm

I did think about ctime issues when I applied the patch.

rename(2) is hardlink to new name followed by unlink of old name, so
internally link count changes twice (and the first "link to new" can
exceed max links and is even allowed to make the whole thing fail); but I
do not think of any other reason for this change in ctime.
--

From: Junio C Hamano
Date: Monday, July 28, 2008 - 10:23 pm

Actually I changed my mind.

I think it is wrong for something as low-level as rename_index_entry_at()
to unconditionally look at the working tree and try refreshing the entry.
The next caller of this function may not even require a working tree.

I think Dscho is correct; expecting the saved cacheinfo to stay fresh
across rename does not make much sense.  What we care about with "git mv"
is that we keep what the user staged, i.e. kind of blob and the contents.
It cannot be guaranteed if the index entry is stat clean or not, as
st_ctime may or may not be updated across rename(2) according to POSIX.

So let's do this instead.

-- >8 --
t7001: fix "git mv" test 

The test assumed that we can keep the cached stat information fresh across
rename(2); many filesystems however update st_ctime (and POSIX allows them
to do so), so that assumption does not hold.  We can explicitly refresh
the index for the purpose of the test, as the only thing we are interested
in is the staged contents and the mode bits are preserved across "mv".

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 t/t7001-mv.sh |    2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/t/t7001-mv.sh b/t/t7001-mv.sh
index b0fa407..910a28c 100755
--- a/t/t7001-mv.sh
+++ b/t/t7001-mv.sh
@@ -185,6 +185,7 @@ test_expect_success 'git mv should overwrite symlink to a file' '
 	! test -e moved &&
 	test -f symlink &&
 	test "$(cat symlink)" = 1 &&
+	git update-index --refresh &&
 	git diff-files --quiet
 
 '
@@ -202,6 +203,7 @@ test_expect_success 'git mv should overwrite file with a symlink' '
 	git mv -f symlink moved &&
 	! test -e symlink &&
 	test -h moved &&
+	git update-index --refresh &&
 	git diff-files --quiet
 
 '

--

From: Junio C Hamano
Date: Monday, August 4, 2008 - 12:49 am

I started to revisit this issue and patched "git update-index --add"
and "git add" so far.  Patches follow.

I think we should also check the following and fix them if any of them is
broken.  Under the same "dir/file exists but symlink points at dir"
scenario:

 * "git rm symlink/file" --- should fail without removing dir/file;

 * "git ls-tree $commit -- symlink/file" should *not* fail if $commit does
   have "symlink/file" in it (iow, we cannot add the logic to get_pathspec());

 * "git ls-files --exclude-standard -o -- symlink/file" should not talk
   about "symlink/file".

If we reword the paragraph I quoted at the beginning of this message
slightly:

	A gitlink to is is just a leaf level of a tree structure that
	represents the working tree in the index.  There won't be anything
	hanging below it.

We would need a similar check to stop at module boundary, just like the
helper function we use for these patches, has_symlink_leading_path(),
stops at a symlink.

So without further ado...
--

From: Junio C Hamano
Date: Monday, August 4, 2008 - 12:51 am

When "sym" is a symbolic link that is inside the working tree, and it
points at a directory "dir" that has "path" in it, "update-index --add
sym/path" used to mistakenly add "sym/path" as if "sym" were a normal
directory.

"git apply", "git diff" and "git merge" have been taught about this issue
some time ago, but "update-index" and "add" have been left ignorant for
too long.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin-update-index.c     |    5 ++++-
 t/t0055-beyond-symlinks.sh |   20 ++++++++++++++++++++
 2 files changed, 24 insertions(+), 1 deletions(-)
 create mode 100755 t/t0055-beyond-symlinks.sh

diff --git a/builtin-update-index.c b/builtin-update-index.c
index 38eb53c..434cb8e 100644
--- a/builtin-update-index.c
+++ b/builtin-update-index.c
@@ -194,6 +194,10 @@ static int process_path(const char *path)
 	int len;
 	struct stat st;
 
+	len = strlen(path);
+	if (has_symlink_leading_path(len, path))
+		return error("'%s' is beyond a symbolic link", path);
+
 	/*
 	 * First things first: get the stat information, to decide
 	 * what to do about the pathname!
@@ -201,7 +205,6 @@ static int process_path(const char *path)
 	if (lstat(path, &st) < 0)
 		return process_lstat_error(path, errno);
 
-	len = strlen(path);
 	if (S_ISDIR(st.st_mode))
 		return process_directory(path, len, &st);
 
diff --git a/t/t0055-beyond-symlinks.sh b/t/t0055-beyond-symlinks.sh
new file mode 100755
index 0000000..eb11dd7
--- /dev/null
+++ b/t/t0055-beyond-symlinks.sh
@@ -0,0 +1,20 @@
+#!/bin/sh
+
+test_description='update-index refuses to add beyond symlinks'
+
+. ./test-lib.sh
+
+test_expect_success setup '
+	>a &&
+	mkdir b &&
+	ln -s b c &&
+	>c/d &&
+	git update-index --add a b/d
+'
+
+test_expect_success 'update-index --add beyond symlinks' '
+	test_must_fail git update-index --add c/d &&
+	! ( git ls-files | grep c/d )
+'
+
+test_done
-- 
1.6.0.rc1.64.g61192

--

From: Junio C Hamano
Date: Monday, August 4, 2008 - 12:52 am

This is the same fix for the issue of adding "sym/path" when "sym" is a
symblic link that points at a directory "dir" with "path" in it.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin-add.c              |   12 +++++++++++-
 dir.c                      |    6 +++++-
 t/t0055-beyond-symlinks.sh |    7 ++++++-
 3 files changed, 22 insertions(+), 3 deletions(-)

diff --git a/builtin-add.c b/builtin-add.c
index fc3f96e..81b64d7 100644
--- a/builtin-add.c
+++ b/builtin-add.c
@@ -153,6 +153,16 @@ static const char **validate_pathspec(int argc, const char **argv, const char *p
 {
 	const char **pathspec = get_pathspec(prefix, argv);
 
+	if (pathspec) {
+		const char **p;
+		for (p = pathspec; *p; p++) {
+			if (has_symlink_leading_path(strlen(*p), *p)) {
+				int len = prefix ? strlen(prefix) : 0;
+				die("'%s' is beyond a symbolic link", *p + len);
+			}
+		}
+	}
+
 	return pathspec;
 }
 
@@ -278,7 +288,7 @@ int cmd_add(int argc, const char **argv, const char *prefix)
 		fprintf(stderr, "Maybe you wanted to say 'git add .'?\n");
 		return 0;
 	}
-	pathspec = get_pathspec(prefix, argv);
+	pathspec = validate_pathspec(argc, argv, prefix);
 
 	/*
 	 * If we are adding new files, we need to scan the working
diff --git a/dir.c b/dir.c
index 29d1d5b..ae7046f 100644
--- a/dir.c
+++ b/dir.c
@@ -727,8 +727,12 @@ static void free_simplify(struct path_simplify *simplify)
 
 int read_directory(struct dir_struct *dir, const char *path, const char *base, int baselen, const char **pathspec)
 {
-	struct path_simplify *simplify = create_simplify(pathspec);
+	struct path_simplify *simplify;
 
+	if (has_symlink_leading_path(strlen(path), path))
+		return dir->nr;
+
+	simplify = create_simplify(pathspec);
 	read_directory_recursive(dir, path, base, baselen, 0, simplify);
 	free_simplify(simplify);
 	qsort(dir->entries, dir->nr, sizeof(struct dir_entry *), cmp_name);
diff --git a/t/t0055-beyond-symlinks.sh b/t/t0055-beyond-symlinks.sh
index eb11dd7..b29c37a ...
From: Linus Torvalds
Date: Monday, August 4, 2008 - 5:21 pm

Patches look good to me, but did you check the performance impact?

The rewritten 'has_symlink_leading_path()' should do ok, but it migth 
still be a huge performance downside to check all the paths for things 
like "git add -u".

I didn't test, though.

		Linus
--

From: Junio C Hamano
Date: Monday, August 4, 2008 - 5:54 pm

Not yet.

I think this is a necessary "correctness" thing to do regardless of the
performance impact, and adding the logic to stop at submodule boundary
(aka gitlinks) should come before optimization.

--

From: Linus Torvalds
Date: Monday, August 4, 2008 - 6:43 pm

Well, "performance" is a feature too, and it's not correct to say that "X 
should be fixed before optimization". If "X" slows things down, the 
question should be whether it really needs fixing..

Yes, we find symlinks when we do _new_ files, but is it really so bad to 
assume that existing directories that we have already added to the index 
are stable? It can easily be seen as a feature too that you can force git 
to ignore the symlink and see it as a real directory.

So that's why it would be interesting to hear about the performance 
impact. Because this is definitely not a black-and-white "one behavior is 
wrong and one behavior is right".

			Linus
--

From: Johannes Schindelin
Date: Monday, August 4, 2008 - 6:59 pm

Hi,


Actually, whatever you want, it needs fixing.

I vividly remember being quite pissed by Git replacing a symbolic link in 
my working directory with a directory, and instead of updating the files 
which were technically outside of the repository, Git populated that newly 
created directory.

However, please note that Junio's patch affects git-add, AFAIR, not 
git-update-index.

Ciao,
Dscho

--

From: Linus Torvalds
Date: Monday, August 4, 2008 - 7:28 pm

Well, that can cut both ways. For example, I vividly remember a time in 
the distant past when harddisks were tiny, and I didn't have insanely 
high-end hardware, and I was building the X server, but had to split 
things up over two partitions because each individual partition was 
too full.

IOW, sometimes you may _want_ to use symlinks that way, even within one 
project - with a symlink allowing you to move parts of it around 
"transparently".

Of course, these days under Linux we can just use bind mounts, so the use 
of symlinks to stitch together two or more different trees is fairly 
old-fashioned, but is still the only option on some systems or if you 
don't have root.

(These days harddisks are also generally so big that it never happens. But 
on my EeePC laptop, I still end up with two filesystems, 4GB  and 8GB 
each. So it's not inconceivable to be in that kind of situation even 
today).

			Linus
--

From: Junio C Hamano
Date: Monday, August 4, 2008 - 9:44 pm

Really?  I thought I added a test case to cover the plumbing as well...
--

From: Johannes Schindelin
Date: Tuesday, August 5, 2008 - 4:23 am

Hi,


Right, I missed that one.  I kinda hoped that only porcelain is affected.

Ciao,
Dscho

--

From: Junio C Hamano
Date: Wednesday, August 6, 2008 - 11:52 pm

I couldn't quite measure much meaningful performance impact.

My test repository was the kernel tree at v2.6.27-rc2-20-g685d87f, without
any build products nor editor temporary crufts.

By the way, if anybody wants to reproduce this, be careful with the tests
that run "rm -f .git/index" before adding everything.  After doing so, if
you check the result with "git diff --stat HEAD", you will notice many
missing files --- I almost got a heart attack before inspecting this file:

	$ cat arch/powerpc/.gitignore
        include

Yes, it excludes 261 already tracked files.  Is it intended?  I dunno.

The file is there since 06f2138 ([POWERPC] Add files build to .gitignore,
2006-11-26).  Not that having tracked files in an entirely ignored
directory is a bug, but the .gitignore entry seems to me an Oops waiting
to happen.  There are three commits that affect this directory that is
entirely ignored after that entry is added:

    b5b9309 (remove unnecessary <linux/hdreg.h> includes, 2008-08-05)
    9c4cb82 (powerpc: Remove use of CONFIG_PPC_MERGE, 2008-08-02)
    b8b572e (powerpc: Move include files to arch/powerpc/include/asm, 2008-08-01)

Anyhow, back on topic.  Here are the numbers.

Test #1: With fully up-to-date .git/index, "git add .", best of 5 runs

* v1.6.0-rc2
0.20user 0.20system 0:00.41elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+3622minor)pagefaults 0swaps

* v1.6.0-rc1-73-g725b060 (with patch)
0.22user 0.19system 0:00.41elapsed 100%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+3635minor)pagefaults 0swaps


Test #2: After "rm -f .git/index", "git add .", best of 5 runs

* v1.6.0-rc2
1.81user 0.55system 0:02.51elapsed 93%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+92855minor)pagefaults 0swaps

* v1.6.0-rc1-73-g725b060 (with patch)
1.76user 0.56system 0:02.58elapsed 89%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+92852minor)pagefaults 0swaps


Test #3: same as #2, after ...
From: Junio C Hamano
Date: Friday, August 8, 2008 - 1:55 pm

Seeing that you applied my arch/powerpc/.gitignore patch to the kernel
(Yaay, I now have a short-log entry in the kernel history ;-), you have
seen the message with some benchmarks I am replying to as well?

I actually was expecting to see measurable slowdown by checking leading
symbolic link more often, but I couldn't, which means either there is not
much downside that the bugfix would hurt real-life usage, or I was too
incompetent to come up with an example to show that the bugfix actually
hurts the performance.

If the former, I think it is a very good news (If the latter, well, we
have a bigger problem :-<), as the affected codepaths are exactly the ones
that need to be fixed when somebody tries to add files in a submodule
directory by mistake, which is an issue (iirc) Dscho had a separate patch
to fix earlier.  Tracked symbolic links may be rare, but it seems that
many people are using submodules, and that case cannot be hand-waved
around by calling it is a feature.


--

From: Linus Torvalds
Date: Friday, August 8, 2008 - 4:45 pm

Yes, I just felt that closed the discussion.

I only brought up the issue in the first place because I worried about the 
performance impact. And I was unhappy about how that worry was dismissed 
as being less important than some specious "correctness" issue (for the 
last 3+ years, performance has mattered a _lot_, and the claimed big 
"correctness" issue has not mattered one whit).

The thing is, sometimes "pi = 3.14" is (a) infinitely faster than the 
"correct" answer and (b) the difference between the "correct" and the 
"wrong" answer is meaningless. And this is why I get upset when somebody 
dismisses performance issues based on "correctness".

The thing is, some specious value of "correctness" is often irrelevant 
because it doesn't matter. While performance almost _always_ matters. And 
I absolutely _detest_ the fact that people so often dismiss performance 
concerns so readily.

But once the performance numbers are in and they don't show any issues, I 
think that simply settles the original query, and I'm happy.

		Linus
--

From: Johannes Schindelin
Date: Sunday, July 20, 2008 - 6:20 pm

Hi,


"rev-parse :dirty" will have to read the index to get at the object name 
of "dirty".  So there you have your index validation for you.

Ciao,
Dscho

--

From: Petr Baudis
Date: Monday, July 21, 2008 - 12:18 am

Hi,


  it will test if the entry stays _valid_, but not whether it stays the
_same_.

				Petr "Pasky" Baudis
--

From: Junio C Hamano
Date: Monday, July 21, 2008 - 12:38 am

You are right.  You would want to catch breakages like a file losing its
executable bits, which you cannot detect by grabbing "rev-parse :dirty"
and comparing it with the expected value.
--

From: Petr Baudis
Date: Wednesday, July 16, 2008 - 12:11 pm

The interface will be used for git-mv and git-rm submodule support.
So far, only the submodule_by_path() function is defined, however more
can be probably expected in the future if/when the git-submodule command
is ported from shell.

Signed-off-by: Petr Baudis <pasky@suse.cz>
---

 Makefile    |    2 ++
 submodule.c |   50 ++++++++++++++++++++++++++++++++++++++++++++++++++
 submodule.h |    8 ++++++++
 3 files changed, 60 insertions(+), 0 deletions(-)
 create mode 100644 submodule.c
 create mode 100644 submodule.h

diff --git a/Makefile b/Makefile
index 9b52071..742b5cb 100644
--- a/Makefile
+++ b/Makefile
@@ -368,6 +368,7 @@ LIB_H += run-command.h
 LIB_H += sha1-lookup.h
 LIB_H += sideband.h
 LIB_H += strbuf.h
+LIB_H += submodule.h
 LIB_H += tag.h
 LIB_H += transport.h
 LIB_H += tree.h
@@ -458,6 +459,7 @@ LIB_OBJS += sha1_name.o
 LIB_OBJS += shallow.o
 LIB_OBJS += sideband.o
 LIB_OBJS += strbuf.o
+LIB_OBJS += submodule.o
 LIB_OBJS += symlinks.o
 LIB_OBJS += tag.o
 LIB_OBJS += trace.o
diff --git a/submodule.c b/submodule.c
new file mode 100644
index 0000000..2883ae6
--- /dev/null
+++ b/submodule.c
@@ -0,0 +1,50 @@
+#include "cache.h"
+#include "submodule.h"
+
+
+struct gitmodules_info {
+	const char *path;
+	char *key;
+};
+
+static int gitmodules_worker(const char *key, const char *value, void *info_)
+{
+	struct gitmodules_info *info = info_;
+	const char *subkey;
+
+	if (prefixcmp(key, "submodule."))
+		return 0;
+
+	subkey = strrchr(key, '.');
+	if (!subkey)
+		return 0;
+
+	if (strcmp(subkey, ".path"))
+		return 0;
+
+	if (strcmp(value, info->path))
+		return 0;
+
+	/* Found the key to change. */
+	if (info->key) {
+		error("multiple submodules live at path `%s'", info->path);
+		/* The last one is supposed to win. */
+		free(info->key);
+	}
+	info->key = xstrdup(key);
+	return 0;
+}
+
+char *submodule_by_path(const char *path)
+{
+	struct gitmodules_info info = { path, NULL };
+
+	config_exclusive_filename = ...
From: Petr Baudis
Date: Wednesday, July 16, 2008 - 12:11 pm

This patch adds support for removing submodules to 'git rm', including
removing the appropriate sections from the .gitmodules file to reflect this

Signed-off-by: Petr Baudis <pasky@suse.cz>
---

 Documentation/git-rm.txt |    6 +++-
 builtin-rm.c             |   65 ++++++++++++++++++++++++++++++++++++++--------
 2 files changed, 58 insertions(+), 13 deletions(-)

diff --git a/Documentation/git-rm.txt b/Documentation/git-rm.txt
index 4d0c495..bfc3dfa 100644
--- a/Documentation/git-rm.txt
+++ b/Documentation/git-rm.txt
@@ -20,7 +20,8 @@ and no updates to their contents can be staged in the index,
 though that default behavior can be overridden with the `-f` option.
 When '--cached' is given, the staged content has to
 match either the tip of the branch or the file on disk,
-allowing the file to be removed from just the index.
+allowing the file to be removed from just the index;
+this is always the case when removing submodules.
 
 
 OPTIONS
@@ -56,7 +57,8 @@ OPTIONS
 --cached::
 	Use this option to unstage and remove paths only from the index.
 	Working tree files, whether modified or not, will be
-	left alone.
+	left alone.  Note that this is always assumed when removing
+	a checked-out submodule.
 
 --ignore-unmatch::
 	Exit with a zero status even if no files matched.
diff --git a/builtin-rm.c b/builtin-rm.c
index 22c9bd1..363d1fa 100644
--- a/builtin-rm.c
+++ b/builtin-rm.c
@@ -9,6 +9,7 @@
 #include "cache-tree.h"
 #include "tree-walk.h"
 #include "parse-options.h"
+#include "submodule.h"
 
 static const char * const builtin_rm_usage[] = {
 	"git-rm [options] [--] <file>...",
@@ -17,16 +18,21 @@ static const char * const builtin_rm_usage[] = {
 
 static struct {
 	int nr, alloc;
-	const char **name;
+	struct {
+		const char *name;
+		int is_gitlink;
+	} *info;
 } list;
 
-static void add_list(const char *name)
+static void add_list(const char *name, int is_gitlink)
 {
 	if (list.nr >= list.alloc) {
 		list.alloc = ...
From: Johannes Schindelin
Date: Wednesday, July 16, 2008 - 3:41 pm

Hi,


I have no time to look at the whole series, or even at the patch, but I 
have concerns.  Do you really remove the whole directory?  Because if you 
do, you remove more than what can be possibly reconstructed from the 
object store.

That is much more dangerous than what "git rm" does.

For example, you can have local branches, remote settings, untracked 
files, etc. in the subdirectory.  And that cannot be recovered once 
deleted.

I wonder if it really makes sense to integrate that into git-rm, and not 
git-submodule, if only to introduce another level of consideration for the 
user before committing what is potentially a big mistake.

Ciao,
Dscho


--

From: Petr Baudis
Date: Thursday, July 17, 2008 - 5:35 am

Hi,


  no; I remove only the index entry and the empty directory made for
non-checked-out submodules (I just try rmdir() and ignore ENOTEMPTY
error).  I make that clear in git rm documentation, but not in the patch

  That is good question and I forgot to elaborate on this in the cover
letter. However, I believe that integrating this functionality into the
basic commands makes for a smoother user experience, following the
principle of the least surprise. It is difficult for me to argue
extensively in further favor of this choice, though. :-)

-- 
				Petr "Pasky" Baudis
GNU, n. An animal of South Africa, which in its domesticated state
resembles a horse, a buffalo and a stag. In its wild condition it is
something like a thunderbolt, an earthquake and a cyclone. -- A. Pierce
--

From: Johannes Schindelin
Date: Thursday, July 17, 2008 - 5:59 am

Hi,



Makes sense,
Dscho

--

From: Petr Baudis
Date: Wednesday, July 16, 2008 - 12:11 pm

The testsuite for newly added submodule support in git mv, git rm.

Signed-off-by: Petr Baudis <pasky@suse.cz>
---

 t/t7403-submodule-mvrm.sh |  242 +++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 242 insertions(+), 0 deletions(-)
 create mode 100755 t/t7403-submodule-mvrm.sh

diff --git a/t/t7403-submodule-mvrm.sh b/t/t7403-submodule-mvrm.sh
new file mode 100755
index 0000000..9b50d6a
--- /dev/null
+++ b/t/t7403-submodule-mvrm.sh
@@ -0,0 +1,242 @@
+#!/bin/sh
+#
+# Copyright (c) 2008 Johannes Schindelin
+#
+
+test_description='Test submodules support in git mv and git rm'
+
+. ./test-lib.sh
+
+test_expect_success setup '
+
+	(mkdir sub-repo &&
+	 cd sub-repo &&
+	 git init &&
+	 echo file > file &&
+	 git add file &&
+	 git commit -m "sub initial") &&
+	(cp -r sub-repo sub2-repo &&
+	 cd sub2-repo &&
+	 echo file2 > file &&
+	 git add file &&
+	 git commit -m "sub commit2") &&
+	git submodule add "$(pwd)/sub-repo" sub &&
+	git submodule add "$(pwd)/sub2-repo" sub2 &&
+	git commit -m initial &&
+	test "$(git config -f .gitmodules submodule.sub.path)" = "sub" &&
+	test "$(git config -f .gitmodules submodule.sub2.path)" = "sub2"
+
+'
+
+test_expect_success 'git mv of a submodule' '
+
+	git mv sub sub.moved &&
+	! test -d sub &&
+	test -d sub.moved/.git &&
+	! git ls-files --error-unmatch sub &&
+	test "$(git ls-files --stage --error-unmatch sub.moved | cut -d " " -f 1)" = 160000 &&
+	test "$(git config -f .gitmodules submodule.sub.path)" = "sub.moved" &&
+	! git config -f .gitmodules submodule.sub.moved.path
+
+'
+
+test_expect_success 'git submodule add vs. git mv' '
+
+	! git submodule add "$(pwd)/sub2-repo" sub.moved &&
+	git submodule add "$(pwd)/sub2-repo" sub &&
+	test -d sub/.git &&
+	test "$(git config -f .gitmodules submodule.sub.url)" = "$(pwd)/sub-repo" &&
+	test "$(git config -f .gitmodules submodule.sub.path)" = "sub.moved" &&
+	test "$(git config -f .gitmodules submodule.sub~.path)" = ...
Previous thread: none

Next thread: [PATCH] Reformat "your branch has diverged..." lines to reduce line length. by Avery Pennarun on Wednesday, July 16, 2008 - 12:19 pm. (4 messages)