Re: [PATCH 2/7] git-submodule: Extract absolute_url & move absolute url logic to module_clone

Previous thread: none

Next thread: none
From: Ping Yin
Date: Wednesday, April 16, 2008 - 7:19 am

This is a resend of the RFC patches some days ago, with only minor
code modification and log refinement. Also swap the order of the last
two patches.

Since there is less feedback these days, i don't know much what you
guys think of this patch series. However, i have use this series for a
long time and think personally it is useful when having many submodules. 

So i resend it, and look forward to its acceptance.

This patch series has following functional improvements for submodule

 - Fall back on .gitmodules if info not found in $GIT_DIR/config
 - multi-level module definition
 - Don't die when subcommand fails for one module

 Actually, they seems three independent improvements. But the first two
 improvements are both dependent on the first two refactoring patches
 and the 3rd improvement is dependent on the implementation of the
 first two improvements. So i have to send them in batch.

Patches 1,2,4 is mainly code refactor but the second one also
has some semantic change.

The other patches do the real functional changes.

Ping Yin (7):
      git-submodule: Extract functions module_info and module_url
      git-submodule: Extract absolute_url & move absolute url logic to module_clone
      git-submodule: Fall back on .gitmodules if info not found in $GIT_DIR/config
      git-submodule: Extract module_add from cmd_add
      git-submodule: multi-level module definition
      git-submodule: "update --force" to enforce cloning non-submodule
      git-submodule: Don't die when command fails for one submodule

 git-submodule.sh           |  325 ++++++++++++++++++++++++++++++++------------
 t/t7400-submodule-basic.sh |   31 ++++-
 2 files changed, 266 insertions(+), 90 deletions(-)


Following is the diff with former RFC patch series

diff --git a/git-submodule.sh b/git-submodule.sh
index 8bea97a..0ecc4ff 100755
--- a/git-submodule.sh
+++ b/git-submodule.sh
@@ -354,7 +354,7 @@ cmd_init()
 		exit_status=1 &&
 		continue
 		# Skip already registered paths
-		git ...
From: Ping Yin
Date: Wednesday, April 16, 2008 - 7:19 am

module_info is extracted to remove the logic redundance which acquires
module names and urls by path filter in several places.

module_url is also extracted to prepare for an alternative logic to get url by
module name.

Signed-off-by: Ping Yin <pkufranky@gmail.com>
---
 git-submodule.sh |   40 ++++++++++++++++++++++++++++------------
 1 files changed, 28 insertions(+), 12 deletions(-)

diff --git a/git-submodule.sh b/git-submodule.sh
index a745e42..0d82ec1 100755
--- a/git-submodule.sh
+++ b/git-submodule.sh
@@ -82,6 +82,25 @@ module_name()
        echo "$name"
 }
 
+module_url() {
+	git config submodule.$1.url
+}
+
+module_info() {
+	git ls-files --stage -- "$@" | grep -e '^160000 ' |
+	while read mode sha1 stage path
+	do
+		name=$(module_name "$path")
+		if test -n "$name"
+		then
+			url=$(module_url "$name")
+			echo "$sha1	$path	$name	$url"
+		else
+			echo "$sha1	$path		"
+		fi
+	done
+}
+
 #
 # Clone a submodule
 #
@@ -232,12 +251,11 @@ cmd_init()
 		shift
 	done
 
-	git ls-files --stage -- "$@" | grep '^160000 ' |
-	while read mode sha1 stage path
+	module_info "$@" |
+	while read sha1 path name url
 	do
+		test -n "$name" || exit
 		# Skip already registered paths
-		name=$(module_name "$path") || exit
-		url=$(git config submodule."$name".url)
 		test -z "$url" || continue
 
 		url=$(GIT_CONFIG=.gitmodules git config submodule."$name".url)
@@ -286,11 +304,10 @@ cmd_update()
 		shift
 	done
 
-	git ls-files --stage -- "$@" | grep '^160000 ' |
-	while read mode sha1 stage path
+	module_info "$@" |
+	while read sha1 path name url
 	do
-		name=$(module_name "$path") || exit
-		url=$(git config submodule."$name".url)
+		test -n "$name" || exit
 		if test -z "$url"
 		then
 			# Only mention uninitialized submodules when its
@@ -538,11 +555,10 @@ cmd_status()
 		shift
 	done
 
-	git ls-files --stage -- "$@" | grep '^160000 ' |
-	while read mode sha1 stage path
+	module_info "$@" |
+	while read sha1 path name url
 ...
From: Ping Yin
Date: Wednesday, April 16, 2008 - 7:19 am

Extract function absolute_url to remove code redundance and inconsistence in
cmd_init and cmd_add when resolving relative url/path to absolute one.

Also move resolving absolute url logic from cmd_add to module_clone which
results in a litte behaviour change: cmd_update originally doesn't
resolve absolute url but now it will.

This behaviour change breaks t7400 which uses relative url './.subrepo'.
However, this test originally doesn't mean to test relative url with './',
so fix the url as '.subrepo'.

Signed-off-by: Ping Yin <pkufranky@gmail.com>
---
 git-submodule.sh           |   41 ++++++++++++++++++-----------------------
 t/t7400-submodule-basic.sh |    2 +-
 2 files changed, 19 insertions(+), 24 deletions(-)

diff --git a/git-submodule.sh b/git-submodule.sh
index 0d82ec1..d3ae1e4 100755
--- a/git-submodule.sh
+++ b/git-submodule.sh
@@ -65,6 +65,21 @@ resolve_relative_url ()
 	echo "$remoteurl/$url"
 }
 
+# Resolve relative url/path to absolute one
+absolute_url () {
+	case "$1" in
+	./*|../*)
+		# dereference source url relative to parent's url
+		url="$(resolve_relative_url $1)" ;;
+	*)
+		# Turn the source into an absolute path if it is local
+		url=$(get_repo_base "$1") ||
+		url=$1
+		;;
+	esac
+	echo "$url"
+}
+
 #
 # Map submodule path to submodule name
 #
@@ -112,7 +127,7 @@ module_info() {
 module_clone()
 {
 	path=$1
-	url=$2
+	url=$(absolute_url "$2")
 
 	# If there already is a directory at the submodule path,
 	# expect it to be empty (since that is the default checkout
@@ -195,21 +210,7 @@ cmd_add()
 			die "'$path' already exists and is not a valid git repo"
 		fi
 	else
-		case "$repo" in
-		./*|../*)
-			# dereference source url relative to parent's url
-			realrepo="$(resolve_relative_url $repo)" ;;
-		*)
-			# Turn the source into an absolute path if
-			# it is local
-			if base=$(get_repo_base "$repo"); then
-				repo="$base"
-			fi
-			realrepo=$repo
-			;;
-		esac
-
-		module_clone "$path" "$realrepo" || ...
From: Ping Yin
Date: Wednesday, April 16, 2008 - 7:19 am

Originally, the submodule workflow enforces 'git init' in the beginning
which copies submodule config info from .gitmodules to $GIT_DIR/config.
Then all subcommands except 'init' and 'add' fetch submodule info from
$GIT_DIR/config and .gitmodules can be discarded.

However, there may be inconsistence between .git/config and .gitmodules
when always using 'git init' at first. If upstream .gitmodules changes,
it is not easy to sync the changes to $GIT_DIR/config.

Running 'git init' again may not help much in this case.  Since .git/config
has a whole copy of .gitmodules, the user has no easy way to know which
entries should follow the upstream changes and which entires shouldn't.

Actually, .gitmodules which formly only acted as info hints can and should
play a more important and essential role.

As an analogy to .gitignore and .git/info/excludes which are for colleagues'
and individual wishes separately, .gitmodules is for common requirements and
$GIT_DIR/config is for special requirements.

This patch implements a fall back strategy to satisfy both common and
special requirements as follows.

$GIT_DIR/config only keeps submodule info different from .gitmodules.
And the info from $GIT_DIR/config take higher precedence. The code first
consults $GIT_DIR/config and then fall back on in-tree .gitmodules file.

With this patch, init subcommand becomes not forcefull and less meaningful.
And now it is just a tool to help users copy info to $GIT_DIR/config
(and may modify it later) only when they need.

Signed-off-by: Ping Yin <pkufranky@gmail.com>
---
 git-submodule.sh           |    9 ++++-----
 t/t7400-submodule-basic.sh |   29 +++++++++++++++++++++++++++++
 2 files changed, 33 insertions(+), 5 deletions(-)

diff --git a/git-submodule.sh b/git-submodule.sh
index d3ae1e4..2276f6b 100755
--- a/git-submodule.sh
+++ b/git-submodule.sh
@@ -98,7 +98,8 @@ module_name()
 }
 
 module_url() {
-	git config submodule.$1.url
+	git config submodule.$1.url ||
+	GIT_CONFIG=.gitmodules ...
From: Ping Yin
Date: Wednesday, April 16, 2008 - 7:19 am

cmd_add will later handle the case adding multiple modules, so extract
module_add to add a single module.

Signed-off-by: Ping Yin <pkufranky@gmail.com>
---
 git-submodule.sh |   67 +++++++++++++++++++++++++++++++----------------------
 1 files changed, 39 insertions(+), 28 deletions(-)

diff --git a/git-submodule.sh b/git-submodule.sh
index 2276f6b..f3a1213 100755
--- a/git-submodule.sh
+++ b/git-submodule.sh
@@ -155,34 +155,7 @@ module_clone()
 #
 # optional branch is stored in global branch variable
 #
-cmd_add()
-{
-	# parse $args after "submodule ... add".
-	while test $# -ne 0
-	do
-		case "$1" in
-		-b | --branch)
-			case "$2" in '') usage ;; esac
-			branch=$2
-			shift
-			;;
-		-q|--quiet)
-			quiet=1
-			;;
-		--)
-			shift
-			break
-			;;
-		-*)
-			usage
-			;;
-		*)
-			break
-			;;
-		esac
-		shift
-	done
-
+module_add() {
 	repo=$1
 	path=$2
 
@@ -226,6 +199,44 @@ cmd_add()
 }
 
 #
+# Add a new submodule to the working tree, .gitmodules and the index
+#
+# $@ = repo [path]
+#
+# optional branch is stored in global branch variable
+#
+cmd_add()
+{
+	# parse $args after "submodule ... add".
+	while test $# -ne 0
+	do
+		case "$1" in
+		-b | --branch)
+			case "$2" in '') usage ;; esac
+			branch=$2
+			shift
+			;;
+		-q|--quiet)
+			quiet=1
+			;;
+		--)
+			shift
+			break
+			;;
+		-*)
+			usage
+			;;
+		*)
+			break
+			;;
+		esac
+		shift
+	done
+
+	module_add "$1" "$2"
+}
+
+#
 # Register submodules in .git/config
 #
 # $@ = requested paths (default to all)
-- 
1.5.5.70.gd68a

--

From: Ping Yin
Date: Wednesday, April 16, 2008 - 7:19 am

This patch introduces multi-level module definition and '--module-name'
option to designate submodules by logical names instead of path filters.
Then the init/update/status/add subcommand is enhanced combined with
this option.

The multi-level module definition in .gitmodules was first suggested by
Linus and etc. in mails "Let .git/config specify the url for submodules"
(http://article.gmane.org/gmane.comp.version-control.git/48939).

Following shows an example of such a .gitmodules which finally comes
from the group notation of 'git remote' which is suggested by Johannes
Schindelin.

.gitmodules with multiple level of indirection
------------------------------------------------------
[submodules]
	service = crawler search
	crawler = util imcrawter
	search = util imsearch
[submodule "util"]
	url = git://xyzzy/util.git
[submodule "imsearch"]
	path = search/imsearch
	url = git://xyzzy/imsearch.git
[submodule "imcrawler"]
	path = crawler/imcrawter
	url = git://xyzzy/imcrawter.git
------------------------------------------------------

By adding the 'submodules' section, we can define multi-level modules
in an infinite levels of indirection.

The "-m|--module-name" option is introduced with which submodules are
designated by logical names instead of real paths as following shows.

Identical commands forms with/without "--module-name"
---------------------------------------------------
$ git submodule XXX util imcrawler              (1)
$ git submodule XXX -m crawler                  (2)
$ git submodule XXX util imcrawler imsearch     (3)
$ git submodule XXX -m service                  (4)
$ git submodule XXX -m crawler search           (5)
---------------------------------------------------
* XXX represents status, update or init, but not add
* (1) and (2) are identical conditionally (explained below)
* (3), (4) and (5) are identical conditionally

There are still minor difference between these two forms.

In the no "--module-name" form, the path parameter may be ...
From: Ping Yin
Date: Wednesday, April 16, 2008 - 7:19 am

If the update subcommand combines with --force, instead of
issuing a "Not a submodule" warning for non-submodules, non-submodules
(i.e. modules existing in .gitmodules or $GIT_DIR/config but not added
to the super module) will also be cloned and the master branch will be
checked out.

However, if a non-submodule has already been cloned before, the update
will be rejected since we don't know what the update means.

Signed-off-by: Ping Yin <pkufranky@gmail.com>
---
 git-submodule.sh |   13 ++++++++++++-
 1 files changed, 12 insertions(+), 1 deletions(-)

diff --git a/git-submodule.sh b/git-submodule.sh
index 87d84fa..ed6f698 100755
--- a/git-submodule.sh
+++ b/git-submodule.sh
@@ -361,6 +361,9 @@ cmd_update()
 		-q|--quiet)
 			quiet=1
 			;;
+		-f | --force)
+			force="$1"
+			;;
 		--)
 			shift
 			break
@@ -381,7 +384,8 @@ cmd_update()
 		test -n "$name" || exit
 		if test $sha1 = 0000000000000000000000000000000000000000
 		then
-			say "Not a submodule: $name @ $path"
+			test -z "$force" &&
+			say "Not a submodule: $name @ $path" &&
 			continue
 		elif test -z "$url"
 		then
@@ -395,8 +399,15 @@ cmd_update()
 		if ! test -d "$path"/.git
 		then
 			module_clone "$path" "$url" || exit
+			test "$sha1" = 0000000000000000000000000000000000000000 &&
+			(unset GIT_DIR; cd "$path" && git checkout -q master) &&
+			say "non-submodule cloned and master checked out: $name @ $path" &&
+			continue
 			subsha1=
 		else
+			test "$sha1" = 0000000000000000000000000000000000000000 &&
+			say "non-submodule already cloned: $name @ $path" &&
+			continue
 			subsha1=$(unset GIT_DIR; cd "$path" &&
 				git rev-parse --verify HEAD) ||
 			die "Unable to find current revision in submodule path '$path'"
-- 
1.5.5.70.gd68a

--

From: Ping Yin
Date: Wednesday, April 16, 2008 - 7:19 am

When handling multiple modules, init/update/status/add subcommand will
exit when it fails for one submodule. This patch makes the subcommand
continue bypassing the failure and keep right exit status.

Signed-off-by: Ping Yin <pkufranky@gmail.com>
---
 git-submodule.sh |   87 +++++++++++++++++++++++++++++++++++++++---------------
 1 files changed, 63 insertions(+), 24 deletions(-)

diff --git a/git-submodule.sh b/git-submodule.sh
index ed6f698..0ecc4ff 100755
--- a/git-submodule.sh
+++ b/git-submodule.sh
@@ -193,15 +193,19 @@ module_clone()
 	# succeed but the rmdir will fail. We might want to fix this.
 	if test -d "$path"
 	then
-		rmdir "$path" 2>/dev/null ||
-		die "Directory '$path' exist, but is neither empty nor a git repository"
+		! rmdir "$path" 2>/dev/null &&
+		say "Directory '$path' exist, but is neither empty nor a git repository" &&
+		return 1
 	fi
 
 	test -e "$path" &&
-	die "A file already exist at path '$path'"
+	say "A file already exist at path '$path'" &&
+	return 1
 
-	git-clone -n "$url" "$path" ||
-	die "Clone of '$url' into submodule path '$path' failed"
+	! git-clone -n "$url" "$path" &&
+	say "Clone of '$url' into submodule path '$path' failed" &&
+	return 1
+	:
 }
 
 #
@@ -227,7 +231,8 @@ module_add() {
 	fi
 
 	git ls-files --error-unmatch "$path" > /dev/null 2>&1 &&
-	die "'$path' already exists in the index"
+	say "'$path' already exists in the index" &&
+	return 1
 
 	# perhaps the path exists and is already a git repo, else clone it
 	if test -e "$path"
@@ -237,21 +242,26 @@ module_add() {
 		then
 			echo "Adding existing repo at '$path' to the index"
 		else
-			die "'$path' already exists and is not a valid git repo"
+			say "'$path' already exists and is not a valid git repo"
+			return 1
 		fi
 	else
-		module_clone "$path" "$repo" || exit
-		(unset GIT_DIR; cd "$path" && git checkout -q ${branch:+-b "$branch" "origin/$branch"}) ||
-		die "Unable to checkout submodule '$path'"
+		module_clone "$path" "$repo" || ...
From: Junio C Hamano
Date: Monday, April 21, 2008 - 11:10 pm

Isn't ".subrepo" a relative URL that says "subdirectory of the current
one, whose name is .subrepo", exactly the same way as "./.subrepo" is?
Shouldn't they behave the same?

If the test found they do not behave the same, perhaps the new code is
broken in some way and isn't "fixing" the test simply hiding a bug?


Why does this call-site matter?  The URL is given to "git-clone" which I


Ok.
--

From: Ping Yin
Date: Monday, April 21, 2008 - 11:50 pm

I just want to unify the behaviour of handling relative url.

'git submodule add'  treats './foo' and 'foo' as different urls. The
1st one is relative to remote.origin.url, while the 2nd one is
relative the current directory. I think this kind of behaviour is
better for submodules, so i unify the handling of relative urls as
this.

With this kind of behaviour, i can set 'submodule.foo.url=./foo' in
.gitmodules or $GIT_DIR/config. And when remote.origin.url changes, i
have not to change submodule.foo.url if the super project and

As said above.



-- 
Ping Yin
--

From: Junio C Hamano
Date: Tuesday, April 22, 2008 - 12:57 am

Please have that kind of justification in the proposed commit log message.
When these changes are made into history, people cannot ask you questions
like I did and expect the history to produce such answer on demand ;-)
--

From: Ping Yin
Date: Tuesday, April 22, 2008 - 2:09 am

OK, i'll resend this patch tonight.

-- 
Ping Yin
--

From: Ping Yin
Date: Tuesday, April 22, 2008 - 7:38 am

See attached patch


-- 
Ping Yin
From: Ping Yin
Date: Tuesday, April 22, 2008 - 7:41 am

Only the commit message changes.


git-submodule: Fix inconsistent handling of relative urls with './' prefix

There is a little inconsistence in current handling of relative url
with "./"

- "git submodule add ./foo" will clone the submodule with url
  "${remote.origin.url}/foo" and init an entry 'submodule.foo.url=./foo"
  in .gitmodules

- "git submodule init" will init an entry in $GIT_DIR/config as
  "submodule.foo.url=${remote.origin.url}/foo"

- However, if there is an entry "submodule.foo.url=./foo" in
  $GIT_DIR/config, "git submodule update" will not expand
  "./foo" with remote.origin.url

This patch unifies the behaviour of handling relative urls with './'
prefix. Now "git submodule init" copies urls from .gitmodules to
$GIT_DIR/config as is without expanding. And the url expanding happens
only at runtime, say when "git submodule add" or "git submodule update".

absolute_url is extracted to remove code redundance and fix inconsistence
in cmd_init and cmd_add when resolving relative url/path to absolute one.

Also move resolving absolute url logic from cmd_add to module_clone which
results in the expected behaviour change: cmd_update will resolve url
'./foo' in $GIT_DIR/config as "${remote.origin.url}/foo" instead of
"$(pwd)/foo".

This behaviour change breaks t7400 which uses relative url './.subrepo'.
However, this test originally doesn't mean to test relative url with './'
prefix, so fix the url as '.subrepo'.



-- 
Ping Yin
--

From: Ping Yin
Date: Tuesday, April 22, 2008 - 12:00 am

Following was my answer days ago

There is a little inconsistence in current logic

1. git submodule add ./foo will expand foo with remote.origin.url and
    init an entry in .gitmodules as "submodule.foo.url=$remoteoriginurl/foo"
2. git submodule update will not expand ./foo if  there is an entry
    "submodule.foo.url=./foo"  in $GIT_DIR/config

I tend to add the url as is when "git submodule add", and then expand
the url when running "git submodule update". So this will result that
the second case expands './foo' as "$remoteoriginurl/foo" instead of
"foo".

And this is the reason i expand './foo' in module_clone.

-- 
Ping Yin
--

Previous thread: none

Next thread: none