First, I resent the patch series last night, it now uses core.origin to
avoid touching remotes.* namespace.
The changes *do* fix a nit when on a non-tracking branch. With this,
fetch / merge / pull will now honor that the user said (via git clone -o
frotz) "my upstream is nicknamed frotz" and not try to use origin when
origin was never defined.
So, while fixing this minor aggravation wasn't my motivation, I view
this as a nice side-benefit :^).
The driving issues:
1) I deal with too many servers for "origin" to be a useful nick name,
and we have an agreed set of nickname / server pairings across my project.
2) Therefore, we always do git clone -o frotz frotz.foo.bar/path_to_git.
3) Because of 2, for top-level, "origin" is not defined, tracking
branches set up via git branch --track point to the correct remote, and
we basically understand branch names as <nickname>/branch. In other
words, we *are* aware of what server we are using.
4) git-submodule update breaks the above:
- a) it invokes git clone frotz.foo.bar/path_to_git thus defining
"origin" as the nickname for frotz.foo.bar.
b) it invokes bare git-fetch on a detached head, so the upstream *has*
to be origin.
Nope, we did it with git clone -o frotz git://frotz.foo.bar/toplevel.git
We *never* define origin, frozt.foo.bar is *always* frotz.
good. We are making (some) progress. :^)
Actually,
1) We don't use origin because we avoid having to wonder "Is
frotz.foo.bar named "origin" or "frotz" on this client, and thus how do
I get data from frotz?
2) I submitted the change allowing submodules to be recorded into
.gitmodules with a relative url (e.g., ./path_from_parent_to_submodule)
rather than an absolute, so we record the relative path only.
3) Thus, git submodule has set up the submodules to point at the parent
project's default remote. However, in the parent the server is nicknamed
"frotz", but now in the submodule the server is nicknamed "origin" Oops.
With my patches, parent and submodule both refer to frotz.foo.bar as frotz.
Again, the relative-url patch was to address this so that a project that
is mirrored to another server remains valid on the new server without
modifying the .gitmodules in-tree. (Yes, I know you *can* modify
information in a given clones .git/config, but I'm trying to avoid such
manual per clone/checkout modifications where it can reasonably be done.).
Basically, I think an important (but not complete) test of the design is
that
git clone -o frotz git://frotz.foo.bar/myproject.git
cd myproject
git submodule init
git submodule update
work, with origin = frotz throughout the submodules, and with the whole
project correctly checked out even if the entire project was rehosted
onto a different server. With relative urls and my latest patch series
last night, this all works, and of course upstream can still be "origin"
if that is what is desired.
While our overall project exists on many servers, mirroring is an
incorrect term. Rather, only certain branches of various parts exist
everywhere, many other branches are specific to a given server, so we
really name branches using servername/branchname. It is this aspect of
the project that causes us to be aware of the server in use, and thus
makes use of "origin" as a generic upstream not useful.
git-submodule right now supports two different layouts (urls relative to
the parent, and absolute urls such that each sub-module is on an
independent server). The management approaches to these are going to be
different.
I also suspect there are two basic use cases here: accumulation of a
number of independently managed projects vs. splitting a single major
project into a number of smaller pieces to allow some decoupling, but
still managing the set as a composite whole.
There may be some direct correlation of use-case and submodule layout,
don't know. My project uses relative-urls, and I am managing a large
project that has been split into a number of components. So, my
suggestions are focused entirely upon this design and use-case, and I
don't expect I am addressing the others at all. (As usual, this requires
someone who needs the other model(s) to step up and drive).
For *my* uses (relative urls, single logical project):
1) There are times when the parent's branch.<name>.remote should be
flowed down to all subprojects for git submodule update, of course this
would require that the remote be defined for all.
2) Thus, there needs to be a way to define a new remote globally for the
project, and have it be correctly interpreted by each submodule (e.g., a
repeat of the relative-url dereferencing now done by submodule init, but
applied later to all submodules to define a new remote). Yes, this could
be accomplished by going into each submodule independently and issuing
appropriate commands, but administration would be much easier given a
top-level command that could recurse and "do the right thing" per
sub-project.
I *suspect* that origin is a much more useful concept for the alternate
construct (absolute urls, loose alliance of separately managed
projects), but as I said that is not my problem so please ask folks who
have that model to define what works for them.
(Acutally, I thought I was the one arguing that using origin when it
means different things to different folks is not good. That's the root
of my problems. :^) )
Anyway, I have not found any use of "origin" on my project really
useful. We have to be and *are* aware of the server/branchname in use,
not just the branch. Partly this is because different subgroups have
different natural gathering points (we tend to exchange data via ad hoc
"mob" branches on whatever server is most accessible to the particular
group), and partly because some information simply cannot be allowed on
some servers, but basically the more accessible a server is, the less
information that server can have. I believe "origin" is really useful
only when it has just one meaning, or when all values are effectively
identical (e.g., you have several mirrors for load balancing, etc, but
all are identical modulo mirroring delays).
OTOH, a reasonable change to the semantics of "origin" might be to have:
1) core.origin name the remote that is the "normal" upstream.
2) Reserve and allow use of the name "origin" to mean $core.origin,
e.g., in shell scripts replace all references to remote "origin" with
$(git config core.origin). Of course, if core.origin = origin, then no
user visible change occurs.
In this way, git would not record the same remote's branches in two
ways (as origin/master and as frotz/master), but rather dereference
origin -> frotz and then get frotz/master. Dunno, no matter how you
slice it, having more than one way to refer to the same remote is going
to be confusing, and that's why we don't use origin.
Mark
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html