Re: How-to combine several separate git repos?

Previous thread: [PATCH] Restore ls-remote reference pattern matching by Daniel Barkalow on Saturday, December 8, 2007 - 7:35 pm. (9 messages)

Next thread: [PATCH 0/2] [RFT] git-svn: more efficient revision -> commit mapping by Eric Wong on Sunday, December 9, 2007 - 12:27 am. (8 messages)
From: Wink Saville
Date: Saturday, December 8, 2007 - 11:34 pm

I've got several git repositories of different projects and was thinking
I should combine into one repository, but my googling around didn't turn up
any simple way of doing it.

Any advice?

Wink Saville

-

From: Alex Riesen
Date: Sunday, December 9, 2007 - 3:43 am

Should they both be visible in one working tree as directories?
Should these be independent branches?

For instance, you can fetch one into another:

    $ cd project1
    $ git config remote.project2.url /path/to/project2
    $ git config remote.project2.fetch 'refs/heads/*:refs/project2/*'
    $ git fetch project

That will give you two (or more) branches, containing history of the
project1 and project2. They are still completely independent, just use
the same object store.

You can merge them, for example:

    $ cd project1
    $ git merge project2/master

Assuming that there is no filename collisions you'll get a repo with
two merged histories (and two starting points). In case you get
conflicts you can either resolve them by editing or just move the
problematic project in subdirectory:

    $ git merge -s ours --no-commit project2/master
    Automatic merge went well; stopped before committing as requested

here will be no conflicts. Merge strategy "ours" (-s ours) does not
take anything from the branch to be merged. The coolest strategy ever.
"--no-commit" stops the operation just before committing.

    $ git read-tree --prefix=project2/ project2/master
    $ git checkout-index -a
    $ git commit

That's it. The histories are merged and the files of project2 are
placed in the directory "project2". It is a wee bit harder to browse
the history of the files: you have to give both new and "old" name of
the project2's files, as if you renamed them (that's what read-tree
with --prefix did).



-

From: Wink Saville
Date: Sunday, December 9, 2007 - 12:12 pm

Yes, I currently have:

  ~/prgs/android/StateMachine
  ~/prgs/android/test2
  ~/prgs/android/test-annotation

I'd with 3 git repo's in StateMachine, test2 & test-annotation.
Ideally I was thinking having using submodules as they
So the above isn't what I want, but for grins I tried it, and got to
this point:

wink@ic2d1:$ cd StateMachine
wink@ic2d1:$ cat .git/config
[core]
        repositoryformatversion = 0
        filemode = true
        bare = false
        logallrefupdates = true
wink@ic2d1:$ git config remote.test2.url ../test2
wink@ic2d1:$ git config remote.test2.fetch 'refs/heads/*:refs/test2/*'
wink@ic2d1:$ cat .git/config
[core]
        repositoryformatversion = 0
        filemode = true
        bare = false
        logallrefupdates = true
[remote "test2"]
        url = ../test2
        fetch = refs/heads/*:refs/test2/*
wink@ic2d1:$ git fetch test2
warning: no common commits
remote: Counting objects: 24, done.  
remote: Compressing objects: 100% remote: (16/16), done.  
remote: Total 24 (delta 3), reused 0 (delta 0)
Unpacking objects: 100% (24/24), done.  
 From ../test2
 * [new branch]      master -> refs/test2/master
wink@ic2d1:$ git checkout -b test2 test2/master
Branch test2 set up to track remote branch refs/test2/master.
Switched to a new branch "test2"
wink@ic2d1:$ git status
# On branch test2
nothing to commit (working directory clean)
wink@ic2d1:$ git status
# On branch test2
nothing to commit (working directory clean)
wink@ic2d1:$ git branch -a
  master
* test2
wink@ic2d1:$ git checkout master
Switched to branch "master"
wink@ic2d1:$ git checkout test2
Switched to branch "test2"

So that worked, but as I mentioned not quite what I want
Starting over (restoring the original from a tar backup)
this didn't work I get:

wink@ic2d1:$ cd StateMachine
wink@ic2d1:$ git merge ../test2/master
So the first suggestion works, but I don't want them as
separate branches as I want to work on the simultaneously
and they'll share common ...
From: Daniel Barkalow
Date: Sunday, December 9, 2007 - 12:55 pm

I think that submodules do what you want. And they may not be ready for 
neophytes to just use, but they're ready for neophytes to try using and 
tell us where things get confused. That is, I don't think git developers 
know of anything still wrong with submodules, so we need people to try 
using them and say "I can't use submodules now because...". Or, of course, 
the code might be fine now and the discussion just too intimidating.

In any case, if you create a new repository, and fetch stuff into it, and 
don't push stuff back, you can't screw up the original (obviously), so it 
should be safe and possibly informative to play around with submodules.

	-Daniel
*This .sig left intentionally blank*
-

From: Wink Saville
Date: Sunday, December 9, 2007 - 4:44 pm

I would like to try submodules and started to  read the man


I'll try submodules and I'll start by reading the man page,
I got to "add" and did the following:

wink@ic2d1:$ mkdir x
wink@ic2d1:$ cd x
wink@ic2d1:$ git init
Initialized empty Git repository in .git/
wink@ic2d1:$ git-submodule add ../android/StateMachine
remote (origin) does not have a url in .git/config
you must specify a repository to clone.
Clone of '' into submodule path 'StateMachine' failed

The documentation for "submodule add" says:

/git-submodule/ [--quiet] [-b branch] add <repository> [<path>]
Add the given repository as a submodule at the given path to the 
changeset to be committed next.

 From the above, <path> is ambiguous to me, is it referring to the 
source or destination. I continue reading and in the options section we 
find:

<path>

    Path to submodule(s). When specified this will restrict the command
    to only operate on the submodules found at the specified paths.


So it seems <path> is the for the source, but some how it can specify 
multiple paths. This seems to imply that I could add my three 
repositories with one command. But I have no idea how and there are no 
examples, but I can guess for my three repositories maybe:

wink@ic2d1:$ rm -rf *
wink@ic2d1:$ git init
Initialized empty Git repository in .git/
wink@ic2d1:$ git add submodule ../android StateMachine test2 test-annotation
fatal: '../android' is outside repository

Nope that didn't work, continue reading, ah I probably need to 
"submodule init" first.
It says:

init

    Initialize the submodules, i.e. register in .git/config each
    submodule name and url found in .gitmodules. The key used in
    .git/config is submodule.$name.url. This command does not alter
    existing information in .git/config.

This is garbally-gok to me (remember, neophyte) but it does leave a clue,
apparently I need to create .gitmodules so I do:

wink@ic2d1:$ cat .gitmodules
[submodule "StateMachine"]
        path = ...
From: Daniel Barkalow
Date: Sunday, December 9, 2007 - 6:11 pm

It looks like git-submodule does something really wrong for URLs that 
start with ./ or ../. (By "wrong", I mean that it's different from the 
handling of the same URL anywhere else.) I think there's a general problem 
(beyond submodule support) that git doesn't do a good job with 
intermediate repositories: if you clone a repository, and then clone the 
clone, the second clone sees the local-to-the-first aspects and largely 
ignores the actual origin's properties.

On the other hand, you probably don't really want the canonical URL for 
the submodule to be a local relative path. If what you have are local 
clones of public repositories, you want to use the public remote 
repositories' URLs there. If you really want it to be a clone of the local 
repository, you can use an absolute path and weird things won't happen.

I think it should be possible to store the submodule data in the 'x' 
repository as well; have some branches which are the three different 
projects, and then pull them out of '.', but this doesn't seem to be 

This is actually the destination. The projects that are under discussion 
are only submodules in the context of the destination. For "git submodule 
add" it's the path in the superproject where the submodule will appear.

	-Daniel
*This .sig left intentionally blank*
-

From: Wink Saville
Date: Sunday, December 9, 2007 - 7:29 pm

I got submodule working, buy following the tutorial here:
http://git.or.cz/gitwiki/GitSubmoduleTutorial#preview.

As well as looking at:
http://jonathan.tron.name/articles/2007/09/15/git-1-5-3-submodule
http://en.wikibooks.org/wiki/Source_Control_Management_With_Git/Submodules_and_Superpr...
http://kerneltrap.org/mailarchive/git/2007/10/19/348810
http://kerneltrap.org/mailarchive/git/2007/10/19/348829

I'd say the current documentation in git needs to improve at least
for neophytes. The first step would be to include the GitSubmoduleTutorial.
Also, I think the second paragraph of the tutorial is very important and
something like it should probably be the first paragraph of the 
git-submodule
man page:

"Submodules maintain their own identity; the submodule support just 
stores the submodule repository location and commit ID, so other 
developers who clone the superproject can easily clone all the 
submodules at the same revision."

My interpretation of the paragraph and how submodules might be used
is that the "supermodule" provides tags for a set of repositories.
I see that as important, especially for large projects which could use
multiple repositories for sub-projects and then use a "supermodule"
for test and release management.

That isn't what I really wanted to do. As a one man band I was looking
to actually "combine" several repositories into one logical repository
to simplify commits, pushes, pulls to my own backup repositories.
I now see that that wasn't the purpose of submodule.

Anyway, that is the perspective of this neophyte and I learned something
new which is a primary goal of mine.

Finally, I'm back to my original question how to combine repositories? As
I said in my response to Alex, the multiple branches I got working but that
isn't what I want.

What I think I now want is to create a new repository which contains my
other repositories as sub-directories (with their histories) after combining
them I would delete the old ...
From: Daniel Barkalow
Date: Sunday, December 9, 2007 - 8:01 pm

Ah, okay. I was assuming that you wanted them to maintain their original 
identities (so you'd send stuff off for each of them separately, for 
example).

I think you can do what you want by doing:

# Set up the new line:
$ mkdir x; cd x
$ git init
$ touch README
$ git add README
$ git commit

# Add a project "foo"
$ git fetch ../foo refs/heads/master:refs/heads/foo
$ git merge --no-commit foo
$ git read-tree --reset -u HEAD
$ git read-tree -u --prefix=foo/ foo
$ git commit

And repeat for all of the other projects.

What's going on here is that you're merging in each project, except that 
you're moving all of the files from that project into a subdirectory as 
you pull in the content. The resulting repository has one recent dull 
initial commit, and then merges in each of the other projects with their 
history, with only the slight oddity that they don't go back to the same 
initial commit, and the merge renames all of the project's files.

I think there may be a more obvious way of doing this (it's essentially 
how gitweb and git-gui got into the git history), but I'm not sure what it 
is, if there is one.

	-Daniel
*This .sig left intentionally blank*
-

From: Wink Saville
Date: Sunday, December 9, 2007 - 11:36 pm

Daniel,

Worked like a charm, someday maybe I'll understand why it works:)

Wink
-

From: Daniel Barkalow
Date: Sunday, December 9, 2007 - 11:51 pm

Get the history and data, as an extra branch, so we can deal with it 

We want to generate a merge commit, but we want to mess with it, so we use 
merge to set it up, but --no-commit to not actually commit it. (The commit 

Replace the index (--reset), updating the working directory (-u), with 
HEAD (the commit on the local side before the merge we're in the middle 

Read into the index, updating the working directory, "foo" (the branch we 
fetched above), but with the prefix "foo/" so everything is stuffed into a 

And commit the result, which has the tree as constructed above (everything 
from the local parent, and everything from the branch in a subdirectory), 
and has the parents as for the merge we started: first parent is the local 
line without the additional branch, and the second parent is the added 
repository.

And git doesn't care about global structure, so the fact that this commit 
is obviously just what you'd ideally like, while the history as a whole is 
a bit odd (like, it doesn't have a unique start, and development didn't 
start in subdirectories, and...) doesn't matter.

	-Daniel
*This .sig left intentionally blank*
-

From: Wink Saville
Date: Monday, December 10, 2007 - 12:01 am

The history does look odd as I look at in gitk, but
it is there and I've expanded my knowledge a little.

Thanks again,

Wink

-

From: Alex Riesen
Date: Monday, December 10, 2007 - 12:52 am

This is *not* what I suggested. It should be:

    $ git config ... (as suggested before)
    $ git fetch test2
    $ git merge test2/master

Here test2/master - is *NOT* a path. It is the name of the branch
where the local repository stores reference to the commit
corresponding to the master of remote repo (that is: the "master"
branch of "test2", as seen from the repository where you do the


They are ready for some (dunno if they'd like to be called neophytes).
I just don't think you need them (keywords on your explanations being
"share common code", understanding them as "the modules use the common
code simultaneously").

-

From: Wink Saville
Date: Monday, December 10, 2007 - 10:55 am

Thanks it was my mistake,

Wink

-

Previous thread: [PATCH] Restore ls-remote reference pattern matching by Daniel Barkalow on Saturday, December 8, 2007 - 7:35 pm. (9 messages)

Next thread: [PATCH 0/2] [RFT] git-svn: more efficient revision -> commit mapping by Eric Wong on Sunday, December 9, 2007 - 12:27 am. (8 messages)