hoi :)
On Sat, Dec 16, 2006 at 10:32:36AM -0800, Junio C Hamano wrote:
Most of the things you described are already implemented in
http://git.admingilde.org/tali/git.git/module2
If there is interest in it, I can generate some nice patches out of it.
However with Linus concerns about scalability I'm not sure it is ready
yet. But if you prefer patches for discussion I'll send them here.
In contrast to your description, my implementation does not
introduce a new "link" type but instead adds the reference to the
submodule commit directly to the parent tree object and to its
index.
yes, this is essential.
There may be links in this particular instance of the submodule, i.e.
the repository/working directory which are checked out by the
supermodule may be coupled to the supermodule, but it must always
be possible to clone/push/pull the submodule alone.
this is essential, too.
yes.
However I decicided not to read in HEAD but some specific branch.
This may sound arbitrary and I did not really like to make
"master" (the branch I chose) even more special, but you will
understand it when looking at the checkout below.
Hmm, I never cared about cache_tree up to now. I guess I should learn
about it to understand the influence on submodules.
Where do you want to write the link to?
What I do here is update one branch ("master") of the submodule to
the new commit which was stored in the parent index.
If this branch is currently checked out, the working directory will
be updated, too. If there is no working directory for the submodule
yet, it will be created.
Updating one special branch instead of HEAD is because the submodule
commits which are stored in the supermodule really can be considered
as a special branch which happens to not be stored in an ordinary ref.
In order to make it visible to the user the commit is copied to a
normal ref.
This approach also integrates better with branches in the submodule.
When you want to start parallel development in a branch you eigther
want to do this in the complete supermodule scope -- then you have
to branch the supermodule --, or you want to do it independent to the
version stored in the supermodule -- then you don't want a supermodule
checkout to mess with your branch.
So it makes sense to have one branch which is tracked by the parent
and other branches which are independent from the parent.
Where exactly do you see the layering violation?
Well I think it makes sense to use read-tree -m <old> <new> in the
submodule instead of a hard reset, but when the supermodule is checked
out the submodule really should move to its new version.
(At least the branch which is tracked by the parent should do so.)
All the diff stuff is what is still missing in my implementation.
If you ask for a diff in the parent, it will happily diff the
submodules commit objects ;-)
Well, a simple and dump version (i.e. my current implementation) can
just do the same for commits as it does for trees: just recursively
descend. Of course this is prohibitive in anything but toy projects.
A better approach is to put all the submodule commits on the pending
list and do the normal ancestry walk for them again. But this would
also need all reachable objects from all modules to be known to one
process.
This could be solved by having one pending list per submodule and
then flush all objects before moving to the next submodule, or
just processing the submodule in a different process.
But when the SEEN information is not shared between submodules then
rev-list could output the same object twice if a blob or tree is
used by several submodules. This may not be a problem if all the
code which processes rev-list output is idempotent, but I haven't
looked into this in detail.
Of course, when rev-list for submodules is already split out there
is the valid question if it really makes sense to descend into
submodules when doing rev-list.
Not doing so would natually decouple sub- from supermodule but then
a lot of operations that depend on rev-list (clone, push, pull)
have to be heavily modified.
Getting this straight in an efficient way is the next challenge.
--=20
Martin Waitz