"Whole repository hierarchy (snvroot) snapshots" are useless without
extra work; Git needs "whole project" snapshots for its commits.
But the whole long description of "branching" model in Subversion was
meant as intro for explanation why there can be mishandled commits
in Subversion, which make it impossible to have 1-to-1 SVN revision to
Git commit mapping.
Actually as Stephen Bash wrote in his response creating branches in
Subversion generates 'copy' operations in svndump... we have to filter
out 'copy' operations which do not create new branches, though.
We would have to ensure that commits in Git in branch 'foo' are the same
as history of 'project/branches/foo' subtree in svnroot in Subversion.
Otherwise we would either have different history in Git and in Subversion,
or we would have screwed up DAG of revisions in Git.
I don't think the most common "sane" Subversion merge case would be
difficult to translate into merge commit in Git: the svn:mergeinfo
property would have common revisions for all affected files/directories.
The problem is that like it is possible to mishandle commit like described
by Stephen Bash by creating all-branches revision, it is also possible
to mishandle merge in Subversion, creating revision where different files
are merged from different branches: such thing does not have easy
translation to Git commit-level rather than file-level merge tracking.
[...]
If I remember correctly some of discussion was whether there can truly
be irrecovable situation where single SVN revision *must* be mapped into
more than one Git commit (one-to-many mapping).
Note that there is problem with possibly changing svn:log, svn:author and
svn:date revision properties is only when there is ongoing interaction
between Subversion repository (or mirror) and Git repository (or mirror).
There is no problem with this issue when doing one-shot conversion.
The major problem is that svn:log etc. are _unversioned_ properties (see
http://svnbook.red-bean.com/en/1.5/svn.ref.properties.html), so I am not
sure if there is a way for Subversion server to tell that some svn:log
properties changed. Perhaps there is a log, even if properties are
unversioned... otherwise we would have to detect somehow that properties
changed.
But let's assume that we have a way of notifying or noticing that e.g.
svn:log property changed.
Say that svn:log property for revision 'n was A at the time Git fetched
from SVN repository, and SVN revision 'n' is mapped to commit AA with
commit message A.
Later we fetch again from SVN repository, and besides new revisions to
be converted we notice somehow that svn:log property for revision 'n'
changed from A to B.
We now create replacement commit BB in Git, with the same Git parent
as commit AA, and with commit message changed to BB. Then we add
commit BB as replacement for AA:
$ git replace -f AA BB
(or its low level equivalent, or its batch equivalent when it exists).
This replacement is saved as a ref in 'refs/replaces/*' namespace. All
git commands (except some plumbing perhaps, and unless you pass
'--no-replace-objects' option to git wrapper) would then work as if
commit AA was replaced by commit BB; in particular 'git show AA' and
'git log' would show BB version.
Because replacements are stored as refs in 'refs/replaces/*' namespace,
it is simple to transfer them. Each repository that fetches those refs
(+refs/replaces/*:refs/replaces/*) would see replaced contents. Those
that do not fetch it would see old contents (and perhaps would have
problems like iteracting with SVN repository).
Alternate solution, though not as natively nice, would be to have empty
or placeholder commit, and store true commit message in notes for commit
AA, i.e. the message A would be in git note for AA. Changing commit
message would mean changing note: after change commit AA would have a
commit-message note with contents B.
If changes to unversioned revision properties are rare, then replacement
technique is much superior to using notes, which generates unnatural git
repository. When changing commit messages (svn:log) and the like are
common and often, which would result in great many replacements, the
notes technique could be better because of performance reasons.
Heh.
Again: svn:log, svn:author and svn:date are Unversioned Properties, but
perhaps Subvrsion repository stores log of changes somewhere (similarly
to git reflog, though hopefully not expired too early).
P.S. The later in this thread, the more I see how utterly wrong
Subversion model of version control is (branches, tags, merges).
--
Jakub Narebski
Poland
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html