Re: Comments on "Understanding Version Control" by Eric S. Raymond

Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]
From: Jakub Narebski
Date: Thursday, February 5, 2009 - 4:23 am

On Tue, Feb 05, 2009, Theodore Tso wrote:


Which evaluation is very important, as Git, Mercurial (hg) and Bazaar
(bzr) in addition to Subversion (svn) dominate the field of open-source
version control systems, with Darcs and Monotone having its own niches.


There is progress and evolution... 
 * locking -> update-then-commit (or merge-then-commit) -> 
   -> commit-then-merge (and alternate/additional workflow of rebase
   aka. commit-merge-recommit-push)
 * local -> client / server -> distributed

Perhaps also
 ? per-file history -> whole tree commits

But there are still controversial issues, like discussed here issue
on _how_ to deal with renames.


There I think everybody would agree.  Modern VCS rare, if even, have
support for locking model.


First, I have stressed already, the issue of 'container identities'
for dealing with renames is totally ORTHOGONAL to the issue whether
SCM is snapshot based or changeset based.  Case in point: Bazaar (bzr).
Bazaar uses file-ids and directory-ids to deal with renames (here it
is spiritual child of Arch), but on Bazaar wiki (http://bazaar-vcs.org)
it is mentioned in the passing that it is _snapshot based_. I think
that it had those file-ids even when it used 'weave' in repository
format (not deltas / changesets).

Second, what I also wrote about already, the article cited as argument
for changeset based SCM (which you don't have in above excerpt) is not
to the point, and moreover is totally, utterly _wrong_. The troubles
with merging in CVS and Subversion are not caused by the fact that they
are snapshot based (CVS isn't, by the way), but by the fact that they
don't (or in the case of Subversion didn't) track merges.


What I'd like to see in the next version of "Understanding Version
Control Systems" is to concentrate more on the _issue_ of managing
renames, than on specific solution of this problem.  And I very much
would like to see 'rename detection' mentioned...

But I think that the issue of renames is not the main point. The main
point is that in modern VCS _merging_ has to be easy[1], from which
naturally follows that VCS needs intelligent merge which can deal well
with file renames.  Managing renames is needed for easy merging; all
else is glitter.

Or, from the other point of view the important thing that _branching_
is important. Both creating branches, and merging branches (and having
large amount of branches, and being able to delete branches, and having
local (unpublished) and global (published) branches, etc.).


BTW. there is excerpt from Junio C. Hamano blog post "FLOSS weekly #19
follow-up (3)" http://gitster.livejournal.com/9970.html

  By the time the basic structure as we currently know has stabilized,
  we had help from literally dozens of contributors to add many things
  on top of the very original version:

  [...]
  * We did not envision that multiple branches in a single repository
    would turn out to be such a useful way to work, and did not have
    support for switching branches.

[...]

I think, and I hope, that Eric would manage to keep proper scientific
decorum[2], balancing or at least mentioning all problems and all
possible solutions, even if he is biased, and even if this bias shows
(hopefully a little).

[2] The thing that distinguish true science from cargo-cult science
    (pseudo-science), which shows only arguments "for".


I though more about the fact that having 'use cases' examples would
be more clean. And also would make possible to test against...


This (or similar, at least) example you can find in 'Tests for
"Understanding Version Control" by Eric S. Raymond' subthread...


Errr... I think that you confused branch 'B' (with innodb-experimental)
with branch 'A' (with innodb only) here.


I think that they should change their filesystem hierarchy naming
conventions and/or use branches more. But that is not terribly 
relevant...


Well, I think it would be a bit simpler: for each _new_ file in merge
you have to see where other files in the directory it was created are.
But I agree that it would be costly; perhaps it should be triggered by
separate option / config, like diff.renames = copies?


On the other hand I think that fundamentally 'container identity'
solutions cannot deal with the case of splitting contents (or reverse,
joining contents), e.g. splitting file into smaller files, or splitting
directory into few directories (grouping files).  And from what
I understand at least current implementations of 'file-id' solution
have problems with repeated merging in the case of independently added
file.


Final words: there is no race. We aren't here to achieve world
domination. Sometimes one SCM, with its different choices, might
be better solution than the other.  For example if you have large
media files then centralized SCM with partial checkout support might
be a best choice.  Another example is how we pointed on #git (sic!)
people from IPsec, who wanted each commit to be signed or equivalent,
towards Monotone.

-- 
Jakub Narebski
Poland
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

Messages in current thread:
Re: Comments on "Understanding Version Control" by Eric S. ..., Jakub Narebski, (Thu Feb 5, 4:23 am)