It is quite obvious that comparison of programs of given type (SMC)
on some program site (Bazaar-NG) is usually biased towards said program,
perhaps unconsciously: by emphasizing the features which were importantFor example simple namespace for git: you can use shortened sha1
(even to only 6 characters, although usually 8 are used), you can
use tags, you can use ref^m~n syntax.I'm not sure about "No" in "Supports Repository". Git supports multiple
branches in one repository, and what's better supports development using
multiple branches, but cannot for example do a diff or a cherry-pick
between repositories (well, you can use git-format-patch/git-am to
cherry-pick changes between repositories...).About "checkouts", i.e. working directories with repository elsewhere:
you can use GIT_DIR environmental variable or "git --git-dir" option,
or symlinks, and if Nguyen Thai Ngoc D proposal to have .gitdir/.git
"symref"-like file to point to repository passes, we can use that.Partial checkouts are only partially supported as of now; it means
you have to do some lowe level stuff to do partial checkout, and be
carefull when comitting. BTW it depends what you mean by partial
checkout, but they are somewhat incompatibile with atomic commits
to snapshot based repository.Git supports renames in its own way; it doesn't use file ids, nor
remember renames (the new "note" header for use e.g. by porcelains
didn't pass if I remember correctly). But it does *detect* moving
_contents_, and even *copying* _contents_ when requested. And of
course it detect renames in merges.Git doesn't have some "plugin framework", but because it has many
"plumbing" commands, it is easy to add new commands, and also new
merge strategies, using shell scripts, Perl, Python and of course C.
So the answer would be "Somewhat", as git has plugable merge strategies,Gaah, subscribe-to-post mailing list!
--
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git-
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1Bazaar's namespace is "simple" because all branches can be named by a
URL, and all revisions can be named by a URL + a number.If that's true of Git, then it certainly has a simple namespace. Using
That sounds right. So those branches are persistent, and can be worked
It sounds like the .gitdir/.git proposal would give Git "checkouts", by
Yes, I'm very much aware of that tension. It will be fun when Bazaar
You'll note we referred to that bevhavior on the page. We don't think
what Git does is the same as supporting renames. AIUI, some Git usersIt sounds like you're saying it's extensible, not that it supports
plugins. Plugins have very simple installation requirements. They can
provide merge strategies, repository types, internet protocols, new
commands, etc., all seamlessly integrated.What you're describing actually sounds like the Arch approach to
extensibility: provide a whole bunch of basic commands and let users
build an RCS on top of that.As the author of two different Arch front-ends, I can say I haven't
found that approach satisfactory. Invoking multiple commands tends
re-invoke the same validation routines over and over, killing
efficiency, and diagnostics tend to be pretty poorly integrated.Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.orgiD8DBQFFNAb90F+nu1YWqI0RAvRDAJ9HHHdbhT1+aA3wOGeuUDkjRIr7BQCcDBKB
cL+DAy5GdTDk8Iz9TUkQ//M=
=AJAu
-----END PGP SIGNATURE-----
-
Hi Aaron,
How should this cope with a distributed project? IOW how does it deal with
"this revision and that revision are exactly the same"?If I understand you correctly, you are claiming that you are not really
identifying a revision, but a revision _at a certain place with a
place-dependent number_. This conflicts with my understanding of aIt depends on your usage. If you want to do anything interesting, like
assure that you have the correct version, or assure that two different
person's tags actually tag the same revision, there is no simplerOf course! Persistence (and reliability) are the number one goal of git.
Performance is the next one.As an example of completely independet branches, look at the "next" and
the "todo" branch of git. They are _completely_ independent, i.e. not evenOh, we start another flamewar again?
Honestly, if you want to record renames, why don't you also support (with
a command for each of those purposes) code copying? And refactoring? And
copyright year bumps? _put your favourite here_If you really, really think about it: it makes much more sense to record
your intention in the commit message. So, instead of recording for _every_
_single_ file in folder1/ that it was moved to folder2/, it is better to
say that you moved folder1/ to folder2/ _because of some special reason_!Same goes for all other thinkable examples.
If you want to track code, then let the tracker do its work, i.e. let
git-pickaxe figure where your code came from. It is likely being moreIt is more like the Unix way. Let each command do _one_ thing, but let it
Welcome to git! Git's commands are very efficient, and you can even pipe
them efficiently! And now that we have GIT_TRACE, diagnostics are no
concern.Ciao,
Dscho-
Just a small nit here: bzr does /not/ record the move of every file: it
records the rename of folder1 to folder2. One piece of data is all thats
recorded - no new manifest for the subdirectory is needed.Of course, a user can choose to move all the contents of a folder and
not the folder itself - its up to the user.By recording the folder rename rather than the contents rename, we get
merges of new files added to folder1 in other branches come into folder2
automatically, without needing to do arbitrarily deep history processing
to determine that.This also does not prevent us doing history analysis as well, to
determine other interesting things - such as cross file 'blame' as has
been mentioned in this thread.=20-Rob
--=20
GPG key available at: <http://www.robertcollins.net/keys.txt>.
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1There are two answers here. One is that the URL + number is UI, not
internals. A unique ID is used internally, so that can be compared.But to fully ensure that there are no differences, i.e. that no one has
No, I am claiming that a revision at a certain place with a
place-dependent number is one name for a revision, but it may have otherI can use the 'bzr missing' command to check whether my branch is in
sync with a remote branch. Or I can use the 'pull' command to update myYou'd be surprised. When we last spoke to the Mercurial team, Mercurial
didn't support multiple persistent branches in one repository. Pulling
from a remote repository could join two branches into one. I'm toldI'd hope not. It sounds as though you feel that supporting renames in
the data representation is *wrong*, and therefore it should be an insult
to you if we said that Git fully supported renames.Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.orgiD8DBQFFNGVq0F+nu1YWqI0RAsXiAJ9hjH2sQGG3E9oIYP2SxscXvVQsJACdHtkj
+r37JPSjbQCuchPo08P3px8=
=5MHE
-----END PGP SIGNATURE-----
-
On Tue, 17 Oct 2006 01:08:59 -0400
The "bzr missing" command sounds like a handy one.
Someone on the xorg mailing list was recently lamenting that git does not
have an easy way to compare a local branch to a remote one. While this
turns out to not be a big problem in git, it might be nice to have such
a command.Sean
-
Not recording and not supporting are quite different things.
What we don't do is to _record_ renames in the data structure.
I personally would not use a word as strong as _wrong_ (and
Linus may disagree), but (1) we can support renames without
recording them just fine, (2) recording renames would not help
to tell users about line movements across files which we would
want to do, and (3) we are getting closer to come up with a way
to even do (2) without recording renames. Given these, perhaps
I might say recording renames is _pointless_ when I am in good
mood.-
Yes. There's a risk of confusing a feature with an implementation
detail. From http://bazaar-vcs.org/RcsComparisons:"If a user can rename a file in the RCS without loosing the RCS
history for a file, then renames are considered supported. If
the operation resultes in a delete/add (aka "DA pair"), then
renames are not considered supported. If the operation results
in a copy/delete pair, renames are considered "somewhat"
supported. The problem with copy support is that it is hard to
define sane merge semantics for copies."The first sentence sounds like a description of a user-visible feature.
The rest of it sounds like implementation.And git probably has some deficiencies here, but it'd be more useful to
identify them in terms of things a user can't do.--b.
-
It would seem that the majority of folks on the Git list feel that
way, myself among them. I don't know that we'd find it an insult
to say Git fully supports renames but I do think we have had better
results from *not* recording them and looking for them after the
fact with smart tools.Junio's recent work with git-pickaxe (or whatever its name finally
settles out to be) is a perfect example of this. Despite not having
"recorded renames" git-pickaxe is able to fairly accurately detect
blocks of code moving between files, of which renaming files is just
a special case. This provides some fairly accurate blame reporting
pointing to exactly which commit/author/datetime put a given line
of code into the project.No additional metadata required. All existing repositories can
immediately benefit from the new tool. Rather slick if you ask me.--
Shawn.
-
I think you missed the simplicity of the git naming here. With git, I
can receive a bug report that specifies a bug that appears in a
revision such as:71037f3612da9d11431567c05c17807499ab1746
And since I have a commit object in my repository with that same name
I have a strong assurance that I am testing the identical software as
the bug reporter without me ever needing any access to pull from the
reporter's repository.And this works in an entirely distributed fashion. Any two users can
be certain they are working with identical software on both ends by
exchanging and comparing a few bytes, (in email, irc, bugzilla, what
have you), without any need to refer to a common repository which both
users have access to.-Carl
Hi!
Dear diary, on Tue, Oct 17, 2006 at 01:45:34AM CEST, I got a letter
I think Aaron rather meant that in case of an error, the error messages
may seem incoherent from the perspective of a porcelain user if it's
been generated by the plumbing. And I had that problem in Cogito as well
few times in the past, but I think most of those are reasonable now (I
can't think of a counter-example off the top of my head).Calling multiple git commands _is_ a problem, especially in a loop, but
I think it's more the inherent fork()+execve() overhead than whatever
happens over and over when main() takes over. Many git commands got
adjusted so that you can call them just once and then feed from/to them
over longer time period.--
Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1
lK[d2%Sa2/d0$^Ixp"|dc`;s/\W//g;$_=pack('H*',/((..)*)$/)
-
Hey, "simple" is in the eye of the beholder. You can always just define
Bazaar's naming convention to be simple.I pretty much _guarantee_ that a "number" is not a valid way to uniquely
name a revision in a distributed environment, though. I bet the "number"
really only names a revision in one _single_ repository, right?Which measn that it's actually not a "name" of the revision at all. It's
just a local shorthand that has no meaning, and the exact same revision
will be called something different when in somebody elses repository.I wouldn't call that "simple". I'd call it "insane".
In contrast, in git, a revision is a revision is a revision. If you give
the SHA1 name, it's well-defined even between different repositories, and
you can tell somebody that "revision XYZ is when the problem started", and
they'll know _exactly_ which revision it is, even if they don't have your
particular repository.Now _that_ is true simplicity. It does automatically mean that the names
are a bit longer, but in this case, "longer" really _does_ mean "simpler".If you want a short, human-readable name, you _tag_ it. It takes all of a
Well, in the git world, it's really just one shared repository that has
separate branch-namespaces, and separate working trees (aka "checkouts").
So yes, it probably matches what bazaar would call a checkout.Almost nobody seems to actually use it that way in git - it's mostly more
efficient to just have five different branches in the same working tree,
and switch between them. When you switch between branches in git, git only
rewrites the part of your working tree that actually changed, so switching
is extremely efficient even with a large repo.So there is seldom any real need or reason to actually have multiple
The fact is, git supports renames better than just about anybody else. It
just does them technically differently. The fact that it happens to be the
_right_ way, and everybody else is incompetent, is not my fault ;)
...
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1Right. That's why I said all revisions can be named by a URL + a
number, because it's the combination of the URL + a number that isI agree that a revision is a revision, but I don't think that's a
When two people have copies of the same revision, it's usually because
they are each pulling from a common branch, and so the revision in that
branch can be named. Bazaar does use unique ids internally, but it'sBut tags have local meaning only, unless someone has access to your
The key thing about a checkout is that it's stored in a different
location from its repository. This provides a few benefits:- - you can publish a repository without publishing its working tree,
possibly using standard mirroring tools like rsync.- - you can have working trees on local systems while having the
repository on a remote system. This makes it easy to work on one
logical branch from multiple locations, without getting out of sync.- - you can use a checkout to maintain a local mirror of a read-only
You can operate that way in bzr too, but I find it nicer to have one
checkout for each active branch, plus a checkout of bzr.dev. Our switch
command also rewrites only the changed part of the working tree.Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.orgiD8DBQFFNFrv0F+nu1YWqI0RAgBHAJ9XpmdvuCNDysxFhnyeCmkEG/z0ggCggMsJ
WyW6lqGMokh0k0It1KOdgtk=
=L1SR
-----END PGP SIGNATURE-----
-
Ehh. Exactly like the bzr numbers? You have to have access to the original
repo to name it.So your point is?
If you do
git log v2.6.17
in a kernel repository, you'll see exactly what I see - because you'll
have gotten the tags, aka the "easy revision names".Now, I'm obviously biased, but the thing is, git really does do this
right. No meaningless numbers. You give _meaningful_ revision names, and
they can be extremely powerful.And no, it's not just tags or the raw SHA1 numbers. You can do
relationships likegit log HEAD~5..
which means "show the log for everything since five parents ago" (which is
_not_ the same as "show the last five revisions", because one of them may
have been a merge, and brought in a lot more of new commits).Or, you can say
git diff mybranch@{2.days.ago}..nextbranch
which says exactly what you'd read it as: show the diff between what
"mybranch" looked like 2 days ago and what "nextbranch" looks like right
now.Or, since the namespace is the same for commit history _and_ for actual
file contents, and since some commands don't need commits, you can decide
to name not a revision, but a specific file or subdirectory in a revision,
and do things likegit -p grep -1 request_irq v2.6.17~2:drivers/char
where the "revision" is not a commit revision at all, it's a _tree_
revision, because we've looked up the revision for "v2.6.17~2" (which
means "the grandparent of the tag 2.6.17"), and then within that commit we
looked up the tree "drivers/char", and then we grepped (recursively) for
the string "request_irq" within that subtree (with one line of context),
and then we paginated the output through "less" (or whatever your pager is
set to).In other words, yes, the above does _exactly_ what you'd expect it to do.
The fact is, nobody ever uses the SHA1 names directly in their normal
work. You'd use the branch names, tag-names, or some relationship operator
like "this long ago" or "the parent of" or simil...
On Tue, 17 Oct 2006 00:24:15 -0400
Yeah, even in git you typically don't publish your working tree when
making it available for cloning. In fact the native git networkThat is a very nice feature. Git would be improved if it could
I'm not sure what you mean here. A bzr checkout doesn't have any history
does it? So it's not a mirror of a branch, but just a checkout of the
branch head?If so, Git can export a tarball of a branch (actually a snapshot as at
any given commit) which can be mirrored out.Sean
-
In bzr there are two different kind of checkouts. One is a called a
lightweight checkout and that's really a "normal" checkout in the way
svn for example does it. In this mode, you have the branch remotely
and only the working tree locally. So it's just a checkout of the
branch head (of any other revision if using -r when doing the
checkout).Then there are none lightweight checkouts, heavyweight checkouts.
These are the default type. A heavyweight checkout is in fact a full
branch locally, but it is "bound" to the remote branch. What this
means is that all commands such as diff/status/log/etc can be done
locally. So it's really quick.It acts the same as a lightweight checkout in most regards, so when I
run "bzr update" it actually pulls from the remove branch, and when I
run "bzr commit" it commits the same revision in both the remote
branch and the local branch. It does this in one transaction so one
can't work and the other fail (they would both fail in that case).What this also gives you is that when you want to clone the branch,
you don't need to go the the remote branch to get the revisions and
also, when being offline, you can commit locally.Committing locally is a very cool feature in my mind. If you work in
a centralized manner with checkouts, you normally commit directly to
the central branch, but when you are offline, that will fail (of
course :) ). So what you can do then is to run "bzr commit --local"
to commit only to your local checkout branch, then when you get online
again you can run "bzr update". In this case the update will take any
new commits that has been done while you were away, pull them into
your local branch, and make your local commits into something that has
been merged into the "checkout".I find this REALLY useful.
Don't know if that made sense, here it is in commands.
$ bzr checkout t p
$ cd p
$ echo hej >> hosts
$ bzr commit --local -m 'offline'
$ echo hej >> hosts
$ bzr commit --local -m 'offline 2'Now...
There are two forms of checkout: a normal checkout which contains the
complete history of the branch, and a lightweight checkout, which just
has a pointer back to the original location of the history.In both cases, a "bzr commit" invocation will commit changes to the
remote location. In general, you only want to use a lightweight
checkout when there is a fast reliably connection to the branch (e.g.
if it is on the local file system, or local network).Aaron would be talking about a normal (heavyweight) checkout here.
With a heavyweight checkout, you can do pretty much anything without
access to the branch. In contrast, almost all operations on a
lightweight checkout need access to the branch.James.
-
So the "lightweight checkout" is equivalent of "lazy clone" we have
much discussed on git mailing list about (without any resulting code,
unfortunately). The point of problem was how to do this fast, without
need for fast reliable connection to the repository it was cloned from.
For example if to leave fetched objects in some kind of cache, or even
in "lightweight checkout"/"lazy clone" repository database.If repository we do "lightweight checkout"/"lazy clone" from is on
local file system (perhaps network file system), then we can use
alternates mechanism (git clone -l -s). That's why "lazy clone" wasWe have terminology conflict here. Bazaar-NG "pull" and "merge" vs.
GIT "fetch", "pull" and "merge"; Bazaar-NG "checkout" vs. GIT "clone"
and "checkout".In GIT "clone" is what is used to copy whole repository, "checkout"
is what is used to extract given/current branch to [given] working area.
--
Jakub Narebski
Poland
-
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1Sure, and so can bzr. But using a checkout of the branch head means:
- - No one has to do anything special to provide a working tree of a given
revision
- - I can still run any readonly operations I desire
- - I can update to the latest version of bzr.dev with one command.Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.orgiD8DBQFFNTRc0F+nu1YWqI0RAsL2AKCCG0bP8m01WVllfPMzCdFZjmgEgACfeToz
57HERFJ6ZkkS3VrxLRnVPAs=
=3CX7
-----END PGP SIGNATURE-----
-
If I can add some clarification: There is a lightweight checkout and
heavyweight checkout. The former contains no history and does everything
(except status and I am not sure about diff) by accessing the remote
data. The later contains mirror of the history data and does
write-through on commit (and otherwise behaves like normal branch with
repository)What would be really useful would be a checkout, or even a branch (ie.
with ability to commit locally), that would only contain history data
since some point. This would allow downloading very little data when
branching, but than working locally as with normal repository clone.In bzr this was already discussed and the storage supports so called
"ghost" revisions, whose existence is known, but not their data. There
are even repositories around that contain them (created by converting
data from arch), but to my best knowledge there is no user interface to
create branches or checkouts with partial data.--------------------------------------------------------------------------------
- Jan Hudec `Bulb' <bulb@ucw.cz>
-
On Sat, 21 Oct 2006 20:58:25 +0200
In Git the same functionality can be achieved with so called shallow-
clones. Unfortunately, they've only been discussed and not yet
implemented.Sean
-
Hi,
It would also make things slow as hell. How do you deal with something
like annotate in such a setup?Ciao,
Dscho-
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1For the particular case of annotate, bzr is designed to store
annotations at commit time. So annotate should require remote access to
a small amount of data from two files-- not a great cost.But our default form of checkout contains a local copy of all history
data, so that readonly operations happen at local speed.Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.orgiD8DBQFFNN8Y0F+nu1YWqI0RAqXtAJ4qKGQ5ZwlMF795kz3udeuRTcRy6wCghr53
tjw9cNVxzrQ0XSUO2v52ZIo=
=W6q7
-----END PGP SIGNATURE-----
-
Hi,
You'd probably have to do all processing server-side (git log, blame,
merges... like in subversion, where you can merge and rename/move files
remotely, IIRC). Of course, all the things which make git really useful
for me (gitk, git log with all its arguments etc.) would not be
available. Cheap checkouts would be made possible easily that way at the
cost of higher server load and an abstraction layer over network for
object access.I don't know if that sounds reasonable at all.
Matthias
-
On Tue, 17 Oct 2006 12:30:27 +0200 (CEST)
Some commands like annotate might not make any sense in such a set up.
But one way to get the same (perhaps even better) feature into git
would be to support shallow clones, in which case even annotate would
continue to work even if somewhat crippled by the lack of a complete
history.Sean
-
Tags are propagated during clone, and during fetch/pull (getting changes
from repository). So in that sense they are global.If you don't publish your repository, then neither tags, nor <URL>+<rev no>
In git we usually use "git clone --local" (with repository database
hardlinked) or "git clone --shared"/"git clone --reference <repository>"
(which automatically sets alternates, i.e. file pointing to alternate
repository database) for that. This way one gets his/her own refs
namespace, so two people can work on different branches simultaneously.Alternate solution would be to symlink .git, or .git/objects (i.e.
In git you can access contents _without_ checkout/working area.
For example gitweb (one of git's web interfaces) uses only repositoryLuben (IIRC) works this way.
--
Jakub Narebski
Poland
-
Bazaar can do this too. For example,
"bzr cat http://something -r some-revision" gets the content of a file
at a given revision. But that's not what Aaron was refering to.In Bazaar, checkouts can be two things:
1) a working tree without any history information, pointing to some
other location for the history itself (a la svn/CVS/...).
(this is "light checkout")2) a bound branch. It's not _very_ different from a normal branch, but
mostly "commit" behaves differently:
- it commits both on the local and the remote branch (equivalent to
"commit" + "push", but in a transactional way).
- it refuses to commit if you're out of date with the branch you're
bound to.
(this is "heavy checkout")In both cases, this has the side effect that you can't commit if the
"upstream" branch is read-only. That's not fundamental, but handy.I use it for example to have several "checkouts" of the same branch on
different machines. When I commit, bzr tells me "hey, boss, you're out
of date, why don't you update first" if I'm out of date. And if commit
succeeds, I'm sure it is already commited to the main branch. I'm sure
I won't pollute my history with merges which would only be the result
of forgetting to update.Once more, that's not fundamental, but handy.
The more fundamental thing I suppose is that it allows people to work
in a centralized way (checkout/commit/update/...), and Bazaar was
designed to allow several different workflows, including the
centralized one.--
Matthieu
-
Dear diary, on Tue, Oct 17, 2006 at 01:19:08PM CEST, I got a letter
It isn't very nice because it enforces the update-before-commit
workflow, which was complaint of many CVS users and I can remember it
being one of the selling points of the distributed VCSes in 2001 or so,
although it is not so emphasized lately. (I understand that this is
something optional in Bazaar.)BTW, merge commits aren't bad. They reflect what really happenned,
explicitly record the merge resolution taken, if there was any, and
protect you from accidentally losing or damaging [any portion of] your
changes. And they aren't cluttery either since we hide them from
non-graphical history listings by default.Still, I can recognize that in some scenarios, people might find it
useful, and I can remember some people asking for it in the past. So I
couldn't resist and implemented it in Cogito as cg-commit --push. Pushed
out now. Took me about 5 minutes implementing it and 10 minutes documenting
it. ;-)P.S.: A general note for bleeding-edge Cogito users, I've rewritten the
local changes handling so that we always do three-way merge now instead
of that braindead patches diffing/applying, but it's not completely
stable yet, some testcases still fail. So be a bit careful when
updating/uncommitting/switching/... with uncommitted changes in the
working tree.--
Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1
lK[d2%Sa2/d0$^Ixp"|dc`;s/\W//g;$_=pack('H*',/((..)*)$/)
-
You're not telling us bzr still follows the utterly stupid
update-before-commit model, right? Right?OG.
-
One last time:
bzr _CAN_ follow the utterly stupid update-before-commit model.
It doesn't force you to do so, obviously.
--
Matthieu
-
What about
3) getting the repo with all the history while still not having to be
online to actually commit to *your* copy of the repo. When you later get
online, you can send all your changes in a big hunk, or let bazaar emailIt appears we have different ideas of what's handy. Perhaps it's just a
difference in workflow, or lack of "email-commits-as-patches" tools in
bazaar, but the ability to commit to whatever branch I like in my local
repo and then just send the diffs by email or please-pull requests to
upstream authors is what makes git work so well for me. I can ofcourse
also pull the changes to another branch, or cherrypick them one by one,
or...OTOH, if by "commit" you mean "send your changes back to central
server", and bazaar'ish for "register my current set of changes in the
local clone of the repo" is called something else, it sounds veryCentralized works in git too after a fashion. Most projects have a
master repo hidden somewhere that frequently gets pushed out for
publishing and which most (all?) contributors sync against from time to
time, but it's by no means a certainty. What *is* a certainty is that
the published branches are exactly identical to the ones in the master
repo, and all the downstream authors will get a history where they can
easily track master's development.For git, I suppose Junio has the hidden master repo which he publishes
at kernel.org. Linus does the same with the Linux repo.On a side-note, it sounds as though the "bound branch" scenario
encourages making a big change as one mega-diff, so long as it
implements one feature, whereas the git workflow with topic-branches
that eventually gets merged to master allows changes to sort of
accumulate up to a feature in the steps one actually has to take to make
the feature work.Side-note 2: Three really great things that have made work a lot easier
and more enjoyable since we changed from cvs to git and that aren't
mentioned in the comparison table:
* Depen...
Well, the discussion was about checkouts, so I was talking about
checkouts ;-).What you mention is the default behavior of Bazaar when you use
"bzr branch" or "bzr get". BTW, it's also possible to do this with aYou have "bzr bundle" in Bazaar, and there was work to have it
actually send the email ( http://bazaar-vcs.org/SubmitByMail ), but I
don't think it's finished yet.And yes, this is a great feature, the first time I used it was with
Darcs, and I was impressed how easy I could submit a patch without any
setup and with a 5-lines tutorial. Even wiki seems complex afterSure. Once again, Bazaar does it this way too. There's an _additional
feature_ called checkout which allows you to work in another way,
though. As most "feature", it's not useful to everybody.Sure. And regarding this, hopufully, most modern VCS go in the same
direction.> * Dependency/history graph display tools
Differences in nomenclature is really messing this discussion up. In
git, a "checkout" is the act of pulling objects from the object databaseNow I'm really confused. Does bazaar have both "clone" (git-style
fetching a full repo and all the branches) and "checkout" (cvs-style
Yes, it has both. That's "bzr branch" (git clone) and "bzr checkout"
(cvs checkout).Difference between "bzr branch" and "git clone" is that bzr doesn't
fetch all the branches. It fetches one "branch" (succession of
revisions) with all the ancestors of the revisions of the branch.--
Matthieu
-
> > * Dependency/history graph display tools
You did. The plugin is largely based on my experiences with the git
version, and explicitly gives credit in the comments.-
In bzr, the "bundle" appears like a patch, but it actually contain the
same information as the revision(s) it contains (I believe this
applies to hg and Darcs too). A bundle can be used almost like a
branch. That's a key point, since revision identity is not based on
content's hash, so applying a patch is very different from merging aThat's the key point, but patch review for non-accidental developpers
Bazaar's bundle use base64 encoding for binaries. I don't think that's
efficient binary diff (xdelta-like) though. Aaron has been fighting
quite a lot with MUA and MTA mixing up the patches (line ending in
particular) ...--
Matthieu
-
The patch generated by git-format-patch has author information (in
"From:" header), original commit date (in "Date:" header), commit
message (first line in "Subject:", rest in message body), place for
comments which are not to be included in commit message, diffstat for
easier patch review, and git extended diff (with information about
renames detection, mode changes, 7-characters wide shortcuts of file
contents identifiers). It does not record parent information, original
comitter and comitter date, which branch we are on etc. You can quite
easily provide ordering of patches.Sending patches via email prohibits first line of commit message to be
enclosed in brackets (subject usually is "[PATCH] Commit description"
or "[PATCH n/m] Commit description") and enforces git convention of
commit message to consist of first line describing commit shortly,
separated by empty line from the longer description and signoff lines.If I remember correctly git binary diff format is xdiff based, and uses
kind of ascii85 encoding (PostScript).--
Jakub Narebski
Poland
-
Dear diary, on Tue, Oct 17, 2006 at 04:41:02PM CEST, I got a letter
It should be noted that there's no user interface for sending/receiving
that and I suspect no reasonably usable user interface for creating it.How frequently are the bundles used in practice?
It's a cultural difference, I suspect. Git comes from an environment
based on intensive exchanges of patches and patch series and an
environment not mandating developers to use any tool besides diff/patch,
so Git is very focused at good support for applying patches and there
simply has been no big conscious demand for bundles support given this.Another aspect of this is that Git (Linus ;) is very focused on getting
the history right, nice and clean (though it does not _mandate_ it and
you can just wildly do one commit after another; it just provides tools
to easily do it). This means that the downstream maintainers have to
rebase patches, possibly reorder them, and update the changesets with
bugfixes instead of stacking the bugfixes upon them in separate changes
- then Linus merges the patches and only at that point they are "etched"
forever. This means that the history will contain neatly laid out way
of how $FEATURE was achieved, but of course also more work for
downstream maintainers.--
Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1
lK[d2%Sa2/d0$^Ixp"|dc`;s/\W//g;$_=pack('H*',/((..)*)$/)
-
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1Many times each day. Most submission to the bzr mainline are done with
Yes, rebasing is very uncommon in the bzr community. We would rather
evaluate the complete change than walk through its history. (Bundles
only show the changes you made, not the changes you merged from the
mainline.)In an earlier form, bundles contained a patch for every revision, and
people *hated* reading them. So there's definitely a cultural
difference there.Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.orgiD8DBQFFNXWW0F+nu1YWqI0RAuRnAJ9aZVLo4T1sfmyGC2t364UyHX+6wACff7sM
peal5rAdk/T515RGeKXkWlo=
=O61J
-----END PGP SIGNATURE-----
-
Take for example
"[PATCH 0/6] ref deletion and D/F conflict avoidance with packed-refs."Isn't it easier to review than "bundle", aka. mega-patch?
--
Jakub Narebski
Poland
-
There are even more important reasons to prefer a series of
micro-commits over a mega-patch than just ease of merging.In the cairo project, I've often reviewed a single patch and said:
"This all looks like perfectly good code and I'd be happy to
have it all in the tree. But please rebuild this as a series
of independent patches (perhaps along the lines of a, b, c,
...)"I do that not just to make the history "look nice" but because code
history is something we _use_ a lot and separate commits for separate
actions just make the history so much more usable.We have great tools like bisect to identify commits that introduce
bugs. I know that I'd be delighted to see bisect comes back pointing
at some minimal commit as causing a bug, (which would make finding the
bug so much easier).But it's also been my experience that the largest commits are also the
most likely to be the things returned by bisect. Big commits really do
introduce bugs more frequently than small commits.Finally, if someone had gone through the useful work to create small,
independent changes, (and likely finding and fixing bugs in the
process), what a horrible shame it would be to throw away that work
and merge it as a single patch, (welcome to the pain of CVS branch
merging).Now, I do admit that it is often useful to take the overall view of a
patch series being submitted. This is often the case when a patch
series is in some sub-module of the code for which I don't have as
much direct involvement. In cases like that I will often do review
only of the diff between the tips of the mainline and the branch of
interest, (or if I trust the maintainer enough, perhaps just the
diffstat between the two). But I'm still very glad that what lands in
the history is the series of independent changes, and not one mega
commit.-Carl
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1A bundle isn't a mega-patch. It contains all the source revisions. So
when you merge or pull it, you get all the original revisions in yourBisect should work equally well with revisions pulled or merged from a
The number of changes shown in the diff has nothing to do with the
So the difference here is that bundles preserve the original commits the
changes came from, so even though it's presented as an overview, you
still have a series of independent changes in your history.Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.orgiD8DBQFFNZ820F+nu1YWqI0RAjNyAJ90HMCAiopuAMvkKlcCEdc4F6QKLwCdGEWI
VOZThAQrvqybe5z93eC44BY=
=xBZM
-----END PGP SIGNATURE-----
-
But what patch reviewer see is a mega-patch showing the changeset
of a whole "bundle", isn't it?I think it is much better to review series of patches commit by commit;
besides it allows to correct some inner patches before applying the whole
series or drop one of patches in series (and it happened from time to time
on git mailing list).So if git introduces bundles, I think they would take form of series
of "patch" mails + introductory email with series description (currently
it is not saved anywhere), shortlog, diffstat and perhaps more metainfo
like bundle parent (which I think should be email form of branch really),
tags introduced etc.
--
Jakub Narebski
Poland
-
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1Yes. Carl was saying that, aside from the issue of what a reviewer
sees, a bundle is bad for other reasons. I am saying those other
reasons don't apply. I wasn't addressing the issue of what a reviewer sees.To me, seeing the individual patches is like reading a book where every
page has a different word on it, and so it's hard to put it together
into a full sentence. I'm not saying my way is The Right Way, just my
personal preference.For larger pieces of work, we try to split them up into logical units,
and merge those units independently.The Bundle format can also support a patch-by-patch output, but we don't
It's important to remember that bundles represent revisions, not
patches. When you merge a bundle, you1. install those revisions into your repository. These revisions are
latent, as though they were on another branch.
2. merge the head revision of the bundle into your branch.Virtually any merge selection process that works with branches would
also work with bundles. So tweaking before merging is really a matterThe parent in a bundle revision is the revision-id of the parent of that
revision in the branch. I don't think it's possible to change that
parent id into something else, without changing the meaning of a bundle.Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.orgiD8DBQFFNlb40F+nu1YWqI0RAnxxAJ9ETibey1Qyvz/zVxdGipaHGtnddgCfTtzt
CQUZ2dK64BS5K5WYecFAsfM=
=bJxq
-----END PGP SIGNATURE-----
-
As for what the reviewer wants to see, I think it depends on what kind
of code it is. Kernel code is complex and does not have (at least I have
not heared of) unit-tests, so short patches are preferable for review.
And since C is of the more verbose languages, short patches mean
spliting them up into several pieces.On the other hand bzr has unit-tests and python is less verbose, so the
single patch for a feature is not so big and is manageable. The patches
to bzr still come in logical steps, but usually one step per feature is
enough.Also programmers usually don't develop even the single logical step as a
single commit. Instead they they also commit to backup their work,
when they try something they think they may in future return, when they
need to continue on another computer and so on. And these commits are
generally not logical steps. Also the steps are often not in a logical
order. Therefore showing diff for each commit in the bundle often does
not make sense.So there is one bundle per logical step and therefore has a summary
diff. Individual bundles for individual steps are preferable anyway,
since the maintainer may decide to accept just some of them. A tool to
generate a series of bundles (either each with just one commit or each
with several commits) would be possible, just noone was interested
enough to do it yet.--------------------------------------------------------------------------------
- Jan Hudec `Bulb' <bulb@ucw.cz>
-
In git you can backup your work on temporary branch; besides there
That is why before sending patch series based on some feature branch,
you should at least rebase the branch on top of current work, to ensure
that the series would apply cleanly.If feature branch/patch series needs cleanup (going from "answer" to
"solution" http://lkml.org/lkml/2005/4/7/176), i.e. patch (commit)
reordering, joining two patches into one, patch splitting, you can
use git-cherry-pick, git-cherry-pick --no-commit and git commit --amend
combination, or git-format-patch, patch editing and reordering, and git-am.
Or just use StGit or pg.--
Jakub Narebski
Poland
-
Dear diary, on Wed, Oct 18, 2006 at 02:30:14AM CEST, I got a letter
BTW, I think what describes the Git's (kernel's) stance very nicely is
what I call the Al Viro's "homework problem":http://lkml.org/lkml/2005/4/7/176
If I understand you right, the bzr approach is what's described as "the
dumbest kind" there? (No offense meant!)--
Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1
lK[d2%Sa2/d0$^Ixp"|dc`;s/\W//g;$_=pack('H*',/((..)*)$/)
-
Yes and no, The bundle includes both the full final thing, and each
step along the way. Each step along the way is something you'll get
when you merge it.Once merged, it will be "next one" in the description above. It would
typically look something like this in "bzr log"(shortened) In this
example, doing C requires doing A and B as well...committer: foobar@foobar.com
message: merged in C
-------
committer: bar@bar.com
message: opps, fix bug in A
-------
committer: bar@bar.com
message: implement B
-------
committer: bar@bar.com
message: implement ASo, you'll get full history, including errors made :) You can also
see who approved it to this branch (foobar) and who did the actual
work (bar)/Erik
-
Dear diary, on Wed, Oct 18, 2006 at 11:28:32AM CEST, I got a letter
I see, that's what I've been missing, thanks. So it's the middle path
(as any other commonly used VCS for that matter, expect maybe darcs?;
patch queues and rebasing count but it's a hack, not something properly
supported by the design of Git, since at this point the development
cannot be fully distributed).I also assume that given this is the case, the big diff does really not
serve any purpose besides human review?But somewhere else in the thread it's been said that bundles can also
contain merges. Does that means that bundles can look like:1
/ \
2 4
| | _
3 5 |
\ / | a bundle
6 |
~In that case, against what the big diff from 6 is done? 2? 4? Or even 1?
--
Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1
lK[d2%Sa2/d0$^Ixp"|dc`;s/\W//g;$_=pack('H*',/((..)*)$/)
-
When you run the "bundle" command, you can tell it what you want the
bundle to be created against. So, If I just commited 5, I can run
"bzr bundle -r-1" to get the bundle against 4, or I can do "bzr bundle
path/to/other/branch" to get a bundle that relates to it.To merge a bundle into a branch, the parrent of the first revision in
the bundle, has to exist in the branch is't being merged into. (well,
unless you use patch, but that's outside of bzr, and bzr wouldn't know
about each revision in them)This command will find a common root and create a bundle that
corresponds to it. The "big diff" as you call it, would be the
changes between the point where the branch was created, and the last
commit.In the case of just committing 5, and you want to create a bundle that
can be merged back at point 6, the "big diff" would be against 1 since
that's the branch point./Erik
--
google talk/jabber. zindar@gmail.com
SIP-phones: sip:erik_bagfors@gizmoproject.com
sip:17476714687@proxy01.sipphone.com
-
>
Git cannot do that remotely (with exception of git-tar-tree/git-archive
which has --remote option), yet. But you can get contents of a file
(with "git cat-file -p [<revision>:|:<stage>:]<filename>"), list
directory (with "git ls-tree <tree-ish>") and compare files or
directories (git diff family of commands) without need for working
directory.AFAICT working area is required _only_ to resolve conflicts during
In git by default in the top directory of working area you have .git
directory which contains whole repository (object database, refs (i.e.
branches and tags), information which branch is current, index aka.
gitcache, configuration, etc.). You can share object database locally
(which includes network filesystem).You can have .git (usually <project>.git then) directory without working
area.There was proposal to allow for tracking branches to be marked
read-only, but it was not implemented yet.But git has reverse check: it forbids (unless forced by user) to fetch
into branch which has local changes (does not fast-forward). This make
sure that no information is lost.The idea is that you fetch changes into tracking branch (e.g. 'master'
branch of some parent remote repository into 'origin' or
'remotes/<repository name>/master' branch); you don't commit changes to
such branch. You do your own work either on 'master' branch, then merge
(typically using "git pull") corresponding 'origin' tracking branch, or
use separate private feature branch and use rebase after fetch.Git is designed for distributed workflows, not for centralized one.
All repositories are created equal :-)--
Jakub Narebski
ShadeHawk on #git and #revctl
Poland
-
Same as bzr then I believe. "bzr pull" will suggest you to use "merge"
Note that "bound branches" and "other branches" in bzr are not so
different. The "master" (the one you make a checkout of) doesn't have
to know it has checkouts, and the "checkout" just has one file
pointing to the "master", and you can switch from one flow to the
other with "bzr bind/unbind".So, in Bazaar, all repositories are /almost/ created equal ;-).
--
Matthieu
-
On Tue, 17 Oct 2006 13:45:31 +0200
Interesting, I didn't know about the --remote option. So in fact as long
as the remote has enabled upload-tar then anyone can do a "light checkout".
However, it appears that kernel.org for instance doesn't enable this feature.Sean
-
And you can use GIT_DIR environmental variable or --git-dir option
to git wrapper.
--
Jakub Narebski
Poland
-
On Tue, 17 Oct 2006 13:19:08 +0200
Git can do this from a local repository, it just can't do it from
a remote repo (at least over the git native protocol). However,
over gitweb you can grab and unpack a tarball from a remote repo.This doesn't sound right, at least in the spirit of git. Git really
wants to have a local commit which you may or may not push to a
remote repo at a later time. There is no upside to forcing it all to
happen in one step, and a lot of downsides. Gits focus is to support
distributed offline development, not requiring a remote repo to beAgain this seems really anti-git. There is no reason for your local
branch to be marked read only just because some upstream branch isThis is exactly the same in Git. You really only ever push upstream
when your local changes fast forward the remote, (ie. you're up to date).While Git really isn't meant to work in a centralized way there's nothing
preventing such a work flow. It just requires the use of some surrounding
infrastructure.Sean
-
Anyway, given the price of disk space today, this only makes sense if
you have a fast access to the repository (otherwise, you consider your
local repository as a cache, and you're ready to pay the disk space
price to save your bandwidth). In this case, it's often in yourI lied in my above description ;-).
I should have said "by default" ... but you have "commit --local" if
you want to have a local commit on a bound branch (at this point, I
should remind that not all branches are "bound branches". "bzr branch"Will, take the example of my bzr setup.
I have one repository, say, $repo.
In it, I have one branch "$repo/bzr.dev" which is an exact mirror of
http://bazaar-vcs.org's branch.I also have branches for patches (occasional in my case) that I'll
send to upstream. Say $repo/feature1, $repo/feature2, ...If, by mistake, I start hacking on bzr.dev itself, I'll be warned at
commit time, create a branch, and commit in this new branch. I believe
git manages this in a different way, allowing you to commit in this
branch, and creating the branch next time you pull. But you know thisYes, but you will have to do a merge at some point, right ? While I'm
keeping a purely linear history (not that it is good in the general
case, but for "projects" on which I'm the only developper, I find it
good. For example, my ${HOME}/etc/).But don't get me wrong, I also prefer the decentralized way in most
case. And I'm happy that bzr and git work like this by default. Just
that at least *I* have cases where a centralized approach suits me
better, and then I'm happy with that particular feature of bzr.--
Matthieu
-
Dear diary, on Tue, Oct 17, 2006 at 02:03:21PM CEST, I got a letter
In fact, in Git the branch is actually created at the moment you clone.
For simplicity sake, let's say you cloned just a single branch, not the
whole repository (or imagine a repository with a single branch). Then,
in your local repository, two branches will be created: 'origin' and
'master'. The origin branch is considered readonly (though Git does
not enforce it) and only mirrors the branch in the remote repository.
The master branch is the branch you do your work on, and it corresponds
to the contents of your working tree.Thus, when you are "updating" your repository (we also call that
"pull"), what happens is that new commits are _fetched_ from the remote
repository to your 'origin' branch and then the 'origin' branch is
_merged_ to the 'master' branch. (You can even separate those two steps
and do them manually. So you can e.g. periodically fetch but just check
diffs with your master branch and never actually merge, or whatever.)If you never do any local commits on the repository, every time you
merge the 'master' branch is ancestor of the 'origin' branch and only
so-called fast-forward merge happens - the 'master' branch is updated to
point at the same commit as the 'origin' branch.If you _did_ do some local commits, a real merge of the two branches
happens and a new merge commit tying the current master and origin
history together is recorded on the merge branch.--
Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1
lK[d2%Sa2/d0$^Ixp"|dc`;s/\W//g;$_=pack('H*',/((..)*)$/)
-
By curiosity, what happens if you accidentally commit to it?
--
Matthieu
-
It will quietly accept the commit.
Later when you attempt to run `git fetch` to download any changes
from the remote repository to your local origin branch the fetch
command will fail as it won't be a strict fast-forward due to
there being changes in origin which aren't in the remote repository
being downloaded.The user can force those changes to be thrown away with `git fetch
--force`, though they probably would want to first examine the
branch with `git log origin` to see what commits (if any) should
be saved, and either extract them to patches for reapplication or
create a holder branch via `git branch holder origin` to allow them
to later merge the holder branch (or parts thereof) after the fetch
has forced origin to match the remote repository.So in short by default Git stops and tells the user something fishy
is going on, but the error message isn't obvious about what that
is and how they can resolve it easily.There has been discussion about marking these branches that we
know the user fetches into as read-only, to prevent `git commit`
from actually committing to such a branch (we also have the same
case with the special bisect branch), but I don't think anyone has
stepped forward with the complete implementation of that yet.Like anything I think people get used to the idea that those branches
are strictly for fetching and shouldn't be used for anything else.
There's really no reason to checkout a fetched into branch anyway;
temporary branches are less than 1 second away with
`git checkout -b tmp origin` (for example).--
Shawn.
-
Dear diary, on Tue, Oct 17, 2006 at 02:03:21PM CEST, I got a letter
(In rich countries. This may still be very different in poorer
countries. E.g. some actual mplayer developer(s) from Turkey opposed
transition to a distributed version control system simply because they
have trouble affording the required additional diskspace for the full
history. SVN is already very space-hungry for them. (It stores
basically two complete checkouts in parallel.))But the much bigger practical problem is bandwidth, plenty of people
still have internet connections where downloading several tens/hundreds
of megabytes of the complete history is quite a big thing, and the
servers ain't gonna be happy from that either, nor those paying the
bandwidth bills. ;-) And this is one of the big problems the Mozilla
guys have - having everyone download 450M worth of the full CVS-imported
history (and I'll bet no other VCS will beat that size) seems to be notSo how is the light checkout actually implemented? Do you grab the
complete new snapshot each time the remote repository is updated? Do all
the (at least read-only, like "log" and "diff", perhaps "status")
commands work on such a light checkout?This is something sorely missing in Git but if it's really only "we just
provide bandwidth-expensive way to keep your tree up-to-date and that's
all," that would not be hard at all to implement in Git too, using
git-archive --remote.--
Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1
lK[d2%Sa2/d0$^Ixp"|dc`;s/\W//g;$_=pack('H*',/((..)*)$/)
-
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1No, the lightweight checkouts store very little. They have
- - a copy of tree shape (filenames, paths, sha1 sums) from the last
commit.
- - a copy of tree shape for the current working directoryYes. And if you check out from a read-write branch, all write commands,
work, too.Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.orgiD8DBQFFNXeN0F+nu1YWqI0RAsdrAJ0bUj4swxm5sod9WnsbPZ9yIQ7FVQCdE4UB
8x0ddFkbr5cPISTihw96d8c=
=/XAr
-----END PGP SIGNATURE-----
-
Ah. So in git terminology it stores index and working directory
(and perhaps the name of branch).--
Jakub Narebski
Poland
-
Dear diary, on Wed, Oct 18, 2006 at 02:38:37AM CEST, I got a letter
I see, I guess that means "the index file and tree objects for the last
Ok, one last question - do you do most of the work locally, fetching
bits of data as you need, or remotely, only taking input/producing
output over the network (the pserver model)?--
Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1
lK[d2%Sa2/d0$^Ixp"|dc`;s/\W//g;$_=pack('H*',/((..)*)$/)
-
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1Personally, I do not do remote commits over slow links. At home, I use
a single machine, and mirror my repository to a public machine using
rsync. At work, I store my repository on an NFS server, and push my
repository to a public machine using rsync.Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.orgiD8DBQFFNXpu0F+nu1YWqI0RAjPTAJ4w9YOM5XLpnIP9jYywtfMr+LZLvACfdycA
/TYAGUVGweR5+cPtDVAIBq4=
=rsNR
-----END PGP SIGNATURE-----
-
Dear diary, on Wed, Oct 18, 2006 at 02:50:54AM CEST, I got a letter
I meant the work of the commands (bzr log and such), not your personal
workflow. :-) Sorry for being unclear.--
Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1
lK[d2%Sa2/d0$^Ixp"|dc`;s/\W//g;$_=pack('H*',/((..)*)$/)
-
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1When using the native network protocol, work can happen remotely. (But
the native protocol is quite new, and support for "smart" operations is
currently limited.) When using the dumb protocols, data is fetched from
the remote system and processed locally. Light checkouts are not
recommended when the server is on a slow link, but heavyweight checkouts
are quite suitable in that situation.Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.orgiD8DBQFFNX3j0F+nu1YWqI0RAtRcAJ0fEZam6H3hs3YHY/dEYEhk3A73BQCdENHY
s9+KZTfqnDJg8mHNmC2C/Ok=
=Nqcn
-----END PGP SIGNATURE-----
-
On Tue, 17 Oct 2006 14:03:21 +0200
This is most likely the reason that people using Git don't clammor
more for the ability to work without a local repository. Disk is cheap
and it just makes sense the vast majority of the time to have a complete
copy of the repository yourself. There are a lot of powerful things
you can do once you have all that information in your repo. Not the least
of which is performing any and all operations while flying on a planeWell, with Git the default is to only commit locally. Of course, you
could set your post commit hook to always push it to a remote ifWell, it's just a slight difference in perspective rather than any
big issue here. Git treats all repositories as peers, so it would never
assume that just because one other particular repo has a branch marked
as read only that it should be marked read only locally. It lets you
commit to it, and then push to say a third and fourth repo that are
writable as well. In practice this doesn't really cause anyWell if you're committing changes from multiple different machines,
how is that different from having say 3 different developers committing
changes to the central repo? How does bzr avoid a merge when you're
pushing changes from 3 separate machines?You mentioned that if you try to push and you're not up to date you'll
be prompted to update (ie. pull from the upstream repo). When you do such
a pull do your local changes get rebased on top or is there a merge? By
your comments I guess you're saying they're rebased rather than merged, and
this is how you keep a linear history. Git can do this easily, but it's
not done by default.Sean
-
The workflow is different.
If I commit broken changes on a repository shared by multiple
developers, they'll insult me, and they'll be right. While I find
nothing wrong in commiting broken changes to my ${HOME}/etc/ whenErr, the same way people have been doing for years ;-). If you don't
have local commits, "bzr update" will work in the same way as "cvs
update", it keeps your local changes, without recording history. Like
"git pull" does if you have uncommited changes I think.--
Matthieu
-
On Tue, 17 Oct 2006 15:44:36 +0200
Ah, okay. Well Git can definitely manage this. Just means you have to
rebase any local changes before pushing. This will keep the history
linear and make sure that no merges are needed in the case you were asking
about.So far, it sounds to me like bazaar and git are more alike than they are
different. Each have a few commands the other doesn't but all in all
they sound very similar. But i'm a Git fanboy so I aint switching
now ;o)Sean
-
Sure. As I said before, the little add-on of checkouts is that you say
once "I don't want to do local commit here", and bzr reminds you this
each time you commit. Well, where it can make a difference is that it
does it in a transactional way, that is, you don't have that little
window between the time you pull and the time you push your nextSure. And at least, if you want to prove that your decentralized SCM
is the best, you'd better look at features other than the ability to
commit on a local branch ;-). If you want a _real_ flamewar, better
talk about rename management or revision identity.The thing is that most people migrated from CVS/svn, so they found
their new SCM to be incredibly better the existing. But it's generally
not _so_ much better than the other modern alternatives ;-). (and
don't forget to thank Darcs and Monotone who brought most of the goodProbably not going to switch either, but that might happen.
--
Matthieu
-
On Tue, 17 Oct 2006 16:19:46 +0200
Yeah, it would be bad luck, but Git wouldn't actually let the push
succeed if someone had changed the upstream repo in that small window.
It would complain that your push wasn't a fast forward and ask youHeh, true enough. And the fact is they're all "borrowing" the
best ideas from one another. All of a sudden the others are all
getting git-like bisect and gitk guis. And of course Linus has
said that he got quite a bit of inspiration from Monotone
originally.Beyond the distributed offline nature of using Git, the killer
"feature" for me is its raw speed and flexibility[1]. It's
really nice to be able to branch in under a second and try
out a line of development etc. Maybe this is just as easy
in Bazaar but it's not true of say Mercurial. Honestly, I
just can't imagine any other SCM meeting my needs better than
Git. So I have a hard time taking complaints about rename
management or revision identity seriously.While they don't affect my usage, IMHO the two biggest failings
of Git are its lack of a shallow clone and its reliance on shell
and other scripting languages so there is no native Windows version.
I'm sure both of these areas are handled better by Bazaar and/or
some of the other new SCMs where they'd be a better choice than
Git.Sean
[1] As an aside, I don't understand why bazaar pushes the idea
of "plugins". For instance someone mentioned that bazaar has
a bisect "plugin". Well Git was able to add a bisect "command"
without needing a plugin architecture.. so i'm at a loss as
to why plugins are seen as an advantage.-
Matthieu Moy wrote:
The revision will change between different repos though, so
random-contributor A that doesn't have his repo publicised needs to send
patches and can't log his exact problem revision somewhere, which makes
it hard for random contributor B that runs into a similar problem but on
a different project sometime later to find the offending code. I prefer
the git way, but I'm a git user and probably biased.That said, it shouldn't be impossible to add fixed, user-friendly
bazaar-like revision numbers for git. We just have to reverse the
<committish>[^~]<number> syntax to also accept <committish>+<number>.This would work marvelously with serial development but breaks horribly
with merges unless the first (or last) commit on each new branch gets
given a tag or some such.Either way, I'm fairly certain both bazaar and git needs to distribute
information to the user in need of finding the revision (which url and
which number vs which sha). I also imagine that the bazaar users, just
like the git users, are sufficiently apt copy-paste people to neverWell, if two people have the same revision in git, you *know* they have
pulled from each other, because ALL objects are immutable. The point ofI imagine the bazaar-names with url+number only has local meaning unless
someone has access to your repository too. One of the great benefits of
git is that each revision is *always exactly the same* no matter in
which repository it appears. This includes file-content, filesystemThis I'm not so sure about. Anyone wanna fill out how shallow clones and
Check. Well, actually, you just clone it as usual but with the --bare
Works in git as well, but each "checkout" (actually, locally referenced
repository clone) gets a separate branch/tag namespace.--
Andreas Ericsson andreas.ericsson@op5.se
OP5 AB www.op5.se
Tel: +46 8-230225 Fax: +46 8-230231
-
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1No, you don't. They may have each pulled from a different repository.
Take revision 00aabbcc, created by Linus. Linus has it because he
committed it. I have it because I pulled Linus' repository. You have
it because Andrew Morton pulled Linus' repository, and you pulled AndrewIn Bazaar, a revision id always refers to the same logical entity, but
With most SCMs that store the repository in the root of the tree,
disentangling the tree and repository requires care. OTOH, this is justIn our terminology, if it can diverge from the original, it's a branch,
not a checkout.Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.orgiD8DBQFFNOM10F+nu1YWqI0RAvNUAJwN/QviOs+sUuN9ep4Otyrgax9SmwCfSH7t
XdxOxo7smshNlzU3qoxq6Nw=
=nxsM
-----END PGP SIGNATURE-----
-
I realized it as I read it now. What I meant was that you know you have
This I don't understand. Let's say Alice has revision-154 in her repo,
located at alice.example.com. Let's say that commit is accessible with
the url "alice.example.com:revision-154". Bob pulls from her repo into
his own, which is located at bob.example.com.Lots of questions here, so I'll split them up. Feel free to delete the
non-applicable ones.Will the commit in Bob's repo be accessible at
"bob.example.com:revision-154"?If it's not, how can you backtrack from old bugreports and find the
error being discussed?If it is, how does that work if Bob suddenly wants to commit things
before Alice is done working with her changes?Also, suppose they both push to a master-repo where Caesar has pushed
his changes and nicked the slot for revision-154. Does the master repo
re-organize everything and then invalidate Bob's and Alice's changes, or
does it tell Alice and Bob that they need to update and then reorganize
their repos before they're allowed to push?I really can't get my head around the usefulness of revision-numbers
hopping around which is probably why I'm having such a trouble grokingYou get the working tree files by default. Use --bare if you don't want
them to be checked out (i.e. written to the working tree) after theThis clears things up immensely. bazaar checkout != git checkout.
I still fail to see how a local copy you can't commit to is useful, but
it doesn't really matter to me as I've already found a tool that does
everything I want wrt scm needs.--
Andreas Ericsson andreas.ericsson@op5.se
OP5 AB www.op5.se
Tel: +46 8-230225 Fax: +46 8-230231
-
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1bzr differentiates between pull and merge. Pull is a mirroring command.
So with pull, yes revision-154 will be accessible at
bob.example.com:revision-154.With merge, it won't. Bob can refer to it as "154:alice.example.com",
I don't see how this applies. You can always commit in a branch. If
alice and bob both commit, then they are diverged and can't pull. IfMy bzr is run from a local copy I can't commit to. To get the latest
changes from http://bazaar-vcs.org, I can run "bzr update ~/bzr/dev".
To merge the latest changes into my branch, I can run
"bzr merge ~/bzr/dev". It's also convenient for applying other peoples'
patches to.Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.orgiD8DBQFFNTKl0F+nu1YWqI0RAhRkAJ0d5KyRElEiFm/m5iRrTIk00RyqywCfe2IY
dhW46SYWm+FTQpN30VY5tPs=
=6SFm
-----END PGP SIGNATURE-----
-
> My bzr is run from a local copy I can't commit to.
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1Sure.
Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.orgiD8DBQFFNXRU0F+nu1YWqI0RAptIAJ0btflKFEjF9a7Kt/qVZufK003DpACeK7Dc
leW4ICG1LbOC9DGrAd5ztlY=
=JGvL
-----END PGP SIGNATURE-----
-
Dear diary, on Tue, Oct 17, 2006 at 09:44:37PM CEST, I got a letter
The question is, why is it useful to enforce the "no commit" rule? Git
can work exactly the same, it just doesn't _enforce_ the rule. And is
the capability of enforcing such a rule important enough to warrant its
own column in the comparison table?--
Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1
lK[d2%Sa2/d0$^Ixp"|dc`;s/\W//g;$_=pack('H*',/((..)*)$/)
-
Another equation can help.
Revision Identity != Revision Number.
$ bzr log --show-ids
------------------------------------------------------------
revno: 1
revision-id: Matthieu.Moy@imag.fr-20061017152029-4c5a2861bcf23b7d
committer: Matthieu Moy <Matthieu.Moy@imag.fr>
branch nick: foo
timestamp: Tue 2006-10-17 17:20:29 +0200
message:
some messageSee, bzr has this unique revision identifier (not based on a hashsum).
The design choice of bzr is to hide it as much as possible from the
user interface.Then, if I'm in the branch in which I typed this command, I can reffer
to this revision with simplybzr whatever -r 1
In the general case, I can access it with
bzr whatever -r revid:Matthieu.Moy@imag.fr-20061017152029-4c5a2861bcf23b7d
(There's currently a lack in the UI to specify a remote revision-id,
but that's not a problem in the model itself)bzr's internal use almost exclusively revision ID (ancestry
information is all about revision id), and revno are a UI layered on
top of it.I don't have strong needs in revision control, but I actually never
encountered a case where I had to access a revision by providing its
ID. So, for people like me, revision numbers are sufficient, and they
are simple (for example, I can tell without running any command that
revision 42 is older than revision 56 in a particular branch).--
Matthieu
-
On Tue, 17 Oct 2006 10:05:41 -0400
Well his point was that they have pulled from each other directly or
indirectly. You can safely say that rev 00aabbcc.. in _any_ repository
is the same rev. This discussion started because of doubt expressed
by some here on the list that the "simple" numbering scheme used by
bzr can offer the same guarantee. That is, rev 1.2.1 may be completelyWhy? Uncommitted changes shouldn't be propagated. Once you have cloned
the repo, you can checkout your own copy of the working tree files.Sean
-
It does work, very well at that.
I have a directory for each separate branch and simply use
cd(1) to change the current working directory to that branch.
So, instead of "git checkout <branch>", I do "cd ../<branch>".One only needs to watch out when one updates the repository.
If there had been updates in those branches, then one needs
to git-reset the "branch" directory... (you know what I mean)
(For example when I come to work in the morning an sync up
with home from my usb key...)The script is called:
Usage: git-mkdir-of-branch <original-directory> <branch> <new-directory>
where <branch> is the name of an existing branch in <original-directory>/.git/refs/headsand uses simple symbolic links and some git plumbing to do the
job. It can be found in my git trees. I never bothered to send
it out to Junio, since it could be considered heretic. ;-)Luben
-
Unless you have branch(es) with totally different contents, like git.git
But without .git being either symlink, or .git/.gitdir "symref"-link,
you have to remember what to ser GIT_DIR to, or parameter for --git-dir
option.I'd like to mention once again that in Git branches and tags have
totally separate namespace than repository namespace.
--
Jakub Narebski
Poland
-
Yes. I have to say, that's likely a fairly odd case, and I wouldn't be
surprised if other VCS's don't support that mode of operation at _all_.The fact that git branches can be independent of each other is very
I'd strongly suggest that people who do this should actually do
git clone -l
instead of actually playing games with symlinking .git/ itself or using
GIT_DIR. It means that the two checkouts get separate branch namespaces,
but that's really what you'd want most of the time.You _can_ share the whole branch namespace and do the symlink of .git (or
just set GIT_DIR - but that's pretty inconvenient), and it might end up
being "closer" to what some other VCS would do. But the natural thing to
do with git is to just share some of the objects through local "slaving"
of the repositories, and consider them otherwise entirely independent.Linus
-
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1Bazaar also supports multiple unrelated branches in a repository, as
does CVS, SVN (depending how you squint), Arch, and probably Monotone.Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.orgiD8DBQFFNFy90F+nu1YWqI0RAgMeAJ99OikxXspSg+efnN6j3ySoPuOovQCfaKA6
yPCRw5Kl/V+ThnU6fsPA8TQ=
=DYAN
-----END PGP SIGNATURE-----
-
Hi,
But I _do_ work with it! I just don't need to "checkout" it! Example:
git -p cat-file -p todo:TODO
You'd just use alternates for that.
But as Linus mentioned in another email, you mostly can use the _same_
working directory. If you want to work on another branch, which is not all
that different from the current branch (say, you have a bug fix branch on
top of an upstream branch), you just _switch_ to it. Git recognizes those
files which are changed, and updates only these. Therefore, if you have
something like a Makefile system to build the project, you actually save
(compile) time as compared to the multiple-checkout scenario.I use this system a lot, since I maintain a few bugfixes for a few
projects until the bugfixes are applied upstream. BTW the
multiple-branches-in-one-working-directory workflow was propagated by Jeff
a long time ago, and it really changed my way of working. Thanks, Jeff!Ciao,
Dscho-
Ok, if there ever was an example of a strange git command-line, that was
Well, you can just add
[alias]
cat=-p cat-file -pto your ~/.gitconfig file, and you're there.
[ For all the non-git people here: the first "-p" is shorthand for
"--paginate", and means that git will automatically start a pager for
the output. The second "-p" is shorthand for "pretty" (there's no
long-format command line switch for it, though), and means that git
cat-file will show the result in a human-readable way, regardless of
whether it's just a text-file, or a git directory ]So then you can do just
git cat todo:TODO
and you're done.
[ So for the non-git people, what that will actually _do_ is to show the
TODO file in the "todo" branch - regardless of whether it is checked out
or not, and start a pager for you. ]I actually do this sometimes, but I've never done it for branches (and I
do it seldom enough that I haven't added the alias). I do it for things
likegit cat v2.6.16:Makefile
to see what a file looked like in a certain tagged release.
People sometimes find the git command line confusing, but I have to say,
the thing is _damn_ expressive. I've never seen anybody else do things
like the above that git does really naturally, with not that much
confusion really.Even that "alias" file is quite readable, although I'd suggest writing out
the switches in full, ie[alias]
cat=--paginate cat-file -pinstead. That kind of helps explains what the alias does and avoids the
question of why there are two "-p" switches.Linus
-
_WONDERFUL_. Really :)
--
Christian
-
This very useful syntax (<ent>:<path>) didn't get documented
"officially" anywhere. It was actually documented in commit log
v1.4.1^0~255^2. Maybe someone should copy and paste it to git
documentation? Maybe core-tutorial.txt or git-rev-parse.txt, is there
any better place?
--
Duy
-
Hi,
Ha! I have that for a long time! Although I named it "s", since "git s
todo:TODO" is two letters shorter...Ciao,
DschoP.S.: BTW a certain person complained about ~/.gitconfig not being
documented, but evidently the itch was not big enough for that person to
document it himself...
-
Well, all refs (branches and tags) are named by [relative] path. So for
example we can have 'master', 'next', 'jc/diff' branches, 'v1.4.0' and
'examples/tag' tags. Cogito for example uses <repository URL>#<branch>Well, <ref>~<n> means <n>-th _parent_ of a given ref, which for branches
(which constantly change) is a moving target.There was proposal to add some kind of serial number to git (like
Subversion revision numbers) and even solution how to do this...
but one must realize that any serial number must be _local_ to the
repository. One cannot have universally valid revision numbers (even
only per branch) in distributed development. Subversion can do that only
because it is centralized SCM. Global numbering and distributed nature
doesn't mix... hence contents based sha1 as commit identifiers.But this doesn't matter much, because you can have really lightweight
tags in git (especially now with packed refs support). So you can haveBranches are persistent, have _separate_ (!) namespace (are not
incorporated in repository URL according to some kind of convention
like in Subversion), can be worked independently, you can easily
switch between branches in one working directory. Branches are cheap
in git (notion of topic branches).I wonder if any SCM other than git has easy way to "rebase" a branch,
i.e. cut branch at branching point, and transplant it to the tip
of other branch. For example you work on 'xx/topic' topic branch,
and want to have changes in those branch but applied to current work,
not to the version some time ago when you have started working on
said feature.What your comparison matrick lacks for example is if given SCM
saves information about branching point and merges, so you can
get where two branches diverged, and when one branch was merged intoActually it is better to work with clone of repository, perhaps either
symlinking object database, or by alternates mechanism (with alternates
repositories would share old h...
Precisely how does this rebase operate in git ?=20
Does it preserve revision ids for the existing work, or do they all
change?bzr has a graft plugin which walks one branch applying all its changes
to another preserving the users metadata but changing the uuids for
revisions.=20-Rob
--=20
GPG key available at: <http://www.robertcollins.net/keys.txt>.
On Tue, 17 Oct 2006 19:37:45 +1000
git rebase does exactly the same as you describe, including changing
the sha1 for each commit it moves.Sean
-
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1Ah. Bazaar uses negative numbers to refer to <n>th parents, and
positive numbers to refer to the number of commits that have been madeSure. Our UI approach is that unique identifiers can usefully be
abstracted away with a combination of URL + number, in the vast majorityThe nice thing about revision numbers is that they're implicit-- no one
If I understand correctly, in Bazaar, you'd just merge the current work
I'm not sure what you mean about divergence. For example, Bazaar
records the complete ancestry of each branch, and determining the point
of divergence is as simple as finding the last common ancestor. But are
you considering only the initial divergence? Or if the branches merge
and then diverge again, would you consider that the point of divergence?merge-point tracking is a prerequisite for Smart Merge, which does
I'm not sure what you mean by API, unless you mean the commandline. If
that's what you mean, surely all unix commands are extensible in that
regard.Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.orgiD8DBQFFNGKQ0F+nu1YWqI0RAsW+AJoDOsNRmBjo3raT43JL6qn7SuJNRwCfe9l5
oAZ9OyrxMQlHnwrruhcjz9Y=
=RNuG
-----END PGP SIGNATURE-----
-
How that works with branching point, and with merges? For example
in the case depicted below, how you refer to commit marked by X?---- time --->
--*--*--*--*--*--*--*--*--*-- <branch>
\ /
\-*--X--*--/The branch it used to be on is gone...
Besides, in git commit object has pointers (in the form of sha1 ids)
to all its parents. So <ref>^ (parent of <ref>), or <ref>^<m> (m-th
parent of <ref>), or <ref>~<n> (n-th parent in 1st-parent lineage
of <ref>) are natural, and fast. <ref>+<n> (which would add yet another
character as forbidden in branch name) would need either serial number
(per repository or per branch) to commit id database, or getting full
history and looking it up in full history.Branches in git are remembered not by their starting points, but by
Git could do that too, by having file (files) with serial number
or branch/tag+serial number to commit id mapping. But this would
have to be local matter. And this would take some disk space, and
would seriously affect fetch performance (now git just downloads
what it doesn't have and dumps it into repository database).BTW. what if repository is moved from one URL to another, for example
moving to different host? All "abstracted away" identifiers getTwo words: post-commit hook. You can automate action of adding tags
(especially now with packed refs, which means that we can have huge numberThat is the alternate solution, but this would mean that merge would be
recorded (unless you squash it). And for published branches (like 'next'
for example) it is better solution, because rebase is in fact rewriting
history.But rebase means that you had
A---B---C topic
/
D---E---F---G masterRebasing 'topic' branch on top of master would mean that you would get
A'--B'--C' topic
/
D---E---F---...
Yup. The new command will also automagically appear in the "git help -a"
output. Those two functions have been available since the C wrapper was
born, although "git help -a" was the only available output for "command
not found" until someone introduced the more newbie-friendly list that
pops up now adays.--
Andreas Ericsson andreas.ericsson@op5.se
OP5 AB www.op5.se
Tel: +46 8-230225 Fax: +46 8-230231
-
In bzr 0.12 this is :
2.1.2(assuming the first * is numbered '1'.)
These numbers are fairly stable, in particular everything's number in
the mainline will be the same number in all the branches created from it
at that point in time, but a branch that initially creates a revision or
obtains it before the mainline will have a different number until they
syncronise with the mainline via pull.-Rob
--=20
GPG key available at: <http://www.robertcollins.net/keys.txt>.
And here, by "fairly stable", you really mean "totally idiotic", don't
you?Guys, let's be blunt here, and just say you're wrong. The fact is, I've
used a system that uses the same naming bzr does, and I've used it likely
longer and with a bigger project than anybody has likely _ever_ used bzr
for.It sounds like bzr is doing _exactly_ what bitkeeper did.
Those "simple" numbers are totally idiotic. And when I say "totally
idiotic", please go back up a few sentences, and read those again. I know
what I'm talking about. I know probably better than anybody in the bzr
camp.Those "simple" numbers are anything but. They may be short, most of the
time, but when you bandy things like "-r 56" around, what you're ignoring
is that for a _real_ project you actually get numbers like "1.517.3.57",
which isn't really any simpler or shorter than saying "7786ce19". You
still want to cut-and-paste it.And the "simple" numbers have a real downside, which is that THEY CHANGE.
What happens is that somebody else started _another_ branch at revision 2,
and did important work, and and they also had a "2.1.2" revision, and then
they merged your work, and you merged their merge back, that "simple"
revision number changed, didn't it? Suddenly "2.1.2" means something
different for one of the users.We had people in the bitkeeper world that _never_ actually understood that
the numbers changed. The "simple" numbers were stable enough that a lot of
people thought they were real revisions, and then they were really
_really_ confused when a number like "1.517.3.57" suddenly went away after
a merge, and became something else instead.And yes, bitkeeper had a "real key" internally too. If you actually wanted
to give a real revision, you had to give something that looked a lot like
what the bzr internal revision numbers look like.Of course, most users didn't even _know_ or understand those revision
numbers, so as a result, you had tons of people who used the "simple"
...
Be as blunt as you want. You're expressing an opinion, and thats fine. I
happen to think that we're right : users appear to really appreciate
this bit of the UI, and I've not yet seen any evidence of confusion
about it - though I will admit there is the possibility of that
occurring.I think its completely ok that git and bzr have made different choices
in this regard, but I *dont* think our choice is in any regard 'totally
idiotic'.[snip examples that are clearly predicated on how bk worked, not on how
bzr works].-Rob
--=20
GPG key available at: <http://www.robertcollins.net/keys.txt>.
On Wed, 18 Oct 2006 08:27:58 +1000
Yeah, but it's an opinion that is based on a huge real world project with
hundreds of developers. If Bazaar is ever used in a project of that
size it may just see the same type of issues as Bk. As has been mentioned
elsewhere, Git users really appreciate the short forms it provides for
referencing commits, so much so that there is no reason to invent a
new (unstable) numbering system or attempt to hide the true underlying
commit identities.Just out of curiosity is there a Bazaar repo of the Linux kernel available
somewhere?Sean
-
So basically anyone can pull/push from/to each other but only so long as
they decide upon a common master that handles synchronizing of the
number part of the url+number revision short-hands?One thing that's been nagging me is how you actually find out the
url+number where the desired revision exists. That is, after you've
synced with master, or merged the mothership's master-branch into one of
your experimental branches where you've done some work that went before
mothership's master's current tip, do you have to have access to the
mothership's repo (as in, do you have to be online) to find out the
number part of url+number shorthand, or can you determine it solely from
what you have on your laptop?--
Andreas Ericsson andreas.ericsson@op5.se
OP5 AB www.op5.se
Tel: +46 8-230225 Fax: +46 8-230231
-
Anyone can push and pull from each other - full stop. Whenever they
'pull' in bzr terms, they get fast-forward happening (if I understand
the git fast-forward behaviour correctly). After a fast-forward, the
dotted decimal revision numbers in the two branches are identical - and
they remain immutable until another fast forward occurs. Push always
fast forwards, so the public copy of ones own repository that others
pull or merge from is identical to your own. In a 'collection of
branches with no mainline' scenario, people usually have fast forward
occur from time to time, keeping the numbers consistent from the pointYou can determine it locally - if you know any of the motherships
revisions locally, we can generate the dotted-revnos that the
motherships master-branch would have from the local data - and the last
merge of mothership you did will have given you that details. I dont
think we have a ui command to spit this out just yet, but it will be
trivial to whip one up.More commonly though, like git users have 'origin' and 'master'
branches, bzr users tend to have a branch that is the 'origin' (for bzr
itself this is usually called bzr.dev), as well as N other branches for
their own work, which is probably why we haven't seen the need to have a
ui command to spit out the revnos for an arbitrary branch.-Rob
--=20
GPG key available at: <http://www.robertcollins.net/keys.txt>.
You mis-understand.
git doesn't have a "ui command to spit out the revnos for an arbitrary
branch" either.Normally, you'd just use the branch-name. Nobody ever uses the SHA1's
directly.What git does (and does very well) is to be _scriptable_. It was designed
that way. I'm a UNIX guy. I think piping is very powerful. And when you
script things, your scripts pass SHA1's around internally.So for example, to repack a git archive, you'd normally do
git repack -a -d
and you don't have any "UI" with SHA1 numbers. But internally, this used
to begit-rev-list --all --objects |
git-pack-objectswhere "git-rev-list" is the one that lists all object names (which are the
SHA1 numbers), and "git-pack-objects" is the one that takes a list of
objects and packs them.(These days, since our internal C libraries have become so much better,
the object traversal is done internally to packing, so we don't actually
use the pipe any more for repacking an archive, but that's just an
implementation detail)You seem to think that we use SHA1 names as _humans_. We don't. The SHA1
names are used internally, and humans just use the branch names.The only case you'd (as a human) use the SHA1 name is when you want to
pass it on to another person that may have a different archive (ie you
mail somebody a revision that is problematic). It would obviously be
totally unworkable to say "it's the grand-parent of my current HEAD
commit", since that's a local description. So instead, you'd say "it's
commit 9550e59c4587f637d9aa34689e32eea460e6f50c".So I think people (totally incorrectly) think that git users use a lot of
SHA1 names, just because they see the git users on the kernel mailing list
sending each others SHA1 names. But that's because you see only the case
where you _want_ to communicate a stable revision name to another side.
Sending a number like 1.57.8.312 to describe what commit broke would be a
_bug_, because a person who has a differently shaped...
With the exception of having sometimes commit-ids in the commit messages,
for example "Fixes bug introduced by aabbcc00" (although usually you just
write "Fixes bug in some_function in some_file"), and automatically
generated
This reverts d119e3de13ea1493107bd57381d0ce9c9dd90976 commit.
(in addition to 'Revert "<Commit title>") for git-revert generated
commit messages.And it is true that you usually use branchname, or branchname~n syntax.
Git even has git-name-rev to convert from sha1 to temporary, local
ref^m~n... syntax.By the way, git has very powerfull syntax to get revisions, and
revision lists. For example "git-rev-list foo bar ^baz" means
"list all the commits which are included in foo and bar lineage,
but not in baz", or more useful "git log origin..next".How's that in bzr?
--
Jakub Narebski
Poland
-
Yes. But in both cases, that's usually because you literally ended up
having the commit name because somebody else (which _can_ be you) searched
for it (with something like "bisect") and gave it to you.So even that case is really about communicating a stable name from one
place (the "find the bug") to another (the "revert the buggy commit").So yes, _communication_ should always happen by full SHA1's, because those
are the only thing that always remain stable.(The fact that "gitk" and I think "gitweb" can then turn them into
hyperlinks in the commit message is obviously one reason we then tend to
give them such prominent visibility - they actually end up being very
useful later on).In bzr, either you don't get the hyperlinks, or you need to use the
non-simple name in the commit messages, since the simple names don't
actually work. Either way, it's an inferior setup.Linus
-
This is where it breaks down for me. "until another fast forward occurs"
To me, this means bazaar isn't distributed at all and I could achieve
much the same distributedness(?) by rsyncing an SVN repo, working
against that and then rsyncing it back with some fancy merging. In other
words, bazaar requires there to be one Lord of the Code, or some of the
key features break down.--
Andreas Ericsson andreas.ericsson@op5.se
OP5 AB www.op5.se
Tel: +46 8-230225 Fax: +46 8-230231
-
Dear diary, on Wed, Oct 18, 2006 at 10:53:16AM CEST, I got a letter
Well as far as I understand, the Lord of the Code is whoever you pulled
from the last time.It's just a different focus here. If I understood everything in this
thread correctly, both Git and Bazaar have persistent (SHA1, UUID) and
volatile (revspec, revision number) revision ids. The only difference is
that Git primarily presents the user with the SHA1 ids while Bazaar
primarily presents the user with a revision number (and that revspecs
change after every commit while revision numbers change only after a
merge).--
Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1
lK[d2%Sa2/d0$^Ixp"|dc`;s/\W//g;$_=pack('H*',/((..)*)$/)
-
I can't say for bzr 0.>12 which do not exist ;-)
For previous versions, it didn't have that "simple" number, and you
had to use the rev-id.--
Matthieu
-
Dnia wtorek 17. pa
What do you do once a branch has been thrown away, or has had 20 other
branches merged into it? Does the offset-number change for the revisionmerge != rebase though, although they are indeed similar. Let's take the
example of a 'master' branch and topic branch topicA. If you rebase
topicA onto 'master', development will appear to have been serial. If
you instead merge them, it will either register as a real merge or, if
the branch tip of 'master' is the branch start-point of topicA, it will
result in a "fast-forward" where 'master' is just updated to theI'm fairly certain he's talking about the API in the sense it's being
talked about in every other application. Extensive work has been made to
libify a lot of the git code, which means that most git commands are
made up of less than 400 lines of C code, where roughly 80% of the code
is command-specific (i.e., argument parsing and presentation).--
Andreas Ericsson andreas.ericsson@op5.se
OP5 AB www.op5.se
Tel: +46 8-230225 Fax: +46 8-230231
-
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1We always track the number of parents since the initial commit in the
Ah, now I see what you mean, and the "graft" plugin mentioned by others
Ah, okay.
So it sounds to me like git is extensible, though not as thoroughly as bzr.
Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.orgiD8DBQFFNTat0F+nu1YWqI0RAn9aAJ9WzMrM72be+3SlwCpvJXQ/X2Y3nQCfeYk3
NTIJuZSze9URUaAsiO4Hu5o=
=9nvr
-----END PGP SIGNATURE-----
-
While this I think is quite reliable (there was idea to store "generation
number" with each commit, e.g. using not implemented "note" header, or
commit-id to generation number "database" as a better heuristic than
timestamp for revision ordering in git-rev-list output), and probably
independent on repository (it is global property of commit history,
and commit history is included in sha1 of its parents), numbering branchingVery useful as a kind of poor-man's-Quilt (or StGit). You develop some
feature step by step, commit by commit in your repository cooking it
in topic branch. Then before sending it to mailing list or maintainer
as a series of patches (using git-format-patch and git-send-email)
you rebase it on top of current work (current state), to ensure thatFast-forward is a really good idea. Perhaps you could implement it,
I think having good API for C, shell and Perl (and to lesser extent for any
scripting language) means that it is extensible more. Git is not as of yet
libified; when it would be we could think about bindings for other
programming languages (there is preliminary Java binding/interface).
--
Jakub Narebski
Poland
-
Dnia wtorek 17. pa
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1We support it as 'pull', but merge doesn't do it automatically, because
we'd rather have merge behave the same all the time, and because 'pull'I guess it's a value judgement on which is more important to extensibility:
Git has more language support.
Bzr has plugin autoloading, Protocol plugins, Repository format plugins,
and more. Because Python supports monkey-patching, a plugin can change
absolutely anything.Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.orgiD8DBQFFNUrP0F+nu1YWqI0RAizXAJ0Wnf2ZoIRpaba3mX2L4pN9XcWDPQCePtg/
G/W6Oxm+kd8SzhGEEfLAxL8=
=VqC7
-----END PGP SIGNATURE-----
-
Excuse me? What does that "throws away your local commit ordering" mean?
A fast-forward does no such thing. It leaves the local commit ordering
alone, it just appends other things on top of it. It's the only sane thing
you can do, since the work you merged was already based on your top
commit.So generating an extra "merge" commit would be actively wrong, and adds
"history" that is not history at all.It also means that if people merge back and forth from each other, you get
into an endless loop of useless merge commits. What's the point? They only
clutter up the history, and they mean that you can never agree on a common
state.There's no reason _ever_ to not just fast-forward if one repository is a
strict superset of the other.You must be doing something wrong. Is it just that people want to pee in
the snow and leave their mark?Linus
-
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1Say this is the ordering in branch A:
a
|
b
|
cSay this is the ordering in branch B:
a
|
b
|\
d c
|/
eWhen A pulls B, it gets the same ordering as B has. If B did not have e
It's not a tree change, but it records the fact that one branch merged
You can pull if you don't want that. We haven't found that people are
Maybe not in Git.
Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.orgiD8DBQFFNV7u0F+nu1YWqI0RAhGtAJwOlWpl088pbl63EHyF04qQCYlXBgCfW0Tm
cfXuE0vqeWelfFbpzffiCNI=
=McQ2
-----END PGP SIGNATURE-----
-
Aaron Bentley wrote:
> You can pull if you don't want that.
For non-git people (and maybe even git people who didn't follow some of
the "reflog" work):- git does actually have "local view" support, but it is very much
_defined_ to be local. It does not pollute any history as seen by
anybody else. It's called "reflog" (where "ref" is just the git name
for any reference into a tree, and the "log" part is hopefully obvious)So each git repository can have (if you enable it) a full log of all the
changes to each branch. But it's not in the core git datastructures that
get replicated - because the local view of how the branches have changed
really _is_ just a local view. It's just a local log to each repository
(actually, one per branch).It's what allows a git person to say
git diff "master@{5.hours.ago}"
because while "5 hours ago" is _not_ well-defined in a distributed
environment (five hours ago for _whom_?) it's perfectly well-defined in a
purely _local_ sense of one particular branch.So there's no need for a fakey "merge" that isn't a real merge and that
doesn't make sense for anybody else because it doesn't actually add any
real knowledge about the _history_ of the tree (only about a single
repository). If you want to see how the history of a particular repository
has evolved, you can just look at the reflog (although admittedly, common
tools like "gitk" don't even show it - the data is there if they would
want to, but the most common usage is the above kind of "show me what
happened in the last five hours in my current branch".Linus
-
Sure. But that doesn't throw away any local commit ordering. The original
order (a->b->c) is still very much there. The fact that there was a branch
off 'b' and there is also (a->b->d) and a merge of the two at 'e' doesn'tBut that's a totally specious "record". It has no meaning in a distributed
SCM. There is absolutely zero semantic information in it.The fact that you _locally_ want to remember where you were is a total
non-issue for a true distributed system. You shouldn't force everybody
else to see your local view - since it has no relevance to them, andI don't think there is any in bzr either. Can you explain?
In other words, the empty merge is totally semantically empty even in the
bazaar world. Why does it exist?Linus
-
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1After the pull, it's no longer the mainline ordering for the branch. c
is represented a revision that was merged into the branch, while d isIt means the the order that revisions are shown in log commands changes,
It records the committer, the date, the commit message, the parent
It exists because it is useful. Because it makes the behavior of bzr
merge uniform. Because in some workflows, commits show that a person
has signed off on a change.It's not something special-- it's just another commit, like regular
commits, and merge commits. It would be harder to forbid than it is to
permit.Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.orgiD8DBQFFNXQQ0F+nu1YWqI0RAnxDAJ4hbuLkEK1eBlyoEOz7NAlqLVth9gCfed4w
nfeiR2KVvN+N9zdSrC8MKcY=
=et73
-----END PGP SIGNATURE-----
-
In the Git world that happens via "git tag -s", i.e, a
cryptographically strong "signoff".
(There's also the secondary convention of appending Signed-off-by: to
email-applied patches, but that's something that would translate
effectively to any other system, since it's outside the SCM.)
-
Well, that is another example while generation number is/can be global,
...but that means that revision numers are totally, absolutely useless.
Unless by some miracle of engineering, or adding namespace, they can beAll totally empty information. What should be commit message? I have
fetched changes from remote repository? You can remove one of parents
(the one of pointing to before fast-forward "merge") without changing
reachability.---------
/ \But if you record "fast-forward merge", you force all people pulling
from your repository to have this purely local and without any significantSigning off the fact of fetching changes? For true merge you are signing
off the fact that there were no conflicts, or you sign off your conflictActualy the check is very easy. And you have to do similar check when
fetchin/pushing to ensure that you don't clobber your changes.
--
Jakub Narebski
Poland
-
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1No. The numbering always follows the leftmost parent. So each revision
No, because no one pulls unless they're trying to maintain a mirror of
Even if I agreed that the revision was meaningless, the cost of such a
You sign off on the contents of the revision you fetched. You say "I
Agreed. It's just that not checking is easier still.
Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.orgiD8DBQFFNXzD0F+nu1YWqI0RAiGvAJsEbPNNlqZ7QCH7EE39YABqEm/BtwCaAxIo
NHqG4NVZpvymTUlCLYyCqKM=
=YUdC
-----END PGP SIGNATURE-----
-
Right. You have to do it your way, because of the "simple revision
numbers".Which gets us back to where we started: "simple" is in the eye of the
beholder. I personally think that git revision naming is a lot simpler,
exactly because it doesn't impose arbitrary rules on users.For example, what happens is that:
- you like the simple revision numbers
- that in turn means that you can never allow a mainline-merge to be done
by anybody else than the main maintainer
- that in turn means that the whole situation is no longer distributed,
it's more like a "disconnected access to a central repository"The "main trunk matters" mentality (which has deep roots in CVS - don't
get me wrong, I don't think you're the first one to do this) is
fundamentally antithetical to truly distributed system, because it
basically assumes that some maintainer is "more important" than others.That special maintainer is the maintainer whose merge-trunk is followed,
and whose revision numbers don't change when they are merged back.That may even be _true_ in many cases. But please do realize that it's a
real issue, and that it has real impact - it does two things:- it impacts the technology and workflow directly itself: "pull" and
"merge" are different: a central maintainer would tend to do a "merge",
and one more in the outskirts would tend to do more of a "pull",
expecting his work to then be merged back to the "trunk" at some later
point)- it will result in _psychological_ damage, in the sense that there's
always one group that is the "trunk" group, and while you can pass the
baton around (like the perl people do), it's always clear who sits
centrally.Maybe this is fine. It's certainly how most projects tend to work.
I'll just point out that one of my design goals for git was to make every
single repository 100% equal. That means that there MUST NOT be a "trunk",
or a special line of development. There is no "vendor branch". It's...
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1That's not true of bzr development. The "main maintainer" that runs the
bzr.dev is an email bot. It's not an integrator-- its work is purely
mechanical. It can't resolve merge conflicts.Most of the merge work is done in integration branches run by the core
developers. Although Martin is our project leader, lays out ground
rules, and makes design decisions, he doesn't have to be involved in anyLinus, if you got hit by a bus, it would still be a shock, and it would
still take time for the Linux world to recover. Your insights and
talent, both technical and social, make you the most important kernel
developer. And it stays that way because you deserve it. Projects with
good leadership don't fork, or if they do, the fork withers and dies
pretty quickly.It is fine to say all branches are equal from a technical perspective.
- From a social perspective, it's just not true.The scale of Bazaar development is much smaller than the scale of kernel
development, so it doesn't make sense to maintain long-term divergent
branches like the mm tree. We do occasionally have long-lived featureAs I mentioned earlier, there are four people who each run their own
I think you're implying that on a technical level, bzr doesn't support
this. But it does. Every published repository has unique identifiers
for every revision on its mainline, and it's exceedingly uncommon for
these to change. There are special procedures to maintain bzr.dev, but
there's nothing technically unique about it. People develop against
bzr.dev rather than my integration branch, because they have
non-technical reasons for wanting their changes to be merged intoOn an actively-developed bzr branch, the first parent *is* special:
- - it's a revision that you committed
- - the diff between a revision and its first parent is the same as theI don't think your analysis holds together completely, because all
actively-maintained branches have very stable rev...
So? It makes no sense to me to cater only to "successful projects"... most
Yes, but what matters here is the principle... if branches aren't equal, it
makes some things unnecessarily hard (i.e., forking, passing maintainership
over, ...). Sure, they aren't activities that should be actively"Very rare" != "never". The "very rare" cases /will/ come back to bite you,
once you grow accustomed to "hasn't ever happened"What makes a "published repository" special, as oposed to my local
Are they different among repositories, even though they came from another
OK.
--
Dr. Horst H. von Brand User #22616 counter.li.org
Departamento de Informatica Fono: +56 32 2654431
Universidad Tecnica Federico Santa Maria +56 32 2654239
Casilla 110-V, Valparaiso, Chile Fax: +56 32 2797513
-
funny. I actually read another post from Linus, and when I
"merge" with your post (understand: bisect), the following
comes out:- git is the fastest scm around
- git has the smallest scm footprint
- git is also aimed at small(ish) projectsmy personal proof of concept on the last point is that I'm a
IC design engineer who threw away other scm in favor of git
since git-1.4.2 and regret now the years wasted on _other_
scm. But your mileage may vary.--
Christian
-
I'd like to point out that the same thing has happened in bzr-land.
Back in the "pre-bot" days, only Martin did put things in "his branch"
where most people got bzr from (same as Linus' git branch), but he was
away for a few weeks and during this time, there was 3 (or 4 perhaps)
other branches, called integration branches, that was being used.
They were all maintained by different people.Everyone learned really quickly to use them instead of Martin's
branch. When Martin came back, he just pulled/merged these branches
and everything was back to normal.I'd say in this case, bzr was even more "without a trunk" then in the
example Linus gives above.What seams to be one interesting thing in this discussion is that,
because people use bzr and git in slightly different ways, they think
that one or the other cannot be used in another way.bzr's use of revision numbers, doesn't mean it hasn't got unique
revision identifiers, and I can't see any reason why it couldn't be
used in the same way as git. Both are excellent tools, and since git
is more specialized (built to support the exact workflow used in
kernel development), it's more suited for that exact use.bzr tries to take a broader view, for example, it does support a
centralized workflow if you want one. Most people don't, but a few
might. Because of this, it probably fits the kernel development less
good than git. That's fine I think! I happens to fit my workflow
better than git does :)Regards,
Erik
-
Dear diary, on Thu, Oct 19, 2006 at 09:02:16AM CEST, I got a letter
There is perhaps no "technical" reason, but it's also what the user
interface is designed around - most probably, using UUIDs instead of
revnos would be a lot less convenient for bzr people because you
probably primarily show revnos everywhere and UUIDs only in few special
places and/or when asked specifically through a command (correct me if
I'm wrong). Also, do you support "UUID autocompletion" so that you canI think they are in fact just as flexible (+-epsilon). Git can support
centralized workflow as well - you have some central repository
somewhere and all the developers clone it, then pull from it and push to
it in basically the same way they would use CVS. And it is perhaps
currently even more used in practice than the "single-man" workflow
nowadays, as more project are using Git.--
Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1
lK[d2%Sa2/d0$^Ixp"|dc`;s/\W//g;$_=pack('H*',/((..)*)$/)
-
[ trim back CC a bit ]
On Thu, Oct 19, 2006 at 01:37:31PM +0200 I heard the voice of
The primary place you'd see either is in 'log'. To show the UUID,
you'd add a "--show-ids" arg to it (and via per-user config aliasing,
you could just alias 'log' to 'log --show-ids' if you always wanted to
see them, so you wouldn't have to type it. The output looks something
like:revno: 1
revision-id: fullermd@over-yonder.net-20061019151437-5b99dff6ed1d76cd
committer: Matthew Fuller <fullermd@over-yonder.net>
branch nick: a
timestamp: Thu 2006-10-19 10:14:37 -0500
message:
Foo(without --show-ids, it's the same, except not showing the
With the form of bzr UUID's, that's not particularly useful, since
you're probably into the minutes/seconds of the timestamp before it
becomes unique, at which points you're close to 2/3 of the way through
the whole string.--
Matthew Fuller (MF4839) | fullermd@over-yonder.net
Systems/Network Administrator | http://www.over-yonder.net/~fullermd/
On the Internet, nobody can hear you scream.
-
close to 200 post on bzr-git war!
is this the right place (git mailing list) to discuss about future
features of bzr ?--
Christian
-
Perhaps not, but the tone is friendly (mostly), the patience of the
bazaar people seems infinite and lots of people seem to be having fun
while at the same time learning a thing or two about a different SCM.
Best case scenario, both git and bazaar come out of the discussion as
better tools. If there would never be any cross-pollination, git
wouldn't have half the features it has today.--
Andreas Ericsson andreas.ericsson@op5.se
OP5 AB www.op5.se
Tel: +46 8-230225 Fax: +46 8-230231
-
I fully agree with Andreas: I am just a bzr user (not even a bzr
developer) and when looking for a decentralized VCS I also looked at
git and a few others. I think I am learning quite a bit about bzr,
git, and VCS in general.--
Ramon Diaz-Uriarte
Statistical Computing Team
Structural Biology and Biocomputing Programme
Spanish National Cancer Centre (CNIO)
http://ligarto.org/rdiaz
-
I second this.
I'm bzr user and occasionnal developper, and I learnt a lot about git
in the discussion. I hope I also could explain well some of the
features of bzr to some git guys, it's always interesting to
understand why other people do things on a different way, or why they
do it in the same way.--
Matthieu
-
Thanks everyone for taking time to explain details.
However, I don't use SCM for code development. I use it for collaborative
documentation, white boarding and tracking configurations.
In fact in my company no one uses SCM for code development.
Everyone here uses it for collaborative documentation and white boarding.
Only I use SCM for tracking configurations.I think of SCMs in terms of an SCM core and SCM tools.
First I want to say every SCM I know of sucks when it comes to tracking
configurations, simply because they don't record or restore file metadata,
like perms, ownership, and acl. I don't see recording or restoring
file metadata as part of the SCM core. I do however feel an SCM core needs to
have provisions for extended file inventory information. The problem
with extended file inventory information, it is fs specific. For this reason I
feel it is essential that the SCM core allow multiple sets of extended file
inventory information. The SCM tools are responsible, based on the local
config, for recording metadata and creating extended file inventory,
translating file metadata of one file system. When tracking configurations
octopus merges are surprisingly common. If a configuration changed is
not signed off by a responsible person, it can not be accepted. Doing
otherwise is simply an invitation to attackers and makes trouble shooting
far too difficult. Also configuration file in one directory will most often not
be members of the same repo. For example each file etc in directory would
members of different repos according to its associated application/pkg.Somethings I like the SCM tools to handle. Personally I would like the
SCM tools to be platform independent. This would ensure that correct
things happening on ext3 mounted on windows.
I don't think execute bit belongs in the basic file inventory information.
Instead I would like to use this replace by a filter in the extended
file inventory
indicating what file metadata if any should be recorded or restored.
Wh...
That's not a simple matter.
Tracking ownership hardly makes sense as soon as you have two
developers on the same project. What does it mean to checkout a file
belonging to user foo and group bar on a system not having such user
and group?Just restoring the complete user/group/other rwx permission is already
a mess. In my experience (GNU Arch did this):1) It sucks ;-). Me working with umask 022 so that my collegues can
"cp -r" from me, working on a project with people having umask 077,
I got some files not readable, some yes, well, a mess. *I* have set
my umask, and *I* want my tools to obey.2) It's a security hole. If you work with people having umask=002 (not
indecent if your default group contains just you), you end-up with
world-writable files in your ${HOME}.That said, it can be interesting to have it, but disabled by default.
The 'x' bit, OTOH, is definitely useful.
--
Matthieu
-
Yes I agree it should be disabled by default. And enabled based on the
local settings.
-
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1Arch supports that kind of metadata.
I believe SVN supports recording arbitrary file properties, so it's just
Our choices have been predicated on producing the best SCM we can for
the purpose of developing software. We find that the execute bit is
very useful for build scripts and other incidental scripts.The other attributes didn't seem useful for software development, so
An XML diff/patch or merge will not handle ODF properly. There's too
The bzr "webserve" plugin provides rss feeds.
Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.orgiD8DBQFFN5oB0F+nu1YWqI0RAjSoAJ9xrZtSrZpVVoz6qAf/sZnd/StsUACfenqX
6bemNgMSbhtL0JjIlvulrb4=
=bSpK
-----END PGP SIGNATURE-----
-
yes svn has arbitrary properties which can be manipulated.
They are not really intended for permissions, ownership, and acl.
To use the svn properties for this requires adding scm tools.
Also svn does not allow files in the same directory to live inI have only experiment with xml diffs on odf files.
From my experience xml diffs work fine on svg files.
For more information, please refer toyes, Multiple merge sources is handy for collaborative document editing
-
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1Agreed. I think it's okay to require extra work to set the scm up to
It would surprise me if many SCMs that support atomic commit also
That's something I'd like for software development, too.
Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.orgiD8DBQFFOEsO0F+nu1YWqI0RAo+6AJ9lzF0+O1I8rgkyCOdhsir1gjo0NQCfXEVV
EIsDmS+eR/7cHKQfmnPJRA4=
=g5jk
-----END PGP SIGNATURE-----
-
In fact I think svk would. You would have to switch them by setting
an environment variable, but it's probably doable. That is because
unlike other version control systems, it does not store the information
about checkout in the checkout, but in the central directory and that
can be set. I don't know git well enough to tell whether git could do
the same by setting GIT_DIR.--------------------------------------------------------------------------------
- Jan Hudec `Bulb' <bulb@ucw.cz>
-
The point here is, that because of using the bot, the revnos on bzr.dev
are indeed stable (and many of the merges are in fact pointless merges
(ie. merges of revision and it's ancestor)). But if you don't use the
bot, than doing:bzr merge mainline
bzr push mainlinemakes your revision the leftmost parent is your revison, not the one
from "mainline". The fact that bzr treats leftmost parent somewhat
specially makes people to replace the above withbzr branch mainline
cd mainline
bzr merge feature-branch
bzr pushwhich is, well, more complicated (but you see it's not about main
maintainer -- anybody with write access can push).--------------------------------------------------------------------------------
- Jan Hudec `Bulb' <bulb@ucw.cz>
-
That's actually a very important insight, but supporting the wrong
conclusion.In a healthy situation, the only thing that makes a branch special are
social issues, such as you describe. That's how it should be.But think about your favorite example of an unhealthy social situation
around a software project and a big, nasty fork. Every example I can
think of involves some technical distinction that makes one branch
more special than another.Now, those situations also involve social problems, and those are even
more significant. But the technical blessing of one branch does not
help. And I think it contributes to the social problems in many cases.So, I think the technical thing that is distributed version control is
an extremely important thing for us to use to help maintain healthy
social software projects. Reducing the technical hurdle of a fork, (to
where continual forking is actually a totally expected part of the
process), is a very healthy thing.Now, both bzr and git are distributed systems, and either one will
help a great deal in the respects I'm talking about compared to
something like cvs.As far as the revision numbers, my impression is that the numbers
would be confusing or worthless if I were to use bzr the way I'mWhich just says to me that the bzr developers really are sticking to a
centralized model. That's fine, but it does have impacts, and the toolEvery argument you make for the number change being uncommon just
strengthens the argument that it will be all that more
confusing/frustrating when the numbers do change.In cairo, for example, we've made a habit of including a revision
identifier in our bug tracking system for every commit that resolves a
bug. I like having the assurance that those numbers will survive
forever. And it doesn't matter if the repository moves, or the project
is forked, or anything else. Those numbers cannot change.I understand that bzr also has unique identifiers, but it sounds like
the tools try to hide them, an...
bzr seems to use the classic UUID format, and it's funny how much it looks
like a real BK ChangeSet revision number ("key").Here's the quoted bzr "true" revision ID:
Matthieu.Moy@imag.fr-20061017152029-4c5a2861bcf23b7d
and here's a BK "ChangeSet Key":
adi@zaphod.bitmover.com|ChangeSet|20031031183805|57296
(I don't have BK installed anywhere, so I had to google for changeset
keys, and this was just some random key in the BK bugzilla ;)Looks very similar, don't they? And yes, the true revision ID is stable
over time (at least it was in BK, and I assume it is in bzr too).The biggest difference seems to be that in bzr, the final checksum is
64-bit, while for BK, it was just a 16-bit checksum/unique number (the
rest is just user-name/machine-name and date: I assume that the bzr commit
was done at 10/17/2006 3:20:29PM, and the example BK ChangeSet was created
10/31/2003 6:38:50PM - it looks like _exactly_ the same date format).With BK, you can also use a "md5 key", and I don't actually know how they
work. They may just be the md5 hash of the ChangeSet key, I think that may
be how those things are indexed. So in bkcvs, you'll see a line like this:BKrev: 42516681VmgTWL0bkLcltPGiI6Yk5Q
which is the BK md5 key for my last kernel revision in BK (2.6.12-rc2).
Again, these numbers are stable, unlike the simple revisions.Note that from a usability standpoint, the UUID's look more readable to a
human, but are actually much worse than the md5 keys (or the SHA1's that
git uses). At least with a hash, the first few digits are likely to be
unique, so you can do things like auto-completion (or just short names).
With the email+date+random number kind of UUID, you don't have that.(Pure hashes obviously also tend to just all have the same length, and are
easier to parse automatically, so from a programmatic standpoint they are
a lot easier too - but the surprising thing is how they are actually
easier on humans too, even if the UUID's look more reada...
On Thu, Oct 19, 2006 at 08:25:26AM -0700 I heard the voice of
Actually, as best I know, it's not a checksum, just random bits (a
This I agree with, at least in part.
--
Matthew Fuller (MF4839) | fullermd@over-yonder.net
Systems/Network Administrator | http://www.over-yonder.net/~fullermd/
On the Internet, nobody can hear you scream.
-
Ahh. They may be that even in BK. I know BK had various 16-bit CRC
checksums, but they were probably on the actual _file_ contents, not in
the key itself.Linus
-
Btw, I do believe that bzr seems to be acting a lot like BK, at least when
it comes to versioning. I suspect that is not entirely random either, and
I suspect it's been a conscious effort to some degree.Which is fine, in the sense that there are certainly much worse things to
try to copy.That said, at least BK was up-front about the versions changing, and
didn't try to do anything to hinder it. It still confused some people, and
it wasn't a great naming system, but it did work.In the big picture, the version naming between BK and git hasn't been an
issue for anybody in practice, I suspect.So if you want to look at features that actually matter more, try out
something likegitk drivers/scsi include/scsi
on the kernel archive (I assume that somebody has tried importing the
kernel git tree into bzr - quite frankly, if bzr cannot handle that size
tree without problems, you have much bigger issues!).In other words, being able to look at history of more than a single file
has been a _huge_ bonus.The other big difference is being able to do merges in seconds. The
biggest cost of doing a big merge these days seems to literally be
generating the diffstat of the changes at the end (which is purely a UI
issue, but one that I find so important that I'll happily take the extra
few seconds for that, even if it sometimes effectively doubles the
overhead).Looking at the dates of the merges yesterday, they're literally half a
minute apart, and that's not me _scripting_ them - that's me actually
looking up the emails, typing in the "git pull " and pasting the source
repository, and git fetching the data over the network and merging it, and
checking out the result (and me verifying that the resulting diffstat
matches what the email says). Doing four of those in a row in less than
two minutes is actually a really big deal.At some point, "performance" is just more than a question of how fast
things are, it becomes a big part of usability....
An interesting effect on this is when people have a column for
merge performance in a SCM comparison table, they would include
time to run the diffstat as part of the time spent for merging
when they fill in the number for git, but not for any other SCM.I know you won't misunderstand me but for the sake of others, I
should add this: I am not saying diffstat should be optional.-
By curiosity, how would you compare git and Bitkeeper, on a purely
technical basis? (not asking for a detailed comparison, but an "X is
globaly/much/terribly/not better than Y" kind of statement ;-) )--
Matthieu
-
Having used both in a past job setting (simultaneously even),
BitKeeper was a huge win over CVS, but after a while, some of its
tools were just very frustrating in comparison with comparable Git
interfaces, and I had actually written a terribly slow BK -> Git
converter just so I could incrementally import our BK tree, then use
Git's history-viewing because it was so much more pleasant to work
with.For small projects (~5 people), they weren't hugely different, but Git
just felt more comfortable after a while. (It was actually possible
to do a commit from the command line in a single command, without
getting annoyed by the interface, for a trivial example.)
-
I think git is better for kernel work these days, but a large portion of
that is that a lot of the features have literally been tweaked for us (for
very obvious reasons).For example, the whole "rebase" thing (or explicitly making cherry-picking
easy) is something that a number of kernel people do, and even if I have
to admit to not liking the practice very much (it kind of hides the "true"
development history), it does have huge advantages, and it makes history a
lot easier to read.Similarly, I often used the single-file graphical history viewing in BK
("revtool"), but being able to follow the history of multiple files as one
"entity" really is something that once you get used to, it's really really
hard going back, and "gitk" does generate a much more readable graph.And I think the git way of doing branches is just simply superior. Git
always did branches in the sense that the way merges happened you _always_
had several heads, but actually making them available and switching
between them was something that wasn't my idea, and that I even was a bit
apprehensive about. I was wrong. Git branches are branches done right. I
just don't see how you _could_ do them better.That said, a lot of the features I like and _I_ consider really important
are possibly not that important to others. For example, maybe nobody else
really cares about viewing the history of a particular subsystem, the way
I do. For a lot of people, single-file is probably ok.For example, while git now does "annotate" (or "blame"), it's not
lightning fast, and I simply don't care. Doing agit blame kernel/sched.c
takes about three seconds for me, and that's on a pretty good machine (and
on the kernel tree, which for me is always in the cache ;). Quite frankly,
if I cared deeply about that kind of annotation, I'd probably be upset
about it. There are basically _no_ other git operations that take that
long. I can get the _full_ log of the last 18 months of the kernel much
fas...
ll.6041-6091 of that file is blamed to arch/ia64/kernel/domain.c
by pickaxe -C (attributed to commit 2.6.12-rc2) while blame says
they are brought in by commit 9c1cfa, which says "Move the ia64
domain setup code to the generic code". I am slowly realizing
that comparing the output from blame and pickaxe might be a good
way to study the project history.-
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1I'm not as familiar with those details. The one fork that I know a lot
about, when Baz (the old Bazaar architecture) forked off from Arch,
showed me that for each developer branch, one branch must be special.This is just because it is hard to maintain a branch that applies
cleanly to two diverging codebases. So each developer must develop
against the fork that they want to merge their code into. If they want
their code to be applied to the other fork, someone must port it.So I really do feel that special branches are inescapable.
With bzr, you have the freedom to choose which branch you consider
special, and change your mind at any time. There are no technicalThey would remain stable if you only used pull to update your origin
I don't see why you're reaching that conclusion. I'd like to understand
that better, because Linus seems to be concluding the same thing, and itThat doesn't follow. Just because something is arguably true doesn't
make it bad. And in this case, I'm not arguing that it's true, I'mWe do it the other way around: we put a bug number in the commit
message. And I personally have been developing a bugtracker that is
distributed in the same way bzr is; it stores bug data in the sourceYes, we put revnos in our bug trackers. No, we can't prove that they
will always be valid. But there are significant disincentives to
changing them, so I am quite comfortable assuming they will not change.
And the older a revno gets, the less likely it is to change.On the other hand, I think your revision identifiers are not as
permanent as you think.In the first place, it seems fairly common in the Git community to
rebase. This process throws away old revisions and creates new
revisions that are morally equivalent[1]. I don't know whether Git
fetches unreferenced revisions, but bzr's policy is to fetch only
revisions referenced in the ancestry DAG of the branch.In the second place, one must consi...
First, I want to point out that I think we're having a delightfully
enlightening conversation here, and I'm glad for that.Let me provide a couple of hypothetical situations to try to
demonstrate my thinking here. The first is far-fetched but perhaps
easier to understand the implications. But the second is the real,
everyday situation that is much more important.Far-fetched
-----------
Let's imagine there's a complete fork in the bzr codebase tomorrow. We
need not suppose any acrimony, just an amiable split as two subsets of
the team start taking the code in different directions.Now, at the time of the fork, all published revision numbers apply
equally well to either team's codebase, (obviously, since they are
identical). But as the projects diverge they each start publishing
revision numbers with respect to their own repositories in their own
bug trackers, etc. Obviously, each project has its own "mainline" so
these new revision numbers are only unique within each project and not
between the two.Time passes...
Finally the two teams (who had remained good friends after the
breakup) find a unifying theory that will let them work on a single
tool that will meet the needs of both user bases. So they want to
merge their code together.After the merge, there can be only one mainline, so one team or the
other will have to concede to give up the numbers they had generated
and published during the fork. That is, the numbers will not be usable
within the new, merged repository.Everyday
--------
Now, the above scenario is just silly. It's not likely to ever happen,
so it's really not worth considering as a motivating case.But, what does (and should) happen everyday is exactly the same. So
here's a realistic situation that is worth considering:An individual takes the bzr codebase and starts working on it. It's
experimental stuff, so it's not pushed back into the central
repository yet. But our coder isn't a total recluse, so his friends
help him with the code he'...
Note that the id's are still permanent in this case; they will never
(module some assumptions about the crypto) be reused. So a given id
points at one and only one object, for all time; it's just that we maySo in this case you can certainly lose the launch codes. But you have
forever granted everyone a way to determine whether a given guess at the
launch codes is correct. (Again, assuming some stuff about SHA1).--b.
-
In what sense? Yes, you can make a guess if you have stored the SHA1
that contained the launch codes. But the point is that that particular
SHA1 is no longer part of the repository. Keeping that SHA1 is no easier
than just keeping the launch codes in the first place.-Peff
-
Well, I thought the discussion was about what meaning references have
after branches were modified or removed. In which case the interesting
situation is one where an object is gone but someone somewhere still
holds a reference (because the SHA1 was mentioned in a bug report or anCould be.
Anyway, the important difference between the SHA1 references and small
integers is that there's no aliasing in the former case. Which is
important--I'd rather have a reference to nothing than a reference to
the wrong thing....--b.
-
Git tries very hard to make sure you don't have a reference to something
that doesn't exist. But yes, you could have a reference to the SHA1 in
another, non-git source, and try to guess the data from it. However,
there's a bit of a two-step procedure, since the SHA1 will likely be of
the commit. You have to guess the commit author, date, message, and
the contents of the rest of the tree to make a correct guess.In practice I think most "launch code" scenarios are less about
guessable confidentiality, and more about ceasing to publish things you
shouldn't be (like copyright or patent encumbered code).-Peff
-
First, I want to point out that I think we're having a delightfully
enlightening conversation here, and I'm glad for that.Let me provide a couple of hypothetical situations to try to
demonstrate my thinking here. The first is far-fetched but perhaps
easier to understand the implications. But the second is the real,
everyday situation that is much more important.Far-fetched
-----------
Let's imagine there's a complete fork in the bzr codebase tomorrow. We
need not suppose any acrimony, just an amiable split as two subsets of
the team start taking the code in different directions.Now, at the time of the fork, all published revision numbers apply
equally well to either team's codebase, (obviously, since they are
identical). But as the projects diverge they each start publishing
revision numbers with respect to their own repositories in their own
bug trackers, etc. Obviously, each project has its own "mainline" so
these new revision numbers are only unique within each project and not
between the two.Time passes...
Finally the two teams (who had remained good friends after the
breakup) find a unifying theory that will let them work on a single
tool that will meet the needs of both user bases. So they want to
merge their code together.After the merge, there can be only one mainline, so one team or the
other will have to concede to give up the numbers they had generated
and published during the fork. That is, the numbers will not be usable
within the new, merged repository.Everyday
--------
Now, the above scenario is just silly. It's not likely to ever happen,
so it's really not worth considering as a motivating case.But, what does (and should) happen everyday is exactly the same. So
here's a realistic situation that is worth considering:An individual takes the bzr codebase and starts working on it. It's
experimental stuff, so it's not pushed back into the central
repository yet. But our coder isn't a total recluse, so his friends
help him with the code he'...
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1I don't think this is true. The abandoned mainline does not need to be
destroyed. It can be kept at the same location that it always was, with
the numbers that it always had. So the number + URL combo stays
meaningful. Additionally, the new mainline can keep a mirror of the
abandoned mainline in its repository, because there are virtually noThey certainly can.
The coder says "I've put up a branch at http://example.com/bzr/feature.
In revision 5, I started work on feature A. I finished work in
revision 6. But then I had to fix a related bug in revision 7."As long as that coder is active, they'll keep their repository at the
same location. And because branches are cheap (even cheaper than
delta-compressed revisions), there's no reason to delete old branches.This is true, but his code is likely to all land in the mainline at
once. Since his own revnos are more fine-grained, he's not likely wantI felt that you were mischaracterizing my _statement_ that "it's
exceedingly uncommon for [revnos] to change" as an _argument_ "it's
exceedingly uncommon for [revnos] to change". The reality is that we
keep saying revnos don't change because git users keep saying "but whatIf you're interested, it's called "Bugs Everywhere" and it's available here:
http://panoramicfeedback.com/opensource/So actually, not all branches are treated equally by Git users. Public
branches are treated as append-only, but private branches are treated asSame here.
Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.orgiD8DBQFFOAPm0F+nu1YWqI0RAhkdAJ9InxuEjbToGQU2AOJmfZw124Lb2wCeMmDC
9w08eZbmL19FfVQmtpPcYkQ=
=AmGo
-----END PGP SIGNATURE-----
-
Sure that's possible, but it gets rather unwieldy the more
repositories you have involved. I've been arguing that bzr really does
encourage centralized, not distributed development, and you were having
trouble seeing how I came to that conclusion. Do you see how "maintain
an independent URL namespace for every distributed branch" doesn'tAnd this part I don't understand. I can understand the mainline
storing the revisions, but I don't understand how it could make them
accessible by the published revision numbers of the "abandoned"...which is what you just said there yourself.
On the other hand, git names really do live forever, regardless of
where the code is hosted or how it moves around. When I'm talking
about historical stability, I'm talking about being able to publish
numbers that live forever.It sounds like bzr has numbers like this inside it, (but not nearly as
simple as the ones that git has), but that users aren't in the
practice of communicating with them. Instead, users communicate with
the unstable numbers. And that's a shame from an historicalWhat I'd like to be able to do, is advertise a temporary repository,
and while using it, publish names for revisions that will still be
valid when the code gets pushed out to the mainline. That is
supporting distributed development, and everything I'm hearing saysOK.
The original claim that sparked the discussion was that bzr has a
"simple namespace" while git does not. We've been talking for quite a
while here, and I still don't fully understand how these numbers are
generated or what I can expect to happen to the numbers associated
with a given revision as that revision moves from one repository to
another. It's really not a simple scheme.Meanwhile, I have been arguing that the "simple" revision numbers that
bzr advertises have restrictions on their utility, (they can only be
used with reference to a specific repository, or with reference to
another that treats it as canonical). I _think_ I understand the...
With this sort of setup, I would publish my branches in a directory
tree like this:/repo
/branch1
/branch2I make "/repo" a Bazaar repository so that it stores the revision data
for all branches contained in the directory (the tree contents,
revision meta data, etc).The "/repo/branch1" essentially just contains a list of mainline
revision IDs that identify the branch. This could probably be just
store the head revision ID, but there are some optimisations that make
use of the linear history here.If the ancestry of "/repo/branch2" is a subset of branch1 (as it might
be if the in the case of forked then merged projects), then all its
revision data will already be in the repository when branch1 was
imported. The only cost of keeping the branch around (and publishing
it) is the list of revision IDs in its mainline history.For similar reasons, the cost of publishing 20 related Bazaar branches
on my web server is generally not 20 times the cost of publishing a
single branch.I understand that you get similar benefits by a GIT repository with
With the repository structure mentioned above, the cost of publishing
multiple branches is quite low. If I continue to work on the project,
then there is no particular bandwidth or disk space reasons for me to
cut off access to my old branches.For similar reasons, it doesn't cost me much to mirror other people's
If you need that level of stability then you want the revision
identifier in both the GIT and Bazaar cases.As for simplicity, note that Bazaar doesn't extract any special
meaning from the "$email-$date-$random" format of the revision
identifiers. The only property it cares about is that they are
globally unique. For example, revision identifiers generated by the
Arch -> Bazaar importer have a different format and are handled theThat is correct. The revision numbers assigned to particular
revisions in the context of one branch won't necessarily be the sameI can't say anything ab...
And here we have a feature which is as far as I see unique to git,
namely to have persistent branches with _separate namespace_. It means
that we can have hierarchical branch names (including names like
"remotes/<remotename>/<branch of remote>", or "jc/diff"), and we don't
have to guess where repository name ends and branch name begins.The idea of "branches (and tags) as directories" was if I understand
it correctly introduced by Subversion, and from what can be seen from
troubles with git-svn (stemming from the fact that division between
project name and branch name is the matter of _convention_) at leastYou can get similar benefits by a GIT repository with shared object
database using alternates mechanism. And that is usually preferred
over storing unrelated branches, i.e. branches pointing to disconnected
DAG (separate trees in BK terminology) of revision, if that you mean by
multiple head revisions (because in GIT there is no notion of "mainline"But the revision number in this case _changes_. It is from 7 to
branch:7 but still it changes somewhat.Emphasisis on _potential_. SHA1 id abbreviated to 6 characters might
be not unique in larger project, but for example the chance that
SHA1 id abbreviated to 7 or 8 characters is not unique is really low.Yet another analogy:
SHA1 identifiers of commits (and not only commits) can be compared
to Message-Ids of Usenet messages, while revision numbers can be compared
to Xref number of Usenet message which if I understand correctly is unique
only for given news server. But Message-Ids cannot be shortened
meaningfully like SHA1 ids can; newertheless they are used in communication
without any problems. Even if namespace is not simple ;-)--
Jakub Narebski
Poland
-
With the above layout, I would just type:
bzr branch http://server/repo/branch1This command behaves identically whether the repository data is in
/repo or in /repo/branch1. Someone pulling from the branch doesn't
have to care what the repository structure is. Having a separate
namespace for branch names only really makes sense if the user needs
to care about it.As for heirarchical names, there is nothing stopping you from using
deaper directory structures with Bazaar too. Bazaar just checks eachI think you are a bit confused about how Bazaar works here. A Bazaar
repository is a store of trees and revision metadata. A Bazaar branch
is just a pointer to a head revision in the repository. As you can
probably guess, the data for the branch is a lot smaller than the data
for the repository.You can store the repository and branch in the same directory to get a
standalone branch. The layout I described above has a repository in a
parent directory, shared by multiple branches.If you are comparing Subversion and Bazaar, a Bazaar branch shares
more properties with a full Subversion repository rather than aI may have got the git terminology wrong. I was trying to draw
parallels between the .git/refs/... files in a git repository and the
way multiple branches can be stored in a Bazaar repository.I am not claiming that you'll get bandwidth or disk space benefits for
storing unrelated branches in a single Bazaar repository. But if the
branches are related, then there will be space savings (which is whatA revision number is only has meaning in the context of a branch. If
I mirror a branch, the revision numbers in the context of each will
refer to the same revision IDs.My point was that by shortening the IDs with GIT, you are trading
global uniqueness (i.e. the identifier may clash with one found in a
different context) for the convenience of shorter identifiers.Provided you know that the tradeoff is being made, it isn't generally
much of a problem. ...
With Cogito (you can think of it either as alternate Git UI, or as SCM
built on top of Git) you would use$ cg clone http://server/repo#branch
for example
$ cg clone git://git.kernel.org/pub/scm/git/git.git#next
to clone _single_ branch (in bzr terminology, "heavy checkout" of branch).
But you can also clone _whole_ repository, _all_ published branches with$ cg clone git://git.kernel.org/pub/scm/git/git.git
With core Git it is the same, but we don't have the above shortcut
for checking only one branch; branches to checkout are in separate
arguments to git-clone.In bzr it seems that you cannot distinguish (at least not only
from URL) where repository ends and branch begins.*Sidenote:* In current version of gitweb you can get file
in given repository in given branch using the following
notation:http://path/to/gitweb.cgi/repo/sitory/branch/name:file/name
gitweb can detect where branch name ends and repository name
begins; usually (by convention) "bare" git repositories uses
<project>.git name, "clothed" git repositories uses
<project>/.gitOh, that explained yet another difference between Bazaar-NG (and other
SCM which uses similar model) and Git.In Git branch is just a pointer to head (top) commit (hence they are stored
under .git/refs/heads/) in given line of development. Git also stores
information (in .git/HEAD) about which branch we are currently on, which
means on which branch git puts new commits. Nothing more (well, there
can be log of changes to head in .git/logs/refs/heads/ but that is optional
and purely local information). In Bazaar-NG you have to store (if I
understand it correctly) mapping from revnos to revisions.By default (it means for example default behavior of git-clone, if we don't
use --bare option) git repository is _embedded_ in working area. We have.git/
.git/HEAD
...
.git/refs/heads/
...
<working area files, e.g.>So repo/branch wouldn't work, because 'branch' wo...
My understanding of git is that this would be equivalent to the "bzr
branch" command. A checkout (heavy or lightweight) has the propertyI suppose that'd be useful if you want a copy of all the branches at
Two points:
(1) if we are publishing branches, we wouldn't include working trees
-- they are not needed to pull or merge from such a branch.
(2) if we did have working trees, they'd be rooted at /repo/branch1
and /repo/branch2 -- not at /repo (since /repo is not a branch).In case (2) there is a potential for conflicts if you nest branches,
but people don't generally trigger this problem with the way they useThat is fairly similar to the default mode of operation with Bazaar:
you have a repository, branch and working tree all rooted in the same
directory. If you have separated working trees and branches, thenThe layout of a standalone branch would be:
.bzr/repository/ -- storage of trees and metadata
.bzr/branch/ -- branch metadagta (e.g. pointer to the head revision)
.bzr/checkout/ -- working tree book-keeping files
source codeIf we use a shared repository, the contained branches would lack the
.bzr/repository/ directory. The parent directory would instead have a
.bzr/repository/, but usually wouldn't have .bzr/branch/ (unless there
is a branch rooted at the base of the repository).if we are publishing a branch to a web server, we'd skip the working
tree, so the source code and .bzr/checkout/ directory would be
missing.In the case of a checkout, the .bzr/branch/ directory has a special
format and acts as a pointer to the original branch. If the checkout
is lightweight, the .bzr/repository/ directory would be missing, andOkay. So using Bazaar terminology, this seems to be an issue of the
working tree being associated with the repository rather than the
branch?Well, a branch can easily have multiple URLs even if there is only one
copy of it. I might write to it via local file access or sftp (which
would be a file: or sftp: URL).
...
Not exactly (my mistake in explaining it). "cg clone git://host/repo@branch"
clones only part of history DAG of commits reachable from given branch.
Still it is full repository. You can add branches to it later withThat is _very_ useful. And that is default option for Git. For
example with git.git repository I'm interested both in 'master'
branch (main line of development), and in 'next' branch (development
branch). For example I send some patches, based on 'master', they
get accepted but in 'next' (to cook for a while for example), and
I want to do further work in this direction I have to base my
new work on 'next' branch.It looks like the Bazaar-NG "branches" are equivalent of the
one-branch-clone of Git.And if there is no command to clone whole repository, how
you do public repository?See below.
Same with Git. Public repositories are usually "bare" clones, i.e.
without working directory. We can clone/fetch from "clothed" repoThere is no problem in Git to have git repository nested within
working area: of course you better ignore .git directory; you can
ignore files in this embedded repository or not.The layout of git repository (git clone, as it is equivalent of bzr branch)
you have the following layout:
.git/objects/ -- repository objects database
.git/refs/ -- heads (branches) and tags
.git/index -- staging area for commit (adding files, merge resolving)
.git/HEAD -- which branch is current branchThe equivalent of shared repository would be having .git/objects/
to be symlink to some directory which would serve as common area
to store object database.You can use alternates file: .git/objects/info/alternates can have
list of absolute pathnames (one per line) where objects can be found
instead. If I understand correctly new objects gets commited to current
repository object database, therefore to have equivalent of symlinking
.git/objects directory you would have for every repository which you
want to share object database to have in al...
Dear diary, on Sat, Oct 21, 2006 at 12:50:31AM CEST, I got a letter
It's not exactly convenient, but you can do
xpasky@machine[0:0]~/git$ GIT_ALTERNATE_OBJECT_DIRECTORIES=../cogito/.git/objects cg-diff -r `GIT_DIR=../cogito/.git cg-object-id -c HEAD`..HEAD
I don't personally think it's worth a special UI, but there're no
boundaries for initiative... :-)--
Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1
lK[d2%Sa2/d0$^Ixp"|dc`;s/\W//g;$_=pack('H*',/((..)*)$/)
-
Dear diary, on Fri, Oct 20, 2006 at 03:17:26PM CEST, I got a letter
Nope, cg clone will in this case clone the master branch (or whatever
the remote HEAD points at). cg clone -a is planned but not implementedYou don't need to, you can switch your working tree between various
branches. I think Linus said he does that (or was it Junio?), and I do that
as well, as well as many others.A good question would be "when to create another branch and when to
clone the repository". And I don't think there's any good answer, except
"when you are comfortable with it". :-) Both approaches have pros/cons.--
Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1
lK[d2%Sa2/d0$^Ixp"|dc`;s/\W//g;$_=pack('H*',/((..)*)$/)
-
That's probably because Cogito still uses obsolete branches/
$ git clone git://git.kernel.org/pub/scm/git/git.git
clones _whole_ repository, all the branches and tags, and saves information
I should have said: bring working area to state given by some revision
(instead of "populate working area").--
Jakub Narebski
Poland
-
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1I understand your argument now. It's nothing to do with numbers per se,
I meant that the active branch and a mirror of the abandoned branch
could be stored in the same repository, for ease of access.Bazaar encourages you to stick lots and lots of branches in your
repository. They don't even have to be related. For example, my repoI can see where you're coming from, but to me, the trade-off seems
worthwhile. Because historical data gets less and less valuable the
older it gets. By the time the URL for a branch goes dark, there'sWhen you create a new branch from scratch, the number starts at zero.
If you copy a branch, you copy its number, too.Every time you commit, the number is incremented. If you pull, your
numbers are adjusted to be identical to those of the branch you pulled from.Sure. It's the "favors centralization" thing that I don't agree with,
In my experience, users who don't understand distributed systems don't
What's nice is being able see the revno 753 and knowing that "diff -r
752..753" will show the changes it introduced. Checking the revo on a
branch mirror and knowing how out-of-date it is.Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.orgiD8DBQFFOCEf0F+nu1YWqI0RAhgtAJwK4jkWFjjF2iHJb1VyXqgszsHElACff2U7
olZJiAED80tIS6kgkqFsJps=
=BkRZ
-----END PGP SIGNATURE-----
-
Well, I'm glad to know we each feel like we are communicating at
The entire discussion is about how to name things in a distributed
system. The premise that Linus has put forth in a very compelling way,
is that attempting to use sequential numbers for names in a
distributed system will break down. The breakdown could be that the
names are not stable, or that the system is used in a centralized way
to avoid the instability of the names.Now, that causality might not accurately describe the way bzr has
developed. It may be that the centralization bias was determined by
other reasons, and that given those, using sequential numbers for
names makes perfect sense.But it really is fundamental and unavoidable that sequential numbers
Granted, everything can be stored in one repository. But that still
doesn't change what I was trying to say with my example. One of the
repositories would "win" (the names it published during the fork would
still be valid). And the other repository would "lose" (the names it
published would be not valid anymore). Right?Now, maybe there's some "simple" mapping from old names to new names
for the losing repository, (something like adding a prefix of
"losers/" to the beginning of the names or something or adding a "15."
prefix or whatever). The point is that the old names are
invalidated. And there's no way to guarantee this kind of change won't
happen in the future, (no matter how old a project is).I constructed that example to show that the naming has a social impact
in forcing a distinction between winners and losers in the merge, (or
mainline and side branch, or whatever you want to name the
distinction). The two re-joining projects could be really amiable,
create a new virgin mainline and treat both histories as side
branches. In this version, everyone loses as all the old names areGit allows this just fine. And lots of branches belonging to a single
project is definitely the common usage. It is not common (nor
encouraged) for unrelated pro...
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1So I'd say that revnos without the context of a location can only refer
to the current branch that the user is working on. They don't refer to
the mainline, which typically has its own numbers that don't match the
user's.If you're saying that bzr is "centralized" in that the user's current
Right. You need something guaranteed to be unique. It's the revno +
url combo that is unique. That may not be permanent, but anyone canNo. It would be silly for the losing side to publish a mirror of the
winning branch at the same location where they had previously published
their own branch. So the old number + URL combination would remain valid.If the losing faction decided to maintain their own branch after the
merge, they'd have two options1. continue to develop against the losing "branch", without updating its
numbers from the "winning" branch. It would be hard to tell who had won
or lost in this case.2. create a new mirror of the "winning" branch and develop against that.
I'm not sure what this point of this would be.I think the most realistic thing in this scenario is that they leave the
"losing" branch exactly where it was, and develop against the "winning"Right. This is a difference between Bazaar and Git that's I'd
characterize as being "branch-oriented" vs "repository-oriented". We'llI got the impression there was also a local ordering of revisions. Is
that wrong?A Bazaar branch is a directory inside a repository that contains:
- a name referencing a particular revision
- (optional) the location of the default branch to pull/merge from
- (optional) the location of the default branch to push to
- (optional) the policy for GPG signing
- (optional) an alternate committer-id to use for this branch
- (optional) a nickname for the branch
- other configuration optionsA Bazaar branch doesn't contain any commit objects ("revisions" in
Bazaar parlance). Those are retrieved from the containing r...
But it is *not* *distributed*. The definition of a distributed system
among other things require, that resource identifiers are independent on
the location of the resources. So only using the revision-ids is reallyI regularly use bzr and I never used git. But I'd not hesitate a second
to pull --overwrite over the old location. Because the url has a meaning
"the base I develop against" for me and I'd want to preserve thatThis is one of things I on the other hand like better on bzr than git.
Because it is really branches and not repositories that I usually care
about.--------------------------------------------------------------------------------
- Jan Hudec `Bulb' <bulb@ucw.cz>
-
>>>> repository.
Why not? I think it really does. And due to the fact that merges are
merges and will show up as such, I think it's very suitable for
feature branches.In fact, in the bzr development of bzr itself. All commits are done
in feature branches and then merged into bzr.dev (the main "trunk" of
bzr) when they are considered stable.Consider the following
bzr branch mainline featureA
cd featureA
hack hack; bzr commit -m 'f1'; hack hack bzr commit -m f2; etc
No I want to merge in mainline again
bzr merge ../mainline; bzr commit -m merge
hack hack; bzr commit -m f3; hack hack bzr commit -m f4; etcright now, I would have something line this in the branch log
-----------------------------------------------------------------
committer: Erik B
At Sun, 22 Oct 2006 11:56:32 +0200, "=3D?ISO-8859-1?Q?Erik_B=3DE5gfors?=3D"=
Thanks for sharing this example. I think when we look at concrete
things that the tools actually let you do, we have a better
conversation. Plus, this example highlights some very interesting
differences between the tools.So here is a complete sequence of git commands to construct the
scenario (even the extra hacking in mainline):mkdir gittest; cd gittest
git init-db
touch mainline; git add mainline; git commit -m "Initial commit of mainlin=
e"
git checkout -b featureA
touch f1; git add f1; git commit -m f1
touch f2; git add f2; git commit -m f2
git checkout -b mainline master
touch sd; git add sd; git commit -m "something done in mainline";
touch se; git add se; git commit -m "something else done in mainline";
git checkout featureA
git pull . mainline
touch f3; git add f3; git commit -m f3
touch f4; git add f4; git commit -m f4For reference, here's the same with bzr:
mkdir bzrtest; cd bzrtest
bzr init-repo . --trees
bzr init mainline; cd mainline
touch mainline; bzr add mainline; bzr commit -m "Initial commit of mainlin=
e"
cd ..; bzr branch mainline featureA; cd featureA
touch f1; bzr add f1; bzr commit -m f1
touch f2; bzr add f2; bzr commit -m f2
cd ../mainline/
touch sd; bzr add sd; bzr commit -m "something done in mainline"
touch se; bzr add se; bzr commit -m "something else done in mainline"
cd ../featureA
bzr merge ../mainline/; bzr commit -m "merge"
touch f3; bzr add f3; bzr commit -m f3
touch f4; bzr add f4; bzr commit -m f4[As has recently been pointed out, the tools really are more the same
OK. So here is a difference in the tools. With git, you don't get the
indentation for the "non-mainline" commits. This is because git
doesn't recognize any branch in the DAG to be more significant than
any other. Instead, git provides a flat, and (heuristically)
time-sorted view of the commits. (It's heuristic in that git just uses
the time stamps in the...
On Sun, Oct 22, 2006 at 07:25:41AM -0700 I heard the voice of
This throws me a little. I'd expect it to Just Do It when it's
fast-forwarding, but if it's doing a merge, I'd prefer it to stop and
wait before creating the commit, even if there are no textual
conflicts. I realize you can just look at it afterward and back outEvery branch has a nickname, settable with 'bzr nick' (defaulting to
whatever the directory it's in is), and that's stored as a text field
in each commit. It's mostly cosmetic, but it's handy to see at aFrom what I can gather from this, though, that means that when I merge
stuff from featureA into mainline (and keep on with other stuff in
featureA), I'll no longer be able to see those older commits from this
command. And I'll see merged revisions from branches other than
mainline (until they themselves get merged into mainline), correct?
It sounds more like a 'bzr missing --mine-only' than looking down aThe branch: (head) and ancestor: (latest common rev) revspecs let you
refer to the respective bits of other branches, which I think wouldWell, what would be the fun in that? 8-}
--
Matthew Fuller (MF4839) | fullermd@over-yonder.net
Systems/Network Administrator | http://www.over-yonder.net/~fullermd/
On the Internet, nobody can hear you scream.
-
The thing that the bzr people don't seem to realize is that their choice
of revision naming has serious side effects, some of them really
technical, and limiting.I already briought this up once, and I suspect that the bzr people simply
DID NOT UNDERSTAND the question:- how do you do the git equivalent of "gitk --all"
which is just another reason why "branch-local" revision naming is simply
stupid and has real _technical_ problems.I really suspect that a lot of people can't see further than their own
feet, and don't understand the subtle indirect problems that branch-local
naming causes.For example, how long does it take to do an arbitrary "undo" (ie forcing a
branch to an earlier state) in a project with tens of thousands of
commits? That's actually a really important operation, and yes,
performance does matter. It's something that you do a lot when you do
things like "bisect" (which I used to approximate with BK by hand, and
yes, re-weaving the branch history was apparently a big part of why it
took _minutes_ to do sometimes).Again, this is something that people don't expect to have _anything_ to do
with revision numbering, but the fact is, it's a big part of the picture.
If you have branch-local revision numbering, you need to renumber all
revisions on events like this, and even if it is "just" re-creatigng the
revno->"real ID" cache, it's actually an expensive operation exactly
because it's going to be at least linear in history.One of the git design requirements was that no operation should _ever_
need to be linear in history size, because it becomes a serious limiter of
scalability at some point. We were seeing some of those issues with BK,
which is why I cared.So in git, doing things like jumping back and forth in history is O(1).
Always (with a really low constant cost too). Of course, checking out the
end result is then roughly O(n), but even there "n" is the size of the
_changes_, not number of revisions or number of ...
On Mon, Oct 23, 2006 at 10:29:53AM -0700 I heard the voice of
I for one simply DO NOT UNDERSTAND the question, because I don't know
what that is or what I'd be trying to accomplish by doing it. TheI don't understand the thrust of this, either. As I understand the
operation you're talking about, it doesn't have anything to do with a
branch; you'd just be whipping the working tree around to differentI agree, and I currently find a number of places bzr doesn't hit the
level of performance I think it should. I'm not convinced, however,
that any notable proportion of that has to do with the abstract model
behind it. And insofar as it has to do with the physical storage
model, that can easily be (and I'm confident will be, considering it'sI consider it a _technical_ sign of a way of thinking about branches I
prefer 8-}--
Matthew Fuller (MF4839) | fullermd@over-yonder.net
Systems/Network Administrator | http://www.over-yonder.net/~fullermd/
On the Internet, nobody can hear you scream.
-
There are two things to do:
* Mark the tree as corresponding to a different revision in the past.
This is roughly "echo 'revision@id-123' > .bzr/checkout/last-revision"
in bzr. Obviously, writting the file is O(1), but computing the
revision identifier if you say "bzr switch -r 42" (I'm not sure
switch accepts this BTW), you have to load the revision history.
Indeed, bzr would load it anyway to make sure that the revision you
switch to is in the revision history.In bzr, you have .bzr/branch/revision-history for each branch, which
is a newline-separated list of revision-identifiers. In the case of
bzr.dev, for example, this file is 112KB as of now. This is
O(history), with "history" being the length of the path from HEAD to
the initial commit, following the leftmost ancestor (i.e. number of
revisions in a centralized workflow, and less than this otherwise).
That said, the constant factor is very small. For example, on
bzr.dev, I did "grep -n some-rev-id" (which does revid-to-revno), it
takes 0.004 seconds (Vs 0.003 seconds to grep in /dev/null
instead ;-) ), so you'd need many orders of magnitude before this
becomes a limitation.Linus's point AIUI is that this will _never_ be a limitation of git.
* Then, do the "merge" to make your tree up to date. You can hardly do
faster than git and its unpacked format, but this is at the cost of
disk space. But as you say, in almost any modern VCS, that's
O(diff). In a space-efficient format, that's just the tradeoff you
make between full copies of a file and delta-compression.--
Matthieu
-
gitk (and all other logging functions) can take as its argument a set of
arbitrary revision expressions.That means, for example, that you can give it a list of branches and tags,
and it will generate the combined log for all of them. "--all" is just
shorthand for that, but it's really just a special case of the generic
facility.This is _invaluable_ when you want to actually look at how the branches
are related. The whole _point_ of having branches is that they tend to
have common state.For example, let's say that you have a branch called "development", and a
branch called "experimental", and a branch called "mainline". Now,
_obviously_ all of these are related, but if you want to see how, what
would you do?In git, one natural thing would be, for example, to do
gitk development experimental ^mainline
(where instead of "gitk" you can use any of the history listing
things - gitk is just the visually more clear one) which will show you
what exists in the branches "development" and "experimental", but it will
_subtract_ out anything in "mainline" (which is sensible - you may want to
see _just_ the stuff that is getting worked on - and the stuff in mainline
is thus uninteresting).See? When you visualize multiple branches together, HAVING PER-BRANCH
REVISION NUMBERS IS INSANE! Yet, clearly, it's a valid and interesting
operation to do.An equally interesting thing to ask is: I've got two branches, show me the
differences between them, but not the stuff in common. Again, very simple.
In git, you'd literally just writegitk a...b
(where "..." is "symmetric difference"). Or, if you want to see what is in
"a" but _not_ in "b", you'd dogitk b..a
(now ".." is regular set difference, and the above is really identical to
the "a ^b" syntax).And trust me, these are all very valid things to do, even though you're
talking about different branches.No. If you "undo", you'd undo the whole history too. And if you undo to a
point tha...
On Mon, Oct 23, 2006 at 03:44:13PM -0700 I heard the voice of
I have zero problem believing that. It seems from all accounts a
wonderful swiss-army chainsaw, and while none of that power is useful
to me personally in anything I'm VCS'ing at the moment, I'd feel awful
shiny knowing it was sitting there waiting for me. All else being
equal, I'd think more highly of a VCS with those capabilities than one
without.bzr-the-program doesn't have a lot of that capability, and what it
does have is rather more verbose to access. Perhaps some attribute of
bzr-the-current-storage-model would make some bit of that
significantly more expensive than it has to be (I don't know of any,
and can't think offhand of anywhere it might hide, but that's way off
my turf).But I don't understand how bzr-the-abstract-data-model makes such
things impossible, or even significantly different than doing so in
git. In git, you're just chopping off one DAG where another one
intersects it (or similar operations). To do it in bzr, you'd do...
exactly the same thing. The revnos, or the mainline, are completely
useless in such an operation of course, but they don't hurt it; the
tool would just just ignore them like it does the SHA-1 of files inI wouldn't be so absolutist about it, but certainly they're of
extremely limited utility if of any at all in such cases. And yes, it
can be an interesting operation. But what does that have to do with
using revnos in other cases? You keep saying "having" where I wouldWell, I guess in this particular case I still don't see why you'd
generally undo big hunks of a branch versus just flipping your working
tree to different versions. But contrived examples are still
examples, and even if so, truncate()'ing a list of numbers is a
constant time operation. And even if you had to renumber totally...
my $DEITY, I'd expect my old 200MHz PPro to renumber a hundredQuite frankly, I just don't think you understand that I WANT to care
about first parents. No, really. ...
one key difference is that with bzr you have to do this chopping by creating the
branches at the time changes are done, with git you do this chopping after the
fact when you are displaying the results.As such you can chop and compare things in ways that were never contemplated by
nobody is saying that the bzr approach is invalid for your workflow.
what people are saying is that it doesn't easily support a truely distributed
workflow. this is a very different statement.your workflow isn't truely distributed so you bzr's model works well for you. no
problem, just don't claim that becouse you haven't run into any problems with
your workflow that there are no problems with bzr with other workflows.David Lang
-
On Tue, Oct 24, 2006 at 08:58:56AM -0700 I heard the voice of
HUH? Why on earth do you think that?
To do this in a git data model, you point at 2 (or 3, or 4, or...)
revisions, anywhere in the revision-space universe. You derive back a
DAG of the history from each of them by recursing over parent links.
You figure out where (if anywhere) those DAG's intersect. And based
on that, you alter what and how you display; including or excluding
certain revs, changing the angles of lines or columnation of dots in a
graph, etc.To do it in a bzr data model, you would follow *EXACTLY* the same
And it's one that carries around a lot of unstated assumptions about
what "truely distributed" means, which *I*'m certainly not
understanding, because any meaning I can apply to the term doesn't
lead me to the conclusions it does you. Certainly, depending on your
workflow, certain parts of the UI are of lesser utility than they are
in mine, down to and including zero. And it's probably certain that
some parts of the UI aren't up to handling various workflows, too,
including OUR workflow. That's kinda what "in development" means...But that's a very different statement from the claim that they CAN'T
be without changes to the conceptual model underneath. Just because a
UI is built around maintaining the fiction of a mainline doesn't mean
the system requires it. All you'd have to do to abandon it is write a
different log formatter that didn't show revnos and didn't nest merge
commits, and change (or add an option to) 'merge' to fast-forward if
possible. The difference between the views on how the pieces should
fit together really IS just that fine.--
Matthew Fuller (MF4839) | fullermd@over-yonder.net
Systems/Network Administrator | http://www.over-yonder.net/~fullermd/
On the Internet, nobody can hear you scream.
-
it sounded like you were saying that the way to get the slices of the DAG was to
use branches in bzr. to do this you need to create the branches with the correct
info on each branch. this is only practical if the branches are created as the
changes are made, if you try to do this after the fact you need to create the
changes in the branch before you do the slicing.with git you can look at the DAG and pick any arbatrary points in it as points
the claim isn't that bzr can't be modified to support these other workflows (it
sounds as if just changing to tools to use the internal refid's rather then the
current refno's would come very close to solving this problem), it's that the
current refno's (use of which is strongly encouraged by the current UI) cannot
support some workflows, and therefor the claim that it supports fully
distributed workflows as well as git is falseremember that this entire thing started with a feature comparison checklist,
the definitions of some of the items on the checklist is being questioned.after that there's the issue of if the VCS in question has the feature.
this discussion started with two topologies
1. Centralized: all commits must go to one repository, connectivity required to check-in
2. Distributed: everything elsesince then one additional topology has been defined, and one has been redefined
1. Centralized: all commits must go to one repository, connectivity required to check-in
2. Star: one repository is 'special' or 'primary' and all other repositories
sync to this, but development can take place against local repositories,
connectivity is only requred when syncing the repositories. as updates take
place the history is defined by the primary repository, and can overwrite or
change the history as defined by local repositories.3. Distributed: all repositories are equal (any definition of 'primary' is a
matter of convention, not a requirement of the tool) development can take place
against local repositories, co...
On Tue, Oct 24, 2006 at 11:03:20AM -0700 I heard the voice of
I'm not entirely sure I understand what you mean here, but I think
you're saying "Nobody's written the code in bzr to show arbitraryI think this statement arouses so much grumbling because (a) bzr does
support such a lot better than often seems implied, (b) where it
doesn't, the changes needed to do so are relatively minor (often
merely cosmetic), and (c) disagreement over whether some of theI think there's a real intent for bzr TO support at least all common
topologies. I'll buy that current development has focused more on
[relatively] simple topologies than the more wildly complex ones. I
look forward to more addressing of the less common cases as the tool
matures, and I think a lot of this thread will be good material to
work with as that happens. It's just the suggestion that providing
fruit for simple topologies _necessarily_ prejudices against complexThat's a good enough reason for me. Before this thread, I wasn't
interested in using git. I'm still not, but now I understand much
better /why/ I'm not. And when (I'm sure it'll happen sooner or
later) some project I follow picks up using git, I'll have enough
grounding in the tool's mental model to work with it when I have to.--
Matthew Fuller (MF4839) | fullermd@over-yonder.net
Systems/Network Administrator | http://www.over-yonder.net/~fullermd/
On the Internet, nobody can hear you scream.
-
I think we are talking past each other here.
what I think was said was
G 'one feature of git is that you can view arbatrary slices trivially'
B 'bzr can do this too, you just use branches to define the slices'
G 'but this limits you becouse branches are defined as code is developed, git
lets you define slices at viewing time'by the way, I think it's more then just saying 'well, the code could be written
to do this in $VCS' some decisions and standard ways of doing things can impact
how hard it is to implement a feature, and some decisions can make itone concern that the git people are voicing is that the things that work for
simple topologies (revno's) can't be used with the more complex ones (where you
need the refid's). especially the fact that users need to do things
significantly different when there are fairly subtle changes to the topology.the scenerio that came up elsewhere today where you have
Master
/ \
dev1 dev2and then dev1 and dev2 both start working on the same thing (without knowing
it), then discover they are working on the same thing. they now have threeB
options1. merge their stuff up to the master so that they can both pull it down.
but this puts broken, experimental stuff up in the master2. declare one of the dev trees to be the master
this changes the topology to
Master--dev1--dev2
3. pull from each other frequently to keep in sync.
this changes the topology to
Master
/ \
dev1--dev2if they do this with bzr then the revno's break, they each get extra commits
showing up (so they can never show the same history).in git this is a non-issue, they can pull back and forth and the only new
history to show up will be changes.this is the situation that the kernel developers are in frequently. it sounds as
if you haven't needed to do this yet, so you haven't encountered the problems.David Lang
-
Since bzr branch is, and is ONLY, a pointer to a revision, I don't see
any design decision that would make this harder in bzr. The UI was onlyThe more I read this thread I actually think bzr does support
distributed topology as well as git.The whole difference is that bzr makes a distinction between the first
and other parents of a revision, while git does not. This distinction is
done in two places:1. The log shows the first parent and than, as indented subsection the
ancestry of other parents until the point where the ancestries meet
again. This actually captures a pattern people usually use. When you
merge, you usually put in the log something along the lines:"merged X, which bars and fixes foo."
when you actually merge M, which you consider a "mainline" and
therefore not worth mentioning and X. Linus does it this way too --
he actually posted a log message as an example, that showed exactly
this.2. Assigns revision aliases in this same order (except the "major"
number for the subsection is based on the common ancestor, not on the
merge point). They are not special thing that is generated at commit
time; they are infered from the shape of the DAG (and cached for
performance reasons).And the only issue I think is, that the bzr UI and documentation pushes
forward these aliases (revnos) more than appropriate for fully
distributed case and hides the real revision names (revids) too much forThat's a deficiency of merge not telling that a merge is pointless.
Actually I think than bzr merge *should* reduce to pull in all cases:- If the common ancestor is on the leftmost path of the other branch,
than the existing revnos as seen on this branch will not change in any
case, only more than one is added. I think it's safe for merge to
reduce to pull in this case and consider it a bug in bzr that it does
not.
- If the common ancestor is not on the leftmost path on the other
branch, than it is because the branch was mer...
On Wed, Oct 25, 2006 at 03:40:00PM -0700 I heard the voice of
Ah. This is more like "bzr [mostly] only does this now in terms of a
single branch (or some point back along it)". The slices that go
between branches are very limited ('missing' gives you one view;
'branch:' and 'ancestor:' revision specifications give you another).
bzrk/'visualize' gives an interface similar to gitk, but also only in
the context of a single branch/head looking backward through its
previous tree AFAIK. Any random DAG-slicing of what you have in the
revision store can be done, somebody would just have to write the code
for it. Nothing about 'the workflow preserves parents' would make
that any harder than writing the code for git was.Much of this is probably a result of the 'branch'-centric (rather than
'repository'-centric) view of the world; similarly to the fact that
branches are referred to by location (local ../otherbranch, or remote
http/sftp/etc) rather than by a name. This is one of the bits of bzrThese two are either/or, not and; either they pull (in which case
their old mainline is no longer meaningful), or they merge (in whichIn git, this is a non-issue because you don't get to CHOOSE which way
to work. You always (if you can) pull and obliterate your local
mainline. In bzr, it's only an 'issue' because you CAN choose, and
CAN maintain your local mainline. You CAN choose, right now, to do a
git and pull back and forth and only new history show up as changed by
creating a 'bzr-pull' shell script that does a 'bzr pull || bzr merge'
(though you'd be a lot better off adding a '--fast-forward-if-you-can'
option to merge and aliasing that over).More basically, though, I don't think that "histories become exactly
equivalent" is a necessary pass-word to enter the Hallowed City of
Truely Distributed Development. And I certainly see no reason to
believe we'll agree on it this time any more than We (in broad) have
the last 6 times it came up in the thread.--
Matthew Fuller ...
Yes they do. They can (and in this case probably will) create a
topic-branch named "the-other-dev/featureX" and keep it solely for
tracking the other peers changes, keeping their own topic-branch for
their own changes, and another branch where they merge both changes in,
or cherry-pick from each branch to get to the desired result fast. This
works easily because in git
a) branches are as cheap as I can ever imagine an SCM making them.
b) the "slice the DAG and view anything you like from any branch you
like any time you like and mix them however you want" approach of the
visualizers makes it trivial for a 10-year old fledgling programmer to
see what changes what, and where, and by whom, and why.The "b" above was a feature I didn't know I needed until it became
available to me. Thanks to Paul Mackerras (spelling?) for creating the
wonderful gitk tool, and to Marco Costalba for making a faster and, imo,Git puts emphasis on code. Bazaar puts emphasis on developers and
branch-structure. Depending on your preferrence, I imagine one suits
some people better. I really, really, really don't care if my branch-tip
gets moved because I hadn't made any changes to it while the other dev
hacked away or if it causes a merge because we had decided to work on
different parts of the feature. Perhaps this is a result of the insanely
good visualizers (kudos again to Paul and Marco) that easily lets me see
who did what when and where anyways. What I *do* care about is being
able to easily make sure all the devs have the same code to work andThe only issue I have with bzr's revno's and truly distributed setup is
that, by looking at the table, it seems to claim that you have found
some miraculous way to make revnos work without a central server. Since
everyone agrees that they don't, this should IMO be listed as mutually
exclusive features.On a side-note, git has made my life easier, so I childishly want to
defend it and see it on top of every list in the world. Someth...
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1The "simple namespace" is both a URL and a revno.
And therefore, it's just as distributed and decentralized as the web.
There is very little difference between this:
http://example.com/mywebpage#5
And this:
In fact, we've been planning to unify them into one identifier.
Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.orgiD8DBQFFQLxr0F+nu1YWqI0RAiVrAJ9rb+uylIuxqMo2VMelI3Qm6oNQOwCfeTAb
kOkp9kOkRl1YEVEP+G3y2SU=
=Zgsg
-----END PGP SIGNATURE-----
-
On Thu, Oct 26, 2006 at 12:13:39PM +0200 I heard the voice of
Not where I was going with that section of the mail; I was looking at
just the merge vs fast-forward distinction. In git, you don't get to
choose; in bzr you do.--
Matthew Fuller (MF4839) | fullermd@over-yonder.net
Systems/Network Administrator | http://www.over-yonder.net/~fullermd/
On the Internet, nobody can hear you scream.
-
Haha, I feel the same way about bzr. Some of the features that bazaar
has, such as how it preservs the leftmost parent and treats that
specially in some cases, are things that I REALLY love and don't want
to live without.All in all, I feel that git and bazaar and both excellent products,
what will happen in the future will be interesting to see./Erik
--
google talk/jabber. zindar@gmail.com
SIP-phones: sip:erik_bagfors@gizmoproject.com
sip:17476714687@proxy01.sipphone.com
-
on many modern VCS systems it's O(n) on the number of changes (start from where
you are and apply the patch to change it to rev -1, then apply the patch to
change it to rev -2, etc)on git it's O(1) (write the new files into place)
David Lang
-
one thing you are missing 'mainline' in this git command is not saying
'everything that's in the 'main' published branch'. it's saying 'everything
reachable by the tag 'mainline'so when you branched off for your feature development you could set a tag that
says 'branchpoint' and no matter what gets merged in mainline after that you can
always do branchpoint..featureA and find what you've done.that being said, mainline..featureA is also extremely useful, it tells you what
development stuff you have done that have not yet been merged into mainlineDavid Lang
-
Or you can use --no-commit option to git pull, and commit later.
But it is true that you can always amend the commit withIf I remember correctly Linus argued against it, because branch
name is something local to repository (most common example is
"mine 'master' is yours 'origin'").There was proposal for "note" header for notes like merge algorithm
used, or branch name, visible only in 'raw' mode, but it wasn'tThat's true. That is what history viewers are for (gitk, qgit, tig,
gitview, git-show-branch, git-browser) are for.And there is always reflog (if you enable it, of course).
--
Jakub Narebski
Poland
-
Thanks for this mail, this makes me happy to see. The tools are pretty
much the same but have some different view on how to do things..If I understand you correctly, you'll get the same thing with "bzr missing".
$ bzr missing ../mainline/
You have 1 extra revision(s):
------------------------------------------------------------
revno: 2
committer: Erik B
Erik B
Carl Worth wrote:
I think I haven't properly explained what "feature branch" means.
"Feature branch" is short (or medium) lived branch, created for
development of one isolated feature. When feature is in stable
stage, we merge feature branch and forget about it. We are not
interested in the fact that given feature was developed on given
branch. BTW. for example in published git.git repository are
only available in the form of "digest" 'pu' (proposed updates)
branch.I guess what you are talking about are long lived "development
branches" (like git.git 'maint', 'master', 'next' and 'pu' branches),
or perhaps long lived another user's clone of given git repository.Git considers having clones of given repository totally equivalent,
and having fast-forward property more important than remembering
"which branch (which clone) has this commit came from" or at least
"this commit is from this (current) branch-clone".You have graphical history viewers (bzr has it's own: bzr-gtk),
committer and author info, and reflog if enabled if you really,
Which if I remember correctly (at least by default) needs and generatesAs it clarified during this long discussion, bzr "branches" are
something between git branches and one-branch [local] clones.
Can you for example create branch starting from an arbitrary revision,
not only tip of branch?The above sequence of operations can be done in (at least) two different
ways in git.Less used:
$ cd /somewhere/else
$ git clone -l -s <mainrepo>/.git featureA
$ cd featureA
$ hack; hack; git commit -a -m "f1"; hack; hack; git commit -a -m "f2"; etc
$ cd <mainrepo>
$ git pull /somewhere/else/featureA/.git
(this does commit and merge)But more common used is:
$ git branch featureA mainline
$ git checkout featureA
$ hack; hack; git commit -a -m "f1"; hack; hack; git commit -a -m "f2"; etc
$ git checkout mainline
$ git pull . featureA
The automatic merge message takes care of this, if we enable
merge.summary config option. For example...
On Sat, 21 Oct 2006 16:05:18 -0400
Of course it works as long as you accept the implicit requirements of
supporting them and ignore the cases where they change out from
underneath the user. But as soon as users want to embrace distributive
models where there isn't a central shared repo, at best revno's are
unhelpful and at worst they are counterproductive. The proof of this
is that if revno's were sufficient bzr wouldn't need revid's.Since the utility provided by revno's seems so minimal even in the
case where they do work, Git simply doesn't bother with them. And
"our" experience is that Git really does work well without them.Sean
-
Yes. This really is what it boils down to.
The _only_ time you actually use revision numbers (as opposed to
branch-names or tag-names) is when you want a _stable_ number.It's that simple. You never really need a revision number otherwise. In
other situations, you do things likegit log --since=2.days.ago
gitk v2.6.18..
git diff --stat --summary ORIG_HEAD..or whatever. It's clearly not "stable", but it's also clearly not a
revision number from a UI perspective.When you want a revision number is _exactly_ when you're moving things
between branches, or reporting a bug to somebody else, or similar. And
that's also _exactly_ when you want the number to be stable and meaningful
(ie the other end should be able to rely on the number).And if you need refer to a central repository to do that, it's clearly not
distributed. Not needing such a central reference point is what the word
"distributed" _means_ in computer science for chrissake!Linus
-
No, there is no such thing like local ordering of revisions.
Each revision (commit) has link to its parent(s). Branch technically
is just a reference to a particular commit object. The commit itself
gives us sub-DAG of DAG of whole history, the DAG of all parents of
said commit. Such lineage of commit pointed by branch is conceptually
a branch; i.e. branch is DAG of development (not line of development,
as there is no special meaning of first parent).You can have (in git repository) also reflog, which records values
of branch-as-reference, or branch tip of branch-as-named-lineage.
But for example fetch and fast-forward 5 commits in history is
Erm, wasn't revno to revid mapping also part of bzr "branch"?We store configuration per repository, not per branch, although
there is some branch specific configuration.Workingtree:
~/Gaah, it's even more inconvenient. Certainly more than using name
Is there a command to list all branches in bzr? Is there a command
Thats opposite to git view. In git, working area is associated with
repository (clone of repository), not branch. We copy whole repositoriesWhich shells? If I understand it '^' was chosen (for example as
NOT operator for specify sub-DAG instead of '!') because of no problems
for shell expansion. And considering that many git commands are/were
written in shell, one certainly would notice that.--
Jakub Narebski
Poland
-
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1It's not part of the conceptual model. The revno-to-revid mapping is
done using the DAG. The branch just tracks the head.The .bzr/branch/revision-history file is from an earlier model in which
branches had a local ordering. Nowadays, it can be treated as:
- a reference to the head revisionThe notation was that ~/repo would contain the .git directory for the
Of course if you have a copy of bzr.dev on your computer, you don't need
to type the full URL. it's just like the 'merge ../b' above.But how can you use the branch name of a branch that isn't on your
computer? I suspect git requires a separate 'clone' step to get it ontoSorry, it's been quite a long time since people complained at me for
using ^, so I don't remember. Perhaps Edgar is right about it being the
pipe character in old shells.Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.orgiD4DBQFFOq+80F+nu1YWqI0RAp/KAJ9Bw1q9/nd3gUAjcX3c+24aoEifeQCYlbD0
tUZ01ra11vkQ7V3RzarXeg==
=oFIC
-----END PGP SIGNATURE-----
-
In git DAG is DAG od parents. There are no "child" links. So it is natural
to refer to n-th ancestor of given commit (in git <ref>~<n>, in bzr -<m>).To have incrementing (from 1 for first revision on given branch) revision
numbers you either have to have links to "children", which automatically
means that revisions cannot be immutable to allow for branching at
arbitrary revision, or to transverse DAG here and back again (perhaps
with cache of revno-to-revid mapping to help performance).Additionally to have incrementing revision numbers you have to remember
which part of DAG is our branch; which parent in merge to chose to follow.
Bazaar-NG decides here to distinguish first parent; to have first parent
immutable it doesn't use fast-forward and always use merge, sometimesThe default layout of "clothed" repository is
Repository:
~/repo/.git/Branches:
~/repo/.git/refs/heads/Workingtree:
No, as it was said in other messages in this thread, you can fetch
a branch (branches), even from other repository that the one you cloned
from, into given branch (branches). For git it would be
$ git fetch <URL> <remotebranch>:<localbranch>
You probably would want to save above info in remotes file or in config.
For cg (Cogito) it would be
$ cg branch-add <localbranch> <URL>#<remotebranch>
$ cg fetch <localbranch>In git you always use names like 'master', 'next', 'HEAD' (meaning current
branch) and also HEAD^, next~5 when comparing branches, viewing history,
merging branches, switching to branch etc. Not '../master'...--
Jakub Narebski
Poland
-
No. You can merge a branch from a remote repository in a single step:
git pull http://example.com/git/repo branch-of-interest
But if you want to do something besides (or before) a merge, (for
example, just explore its history, do some diffs etc.) then you would
fetch it instead, assigning it a local branch name in the process:git fetch http://example.com/git/repo branch-of-interest:local-name
After which "local-name" is all one would need to use. So after a
fetch like the above, the equivalent of "bzr missing --theirs-only"
would be:git log ..local-name
[This shows some of the expressive power of git revision
specifications. There's no need for a separate "missing" command. It's
just one case of viewing a particular subset of the DAG. And the
specification language makes almost all interesting subsets easy. The
--mine-only specification would be "local-name.."]And beyond what bzr missing does (I believe) it's easy to also see the
patch content of each commit with:git log -p ..local-name
And then if everything is happy, one could merge that branch in:
git pull . local-name
(And, yes, it is the case that "pull" with a repository URL of "." is
how merging is done. It's bizarre to me that this is not "git merge
local-name" instead. There actually _is_ a "git merge" command that
could be used here, but it is somewhat awkward to use, (requiring both
a commit message (without the -m of git-commit(!)) and an explicit
mention of the current branch). So using it would be something like:git merge "merge of local-name" HEAD local-name
I've never claimed that git is completely free of its UI
warts---though there are fewer now than when I started using it.)But, yes, the notion in git is to bring things in to the current
repository and then work with them locally. This has an advantage that
network traffic is spent only once if doing multiple operations, (say
the three steps shown above: 1) investigate commit messages, 2)
investigate patch content, 3) pe...
In the traditional Bourne shell ^ is an alias for the pipe symbol |.
Ciao, ET.
-
On Fri, Oct 20, 2006 at 02:48:52PM -0700 I heard the voice of
I think we're getting into scratched-record-mode on this.
Git: Revnos aren't globally unique or persistent.
Bzr: Yes, we know.
G: Therefore they're useless.
B: No, they're very useful in [situation] and [situation], and we deal
with [situation] all the time, and they work great for that.G: But they fall apart totally in [situation].
B: Yes, so use revids there.
G: So use revids everywhere.
B: Revnos are handier tools for [situation] and [situation] for
[reason] and [reason].*brrrrrrrrrrrrrrrrip!!!* *skip back to start*
I'm not sure there's any unturned stone left along this line, so I'm
not sure how productive it really is to keep walking down it. So, to
make something productive of it, I'm going to put it onto my todo list
to spend some time with bzr trying to use revids for stuff. I'm
fairly certain that, due to the bzr cultural tendancy to use revnos
where possible, there are some rough edges in the UI when using revids
that should be filed down (though I think it much less likely to turnI think it's more accurately describable as a branch-identity bias.
The git claim seems to be that the two statements are identical, but IThe term is somewhat overloaded, which is why it's causing you trouble
(and did me). It refers both to the conceptual entity ("a line of
development" roughly, much like what 'branch' means in git and VCS in
general), and to the physical location (directory, URL) where that
branch is stored, and where it'll often have a working tree. BranchesThen all branches stored under that 'bzrtest' dir will use the
bzrtest/.bzr/ dir for storing the revisions, and shared revisions will
only exist once saving the space/time for multiple copies.Probably, you'd actually want 'init-repo --trees' in this case,
because repos default to being [working]tree-less. In a tree-less
setup, you'd create a [lightweight] checkout of the branch(es) you
wanted to work on els...
This is new to me. At work, we merge our toy repositories back and forth
between devs only. There is no central repo at all. Does this mean that
each merge would add one extra commit per time the one I'm merging with
has merged with me?--
Andreas Ericsson andreas.ericsson@op5.se
OP5 AB www.op5.se
Tel: +46 8-230225 Fax: +46 8-230231
-
Two things differ in bzr and git, here:
* bzr doesn't do "autocommit" after a merge. So, new revisions are
created only if you use"commit".* bzr has two commands, "pull" and "merge". "pull" just does what the
git people call "fast-forward", and only this (it refuses to do
anything if the branches diverged). In particular, you never have to
commit after a pull (well, except if you had some local, uncommited
changes). "merge" changes your working directory, and you have to
commit after. "merge" will never do fast-forward, it will never
change the revision to which your working tree revfers to, and it's
your option to commit or not after (if you see that it introduces no
changes, you might not want to commit).The final rule in bzr would be "you create an extra commit each time
you commit" ;-).As a side-note, it could be interesting to have a git-like merge
command (chosing automatically between merge and pull), probably not
in the core, but as a plugin.--
Matthieu
-
From what I understand, "bzr merge" will create one extra commit to
preserve the "first parent is my branch" feature. "bzr pull" will do
fast-forward if your DAG is proper subset of pulled branch/repository
DAG, but at the cost that it would change your revno to revision mapping
to those of the pulled repository.That's a consequence of preserving branch as "my work" i.e. as path
through "branch DAG" in the DAG using first parent as special, instead
of saving it outside DAG.--
Jakub Narebski
Poland
-
Actually, "bzr merge" does not create any commits on the branch -- you
need to run "bzr commit" afterwards (possibly after resolving
conflicts). The control files for the working tree record a pending
merge, which gets recorded when you get round to the commit.So you can easily check if there were any tree changes resulting from the merge.
If there aren't, or you made the merge by mistake, you can make a call
to "bzr revert" to clean things up without ever having created a new
revision.James.
-
One result of this approach is that developers of different trees
don't necessarily have common revision IDs to compare. Imagine a
question like:When you ran that test did you have the same code I've got?
In git, the answer would be determined by comparing revision IDs.
In bzr, the only answer I'm hearing is attempting a merge to see if it
introduces any changes. (I'm deliberately avoiding "pull" since we're
talking about distributed cases here).And to comment on something mentioned earlier in the thread, there's
no need for "wildly complex" distributed scenarios. All of these
issues are present with developers working together as peers, (and
each considering their own repository as canonical).A harder question (for bzr) is:
Do you have all of the history I've got?
(The problem being that when one developer is missing some history and
merges it in, she necessarily creates new history, so there's never a
stable point for both sides to agree on.)-Carl
Can you really just rely on equal revision IDs meaning you have the
same code though?Lets say that I clone your git repository, and then we both merge the
same diverged branch. Will our head revision IDs match? From a quick
look at the logs of cairo, it seems that the commits generated for
such a merge include the date and author, so the two commits would
have different SHA1 sums (and hence different revision IDs).So I'd have a revision you don't have and vice versa, even though the
Or run "bzr missing". If the sole missing revision is a merge (and
not the revisions introduced by the merge), you could assume that youWhy does it matter if they create a new revision? They can still tell
if they've got all the history you had.James.
-
Yes. Because each commit contains parent revision id's, which in turn
contain *their* parent revision id's, which in turn..., you know you
have exactly the same revision, code, and history leading up to that
revision. You may have other revisions on top or on other branches, but
all commits, including merge-points and whatnot, leading to thatMerges preserve author and commit info. You may need to create a new
branch (a git branch, the cheap kind which is a 41-byte file) and fetch
"his" into "yours". This will be very cheap if you both have the same
code but not the same history, as everything but a few commit-objects
will be shared. A more likely scenario though is this;Bob writes a feature that doesn't work as per spec. He doesn't know why.
He asks Alice to have a look, so he communicates the commits to her by
"please pull this branch from here", or by sending patches and telling
Alice the branch-point revision to apply them to.
Alice creates the "bobs-bugs/nr1232" at the branch-point and fetches
Bobs branch into that or applies the patches on top of that (in the
fetch scenario she wouldn't need to know the branch point, since git
would figure this out for her).
She knows this should create a revision named 00123989aaddeddad39, so if
it doesn't, she doesn't have the same code.I imagine this works roughly the same in bazaar, although the original
case where tests have already been done and the testers wanted to know"assume" != "know", or was that just sloppy phrasing?
--
Andreas Ericsson andreas.ericsson@op5.se
OP5 AB www.op5.se
Tel: +46 8-230225 Fax: +46 8-230231
-
If you two have the same commit that is a guarantee that you two
have identical trees. The reverse is not true as logic 101
would teach ;-).Doing fast-forward instead of doing a "useless" merges helps
somewhat but not in cases like two people merging the same
branches the same way or two people applying the same patch on
top of the same commit. You need to compare tree object IDs forIs it "you could assume" or "it is guaranteed"? If former, what
kind of corner cases could invalidate that assumption?-
That was the point I was trying to make. Carl asserted that in git
you could tell if you had the same tree as someone else based on
revision IDs, which doesn't seem to be the case all the time.The reverse assertion (that if you have the same revision ID, you have
Sure, you can do the same in Bazaar by comparing the inventories for
The merge revision will also include any manual conflict resolution.
If the other person resolved the conflicts differently.James.
-
If you have the same revision (commit IDs), you have the same tree (at
the same time, by the same committer, etc).If you have a different revision (commit), you may or may not have the
same tree. You can then check the tree id, which will either be the same
(you have the same tree) or differ (you don't).Thus, in the converse, if you have the same tree, you _will_ have the
same tree id. You may or may not have the same commit id.-Peff
-
>>>>> "Jeff" == Jeff King <peff@peff.net> writes:
Jeff> On Thu, Oct 26, 2006 at 05:57:20PM +0800, James Henstridge wrote:
>> >If you two have the same commit that is a guarantee that you two
>> >have identical trees. The reverse is not true as logic 101
>> >would teach ;-).
>>
>> That was the point I was trying to make. Carl asserted that in git
>> you could tell if you had the same tree as someone else based on
>> revision IDs, which doesn't seem to be the case all the time.Jeff> If you have the same revision (commit IDs), you have
Jeff> the same tree (at the same time, by the same committer,
Jeff> etc).Jeff> If you have a different revision (commit), you may or
Jeff> may not have the same tree. You can then check the tree
Jeff> id, which will either be the same (you have the same
Jeff> tree) or differ (you don't).Jeff> Thus, in the converse, if you have the same tree, you
Jeff> _will_ have the same tree id. You may or may not have
Jeff> the same commit id.Ok, so git make a distinction between the commit (code created by
someone) and the tree (code only).Commits are defined by their parents.
Trees are defined by their content only ?
If that's the case, how do you proceed ?
Calculate a sha1 representing the content (or the content of the
diff from parent) of all the files and dirs in the tree ? Or
from the sha1s of the files and dirs themselves recursively based
on sha1s of the files and dirs they contain ?I ask because the later seems to provide some nice effects
similar to what makes BDD
(http://en.wikipedia.org/wiki/Binary_decision_diagram) so
efficient: you can compare graphs of any complexity or size in
O(1) by just comparing their signatures.Vincent
-
Commits are defined by a _combination_ of:
- the tree they commit (which is recursive, so the commit name indirectly
includes information EVERY SINGLE BIT in the whole tree, in every
single file)
- the parent(s) if any (which is also recursive, so the commit name
indirectly includes information about EVERY SINGLE BIT in not just the
current tree, but every tree in the history, and every commit that is
reachable from it)
- the author, committer, and dates of each (and committer is actually
very often different from author)
- the actual commit messageSo a commit really names - uniquely and authoratively - not just the
Where "contents" does include names and permissions/types (eg execute bit
If you compare the commit name, and they are equal, you automatically know
- the trees are 100% identical
- the histories are 100% identicalIf you only care about the actual tree, you compare the tree name for
equality, ie you can dogit-rev-parse commit1^{tree} commit2^{tree}
and compare the two: if and only if they are equal are the actual contents
This is exactly what git does. You can compare entire trees (and
subdirectories are just other trees) by just comparing 20 bytes of
information.How do you think we can do a diff between two arbitrary kernel revisions
so fast? Why do you think we can afford to do agit log drivers/usb include/linux/usb*
that literally picks out the history (by comparing state) for every commit
in the tree?I can do the above log-generation in less than ten _seconds_ for the last
year and a half of the kernel. That's 20k+ lines of logs of commits that
only touch those files and directories. And I _need_ it to be fast,
because that's literally one of the most common operations I do.And the reason it's fast is that we can compare 20,000 files (names,
contents, permissions) by just comparing a _single_ 20-byte SHA1.In git, revision names (and _everything_ has a revision name: commits,
t...
Hello all,
Following the very interesting debate about the differences between bzr
and git, I thought it was about time I tried to learn properly about git
and how to use it. I've been using bzr for a good while now, although
since I'm not a serious developer I only use it for simple purposes,
keeping track of code I write on my own for academic projects.So, a few questions about differences I don't understand...
First off a really dumb one: how do I identify myself to git, i.e. give
it a name and email address? Currently it uses my system identity,
My Name <username@computer.(none)>. I haven't found any equivalent of
the bzr whoami command.Now to more serious business. One of the main operational differences I
see as a new user is that bzr defaults to setting up branches in
different locations, whereas git by default creates a repository where
branches are different versions of the directory contents and switching
branches *changes* the directory contents. bzr branch seems to be
closer to git-clone than git-branch (N.B. I have never used bzr repos so
might not be making a fair comparison).With this in mind, is there any significance to the "master" branch (is
it intended e.g. to indicate a git repository's "stable" version
according to the owner?), or is this just a convenient default name?
Could I delete or rename it? Using bzr I would normally give the
central branch(*) the name of the project.(* Central or main on my own system. Not intended to be central in the
sense of a CVS-style version control setup:-)Any other useful comments that can be made to a bzr user about working
with this difference, positive or negative aspects of it?Next question ... one of the reasons I started seriously thinking about
git was that in the VCS comparison discussion, it was noted that git is
a lot more flexible than bzr in terms of how it can track data (e.g. the
git pickaxe command, although I understand that's not in the released
version [1.4.4.1] yet?). A fru...
I also have a basic question about git regarding its content tracking
and merging.Does this mean if I have, for example, a large C++ file with a bunch of
methods in it and I move one of the methods from the bottom of the file
to the top and in another branch someone makes a change to that method
that when I merge their changes git will merge their changes into the
method at the top of the file where I have moved it?If so that would be really quite impressive!
Cheers,
Nick
-
Right now (and in the near future), nope. "git blame" will track the
changes (so the pure movement wasn't just an addition of new code, but
you'll see it track it all the way down to the original), but "git merge"
is still file-based.In other words, "git merge" does uses a data similarity analysis that
could be used for smaller chunks than a whole file, but at least for now
it does it on a file granularity only (and then passes it off to the
standard RCS three-way merge on a file-by-file basis).That said, if the movement happens _within_ a file, then just about any
SCM could do what you ask for, by just using something smarter than the
standard 3-way merge. So that part isn't even about tracking data across
files - it's just about a per-file merge strategy.The "track data, not files" thing becomes more interesting when you factor
out a file into two or more files, and can continue to merge across such a
code re-filing event. Git can do it for "annotate", but doesn't do it forIndeed, and it's one of the potential future goals that was discussed very
early in the git design phase. The point of _not_ doing file ID tracking
is exactly that you can actually do better than that by just tracking the
data.So some day, we may do it. And not just within one file, but even between
files. Because file renames really is just a very specific special case of
data movement, and I don't think it's even the most common case.That said, there are several reasons why you might not actually _ever_
want it in practice, and why I say "potential future goal" and "we may do
it". I think this is going to be both a matter of not just writing the
code (which we haven't done), but also deciding if it's really worth it.Because merges are things where you may not want too much smarts:
- Quite often, a failed merge that needs manual fixup may even be
_preferable_ to a successful merge that did the merge "technically
correctly", but in an unexpected way.- ...
Hi,
As for now, no, it does not. This is a shortcoming of RCS merge which does
the heavy-lifting.Having said that, stay tuned for new developments: the functionality of
merge is being integrated in git. This opens the door to make use of the
code tracking support in git, to do exactly what you just proposed.Ciao,
Dscho-
usage: bzr annotate FILENAME
aliases: ann, blame, praiseShow the origin of each line in a file.
/Erik
-
Depending on whether you like editing config files by hand or not, you
would either just edit your ~/.gitconfig file and add a section like:[user]
name = My Name Goes Here
email = myemail@work.comor you would use "git repo-config" to do it for you. Personally, I find it
easier to just edit the .gitconfig file directly, since the config file
syntax is actually rather pleasant, but if you want to do it with a git
command, you'd dogit repo-config --global user.name "Joseph Wakeling"
git repo-config --global user.email joseph.wakeling@webdrake.net(where the "--global" just tells repo-config to use the user-global
~/.gitconfig file - you can also do this on a per-repository basis in the
repository .git/config file if you want to have different identities forYou can do either, it's almost purely a matter of taste.
Using a local branch and switching between them in place has some
advantages once you get used to it: most notably you can trivially use git
commands that work on data from different branches at the same time. So
with that kind of setup it's very natural to do things like "show me
everything that is in branch 'x', but _not_ in branch 'y'", and once you
get used to that, you really appreaciate it.But at the same time, if you want to actually keep several branches
checked out at the same time, and prefer to work on them that way, just
use "git clone" to create the other branch instead. It really is just a
matter of taste.I suspect that most people tend to end up using the "multiple branches in
the same directory and switching between them" approach after a time, but
that's really just an unsubstantiated feeling, and it certainly isn'tIt's just a convenient default name, and it has no real meaning otherwise.
Feel free to rename it any way you want (just make sure to edit HEAD toThere should be no difference, although since everybody seems to use
"master" by default, the documentation is probably geared towards it, and
...
Thanks to everyone for your very detailed responses. :-)
On the subject of blame and pulling patches from unrelated branches,
So ... if I understand correctly, I can get patches from somewhere else,
but in the branch history, I will not be able to tell the difference
from having simply newly created them?With regards to git blame/pickaxe/annotate, the idea of tracking *code*
rather than files was one thing that really excited me when I read about
it in the earlier discussion, and is probably the main reason I'm trying
out git. I'd like to understand this properly so is there a simple
exercise I can do to demonstrate its capabilities? I tried an
experiment where I created one file with two lines, then cut one of the
lines, pasted it into a new file, and committed both changes at the same
time. But git blame -C on the second file just gives me the
time/date/sha1 of its creation, and no indication that the line was
taken from elsewhere.Back to the more basic queries ... one more difference I've observed
from bzr, after playing around for a while, involves the commands to
undo changes and commits. It looks like git reset combines the
capabilities of both bzr uncommit and bzr revert: I can undo changes
since the last commit by resetting to HEAD, and I can undo commits by
resetting to HEAD^ or earlier.Some things here I'm not quite sure about:
(1) the difference between git reset --soft and git reset --mixed,
probably because I don't understand the way the index works, the
difference between changed, updated and committed.
(2) How to remove changes made to an individual file since the last commit.Last, could someone explain the git merge command? git pull seems to do
many things which I would need to use bzr merge for---I can "pull"
between branches which have diverged, for example. I don't understand
quite what git merge does that's different, and when to use one or the
other.Many thanks again to everyone,
-- Joe
-
Think of it this way: if the _patch_ looks like it's a code movement, then
"git blame" will show it as a code movement. Ie, if the patch (to a human)
looks like it's moving a function from one file into another (which in a
patch will obviously be a question of removing it from one file, and
adding it to another), then git will also see it that way, and then "git
blame" will also follow its history as it moved.But if somebody sends you a patch that just adds a new function that
didn't exist in that context at all, then "git blame" won't ever realizeActually, I think you found a bug.
Now, with small changes, "git blame -C" will just ignore copies entirely,
so your particular test might not have even been supposed to work, but
trying with a new git repo with two bigger files checked in at the initial
commit, I'm actually not seeing "git blame -C" do the right thing even for
real code movement.And the problem seems to go to the "root commit": if the file existed in
the root, the logic in "git blame" to diff against the (nonexistent)
parent of the root commit won't do the right thing, and that just confuses
git blame entirely.I think Junio screwed up at some point. I'll send him a bug-report once
I've triaged this a bit more, but I can recreate your breakage if I start
a new git database and create two files in the root, and move data between
them in the second commit (but if I instead create the second file in the
second commit, and do the movement in the third commit, git blame -C worksI'm not quite sure what "bzr revert" does. Git does have a "revert" too,
but it will append a _new_ commit that actually undoes the commit you're
asking to revert. If you want to just "undo history" (whether it's one
commit or many - I don't see why it would be different) then yes, "git
reset" is the thing to use.I _suspect_ that bzr people use "uncommit" to undo a commit in order to
fix it up. In git, you could do that with "git reset" and a new commit,
...
Obvious when I think about it, otherwise every 'int i;' in the kernel
Actually my setup was like the latter situation you describe, so blame
was probably working fine and just ignoring the small change. But
serendipity is a wonderful thing. :-)-- Joe
-
Indeed. We didn't do that heuristic originally, and the most common
sequence that was "blamed" on being copied from somewhere else was
something like the string"<tab><tab><tab>}<nl><tab><tab>}<nl><tab>}<nl>"
which is obviously very common in C, especially when you have coding
Yeah. As it turns out, the bug was really that "git blame" ended up just
not showing the filenames (that it had followed correctly), because it had
decided (incorrectly) that they weren't interesting because it all came
from the same commit, and it had already shown that commit (just not that
_file_ in that commit).So it's fixed now, and probably would never trigger except for the stupid
special case that was "let's just show an example of this" ;)Linus
-
I'm very happy my stupidity could help. ;-)
On a related note ...
I do think that bzr has quite an intuitive set of commands, and it is
easy to learn, though at this point I don't feel git is really *that*
much more difficult in itself. Although the terminal output for some
problems could be improved, most of my difficulties are stemming from
overlap of command names when the commands themselves do different
things, and the fact that git's documentation is somewhat more technical
than bzr's.What would be nice would be to have in the documentation a whole bunch
of stupid examples for the main commands, something where someone can
create a repo from scratch, create and modify some simple files
according to instructions, and see the particular command in action.
The tutorials do this, of course, but only for a few cases, when to be
honest it's the more complex commands that most need such explanation.
For beginners, especially less technically skilled ones, it would be
good to have a lot more of, "Do this, here's what git will respond, this
is what it means, here's how to fix it...."As a relatively non-technical user, perhaps I should keep track of my
difficulties (and others') and try to write something up.-- Joe
-
100% agreed. A lot of the man-pages etc have been written to be about the
technology, not about the _use_ of it.I encouraged people at some point to add an "Examples" section to some of
the functions to show what it all _means_, so for "man git-log", I think
some of the most useful stuff is that examples section that shows the
combination of revision naming and path-name limiting, for example. I
personally think that that is a much better way of teaching people what
the commands actually do than by mentioning the arguments one by one.But that only exists for a couple of man-pages, and mostly for the simple
ones at that. And a lot of the real examples would need "real data" to
work on, so it can't easily be done as a trivial example in a man-page, it
really needs a tutorial to "build up" to the situation where you can thenYeah. The git "tutorial.txt" should be extended, and preferably be a while
nice set of "follow along with the bouncing ball" kind of web-page
sequence.So I absolutely agree. It's just that at least me personally, I just can't
write documentation. I wrote some of the original tutorial, I've written
some of the original tech docs, but I just can't get into the whole
"document it" mindset, especially not from a user perspective. It doesn't
float my boat, and judging by a lot of the discussions, I obviously also
don't even see why something could _possibly_ cause confusion.To make things worse, a lot of the docs (and by that I also mean some of
the error messages and helpful hints) tend to be old.The whole fact that "git commit" mentions "git update-index" is exactly
that kind of thing: it's largely a legacy message. You'd almost never
actually _use_ git-update-index itself these days, and it's much more
convenient to just list the files you want to commit to "git commit"
directly (or just use the -a flag, if that is what you want to do).But that message exists, because it was written in an earlier age.
Linus
-
Here's a crazy idea. How about a "git tutorial" builtin or "git
example" or something that would create a repository into some useful
state for demonstrating something.I know that I'm regularly putting stuff into emails like:
mkdir gittest
cd gittest
git init-db
echo hello > hello
git add hello
git commit -m "add hello"
git checkout -b other
echo other > other
git add other
git commit -m "add other"
git checkout master# OK, that was just setup, here's what I want to demonstrate
git pull . other
...So maybe if there was a command to setup a standard example
repository, ("git boilerplate", "git sandbox", "git playground" ?),
then the documentation could use that to have full-fledged examples
without having to duplicate similar setup each time.And then there could be a way for this command to also spit out the
commands it is using to reach some state so it could even serve as a
sort of self-documenting tutorial of some sort.Anyone interested in exploring something like that?
-Carl
Hi,
That sounds fine! Actually, it should be very simple to turn the tutorial
into such a script, displaying the command with an explanation, and
executing the command. It could even call gitk from time to time, so the
user can form a mental model of the ancestor graph.Ciao,
Dscho-
Doesn't one of our existing t/ scripts do that?
-
Hi,
;-) I did not forget... t1200-tutorial.sh
But it serves a different purpose: it makes sure that we did not break the
commands in the tutorial. (I fear that the script and the tutorial have
diverged a little bit, though).git-tutorial should not test that, rather it should show the user what is
possible, and encourage playing with git.Ciao,
Dscho-
Currently tutorial.txt doesn't work like that--there are places where it
just tells the user to edit a file, or make a few commits, without
listing commands to do so. It also isn't linear. That could all be
"fixed", but I think the result would just make it more tedious.But I agree that a "git tutorial" command to set up a canonical example
repository might be fun.--b.
-
On Tue, 28 Nov 2006 01:01:46 +0100
Assuming you have a recent version of git, then:
$ git repo-config --global user.email "you@email.com"
$ git repo-config --global user.name "Your Name"Will setup a ~/.gitconfig in your home directory; these settings
will apply in any repo you use. Drop the "--global" to set themIt's just a common convention and carries no special significance;
Don't be afraid to git-clone your local repo, especially with the -l
and -s options. That will get you a separate repo/working directory
while not taking up much extra disk space (objects from your first
repo will be shared with the second).Once you get comfortable with multiple branches in a single repo/
working directory, it often is much better than the alternatives.The Git cherry-pick command lets you grab specific commits from
other branches in your repo. But cherry-pick works at the commit
level, there is no easy way to grab a single function for instance
and merge just its history into another branch.However, you can merge an entire separate project into yours even
though they don't share a base commit. This has been done several
times in the history of Git itself. For instance you can see two
separate "initial" commits in the Git repo with a command like
"gitk README gitk" which gives a graphical history of the "gitk"
and "README" files and shows each started life in a separate
initial commit. Use "git show 5569b" to see Linus bragging onDon't think a direct bridge between the two has been written yet.
Cheers,
Sean
-
>>>>> "Linus" == Linus Torvalds <torvalds@osdl.org> writes:
Linus> On Thu, 26 Oct 2006, Vincent Ladeuil wrote:
>>
>> Ok, so git make a distinction between the commit (code created by
>> someone) and the tree (code only).
>>
>> Commits are defined by their parents.Linus> Commits are defined by a _combination_ of:
Linus> - the tree they commit (which is recursive, so the
Linus> commit name indirectly includes information EVERY
Linus> SINGLE BIT in the whole tree, in every single file)And here you keep that separate from any SCM related info,
right ?Linus> - the parent(s) if any (which is also recursive, so
Linus> the commit name indirectly includes information about
Linus> EVERY SINGLE BIT in not just the current tree, but
Linus> every tree in the history, and every commit that is
Linus> reachable from it)Linus> - the author, committer, and dates of each (and
Linus> committer is actually very often different from
Linus> author)Linus> - the actual commit message
Linus> So a commit really names - uniquely and authoratively
Linus> - not just the commit itself, but everything ever
Linus> associated with it.Thanks for the clarification. But no need to shout about EVERY
SINGLE BIT, the pointer to BDDs was already talking a bit about
bits :)But I agree, this is the important point that may be missed.
>> Trees are defined by their content only ?
Linus> Where "contents" does include names and
Linus> permissions/types (eg execute bit and symlink etc).Which can also be expressed as: "Everything the user can
manipulate outside the SCM context", right ?>> If that's the case, how do you proceed ?
Linus> If you compare the commit name, and they are equal,
Linus> you automatically knowLinus> - th...
I don't understand that question.
The commits contain the tree information. A raw commit in git (this is the
true contents of the current top commit in my kernel tree, just added
indentation and an empty line between the command I used to generate it
and the output, to make it stand out better in the email) looks something
like this:[torvalds@g5 linux]$ git-cat-file commit HEAD
tree ba1ed8c744654ca91ee2b71b7cdee149c8edbef1
parent 2a4f739dfc59edd52eaa37d63af1bd830ea42318
parent 012d64ff68f304df1c35ce5902f5023dc14b643f
author Linus Torvalds <torvalds@g5.osdl.org> 1161873881 -0700
committer Linus Torvalds <torvalds@g5.osdl.org> 1161873881 -0700Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6
* master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6:
[SPARC64]: Fix memory corruption in pci_4u_free_consistent().
[SPARC64]: Fix central/FHC bus handling on Ex000 systems.where the _name_ of the commit is
[torvalds@g5 linux]$ git-rev-parse HEAD
e80391500078b524083ba51c3df01bbaaecc94bb
ie the commit itself contains the exact tree name (and the name of the
parents), and the name of the commit is literally the SHA1 of the contentsAgain, I'm not sure what you mean by that. The SCM does not track
_everything_. It does not track user names and inode numbers, so in a
sense a developer can change things that the SCM simply doesn't _care_
about and never tracks. But yes, the tree contents uniquely identify theNo, there is ordering there too. But yes, the ordering is not in the name
itself, you have to go look at the actual commit history to see it.No.
If the signatures are equal, the contents are equal, and vice versa. It
No. Don't even think that way. That just confuses you. The hash is
cryptographic, and large enough, that you really can equate the contents
with the hash. Anything else is just not even interesting.Linus
-
Yes (a commit is a tree, zero or more parents, commit message, and
Recursively. Each tree is an ordered list of 4-tuples: pathname, type,
sha1, mode. If the type is "blob" then the sha1 is the hash of the file
contents. If the type is "tree" then the sha1 is the id of a sub-tree.Yes, if two trees' hashes compare equal, they contain the same data. I
believe we are not currently using this optimization to find merge
differences, but there was some discussion earlier this week about doing
so.-Peff
-
Sorry, I should clarify: a commit is a _tree id_, zero or more _parent
ids_, commit message, etc.-Peff
-
I apologize if I've come across as beating a dead horse on this. I've
really tried to only respond where I still confused, or there are
explicit indications that the reader hasn't understood what I was
saying, ("I don't understand how you've come to that conclusion",
etc.). I'll be even more careful about that below, labeling paragraphsI'm missing something:
I still haven't seen strong examples for this last claim. When are
they handier? I asked a couple of messages back and two people replied
that given one revno it's trivial to compute the revno of its
parent. But that's no win over git's revision specifications,Maybe I wasn't clear:
There's no doubt that there has been semantic confusion over the term
branch that has been confounding communication on both sides. Here's
my attempt to describe the situation, (which only became this clear
recently as I started playing with bzr more). This is not an attempt
at a complete description, but is hopefully accurate, neutral, and
sufficient for the current discussion:Abstract: In a distributed VCS we are using a distributed process to
create a DAG, (nodes are associated with revisions and point to parent
nodes). The distributed nature means that the collective DAG will have
multiple source nodes, (often termed heads or tips).Git: A subset of the DAG is stored in a "repository". The DAG in the
repository may have many source nodes. A "branch" is a named reference
to a node (whether or not a source). Multiple local repositories may
share storage for common objects. There are inter-repository commands
for copying revisions and adjusting branch references, but basically
all other operations act within a single repository.Bzr: A subset of the DAG is stored in a "branch". The DAG in the
branch has a single source node. Multiple local branches may share
storage for common objects through a "repository". Basically all
operations (where applicable) can act between branches.Let me know if I botched ...
I would say that: revnos are handier tools than revids...etc
I think that since G: was making a statement about revids, B: was making
an implicit comparison with them.bzr log -r before:1 =20
being handier than
bzr log -r before:revid:david@zettazebra.com-20061022175244-4b85cb5f0cbc79a=
d-davidc
--=20
gpg-key: http://www.zettazebra.com/files/key.gpg
[ Time to trim up CC's a bit ]
On Sat, Oct 21, 2006 at 01:47:08PM -0700 I heard the voice of
Oh, I don't mean the whole topic in general. It's just that there are
only so many ways one can say "revnos are only valid in certain
situations", and I really think we must have hit them all by now. We
all agree on that; we just disagree (probably highly based onThis seems correct; at least, it's correct enough to work from until
Rather, unless you can one way or another access the branch the number
I think it's using that 'c' word there that's causing contention here;
we're ascribing different meanings to it.Revnos only apply to a specific "branch" (in this usage, I'm talking
about branch abstractly and somewhat specifically; more in a moment),
and so except by wild coincidence are only useful in talking about
that branch. One of the two cases (the second discussed later) where
that's useful is when you have long-lived branches. In git,
apparently, you don't have long-lived "branches" in this particular
meaning of the word, but the way people use bzr they do. Perhaps this
is what you mean by 'centralization'.That long-lived branch doesn't have to be any sort of "trunk", though
it usually is; it could as easily be something totally peripheral.Now, details of that use of "branch". In mathematical terms, a branch
may be defined purely by its head rev (and the graph built up by
recursing through all the parents), but in [bzr] UI and mental model
terms, a "branch" is that plus its mainline[0]; the left-most or first
line of descent, which colloquially is the difference between 'things
I commit' and 'things I merge'.Let me try flexing my git-expression muscles here. Given a branch at
a specific point in time, you point at the head rev, and there's a
subset we call 'mainline' of the whole set of parents, which is
expressed by following the 'first' parent pointers back to a single
origin (there can be 50 origins in the whole graph, of course, but
only one of the...
Having used both (though my familiarity with git is less), in my opinion
the biggest win is the obvious one: sequential numbers work in the head
better than SHA1 checksums."But it's not a problem in practice!" is a good retort, except that I
wonder whether the set of "practices" you're using includes anyone who
decided to pass on git in favor of something else--perhaps because they
saw a few SHAs float by and ran in terror. Beware of self-selection
bias.Put another way, "strength" of example is often in the eye of the
beholder. That we continue to give you the same "weak" examples may be
evidence that we have a different impression of their strengths, and
that your analysis of their strengths isn't convincing to us.I suppose this line of conversation still has value if you don't see any
benefit at all, but OTOH if you really don't see how sequential numbers
are easier to work with in the head than SHA sums with modifiers, I'mI wonder if part of the problem is that the revno scheme we've been
talking about (the x.y.z... format) doesn't technically exist in any
released version of bzr that I know of.Previous to 0.12, bzr revnos were absolutely a local thing; revisions
from merges didn't even have revnos (except for the merge commit
itself). If you merged a branch and you later wanted to recreate that
branch, or see a diff from that branch, etc., you had to use revids.So when you talk of a "centralization bias" in bzr, a lot of us get
confused, defensive, etc., because from our perspective, bzr and git
weren't all that much different until just recently.Now it may be that you're right that "global" revnos like bzr has now
introduce a bias in favor of centralization. If that's true, I'm not
sure that totally vindicates the git model. We have to ask if the bias
is a good thing, but so do you; after all, we may have done so because
of user demand, and if our users want it, maybe yours will want it too
someday.(I say "may" because I haven't been paying clos...
