The git interface refactoring should be I think the cause for git 2.0.0 release... -- Jakub Narebski Warsaw, Poland ShadeHawk on #git -
Good idea indeed. Nicolas -
We need to avoid user confusion, so making a command that used to do one thing to suddenly do something completely different is a no-no. However, I do not think it needs to wait for 2.0.0. We can start with a separate namespace (or even a separate "Improved git UI project") and introduce the "improved UI set" in 1.5.0 timeframe. If managed properly, the "improved git UI" can coexist with the current set of tools and over time we can give an option not to even install the older Porcelain-ish commands. -
Dunno. I feel this is a bit overboard. Actually the naming problem is rather localized to one command, namely git-pull. In my opinion going with yet another namespace which would rather add to the confusion not clear it. The best way to avoid user confusion is to remove the source of the confusion not let it live. In other words I think we should _fix_ git-pull instead of replacing it. People are already confused about it so simply fixing this command will have a net confusion reduction. Yet we're not talking about "suddenly doing something completely different" either. If git-pull doesn't merge automatically anymore it is easy to tell people to use git-merge after a pull. "You pull the remote changes with 'git-pull upstream,, then you can merge them in your current branch with 'git-merge upstream'." Isn't it much simpler to understand (and to teach) that way? Also I don't think using git-upload and git-download is much better. This adds yet more commands that do almost the same as existing ones but with a different name which is yet not necessarily fully adequate. I for example would think that "download" is more like git-clone than git-fetch or git-pull. Let's face it: HG got it right with pull and push and newbies have much less difficulty grokking it. We screwed it by not using the most intuitive semantic of a pull and locking the word "pull" away is not the better solution given all considerations. Why just not admit it and avoid being different than HG just for the sake of it? Nicolas -
If it were "you download the remote changes with 'git download upstream' and then merge with 'git merge'", then perhaps, but if you used the word "pull" or "fetch", I do not think so. I would be all for changing the semantics of "pull" from one thing to another, if the new semantics were (1) what everybody welcomed, (2) what "pull" traditionally meant everywhere else. In that case, we have been misusing it to be confusing to outsiders and I agree it makes a lot of sense to remove the source of confusion. But I do not think CVS nor SVN ever used the term, and I was told that BK was what introduced the term, and the word meant something different from what you are proposing. You have to admit both pull and fetch have been contaminated with loaded meanings from different backgrounds. I was talking about killing the source of confusion in the longer term by removing fetch/pull/push, so we are still on the same page. That's where my "you download from the upstream and merge" comes from. -
How was/is fetch contaminated? -- Petr "Pasky" Baudis Stuff: http://pasky.or.cz/ #!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj $/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1 lK[d2%Sa2/d0$^Ixp"|dc`;s/\W//g;$_=pack('H*',/((..)*)$/) -
I think "fetch" is sane. Its only problem is a missing symetrical counterpart verb, like "get" and "put". Nicolas -
If you're a dog owner, the obvious counterpart for "fetch" is "throw" ;) I think "get" and "put" would be bad, just because of confusion with "sccs get" (ie it has that "get this file" connotations). Maybe "fetch" and "push" aren't totally diametrically opposite, but really, I don't think they are that hard to understand either. We do have the BK legacy of "pull" implying a merge, and that's fairly fundamental. It's also true that in a lot of usage schenarios, what people actually _use_ is "pull" and "push", and no, they aren't mirror images (since push will _not_ do the merge), but at the same time, from a _usage_ standpoint they really _are_ each others opposites. You "pull" to get other peoples data into your branch (and once you've internalized local branches and the merge thing, you know what this means), and you "push" to push your changes out. It really _is_ the usage schenario, and using "opposite" words really _does_ make sense. It's true that _technically_ "fetch" is the opposite of "push", but at the same time, that really is about technology, not about usage models. You normally wouldn't do a "git fetch + git push" pair. You _can_ do so, but it's not the natural way to work - unless you're just doing a mirror service. Linus -
Yeah. You could always throw a branch to your dog. Or maybe we should introduce the concept of "bones" to GIT in place of Has SCCS really had a similar level of influence than BK or CVS in that The problem is the "usage standpoint" distinction that has to be made. Exactly because in GIT it is a bit distorted from what most people But that's exactly why newbies have problems. Instead of simply understanding the bare operation (fetch data in a branch _then_ merge it) they sort of need to abstract the concept of branch away because a "pull" does it all automagically. Which is fine as long as you're willing to ignore branch concepts altogether. But once branches are back in the picture for more involved operations then the "pull" word simply feels odd. Even more so with the local merge syntax. When I say to someone "just merge branch weezee with your current branch" the most intuitive command would be: git merge weezee But because "pull" mixes two concepts together this makes the thing more esoteric. Unless, of course, you get used to the mental model you outlined above, but IMHO simply needing a mental model to explain the tool is a sign that something is mapped wrong. Nicolas -
But the fact is that HG (which has a growing crowd of happy campers, maybe even larger than the BK crowd now) did work with and got used to a sensible definition of what a "pull" is. This means that their definition is becoming rather more relevant with time than what it used to, and because it is a saner definition than what GIT has for the same word which HG users really have no issue with, I think we really should leverage the "common wisdom" and consider aligning ourselves with them in this case rather than trying to go into a totally different direction. We simply won't gain anything trying to teach people "a pull in HG is a download in GIT". If a pull becomes the same thing for both then it's one less oddball in the GIT interface. Nicolas -
Guys, before you start thinking this way, the fact is, there's a lot of happy git users. So the reason for using "git pull" is - bk did it that way, and like it or not, bk was the first usable distributed system. hg is totally uninteresting. - git itself has now done it that way for the last 18 months, and the fact is, the people _complaining_ are a small subset of the people who actually use git on a daily basis and don't complain. So don't fall for the classic "second system syndrome". The classic reason for getting the second system wrong is because you focus on the issues people complain about, and not on the issues that work well (because the issues that work fine are obviously not getting a lot of attention). If you think "pull" is confusing, I can guarantee you that _changing_ the name is a hell of a lot more confusing. In fact, I think a lot of the confusion comes from cogito, not from git - the fact that cogito used different names and different syntax was a mistake, I think. And that '#' for branch naming in particular was (and is) total braindamage. The native git branch naming convention is just fundamentally much better, and allows you to very naturally fetch multiple branches at once, in a way that cogito's syntax does not. So when I see suggestions of using that brain-damaged cogito syntax as an "improvement", I know for a fact that somebody hasn't thought things through, and only thinks it's a better syntax beause of totally bogus reasons. I do agree that we probably could/should re-use the "git merge" name. The current "git merge" is an esoteric internal routine, and I doubt a lot of people use it as-is. I don't think it would be a mistake to make "git merge" basically be an alias for "git pull", for example, and I doubt many people would really even notice. But the fact is, git isn't really that hard to work out, and the commands aren't that complicated. There's no reason to rename them. We do have other pro...
I would agree that having "pull" mean something different in Cogito than in Git was a bad idea (explanation: historically, for some period of time Cogito had cg-pull which meant the same as cg-fetch or hg pull; later it got renamed to cg-fetch). But I'm also happy that Cogito just does not use the "pull" expression at all currently: "updating" seems to be a clear and unloaded enough concept for new people. Pull is really _very_ confusing, with it meaning something different (but not different enough) in _all_ other systems but BK (which is basically irrelevant nowadays). That said, I agree with your argument that changing it in Git now might just result in more confusion. I'm just trying to explain Cogito's choice here, and I believe it does no good nor harm to Core Git if it just uses different name for the concept and avoids the original name at all (except explaining in the docs that updating in Cogito is what pulling is in Git). -- Petr "Pasky" Baudis Stuff: http://pasky.or.cz/ #!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj $/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1 lK[d2%Sa2/d0$^Ixp"|dc`;s/\W//g;$_=pack('H*',/((..)*)$/) -
Yes, "bk pull" had an implied merge. But, the reason why bk pull was
never really a problem with Bitkeeper is because it didn't really have
support for multiple branches active within the same repository ---
what Larry called "lines of development". Or rather, Larry started
down the path of implementing lines of development, and then never
fully supported it, mainly because making it easy for people to use
was the tricky part.
So with Bitkeeper, with "bk pull" there was never any question about
which branch ("line of development") you would be merging into after
doing a "bk pull", since there was only one LOD, and given that BK had
the rule that a within a LOD only one tip was allowed, a "bk pull"
_had_ to do do a merge operation.
The moment you start supporting multiple unmerged tips in a repository
i.e., branches, it raises the question, "which branch should the pull
operation merge onto"? And git's answer, "the current branch", is
often not the right one. *That's* why always doing a merge isn't
always the right answer, and so in the git world, people are told, use
"git fetch" instead, and in the hg world, "hg pull" doesn't do the
merge. IMO, it's a fundamental result of the fact that both git and
hg have chosen to support mulitple LOD's, whereas BK punted on the
concept.
If you are operating on your local development branch, the reality is
that merging is probably not the right answer in the general case,
which is why the hg world have omitted doing the merge. And by
telling people, use "git fetch" instead, that's also an implicit
admission that merging onto the current branch is often not the Right
Thing.
The problem is that "pull" is a very evocative word, especially given
the existence "push", and so in the git world we are reduced to
telling people, "you really don't want to use pull, trust me".
Is this a major issue? Not really; I can think of a number of other
issues that make git hard to learn, and why hg has a more gentle
learning curve, and th...I agree, but I wonder why you are pulling/fetching (with or without merge) if you are operating on your local development I would rather say "use 'git branch' to make sure if you are I have to disagree with this. In the simplest CVS-like central repository with single branch setup in which many "novice users" start out with, there is almost no need for "git fetch" nor tracking branch. You pull, resolve conflicts, attempt to push back, perhaps gets "oh, no fast forward somebody pushed first", pull again, then push back. So I am not sure where "you really do not want to use pull. trust me" comes from. It is a different story for people who _know_ git enough to know what is going on. They may be using multiple branches and interacting with multiple remote branches, and there are times you would want fetch and there are other times you would want pull. But for them, I do think the suggestion would never end with "trust me" -- they would understand what the differences are. -
Well, when I was using BitKeeper, I never would. Bitkeeper has what
Linus calls the broken "repository == branch" model. So normally I
would have one repository where I would track the upstream branch, and
only do bk pull into that branch. I would do my hacking in another
repository (i.e., branch), and periodically keep track wha was going
on in mainline by cd'ing to the mainline repository and doing the bk
pull there.
The challenge when you put multiple branches into a single repository,
is you have to keep track of which branch you happen to be in. In the
BK world, this was obvious because it would show up in my shell
prompt:
<tytso@candygram> {/usr/src/linux-2.6}
2%
(OK, obviously I'm in the Linux 2.6 upstream repository)
In a system where you need to keep track of what branch you are in via
an SCM-specific local state information, it's easy to get confused and
do a pull when you are in the "wrong" branch, or while you have local
state in your working directory.
What I currently do (and I'm sure I'm being really horrible and need
to be say 100 "Hail, Linus"'es for penance for not adhering staying in
the one true distributed state of grace) is that I keep an entirely
separate Linux 2.6 git repository just to make sure I never get
confused about what branch I might happen to be in when I do the "git
pull" --- and yeah, I could have used "git fetch", but 3+ years of BK
usage plus Hg usage is hard to get away from. I'm sure this is where
Linus would say that use of BK and Hg, causes permanent brain damage,
ala's Dijkstra's ofted quoted comment about use of Basic inducing
I think the problem is the people who have had years of BK or Hg
experience. Maybe it's more of a documentation problem; perhaps a
"git for BK" or "git for Hg" users is what's needed. The problem
though is that while use of BK is definitely legacy, there are going
Well, I think this is where git's learning curve challenges are. Yes,
for users that are doing the stupidest, mo...As a 80%-hg/20%-git user, I'm curious what features of git you had in
mind, so I know where to look as I wander up the git learning curve.
My experience with the git user interface, for what it's worth, is
that I never quite get the conceptual model crystal clear enough in my
head. So it won't stay for long enough for me to progress up the
learning curve and retain the gains. I move up a bit, but the gain
soon evaporates and I backslide, and then just hack my way through it.
I found hg's conceptual model very easy to learn, almost as if I don't
have to remember anything. Maybe that simplicity comes at a price,
whence my question at the start about the extreme-power features of
git.
-Sanjoy
`Never underestimate the evil of which men of power are capable.'
--Bertrand Russell, _War Crimes in Vietnam_, chapter 1.
-Err, what I meant to say is that there are going to be a lot of people who will need to simultaneously use both git and Hg. - Ted -
We do that for Wine. The problem is that we recommend using git-rebase to make it easier for occasional developers to keep a clean history, and rebase and pull interfere badly. The result is that we recommend always using fetch+rebase to keep up to date, but this is confusing many people too, because git-fetch appears to do a lot of work yet leaves the working tree completely unchanged, and git-rebase doesn't do anything (since in most cases they don't have commits to rebase) but has an apparently magical side-effect of updating the working tree. Ideally it should be possible to have git-rebase do the right thing even if the branch has been merged into; then we could tell people to always use git-pull, and when they get confused by seeing merges in their history have them do a git-rebase to clean things up. -- Alexandre Julliard julliard@winehq.org -
How do those developers submit their changes? Do they push? If they do, git-rebase can be saving one merge at most, and the merge is actually a good thing (someone should write some nice standalone writeup about that). If they don't have push access and maintain their patches locally until they get accepted, perhaps it would be far simpler for them to use StGIT? -- Petr "Pasky" Baudis Stuff: http://pasky.or.cz/ #!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj $/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1 lK[d2%Sa2/d0$^Ixp"|dc`;s/\W//g;$_=pack('H*',/((..)*)$/) -
For regular developers, sure. But regular developers will need to properly understand the git model anyway, and then they will able to make sense even of the standard git commands ;-) The problem is that there isn't a smooth progression to that point. At first, a user will simply want to download and build the code, and for that git-pull works great, it's a one-stop command to update their tree. Then after a while the user will fix a bug here and there, and at that point git-rebase is IMO the best tool, it's reasonably easy to use, doesn't require learning other commands, and once the patch is accepted upstream it nicely gets the tree back to the state that the user is familiar with. The problem is that rebase doesn't work with pull, so the user needs to un-learn git-pull and start using git-fetch; it's to avoid this that we recommend using git-fetch from the start, which is unfortunate since it makes things harder for beginners. -- Alexandre Julliard julliard@winehq.org -
that's not a good argument; the set of git users is a small subset of
those that looked at git, and dismissed it because they couldn't wrap
their heads around it. It's worth trying to get those on board by
fixing the annoying little issues that have popped up in this thread.
The technical base for GIT is excellent, and the only reason for not
using it is its arcane interface.
A version control system is often only tangentially related to the real
work that needs to be done, so the incentive to learn it well is small,
and a steep learning curve only makes it worse.
FWIW, I regularly mess up with the differences between fetching, pulling
and merging. In particular, having to do a two step process to get
remote changes in,
git pull url-to-server master:master
..error message about not being a fast-forward..
git pull --update-head-ok url-to-server master:master
..still an error message about update not being a fast-forward..
(sigh)
git pull url-to-server master:scrap-branch
git pull . scrap-branch:my-current-branch
(make mental note of deleting scrap-branch)
--
Han-Wen Nienhuys - hanwen@xs4all.nl - http://www.xs4all.nl/~hanwen
-And I've said this again, and I'll say it once more: that has basically _nothing_ to do with whether you spell "pull" as "pull" or "merge". The reason people have trouble wrapping their heads around git is because they have been braindamaged by CVS and SVN, and just don't understand the fairly fundamental new concepts and workflow. That's totally different from then arguing about stupid naming issues. Peopel seem to believe that changign a few names or doing other totally _minimal_ UI changes would somehow magically make things understandable. I claim that isn't so at all. The fact is, git is different from CVS and SVN, and git _has_ to be different from CVS and SVN. It has to be different because the whole model of CVS and SVN is simpyl fundamentally I claim that those "annoying little issues" are totally made up by people who had trouble wrapping their minds about git, and then make up reasons that have nothing to do with reality for why that might be so. Let's face it, you could just alias "merge" to "pull", and it wouldn't really change ANYTHING. You'd still have to learn the new model. Linus -
Hi, Never ever underestimate pet peeves. If we give many people an obvious reason (however trivial and bike-shed-coloured) to complain, they will complain. If we pull (pun intended) that reason away under their collective backsides, they will have to find another reason to complain. But by the time they found something, they will already be happy git users! But since you just provided a patch to make life easier on non-gitters, I guess you agree with that already. And hopefully you also agree that enhancing the syntax of git-merge to grok "git-merge [-m message] <branch>" and "git-merge [-m message] <url-or-remote> <branch>" would be a lovely thing, luring even more people into using git. Maybe they even start complaining about subversion and CVS calling a merge "update", who knows? Ciao, Dscho -
I do actually think that this discussion has been informative, partly because I never even realized that some people would ever think to do "init-db" + "pull". Making things like that work is easy enough, it's just that I never saw any point until people complained. And when they complained, the initial complaint wasn't actually obvious. Only when Han-Wen actually gave something that didn't work, was it clear that the real issue wasn't so I definitely think we can make "git merge" have a more pleasant syntax. I'm just still not sure that people should actually use it ;) My real point was/is that usually it's really not the "naming details" that people _really_ have problems with. The real problems tend to be in learning a new workflow. We can make some of those workflows easier, but I would heartily recommend that people not worry about naming of "pull" vs "fetch", because that's almost certainly not really the issue. Instead, if you have a problem, rather than concentrating on the names of the programs, say: - what do you want to get done. Most likely it's _trivial_ to do with git, it's just that somebody used the wrong approach, and then it didn't work at all. - give actual examples of a workflow that didn't work or was complex. (again, the "init-db" + "pull" example). And yes, in many cases, it might well be a case of "sure, we can make that _other_ workflow work too". But somebody like me, who has used git for a year and a half, and used BK before it, probably simply uses a different workflow than somebody who comes from CVS. For example, I suspect that your gripe with "git fetch" was just from using it in a really awkward manner. Maybe we could make your workflow work with git too, but maybe it really already (and always) did, you just used a particular tool in a way that made the use be really really painful. Sometimes it's just a question of "ok, use it like _this_, and now it's actually really simple"...
I agree that discussions on naming may cloud the issue, but "learning the workflow" implies that people should adapt to the limitations of their tools. That's only a viable stance when the tools are finished and completely perfect. Until that time, it would be good goal to remove all idiosyncrasies, all gratuitious asymetries and needless limitations in the commands of git, eg. - clone but not a put-clone, - pull = merge + fetch, but no command for merge + throw - clone for getting all branches of a repo, but no command for updating all branches of a repo. Of course, when all warts are fixed, backward compatibility will force us to choose some new names. At that point, a discussion on naming is in place. -- Han-Wen Nienhuys - hanwen@xs4all.nl - http://www.xs4all.nl/~hanwen -
Han-Wen Nienhuys <hanwen@xs4all.nl> wrote: throw + merge (at the remote end, that is)? -- Dr. Horst H. von Brand User #22616 counter.li.org Departamento de Informatica Fono: +56 32 2654431 Universidad Tecnica Federico Santa Maria +56 32 2654239 Casilla 110-V, Valparaiso, Chile Fax: +56 32 2797513 -
I think there's a fundamental assumption built into the design of git that most programmers accustomed to a corporate environment don't understand. Namely, that each programmer owns his or her entire "repository", and can do whatever he or she darn well pleases with it at any time. Go ahead and create hundreds of transient branches as part of a scripted "merge complexity metric" calculation. Try three different refactoring strategies on different branches, abandon two of them, and prune them months later. And generally use the power of the SCM to juggle a lot of things at once, because there's no sysadmin gatekeeper stopping you, and the thing is designed and coded scalably so it doesn't grind to a halt as soon as everyone has dozens of private branches. Even if you do find a way to push git in a direction that it doesn't scale, it's no one's problem but your own -- people who pull from you are pulling the _content_ on the branches they care about, not the structure of your repository. One person's gratuitous asymmetry is another's minimalism. (If the symmetric thing doesn't make any sense or can't be implemented scalably, leave it out.) It is more important that git continue to pull = fetch + merge. It is (almost?) always followed by a judgment call based on the merge results. merge + throw doesn't make any sense in terms of the job at hand, which is facilitating human judgments clone is shorthand for the steps involved in setting up a new repository with content similar to an existing one. There isn't any merge involved, and no scope for human judgment, so it's simplest to clone the whole state of the remote repository (including tags and branches) and let the user blow away any branches he doesn't need. But once the clone is done, all of those branches are _truly_ _local_ -- they don't retain any reference to the remote branches, and you can commit to all of them. The only entry placed in .git/remotes is the "origin" of the new clone, which is the "master" of...
Actually, this "origin" entry does contain "Pull:" lines for all of the branches that were cloned, so that "git pull" fetches and merges updates to all of these branches. (If upstream is in the habit of reverting things, you may need "git pull -f"; I just did that on the git repo to handle a failure to fast-forward on the "pu" branch.) Presumably "git branch -D" should inspect everything under .git/remotes to see whether one or more Pull: lines need to be deleted along with the branch. Currently, it looks like "remotes" entries are created only by "git clone" or by hand. Junio, are there any plans to manage the contents of "remotes" through the tool instead of by hand? Cheers, - Michael -
I am not sure what you mean. .git/remotes files do not describe any relationship between local branches (and that is where one of the problem raised in recent thread -- pull does not notice on which branch you are on and change its behaviour depending on it), so I do not think there is anything gained for "git branch I muttered something in a near-by thread Message-ID: <7vr6w78b4x.fsf@assigned-by-dhcp.cox.net> I am reasonably sure a separate tool (what I tentatively called "maint-remote" in the message) is necessary, because, while it would be relatively easy to make "git fetch" and friends to add new mappings in the default way under a new option, people with different workflows would want differnt "default mappings", and adding new mappings for _all_ remote branches is useful only for people who work in one particular way (namely, the CVS-style "the central distribution point is where everybody meet" model). The tool, under "interactive" mode, would probably take one parameter, the short name of a remote ($name), and would give you a form to update its URL:, shows ls-remote output against that repository and would let you: - update the URL: which would probably cause the ls-remote to be re-run; - remove existing mappings; - add mappings for a remote branch for which you do not have a corresponding tracking branch, with a straightforward default mapping: refs/heads/$branch:refs/remotes/$name/$branch But I haven't thought things through yet. -
.git/remotes/foo does contain Pull: lines which indicate the local branch onto which to _fetch_ remote changes. It's the subsequent _merge_ that doesn't notice which branch you have checked out. Cheers, - Michael -
As mentioned, in order to "put-clone", you generally have to "create" first, so the "put-clone" really makes no sense. The _true_ reverse is really your - "git init-db" on both sides - "git pull" (your workflow ;) on receiving - "git push" on sending. The fact that we can do "git clone" on the _receiving_ side is an assymmetry, but it's not gratutous: when receiving we don't need any extra permissions or setup to create a new archive. In contrast, when sending, Again, this is not gratuitous, and the reason is very similar: when you pull, you're pulling into something that _you_ control and _you_ have access to, namely your working directory. In order to merge you have to have the ability to fix up conflicts (whether automatically or manually), and this is something that you _fundamentally_ can only do when you own the repo space. Again, when you do "push", the reason you can't merge is not a "gratuitous assymmetry", but a _fundamental_ assymmetry: by definition, you're pushing to a _remote_ thing, and as such you can't merge, because you can't fix up any merge problems. See? In many ways, if you want _symmetry_, you need to make sure that the _cases_ are symmetrical. If you have ssh shell access, you can often do that, and the "reverse" of a "git pull" is actually just another "git pull" from the other side: ssh other-side "cd repo ; git pull back" Now they really _are_ symmetrical: "git pull" is really in many ways ITS OWN reverse operation. But "push" and "pull" _fundamentally_ aren't symmetric operations, and you simply cannot possibly make them symmetric. Any system that tries would be absolutely horrible to use, exactly because it would be either: - making local/remote operations totally equivalent This sounds like a "good" thing, but from a real user perspective it's actually horribly horribly bad. Knowing the difference between local and remote is what allows a lot of performance optimizations, and a lot...
Point taken; thank you. In that case, we're full circle with the command naming issues. Push and pull are fundamentally asymmetric operations, but then a consistent UI would dictate that they wouldn't be named symmetrically, as they are now. -- Han-Wen Nienhuys - hanwen@xs4all.nl - http://www.xs4all.nl/~hanwen -
This one I can understand, but how would you propose to "update all branches", in other words what's your design for mapping remote branch names to local branch namespaces? It would be nice if the design does not straightjacket different repository layouts different people seem to like, but I think it would be Ok to limit ourselves only to support the straight one-to-one mapping and support only separate-remote layout. -
What I want here is a command "git update" that fetches and fast-forwards the all branches which are designated as "tracking" a branch in some known remote repository. And git-clone would setup all branches appropriately so that they would be updated by git-update. Additionally, it would be nice if git-update would also create new tracking branches for all remotes repositories that had been designated as being tracked, (and git-clone would do this as well). There should also be a mechanism to easily create new tracking support for specific branches or all branches of a repository, (could be "git fetch URL branch" or "git fetch --all URL", for example). With this kind of setup, I would use "git update" regularly, and only ever merge locally. And by definition merging with any local tracking branch would have just as much information available as "pull URL branch" so the message would be the same. I've been using git for 10-11 months, so I think I understand the models fairly well, and I'd be really happy with a setup like that. I also have talked with a fair number of (non-git-using) users who think git is confusing, but I think would find the above scenario just fine. In this scenario, git pull would still work just fine, but it would also be much easy to teach a workflow that didn't use pull at all, so if there's any git-pull confusion that's an actual problem, it could be avoided. Junio, what do you think of a setup something like that? I really don't want to create a command other than "git" to implement it. -Carl
As long as its consistent with "clone" I'll be happy, (I think as part of a separate topic we need to fix the mappings in clone, see --use-separate-remotes as default and related). The current case is really annoying where I have to throw use clone into a new repository just to get everything, rather than just being able to fetch everything into the repository I already have. -Carl
put clone would be the putative inverse of clone, ie. make a clone of
throw is the hypothetical opposite of fetch. I agree that this is
academical, because it's logical to only allow fast-forwards for
I think the whole clone design is a bit broken, in that the "master"
branch gets renamed or copied to "origin", but all of the other
branches remain unchanged in their names.
It's more logical for clone to either
* leave all names unchanged
* put all remote branches into a subdirectory. This would also make
it easier to track branches from multiple servers.
At present, I have in my build-daemon the following branches,
cvs-head-repo.or.cz-lilypond.git
hanwen-repo.or.cz-lilypond.git
hwn-jcn-repo.or.cz-lilypond.git
lilypond_1_0-repo.or.cz-lilypond.git
lilypond_1_2-repo.or.cz-lilypond.git
lilypond_1_4-repo.or.cz-lilypond.git
lilypond_1_6-repo.or.cz-lilypond.git
lilypond_1_8-repo.or.cz-lilypond.git
lilypond_2_0-repo.or.cz-lilypond.git
lilypond_2_2-repo.or.cz-lilypond.git
lilypond_2_3_2b-repo.or.cz-lilypond.git
lilypond_2_3_5b-repo.or.cz-lilypond.git
lilypond_2_4-repo.or.cz-lilypond.git
lilypond_2_6-repo.or.cz-lilypond.git
lilypond_2_8-repo.or.cz-lilypond.git
master-git.sv.gnu.org-lilypond.git
master-hanwen
master-repo.or.cz-lilypond.git
origin-repo.or.cz-lilypond.git
stable
stable-2.10
stable--2.10-git.sv.gnu.org-lilypond.git
It would solve lots of problems for me if cloning and fetching would
put branches into a subdirectory, ie.
git clone git://repo.or.cz/lilypond.git
leads to branches
repo.or.cz/lilypond_2_8
repo.or.cz/lilypond_2_6
repo.or.cz/lilypond_2_4
repo.or.cz/master
(etc..)
--
Han-Wen Nienhuys - hanwen@xs4all.nl - http://www.xs4all.nl/~hanwen
-So effectively to tell git push not to unpack on the remote side, and to push all branches and relevant tags. That's basically exactly what git clone --use-separate-remote should do. Now only if it would become the default... :-) -- Petr "Pasky" Baudis Stuff: http://pasky.or.cz/ #!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj $/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1 lK[d2%Sa2/d0$^Ixp"|dc`;s/\W//g;$_=pack('H*',/((..)*)$/) -
> I claim that those "annoying little issues" are totally made up by > people > who had trouble wrapping their minds about git, and then make up > reasons > that have nothing to do with reality for why that might be so. Let me put this more personally: I continue to be bitten by stupid naming issues, and the myriad of little mostly non-orthogonal commands. My head is doing just fine otherwise, and has no problems wrapping it around the core of GIT. I've also used Darcs for almost a year. Darcs, which is much less overwhelming. This is not about CVS or SVN, so don't put them up as a strawman. If you want to argue that my brain is warped, use other distributed VCs as an example. The following mkdir x y cd x hg init echo hoi > bla hg add hg commit -m 'yes, I am also too stupid to refuse explicit empty commit messages' cd ../y hg init hg pull ../x pretty much works the same in Darcs, bzr and mercurial. With GIT, this is what happens [hanwen@haring y]$ git pull ../x fatal: Needed a single revision Pulling into a black hole? [hanwen@haring y]$ git fetch ../x warning: no common commits remote: Generating pack... Done counting 3 objects. Deltifying 3 objects. 100% (3/3) done Total 3, wremote: ritten 3 (delta 0), reused 0 (delta 0) Unpacking 3 objects 100% (3/3) done [hanwen@haring y]$ git checkout fatal: ambiguous argument 'HEAD': unknown revision or path not in the working tree. Use '--' to separate paths from revisions fatal: Not a valid object name HEAD [hanwen@haring y]$ git branch master fatal: Needed a single revision at this point, I resort to adding a bogus commit and/or editing .git/HEAD by hand. I'm sure there is a saner way of doing it, but I still haven't found out what it is. This might not be typical GIT use, but it does show the typical GIT user experience, at least mine. If you want to have another example of how not to design a I don't want ANYTHING to really c...
Your example has nothing at all to do with "pull" vs "fetch", though. Your example is about something totally _different_, namely that under Bzzt. This is where you went wrong, and you blamed "pull". The way you do this in git is to NOT do "git init". Instead, you replace all the mkdir y cd ../y hg init hg pull ../x with a simple git clone x y and YOU ARE DONE. Now, we could certainly _make_ "git pull" work on an empty git project, but that has _nothing_ to do with what people have been talking about. In fact, the fact that "git fetch" kind of works is not exactly accidental (because "git fetch" _is_ meant to add new local branches too), but all the problems you have with it are due to the SAME issue. You started without any branch at all, because you started with an empty git repo, and you're simply not _supposed_ to do that. So current rule (and this is not new, it's always been true): the ONLY time you use "git init-db" is when you are going to start a totally new project. Never _ever_ otherwise. If you want to track another project, use It's not that it isn't typical, it's that you are using the wrong model. Maybe it's not well documented, I can easily give you that, but ALL your problems come from that fundamental starting point: you shouldn't have used "git init-db" in the first place. Somebody want to document it? Alternatively, we certainly _could_ make "git pull" just accept an empty git repo, and make it basically create the current branch. The sane interface _exists_. It's called "git clone". Linus -
Actually, only a 2 weeks ago, you suggested that I share the website and main source code for my project in a single repository for reasons of organization. In this setup I find it logical to do git init-db git pull ..url.. website/master to wind up with just the 5mb website, instead of the complete 70mb Yes, I would like that. -- Han-Wen Nienhuys - hanwen@xs4all.nl - http://www.xs4all.nl/~hanwen -
I don't disagree per se. It should be easy to support, it's just that it's
not traditionally been something we've ever done.
So the way you'd normally set up a single repo that contains multiple
other existing repositories is to basically start with one ("git clone")
and then add the other branches and "git fetch" them.
So again, instead of "git init-db" + "git pull", you'd just use "git
clone" instead.
Note that there _is_ another difference between "git pull" and
"fetch+merge". The difference being that "git pull" implicitly does the
checkout for you (I say "implicitly", because that's the way the git
merge conceptually works: we always merge in the working tree. That's not
the only way it _could_ be done, though - for trivial merges, we could do
them without any working tree at all, but we don't suppotr that).
And that "git pull" semantic actually means that if you want a _bare_
repository, I think "git --bare init-db" + "git --bare fetch" actually
does exactly the right thing right now too. But "git pull" would not be
the right thing to use.
Btw, another normal way to generate a central "multi-headed repo" for is
to not use "pull" or "fetch" or "clone" at ALL, but I would likely do
something like
mkdir central-repo
cd central-repo
git --bare init-db
and that's it. You now have a central repository, and you _never_ touch it
again in the central place except to repack it and do other "maintenance"
(eg pruning, fsck, whatever).
Instead, from the _outside_, you'd probably just do
git push central-repo mybranch:refs/heads/central-branch-name
(actually, you'd probably set up that branch-name translation of
"mybranch:refs/heads/central-branch-name" in your remote description, but
I'm writing it out in full as an example).
So there are many ways to do it. It just happens that "git init-db"
followed by "git pull" is not one of them ;)
(And the real reason for that is simple: "git pull" simply wants to have
something to _start_ with...For that we'd also need a way for clone to be able to fetch just a
single branch, and not all of them as well.
There is some clone vs. fetch asymmetry here that has annoyed me for a
while, and that I don't think has been mentioned in this thread
yet. Namely:
clone: can only be executed once, fetches all branches, "remembers"
URLs for later simplified use
fetch: can be executed many times, fetches only named branches,
doesn't remember anything for later
I've often been in the situation where I cloned a long time ago, but
I'd like to be able to fetch everything that I would get if I were to
start a fresh clone.
-CarlHere's a very lightly tested patch that allows you to use "git pull" to populate an empty repository. I'm not at all sure this is necessarily the nicest way to do it, but it's fairly straightforward. Junio, what do you think? Linus --- diff --git a/git-pull.sh b/git-pull.sh index ed04e7d..7e5cee2 100755 --- a/git-pull.sh +++ b/git-pull.sh @@ -44,10 +44,10 @@ do shift done -orig_head=$(git-rev-parse --verify HEAD) || die "Pulling into a black hole?" +orig_head=$(git-rev-parse --verify HEAD 2> /dev/null) git-fetch --update-head-ok --reflog-action=pull "$@" || exit 1 -curr_head=$(git-rev-parse --verify HEAD) +curr_head=$(git-rev-parse --verify HEAD 2> /dev/null) if test "$curr_head" != "$orig_head" then # The fetch involved updating the current branch. @@ -80,6 +80,11 @@ case "$merge_head" in exit 0 ;; ?*' '?*) + if test -z "$orig_head" + then + echo >&2 "Cannot merge multiple branches into empty head" + exit 1 + fi var=`git-repo-config --get pull.octopus` if test -n "$var" then @@ -95,6 +100,12 @@ case "$merge_head" in ;; esac +if test -z "$orig_head" +then + git-update-ref -m "initial pull" HEAD $merge_head "" || exit 1 + exit +fi + case "$strategy_args" in '') strategy_args=$strategy_default_args -
So this is the place that probably wants a "git-checkout" before the exit, otherwise you'd (illogically) have to do it by hand for that particular case. Of course, we should _not_ do it if the "--bare" flag has been set, so you migth want to tweak the exact logic here. Linus -
As you said, pull inherently involve a merge which implies the existence of associated working tree, so I do not think there is any room for --bare to get in the picture. We already do the checkout when we recover from a fetch that is used incorrectly and updated the current branch head underneath us. To give the list a summary of the discussion so far, here is a consolidated patch. -- >8 -- From: Linus Torvalds <torvalds@osdl.org> Subject: git-pull: allow pulling into an empty repository We used to complain that we cannot merge anything we fetched with a local branch that does not exist yet. Just treat the case as a natural extension of fast forwarding and make the local branch'es tip point at the same commit we just fetched. After all an empty repository without an initial commit is an ancestor of any commit. Signed-off-by: Junio C Hamano <junkio@cox.net> --- diff --git a/git-pull.sh b/git-pull.sh index ed04e7d..e23beb6 100755 --- a/git-pull.sh +++ b/git-pull.sh @@ -44,10 +44,10 @@ do shift done -orig_head=$(git-rev-parse --verify HEAD) || die "Pulling into a black hole?" +orig_head=$(git-rev-parse --verify HEAD 2>/dev/null) git-fetch --update-head-ok --reflog-action=pull "$@" || exit 1 -curr_head=$(git-rev-parse --verify HEAD) +curr_head=$(git-rev-parse --verify HEAD 2>/dev/null) if test "$curr_head" != "$orig_head" then # The fetch involved updating the current branch. @@ -80,6 +80,11 @@ case "$merge_head" in exit 0 ;; ?*' '?*) + if test -z "$orig_head" + then + echo >&2 "Cannot merge multiple branches into empty head" + exit 1 + fi var=`git-repo-config --get pull.octopus` if test -n "$var" then @@ -95,6 +100,13 @@ case "$merge_head" in ;; esac +if test -z "$orig_head" +then + git-update-ref -m "initial pull" HEAD $merge_head "" && + git-read-tree --reset -u HEAD || exit 1 + exit +fi + case "$strategy_args" in '') strategy_args=$strategy_default_args -
Fair enough. Feel free to add the signed-off-by from me too, Linus -
Yeah, I talked about making "merge" treat missing HEAD as a special case of fast forward, but I like yours better. It is a lot cleaner and to the point. -
You're misunderstanding me: the multi-repo is at git.sv.gnu.org is the remote one. The example I gave was about locally creating a single project repo from a remote multiproject repo. On a tangent: why is there no reverse-clone? I have no shell access to the machine, so when I created the remote repo, I had to push, and ended up putting 1.2 Gb data on the server. <looks at manpage> is this send-pack? From UI perspective it would be nice if this could also be done with clone, yes, this works. Two remarks: * it needs website/master:master otherwise you still don't have a branch. * why are objects downloaded twice? If I do git --bare fetch git://git.sv.gnu.org/lilypond.git web/master it downloads stuff, but I don't get a branch. If I then do git --bare fetch git://git.sv.gnu.org/lilypond.git web/master:master it downloads the same stuff again. -- Han-Wen Nienhuys - hanwen@xs4all.nl - http://www.xs4all.nl/~hanwen -
Ahh.
Ok, try the patch I just sent out, and see if it works for you. It
_should_ allow you to do exactly that
mkdir new-repo
cd new-repo
git init-db
git pull <remote> <onehead>
and now your "master" branch should be initialized to "onehead".
Oh, except I just realized that I forgot to do a "git checkout" in my
patch, so you'd need to add that (or do it by hand, but you really
shouldn't need to, since the checkout is implied by the "pull").
The downside with this is that it does NOT populate your "remotes"
information (like "git clone" would have done), so either we'd need to
teach "git pull" to do that too, or you just have to do it by hand (so
Yeah, you're supposed to "init-db" and "push". Right now, that tends to
unpack everything (which is bad), although that is hopefully getting fixed
"git push" uses send-pack internally, you shouldn't ever need to use it
The creation of a new archive tends to need special rights (with _real_
ssh access and a shell you could do it, but "ssh+git" really means "git
protocol over a connection that was opened with ssh, but doesn't
necessarily have a real shell at the other end").
So for most protocols, you simply cannot (and shouldn't) do it. Think
about services like the one that Pasky has set up, that allow you to set
up a new git repo - the setup phase really _has_ to be separate (because
you need to set up your keys etc).
So I think the above syntax is actually not a good one, because it cannot
work in the general case. It's much better to get used to setting up a
repo first, and then pushing into it, and just accepting that it's a
two-phase thing.
Also, from a bandwidth standpoint, you can often (although obviously not
always) make the setup start with something that is _closer_ to what you
want to do. So, for example, you'd often do something like:
(a) ssh to central repository
(b) create the new repository by cloning it _locally_ at the central
place from some other ...What happens on savannah is that the sysadmins set up an empty GIT repo with access, and leave it to you to push the stuff. Of course, Perhaps ; from a UI viewpoint, it would be nice though, even if it were aliased to a simple push. (Darcs has a get command analogous to No, I don't understand. In the fetch all the objects with their SHA1s were already downloaded. I'd expect that the fetch with a refspec would simply write a HEAD and a refs/heads/master, and notice that all the actual data was already downloaded, and doesn't download it again. -- Han-Wen Nienhuys - hanwen@xs4all.nl - http://www.xs4all.nl/~hanwen -
Hi, This is actually a perfect example for - a script that is porcelain as well as plumbing (you are supposed to use it directly, or via pull), and for - a terrible UI. _If_ you use git-fetch directly you virtually always want to store the result. I was tempted quite often to submit a patch which adds a command line switch --no-warn, which is passed to git-fetch by git-pull, and without which git-fetch complains if the branch-to-be-fetched is not stored right away (and refuses to go along). _Also_, git-pull not storing the fetched branches at least temporarily often annoyed me: the pull did not work, and the SHA1 was so far away I could not even scroll to it. The result: I had to pull (and fetch!) the whole darned objects again. Again, I was tempted quite often to submit a patch which makes git-pull fetch the branches into refs/fetch-temp/* and only throw them away when the merge succeeded. Ciao, Dscho -
Again, why didn't you use FETCH_HEAD? If the user doesn't give us a head to write to, we clearly MUST NOT write to any long-term branch. That would be a _horrible_ mistake. So all your complaints seem totally misplaced. The UI is both usable and practical, and your complaint that git pull doesn't store the fetched branches is just NOT TRUE. And your "solution" is obviously totally unusable. git ABSOLUTELY MUST NOT overwrite any existing branches unless explicitly told to do so by the user. So I really don't see your point. A lot of the complaints seem to not be about the interfaces, but about people not _understanding_ and knowing what the interfaces do. If you were confused about something (like not realizing that FETCH_HEAD is there and very much usable), how about sending in a patch to make FETCH_HEAD use clearer in whatever docs you looked at and didn't find it mentioned in. Now, there is no question that some of the interfaces can get a bit "interesting" to use. For example, if you really don't want to re-fetch for some reason, FETCH_HEAD actually does contain enough information that you should be able to just re-do a failed merge, for example, including the message generation. But at that point it really _does_ get a bit complicated, and you end up doing something like git merge "$(git fmt-merge-msg < .git/FETCH_HEAD)" HEAD FETCH_HEAD which should _work_, but I'm not going to claim that it's all that easy to understand. (That said, read that one-liner a few times, and suddenly it doesn't seem _that_ complicated any more, now does it? You can probably even guess what it's really going to do, even if you don't know git all that well. It's not unreadable line noise, is it?) Of course, if I had a merge that failed (the most common reason being that I had some uncommitted patch in a file that wanted to be updated by the merge), I'd never actually do the above one-liner. I'd just re-do the pull. But if networking was _really_ sl...
From the point of view of a user, there is not really a difference between the two. As a user, you form a mental model of how things work by looking at the interface. If the interface is bad, the user creates a faulty model in his head, and starts doing things that are perfectly logical in the faulty model, but stupid and silly when you consider the actual internals. A nice book about this is "The Design of Everyday Things" by Donald Norman. -- Han-Wen Nienhuys - hanwen@xs4all.nl - http://www.xs4all.nl/~hanwen -
Hi, It is a terrible UI, because it was not that obvious to me. And I consider myself not a git newbie. Besides, it is not really a temporary branch. If it was, the pull would I was _not_ suggesting a long-term branch. Just a way to do-what-i-want Guess three times why I did not post the patches. But the real problem is not necessarily the behaviour; it is the obscure fashion of the behaviour. You may not understand that problem, because you were there from the beginning. You saw the big-bang and how all the quarks formed all of a sudden, and how matter and eventually planets and suns came into being. But others (me included) were not there. Or they did not really watch. And now they see all these creatures, and plants, and bacteria, and they do not understand how these are all connected, because of that. And now they think "wow that must have been some intelligent design, and really a miracle, and I cannot understand how it works." But that is not true (the latter part of course). There is something to be said about the simplicity of Mercurial. It's inner workings may suck, but people get easily attracted by it. I do not claim we should imitate Mercurial, or even hide the index (even if I sometimes wonder if the index is not just a clever way to accelerate But the interfaces should be usable interfaces! They should _explain_ what I find that quite easy to understand. Why? Because I happen to _know_ the syntax of -merge and -fmt-merge-msg. For similar reasons I _understand_ why -pull behaves like it does. But others don't; they will shudder and then run. Maybe it is not important that -pull fetches all objects all over again. But it _is_ important to make things like merging branches (local or remote) trivial. It _is_ important to make the user experience be fun. Ciao, Dscho -
Heh. The "temporary branches" are actually the _original_ branches as far as git is concerned. The long-term branches only came later. So in many ways, HEAD, FETCH_HEAD, MERGE_HEAD and ORIG_HEAD are more fundamental than any long-term branch has ever been, and maybe they should be taught first as such. So you're newbie enough that you've only seen those new-fangled "real" branches. When I was young, we had to walk to school up-hill in three feet of snow Well, exactly because they are temporary, we can't actually trust the objects they point to. They have no "real" long-term life, so no, I'm afraid that we always will have to re-fetch the objects, because fetching them is the only way to know that we still have them. That said, we could certainly _make_ them be honored by things like "git prune" and friends. But yes, they really _are_ temporary branches right now, and part of the meaning of that "temporary" is exactly the fact that git fetch will not trust that you still have the objects. For example, if you used one of the old-fashioned commit walkers, maybe we got the initial commit, but we may not have gotten the whole _chain_. See? Well, we clearly should document them better. Anybody? Linus -
Hi, Nonono. We made _sure_ that FETCH_HEAD is only written once _all_ the objects were received. So, actually, we _can_, and we _should_ trust the objects they point to! Huh? I am quite certain that FETCH_HEAD is not updated in that case. If it is, that's a bug. Ciao, Dscho -
