Hi, I've read the manual, and I belive I have a correct understanding of how the index works, technically speaking. Still, I'm not clear about the rational for such design. Almost any other decent system has an equivalent to cache the stat information (bzr calls this stat-cache, hg calls it dirstate IIRC). That is, if your run "$vcs diff" twice, the second run will only need to stat all files, never diff them. But the fact that git actually remembers the _content_ of files in the index, and that the default behavior for "commit" is to commit only the content that is explicitely "git add"ed is something I've never seen outside git. At first, I find it rather annoying. My usual workflow is <hack hack hack> % $vcs status % $vcs commit -m "describe whatever I did" <hack hack hack> ... With git, i'd do <hack hack hack> % git status % git add X % git add Y % git status % git commit or <hack hack hack> % git satus -a % git commit -a -m "..." In the former case, I have more commands to type, and in the second case, I loose part of the stat-cache benefit: If I run "git status -a" twice, the second run will actually diff all the files touched since the last run, since "git status -a" actually updated a temporary index, and discarded it afterwards, so it doesn't update the stat information in the index (while "git status" would have). In both cases, I can't really see the benefit. I'm pretty sure this is a FAQ, and I'm also pretty sure there are good arguments for it, but I can't find it anywhere. Thanks for your answers, -- Matthieu -
Hi, As a newbie, I'm agree with Matthieu: the Git's index is surprising for people coming from CVS/SVN (mindless?) world. So a good documentation about this, even in tutorials, is really important. In order to improve my productivity with Git, and in order to avoid traps around moving from SVN to Git, I often use the Git Emacs mode. It is really usefull for beginners as it works similarly for CVS, SVN and Git: synthetic view of all modifications, easy selection of what will be commited... The biggest drawback of this "porcelain": using it, you do not understand the Git's index philosophy. -- Guilhem BONNEFILLE -=- #UIN: 15146515 JID: guyou@im.apinc.org MSN: guilhem_bonnefille@hotmail.com -=- mailto:guilhem.bonnefille@gmail.com -=- http://nathguil.free.fr/ -
I think that the confusing thing isn't really the index, but the fact that git, by default, will make commits where the content in the commit is different from the content in the working directory. (In fact, you can use git-hash-object --stdin and git-update-index --cacheinfo to do a commit which shares no content at all with any present or past state of the working directory!) In other version control systems, you have to use some option or argument to make that kind of non-matching commit (and you're generally limited in how your commits can fail to match the working directory). I think the confusion is that git requires an option to say that you want the commit to match the working directory, as opposed to creating a non-matching commit, which is generally the more advanced and more unusual case. I think this is why people mostly get to understand the index by way of using it to resolve a conflicted merge: in that case, you have to make the index match the working directory before committing, and the index tracks your progress in reaching this state, which is the intuitive use of the index in normal situations. -Daniel *This .sig left intentionally blank* -
Hi, So, you are not only a newbie, but you have to unlearn some CVS braindamage. I don't know how to make it even more prominent that CVS users should read a special introduction first. AFAICT such a hint is in all the appropriate places. (I mean, you would not expect to be able to fly a plane, just because you have learnt to drive a car, wouldn't you?) Ciao, Dscho -
Hi, http://www.kernel.org/pub/software/scm/git/docs/tutorial.html does not talk about anything like that (it links to "Git for CVS users" but that's really just about importing from CVS and the shared repository workflow). On the other hand, I think the tutorial linked above gives quite a clear explanation of git commit -a, git add etc. Guilhem, what do you find missing in the tutorial about this topic? -- Petr "Pasky" Baudis Stuff: http://pasky.or.cz/ Ever try. Ever fail. No matter. // Try again. Fail again. Fail better. -- Samuel Beckett -
Well, people worried that documentation and command set before 1.5.0 exposed index too much, making learning curve too steep by having one extra thing people need to learn before starting to be productive with git. Now post 1.5.0 people are confused, quite rightly, that they are not told about index early enough. Let alone flying. Just taxiing straight was hard for me until I shook the habit I picked up from driving a car. -
git-gui is a good tool here (so good, in fact, that this is the second time today I spam the list about it). It shows very pedagogically the diff between HEAD and index, and the diff between index and working dir, and allows you to point and click your way to committing precisely the subset of changes you intended to commit. As an added bonus, it's perfectly usable even if you don't know anything about emacs. -- Karl Hasselstr
Yeah. You'd better get used to it, because it's fundamental.
Here's the rationale list:
- It's fundamentally the only sane thing to do.
Git tracks content at _all_ levels, not "files". So this is more than
an implementation issue, it's a fundamental "how the world works"
issue. The fact that everybody else gets it wrong is _their_ problem.
[ Corollary: the fact that your brain has rotted from using those
broken systems is obviously your problem, and sadly, there's nothing
else we can do than try to show the right way and hope that the
neurons re-generate. CVS has caused endless suffering, this is just
one small example of it ]
- You fundamentally cannot do it any other way.
Not doing it the way git does it (point to the content) means that the
index-replacement has to point to something else, namely a "file ID".
That's so broken as to be really really sad. In CVS, for example, there
obviously isn't any "file ID", so what does the "index" in CVS point
at?
Right. The "index" in CVS is the Entries file, and it not only lacks
stat information, it also lacks any other information, which means that
the "file ID" is _literally_ just the pathname itself. That causes
obvious problems, so nobody sane would ever suggest that this is a good
idea.
So what do other people use? They tend to not have understood the
"content is king" thing (which is what git uses), so they add somethng
*else* to the "index" file than the content. What can I say? People are
morons. I'm constantly amazed at just how stupid SCM people seem to be.
In most systems, that "something else" is a "file ID". That just means
that they are fundamentally broken whenever they do any trivial merge
with renames. Just don't do it. I've talked before about why tracking
file ID's is wrong - it's just as wrong as thinking that the "ID" of a
file is the path.
- Tracking content in the index is f...Thanks a lot for the detailed explanations.
Note that I'm not "complaining", but just not understanding something.
(I would actually complain about the documentation not being clear
enough, but I'll try to complain with a contribution instead ;-) I'll
Well, git's index still tells more than "the content FOOBAR exists,
Off course, I don't have strong argument against it. The biggest
annoyance is that my fingers are used to "commit -m message", and now
type "commit -a message", but ...
The reason why I'm posting this is that I was wondering whether
"commit -a" not being the default was supposed to be a message like
"you shouln't use it too often".
It seems it isn't. I'll just get used to "commit -a" (and probably
alias it), and discover the actual benefits of the index little by
^^^
Not sure this was intentional, but your spelling of "as" when used to
talk about CVS seems to reveal something about your state of mind ;-).
Thanks,
--
Matthieu
-As promised, here's a FAQ entry on the wiki: http://git.or.cz/gitwiki/GitFaq#head-3aa45c7d75d40068e07231a5bf8a1a0db9a8b717 Feel free to correct it. Anyway, thanks for the interesting discussion. -- Matthieu -
Heh. Making the index very visible makes sense when you are merging, Linus and Junio are both integrators and spend a lot of time merging. Hence the default is for git-commit to observe the index. I agree with Linus' other points too, but at the end of the day, it makes life easier and saner mainly when merging, at the expense of having to pay a bit more attention in common commits. The tradeoff makes sense _specially_ if you are the integrator. So I do git-commit -a, and typing that '-a' is small price to pay for the best SCM I've ever used ;-) cheers, martin -
Hi, You're saying that the main use of the index is to help merging. I have to disagree strongly. When I have been chasing a bug all over the place, and finally found it, my working tree is a mess. Lots of assertions, lots of debugging statements, some of them commented out. So, now it is cleanup time, right? The problem is that more often than not, I broke my fix while cleaning up. Therefore, I now put all changed files into the index (git add -u), and clean up the files one by one, always checking with "git diff" and "git diff HEAD" what I still have to do. Yes, very often I can just take the original version of a file (git reset --soft <file...> would be handy here), but it helped me quite a number of times to have my messed-up-but-working state in the index. In a sense, I am using the index as the stash commit we talked about every once in a while. Ciao, Dscho -
It is definitely true that some of the advantages of the way git does the index really start shinign when merging and you have content conflicts. What we've done to "git diff" really makes things a lot easier (and anybody who hasn't used "gitk --merge" after a content conflict really hasn't realized how *helpful* git is when merging content conflicts). However, in all honesty, while the whole "index for merges" comes from pretty damn early in git history (the whole "stage number" thing appeared on April 15th 2005 - so it was about a week after the first release), it wasn't the original impetus of the way git works. Git used explicit index updates from day 1, even before it did the first merge. It's simply how I've always worked. I tend to have dirty trees, with some random patch in my tree that I do *not* want to commit, because it's just a Makefile update for the next version (to remind me - I've released kernel versions too many times with an old version number, just because I forgot to update the Makefile). Or other things like that - I have small test-patches in my tree that I want to build, but that I don't want to commit, and I end up doing big merges and whole patch-application sequences with such a dirty tree (obviously if the patch or merge wants to change that file, I then need to do something about that dirty state, but it happens surprisingly seldom). So the whole "update stuff to be committed explicitly" ends up _really_ shining during a merge, but it actually is how I do non-merge development too. Linus -
Hmm, does this really work so well for you guys? Because thanks to Mr. Murphy, in my case, when I have some custom Makefile tweak, I always need to commit some unrelated changes involving Makefile more often than usual, and so on; so in general case, file-level changes exclusion doesn't really work so well for me. So this use of index seems to me really as a workaround for more fine-grained change control (in a similar way that rename following would be a workaround for lack of more fine-grained content moves tracking). I will have to look into git-gui's hunk-level control and maybe reimplement it in tig. -- Petr "Pasky" Baudis Stuff: http://pasky.or.cz/ Ever try. Ever fail. No matter. // Try again. Fail again. Fail better. -- Samuel Beckett -
Well, one thing is that I obviously mainly work on a relatively large project, and one that has been carefully de-centralized over a long long time, so the source code I work with - the kernel - may be more amenable to my workflow than most. For example, we have long long since tried to avoid having central files that everybody changes - because it's such a pain to manage, even with good automated merging (and even more with central people still using just series of patches). In other words, in well-maintained larger projects, you simply don't see those kinds of conflicts very often: people don't work on the same files very much. I regularly go for days, and easily merging hundreds of thousands of lines of changes, with a dirty tree, and the merges don't affect it at all. And if I happen to hit a dirty file, the pull will just say "cannot merge", and I can stash away my changes, and just re-do. So the cost of a Many people seem to enjoy per-hunk commits, but I seldom do that. Maybe it's just because I'm *so* comfortable with diffs, that when I clean up an ugly sequence of commits, what I do is literally: - I make sure that my ugly sequence of commits is on some temporary branch, but that the _end_result_ is good and clean (ie I will have tested the end result fairly well, and made sure that there are no debug statements etc crud left). I would call this branch something like "target", because the end result of that branch is what I'm looking for - even if the commits in the sequence that gets me there are individually ugly! - I just switch back to my starting point (and now I'm usually on "master"), and do git diff -R target > diff to create a diff of my current tree (which is initially the starting point) to the good result. - I actually edit the "diff" file by hand, and edit it down to the part I actually want to commit as the first in the series. And then I just do a "git-apply diff" to actu...
I obviously agree with this. As I said a few times I regret introducing "add -i" --- it encourages a wrong workflow, in that what you commit in steps never match what you had in the working tree and could have tested until the very end. -
Which is why I'm considering shelving support (of some kind) in git-gui... but I'm probably not going to take away the current index view, nor am I going to take away the current hunk selection. But I would like to make it easier for non-patching-editing gods (Linus) to pull hunks in from a shelf, test them, and commit them. Said shelf probably would be another branch, much as Linus' nicely documented workflow does... -- Shawn. -
FWIW, Cogito supports shelving of uncommitted changes when switching a branch (so that they are not retained through the switch but restored when you switch back to the original branch) by committing the local changes to refs/shelves/HEADNAME. -- Petr "Pasky" Baudis Stuff: http://pasky.or.cz/ Ever try. Ever fail. No matter. // Try again. Fail again. Fail better. -- Samuel Beckett -
On the other hand, not all changes require any testing at all. For example, if you're using git to manage documentation, it is totally reasonable to commit a fix for a simple spelling error in one part of a file while not committing an in-progress rewrite of another part. -Steve -
Yeah, I don't think "git add -i" is a horrible flow - it just shouldn't be the only or the primary one (ie apparently it *is* the primary one for darcs, and that's a mistake!) Of course, whether "git add -i" is a nice interface or not, I dunno. Personally, if I wanted to do hunk selection, I think I'd stick to something graphical where I can just click on the hunks. But that's just me. Linus -
Note that darcs has a way to test before commit even for partial commits. It re-creates your working tree, hardlinking unmodified files, and runs a command there as a precommit hook. I still prefer the old good "you commit what's in the tree, and run whatever you want before commit", but their approach seems interesting also in this case. -- Matthieu -
It only sounds like a complicated sequence because you didn't write a script to do it... $ git checkout -b clean origin $ git-refine target (edit the patch in the editor that pops up) $ git-refine Test changes and commit $ make test ... $ git commit (write message) $ git-refine (edit the patch, etc) ... $ git commit $ git-refine All done. I actually wrote it years ago, but I couldn't describe my workflow well enough, so I didn't submit it. If everybody seems to be doing the same thing, I can submit my script... -Daniel *This .sig left intentionally blank* -
Well, I actually think it sounds like a complicated sequence because I tried to explain what I do. The "script" parts don't really end up being any smaller, and not scripting it actually means that I can (and often do) things outside of a strict scripting environment. As mentioned, I not only mix it up with "git cherry-pick", but since I just use "git diff", I can - and do - things like pick only a certain set of files to diff and edit the patch on. So it's an iterative process at several levels (the "outer" level is the act of actually committing each change, and iterating to the next one, while the "inner" level is often a sequence of "git diff" exploration), it's not very fixed. For example, when I said that I do a git diff -R target > diff that's not strictly true. The "git diff -R" is useful for comparing the current working tree to another commit, but quite often I actually end up doing it differently, and doing it as git diff ..target file > diff .. edit .. git apply diff or, if I don't need the edit (ie just the fact that I limit it to a single file is a sufficient "edit" in itself), I might just do git checkout target file instead, which will fetch the whole file from the "target" branch (and also update it in the index, which may or may not actually be what I want, but that's a different issue). So the "process" as far as I'm concerned is actually much more fluid than necessarily always working with diffs. Git gives you so many ways to do things like this, and I'm pretty comfortable with lots of them. Linus -
Geez, this is similar [in nature, not scale] to what I've been doing. After reading about people "right-clicking on hunks in git-gui", I was convinced I needed to force myself to do more manipulations inside git itself. Hmm... Maybe, in addition to [or in] the User Manual, git should have some workflow examples, which have been cribbed from various emails on this list? Thanks, -- Dana L. How danahow@gmail.com +1 650 804 5991 cell -
That's something several people have asked for, and I think it's a great idea--I just haven't personally had much time to get to it. But I'd happily take even very rough patches and help get them into shape. The way I'd thought of doing it was having an "examples" section at the end of each chapter, with subsections for each individual example; see the one at the end of the "exploring git history" chapter: http://www.kernel.org/pub/software/scm/git/docs/user-manual.html#history-examples They shouldn't use the material introduced in the associated chapter, but it's also OK to introduce new commands (with references to the man pages) when their use in the example is pretty self-explanatory. (In fact, this is a great way to introduce more commands and options--git has so many that it would be tedious to try to be comprehensive, but they'd fit well in examples.) The patch-editing stuff discussed above might fit best at the end of "rewriting history and maintaining patch series". --b. -
There is some workflow-related discussion accumulated over years in Documentation/howto/, some of them also already suffering quite of a bitrot. :-( -- Petr "Pasky" Baudis Stuff: http://pasky.or.cz/ Ever try. Ever fail. No matter. // Try again. Fail again. Fail better. -- Samuel Beckett -
Yup. I think we should one-by-one update those and suck them into the manual. (Patches accepted!) --b. -
On Wed, 9 May 2007 08:52:09 -0700 (PDT), Linus Torvalds wrote: [Snip good description of rebuilding a branch to meet some "target" state.] That's all really good stuff. And as you mentioned you sometimes use cherry-pick during this rebuilding, one can also use "git add -i" to help with splitting up an ugly commit that should have been multiple commit. For example, a sequence might look like this, (I always use "desired" where you use target): git diff HEAD desired | git apply git add -i git commit git reset --hard # test here and commit --amend as needed And repeat that as needed. It's really no different than your "edit the diff" approach. It's just using "add -i" instead of a text editor. But I do admit that the commit;reset;test;--amend sequence This reminds me of a confusing semantic issue that came about with the "new" add. It can be quite natural to commit a single file in one step with: git commit some-file.c or to do that in two steps with: git add some-file.c git commit (which is particularly useful if one wants to add multiple files). I recently found myself wanting to do a similar thing with a directory path. I can commit a path with: git commit path/ but I don't get anything at all like the same semantics if I do: git add path/ git commit (since "git add" will recursively add all untracked files under path/). Now the "recursively add all files" behavior is older, and has been an essential part of git-add forever. But I found it to be not at all what I wanted in this case, (where I'm now trained to say "git add" to stage things into the index). I don't know of any good fix for the problem now. Maybe I'll just need to remember to break out that old "git update-index" for a situation like this, but that sure feels clunky. -Carl
Totally, when merging git's approach is incredibly useful. gitk --merge and the resolved conflicts not appearing in the default git diff is great stuff. For for small, simpleminded and mostly-linear development it's not that important. Of course, I use git on projects large and small, so I can understand it. For someone using it with a small mostly-linear project, the whole index thing is overkill, and the explanations On a large project it's always a good idea to commit with explicit paths -- regardless of your SCM. As it happens, I have to use explicit paths with CVS, or it'll punish me by taking solid minutes to do a 2 file commit. I am sure that the mozilla and OpenOffice developers using CVS also commit with explicit paths. Life's too short to waste an hour. (The times are from working on Moodle, hosted on SF.net with ~4K files, 700 directories.). cheers, m -
Hi, As you pointed out yourself, the index _has_ an idea of the content of that file. So, arguably, it does not point to _that_ file, but rather to Just another reason to hate CVS. Because it trained people to do that. If it was not for the training by CVS, I would have strongly opposed to the introduction of the "-m" switch to commit. It _encourages_ bad commit messages. Now, with Git I usually let git-commit start up the editor. Because then I am actually encouraged to make up my mind, and put down a meaningful message, which might not only help _others_ to understand why I did it, IMHO yes, that is the message. In addition to being nice to people used to the behaviour of "git commit" _without_ other arguments. Ciao, Dscho -
Well, this really depends on the use-case, size of commit, ... I often use a version control system for very low importance stuff. I don't want to type a 3-lines long message to describe a 2-lines long change in my ~/.emacs.el for example. I also work with people using (sorry) svn to work collaboratively, but they don't even provide a log message: the version control system here is just a replacement for unison/NFS/whatever other way to have people edit files from different machines. For sure, in a context where code quality and review is important, -m "xxx" isn't the way (except if you prefer your shell's line editor to your actual editor). -- Matthieu -
Hi, I positively _hate_ empty commit messages. There is _always_ something to I also find it very useful for my own pleasure when reviewing some logs. I track config files, small scripts, documents, etc. with Git, and I found myself looking for something in _all_ of them. The commit messages helped. Commit messages, BTW, are somewhat of an artform. You cannot imagine how slow I am writing them, because they should be helpful not only for the reviewer, but also for the casual git-blame user, who wants to find out the rationale of a change. Ciao, Dscho -
Hi, I'm maybe somewhat standing out of the crowd, but I sometimes use -m for *very* long commit messages - just using separate -m parameters for paragraphs and writing on; I tend to find it much more natural than spawning an editor. Only when I find later that I've made an ugly typo in the middle of 250-characters commandline or I figure out that I should add some figure to the message, I throw in -e at the end and add the final touches. But I agree that commit messages are somewhat of an artform, and just finding a good headline can be quite difficult sometime. :-) -- Petr "Pasky" Baudis Stuff: http://pasky.or.cz/ Ever try. Ever fail. No matter. // Try again. Fail again. Fail better. -- Samuel Beckett -
Well, personally I practically never use it, I find that having a separation between what the current state of my tree is and what will be comitted to be one of the really "oh wow, why doens't everything else do this?" features. However, i tend to be working on more than one thing at once, and switch between them - so I commit work on A while work on B is still unfinished, then start C, finish B some point later and commit it, and then I can finish C. Git is the first VCS that supports a butterfly "git add -i" - this is a feature I have wanted since I started using version control ... -- Julian --- Your good nature will bring you unbounded happiness. -
git-gui is really handy for adding/committing a subset of the changes in your working tree. Especially for those of us with goldfish memory, since it's so easy to see exactly what's happening: what's going to be I thought "git add -i" was the best thing since sliced bread -- until I found the same feature in git-gui, but with a _much_ better interface. Just right-click on a hunk in a diff, and you have the option of staging/unstaging that hunk. Pure magic. -- Karl Hasselstr
"git add -i" has a hunk splitting feature that git-gui lacks. I'm thinking of adding features to git-gui to let you select a region of a hunk using the text selection, and then stage only that selection. I also want to let you revert hunks from the working directory copy. But after reading Junio's comments about "git add -i" being a possibly bad idea and instead letting you park everything into a shelf, reset --hard your working directory to HEAD and then pull things back off the shelf to be staged, I might want to do that differently in git-gui... like use a shelf. ;-) But I'm glad someone else finds the hunk feature useful in git-gui. I use it far too often myself. -- Shawn. -
On 2007-05-07 21:41:14 -0400, Shawn O. Pearce wrote: > Karl Hasselstr
Yea, I've played that game before too (reduce content lines) to try and simulate a hunk splitter. ;-) Doesn't always work. Right now I feel like a huge chunk of the git-gui code is simply not maintainable. The 0.7.0 release is really more about refactoring the code to make it more maintainable, than it is about actual features (though there are some new things, like vi-keys). The hunk selection stuff is just one part of the 2,000 lines still left in git-gui.sh itself, and that still uses a lot of messy globals. I want to get the code better organized before True, but that beats the tar out of copying the - lines to your clipboard and pasting them into your text editor, then deleting the - prefix. Especially if its a couple of hunks that you want to revert. Which I find myself doing all to often. Actually I work around it today by staging what I care about, then reverting the file. Since the revert comes out of the index, I get (mostly) the same action as reverting a particular hunk. But it does mean that I lose my index state, if that happened to I haven't looked at StGIT in a while. I've seen noise on the list about nifty features being added, but I haven't kept up with what those features actually are. I think you are right about this and maybe git-gui should try to be compatible with StGIT's unapplied Indeed; I was thinking that this very morning. Making an index that you stage things into, but then also saying you cannot really do that and instead have to shelve what you don't want - that's just evil. I'll have to think about it more. The blame interface in git-gui needs help more than the index staging features. The colors suck. ;-) -- Shawn. -
Indeed. Git's index is basically very much defined as - sufficient to contain the total "content" of the tree (and this includes all metadata: the filename, the mode, and the file contents are all *parts* of the "content", and they are all meaningless on their own!) - additional "stat" information to allow the obvious and trivial (but hugely important!) filesystem comparison optimizations. So you really should see it as *being* the content. The content is not the "file name" or the "file content" as separate parts. You really cannot separate the two. Filenames on their own make no sense (they have to have file content too), and file content on its own is similarly senseless (you have to know how to reach it). What I'm trying to say is that git fundmaentally doesn't _allow_ you to see a filename without its content. The whole notion is insane and not valid. It has no relevance for "reality". Also, you should realize that when you do git add X you are *not* adding the filename X. No, "X" is literally a "content path pattern", the same way it is when you do something like gitk X and it's worth always keeping in mind that in neither case is "X" necessarily a single file, but literally a pathname pattern that is used as a "filter" on all the possible patterns. (Of course, the filtering rules are different for "git add" and "gitk": in the "git add" example, you filter the working tree files, while in "gitk" you filter the files that git already knows about, so they are different, but in both cases you really should think of them as filters, not as "filenames", even though one _trivial_ filter is to give a filter that No, "git commit -a" is undoubtedly _convenient_. You can use it as often as you like. So as long as you see it as a convenience feature, and realize that "git commit" is actually a lot more powerful than just being able to always do the convenient, go on and use "git commit -a" all the time. When you h...
Hi, The benefit is a clear distinguishing between DWIM and low level. The index contains _exactly_ what you told it to contain. By forcing users to use "-a" with "git commit", you make it clear that a separate update steo is involved, and if you made an error (which you see from the file list), you can abort, and start over with the original index. Hth, Dscho -
Reading my message (including the last 5 words of the sentence you're In other systems, commit commits _exactly_ the content of files on Well, with those kind of arguments, I could have my web browser not do DNS resolution for me, because it would make it clear that a separate step from HTTP request is involved. Still, this low-level thing brings no benefit to the user, and I know no web browser forcing the user to You don't necessarily see your error from the file list: % vi foo.c % git add foo.c % vi foo.c % git commit -m foo [...] create mode 100644 foo.c % This commited the old content of foo.c, while I hardly see any scenario where this is the expected behavior. Then, being able to repare the error if I made it is interesting, but I don't get the reason why the error could not just be avoided. Well, indeed, I just found a thread talking about this: http://lists-archives.org/git/196050-making-git-commit-to-mean-git-commit-a.html I'll go through it, I might understand better after that ;-). Thanks, -- Matthieu -
Hi Matthieu, Okay, I rephrase the (badly worded) question: Because they do not realize that the file _names_ are actually only a key, not the value. With Git, it is possible to stage changes, but also to have a dirty stage. Think, for example, about debugging a program. Many programs have Makefiles, which define CFLAGS without "-g". Now you want to debug. Since gdb acquired the bad habit of not working properly at all without that flag (which is especially apparent when single stepping jumps around wildly in the source code), you _have_ to change the Makefile to include "-g" with the CFLAGS. But you don't want to commit _that_. It is no useful change for the project. Submitting such a patch makes you look foolish. So, you leave it out of the commit. And to make you _aware_ that it is a real possibility, and often a desirable one, git-commit makes you specify "-a" when you are _sure_ that you want to commit _all_ of your changes to the tracked files. With CVS (which has been bashed on a lot on this list, and rightfully so), after a mistaken "cvs commit" _with_ irrelevant changes, like the change to the Makefile I illustrated above), you have two options: - leave it as it is (possibly undoing the change in a subsequent commit), or - edit the files, which often leads to an inconsistent repository. Yeah, sure, you can checkout the newest state, but you cannot Well, I use it quite a lot. But 30% of the time, I prefer to commit with specific filenames, so I can be sure _what_ I commit. FWIW, I picked up on that practice when using CVS... There are even about 20% of the time, when I use "git commit" _without_ any parameters, because I used "git add" to tell Git that I resolved some conflicts, or that I want this file to be committed, while other files No. _You_ never need to tell the browser _not_ to resolve via DNS. But _you_ sometimes _need_ to commit with _different_ parameters than "-a". You might not realize that _now_. But a...
You might find it useful to break your question into 2 pieces. One is what information should be in the index, which essentially is what Linus addresses. The way I look at this, at the moment, is that the index contains whatever's required to make git-write-tree work without collecting information elsewhere. I suspect this is the correct historical way to look at this, but I wasn't on this list then. The other is how to get information into the index. I think this is the original thing that seemed strange to you? It did to me. But, in part, since git has both "git-commit" and "git-commit -a", this is somewhat recognized. I've wondered if there's a way to improve this, but I don't have any coherent ideas right now. Thanks for finding and posting that thread; that was helpful. Also, the idea of an index isn't all that strange. I need to use perforce at work, and it has an index (called "db.have"). But it is stored on the server and has everyone's state mixed together, uses the type of file IDs Linus complains about, and is more difficult to manipulate (hence less useful). Being on the server is a great performance bottleneck as well. Dana -- Dana L. How danahow@gmail.com +1 650 804 5991 cell -
One reason why is because you are using "-m foo" (a very non-descriptive commit message that would not help anybody including yourself in the future). Try the above without giving such a bogus error message with "-m" to commit, but instead let it spawn your editor --- you would be doing that in real-life when you are doing anything nontrivial. Then notice what appears on the file list of "Changed but not updated" section. A single liner "-m" is handy for "Oops, typofix in foo.c" kind of commit, but in such a case you literally would be changing only the typofix and won't have "edit foo.c; git add foo.c; edit foo.c; git commit" sequence anyway. I think Linus explained quite well to correct your doubts in your original message, and I do not have anything to add. -
I don't get this argument - I frequently write quite long descriptions inside the -m argument(s), since I just find it more convenient than having to edit it in an editor, for various reasons. So there is really no reason why the "-m is only for short single-liner commit messages" hypothesis could hold true. -- Petr "Pasky" Baudis Stuff: http://pasky.or.cz/ Ever try. Ever fail. No matter. // Try again. Fail again. Fail better. -- Samuel Beckett -
Hi, Another reason is that you can see how the end result will look like in an editor. For example, you'll have a hard time making sure in the command line that the lines are no longer than 76 characters. Ciao, Dscho -
Hi, oh, indeed - good point. cg-commit uses fmt to format the message, I think git-commit should do the same; let's see how controversial such a change would be. --- This makes git-commit filter log messages provided on commandline by fmt, thus making nice paragraphs from them. This makes it possible to specify even long commit messages on command line without worrying about this, akin to cg-commit. Signed-off-by: Petr Baudis <pasky@suse.cz> --- git-commit.sh | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/git-commit.sh b/git-commit.sh index f28fc24..28cbb55 100755 --- a/git-commit.sh +++ b/git-commit.sh @@ -432,7 +432,7 @@ fi if test "$log_message" != '' then - echo "$log_message" + echo "$log_message" | fmt elif test "$logfile" != "" then if test "$logfile" = - -- Petr "Pasky" Baudis Stuff: http://pasky.or.cz/ Ever try. Ever fail. No matter. // Try again. Fail again. Fail better. -- Samuel Beckett -
Two points. * You would not want to wrap the first line; * 75-column is not ideal for every project, so this needs to be customizable; * If we were to munge the given message, we would probably also want to enforce "single-liner summary, empty line, and then the rest" convention. Well, I have three there, but I suspect the first two somebody else may have said already, so... This is slightly related, but I have been wondering about the interaction with "single-liner summary, empty line and then the rest" convention and various commands in the log family. Currently, --pretty=oneline and --pretty=email (hence format-patch) take and use only the first line. I think we could change it to: - take the first paragraph, where the definition of the first paragraph is "skip all blank lines from the beginning, and then grab everything up to the next empty line". - replace all line breaks with a whitespace. This change would not affect well-behaved commit messages that adhere to the convention, as their first paragraph always consist of a single line. On the other hand, people from different culture can get frustrated by their commit message chomped at the first linebreak in the middle of sentence right now, which would be helped by this change. Their Subject: and --pretty=oneline output would become very long and unsightly, but their commit messages are already ugly anyway, and such a change at least avoid the loss of information. If we were to do this, Subject: line would most likely use RFC2822 line folding at the places where line breaks were in the original, but that goes without saying. What do people think? -
Hi, FWIW, I have a builtin git-fmt in my local repo, which uses the (slightly enhanced) functions in utf8.c... Maybe after 1.5.2 I dare to submit this... Ciao, Dscho -
I wouldn't do that for the first line of the message.
Someone typing
$ git commit -m "a very very very very very very very very very very very very long summary" \
-m "a longer description of the above summary"
Probably doesn't want his first line to be broken (otherwise,
git-format-patch and other tools would be confused).
So, that would be more like
echo "$log_message" | (read first_line; echo "$first_line"; fmt)
Perhaps another option would be to provide, say, a -M option, doing
log_message="$log_message
$(echo $1 | fmt)"
to allow people to explicitely say whether they want reformatting. But
that's probably overkill.
--
Matthieu
-Hmm, I don't really know if it's more evil to split an extra-long line to two or keep it longer than the maximum sane width. Since I'm torn, I'd prefer to go for the version that's simpler (also, avoids weird results for those who for some reason chose not to follow the usual convention, but that's a minor point). I don't really care, but if noone else does either, I'd stay with the current simple version. :) -- Petr "Pasky" Baudis Stuff: http://pasky.or.cz/ Ever try. Ever fail. No matter. // Try again. Fail again. Fail better. -- Samuel Beckett -
The evil already happened several times in git's repository ;-). $ git log --all --pretty=oneline | grep \ ' ................................................................................' \ | wc -l 81 $ When I encounter such long line, I often just don't care, since my terminal or tool (gitk ...) is often more than 80 char. And in the cases I care, the fix is just to enlarge the window or to scroll (only people using a text-mode console would _really_ be disturbed). With the other solution (breaking the line automatically), I have no easy fix. In gitk, I have the beginning of a sentence in the summary field, in a mailed patched, I have the sentence split between the Subject: header and the body. (but we agree that both cases are evil. Perhaps just "ERROR: you're doing evil" would be better ...) -- Matthieu -
| David Miller | Slow DOWN, please!!! |
| Florian Schmidt | blacklist kernel boot option |
| Ryan Hope | reiser4 for 2.6.27-rc1 |
| Nick Piggin | [rfc] x86: optimise page fault path a little |
git: | |
| Ken Pratt | pack operation is thrashing my server |
| Karl | Re: Git checkout preserve timestamp? |
| Greg KH | Re: [ANNOUNCE] pg - A patch porcelain for GIT |
| Johannes Schindelin | Re: [PATCH] RFC: git lazy clone proof-of-concept |
| Richard Stallman | Real men don't attack straw men |
| new_guy | Longest Uptime? |
| GVG GVG | ssh_exchange_identification: Connection closed by remote host |
| Steve B | Intel Atom and D945GCLF2 |
| Sam Fourman Jr. | amd64 NVIDIA support in FreeBSD 7 |
| Peter Schuller | Re: When will ZFS become stable? |
| Justin Hibbits | Re: [ANNOUNCEMENT] Wiki for discussing P35/IHC9(R)/SATA issues set up |
| Kostik Belousov | Re: sbrk(2) broken |
| VPN's on NetBSD | 3 hours ago | NetBSD |
| Why does uClinux 2.6.18 bootup block SuperIO UART IRQs that BIOS configured | 4 hours ago | Linux kernel |
| USB statistics | 6 hours ago | Linux kernel |
| Block Sub System query | 10 hours ago | Linux kernel |
| kernel module to intercept socket creation | 11 hours ago | Linux kernel |
| Image size changing during each build | 11 hours ago | Linux kernel |
| Soft lock bug | 16 hours ago | Linux kernel |
| sysctl - dynamic registration problem | 22 hours ago | Linux kernel |
| Question on swap as ramdisk partition | 1 day ago | Linux kernel |
| serial driver xmit problem | 1 day ago | Linux kernel |
