Re: git: avoiding merges, rebasing

Previous thread: Re: git: avoiding merges, rebasing by Benoit SIGOURE on Monday, October 8, 2007 - 6:16 am. (1 message)

Next thread: [TIG PATCH] Add missing = for comparison in obsolete actions check. by James Bowes on Monday, October 8, 2007 - 10:30 am. (2 messages)
From: Benoit SIGOURE
Date: Monday, October 8, 2007 - 6:17 am

--Apple-Mail-19-941398214
Content-Transfer-Encoding: 7bit
Content-Type: text/plain;
	charset=US-ASCII;
	delsp=yes;
	format=flowed

[as usual, I forgot the attachment...]


I finally found some time to hack something and here is my git-merge- 
changelog Perl script.  I tested it quickly and it seems to work  
fine.  It will mess up the ChangeLog entries when a commit actually  
modifies an existing ChangeLog entry.  It needs more testing but I'm  
just throwing this out in the wild for people that have interest in  
this to review.  I will eventually come up with a solution for the  
commits modifying existing entries and a testsuite.

I'm CC-ing the Git ML just in case it interests some more people over  
there.

In order to use it, add the following to your ~/.gitconfig:
[merge "cl-merge"]
         name = GNU-style ChangeLog merge driver
         driver = /path/to/git-merge-changelog %O %A %B
(you can also add it to a specific working copy by adding it in .git/ 
config instead)

Then, in the directory where the ChangeLog is, add a .gitattributes  
file with the following content:
ChangeLog       merge=cl-merge

For more information, see man gitattributes.

Let me know if something goes wrong.

Cheers,

-- 
Benoit Sigoure aka Tsuna
EPITA Research and Development Laboratory


--Apple-Mail-19-941398214
Content-Transfer-Encoding: 7bit
Content-Type: application/octet-stream;
	x-unix-mode=0711;
	name=git-merge-changelog
Content-Disposition: attachment;
	filename=git-merge-changelog

#!/usr/bin/env perl
# Define a merge driver to use with Git to merge GNU-style ChangeLog entries.
# Copyright (C) 2007  Benoit Sigoure.

my $VERSION = '2007-10-08 12:33'; # UTC

# This program is free software; you can redistribute it and/or
# modify it under the terms of the GNU General Public License
# as published by the Free Software Foundation; either version 2
# of the License, or (at your option) any later version.
#
# This program is distributed in the hope that it ...
From: Johannes Schindelin
Date: Tuesday, October 9, 2007 - 3:43 am

Hi,


Two comments:

- by not inlining you made it hard to review your script.  Therefore I 
  will not do it.

- Try to avoid naming the script git-merge-*; these scripts/programs are 
  merge _strategies_, not merge _drivers_ (and yes, we already have two 
  programs violating this rule -- merge-base and merge-file -- but that 
  does not mean that you are free to add to the pile).

Ciao,
Dscho

-

From: Benoit SIGOURE
Date: Tuesday, October 9, 2007 - 11:06 am

I'm open to better suggestions.

Cheers,

-- 
Benoit Sigoure aka Tsuna
EPITA Research and Development Laboratory


From: Bruno Haible
Date: Tuesday, October 9, 2007 - 5:03 am

Hello Benoit,

Thanks for working on this. But this merge driver has a few major nits.


1) While my ChangeLog file was locally unmodified but some pulled in commits
   should modify it, "git pull" stopped and said:

ChangeLog: needs update
fatal: Entry 'ChangeLog' not uptodate. Cannot merge.

[I cannot swear on this, because I did not do a "git status" before the
"git pull", but this is in a directory where I usually have no pending diffs.]

The ChangeLog in question is the one from gnulib
(git clone git://git.sv.gnu.org/gnulib).


2) This "merge driver" did much more than sorting in a merge: it sorted the
entire file! In doing so,
  - It changed the order of ChangeLog entries in a way that does not represent
    the historical commit order.
  - For ChangeLog entries with multiple contributors, it shuffled around these
    extra contributors to other ChangeLog entries.
  - Near the end of the file, it made a change that I cannot explain.

Find attached a context diff of all the bad changes that it did.


In my opinion, a merge driver should not do changes to the file except
in the range of lines where the conflict occurred. For a ChangeLog driver,
all uncommitted entries should be collected at the top of the file, because
1. ChangeLogs are kept in the order of historical commit in the central
   repository,
2. other developers always look at the top of the ChangeLog; if a ChangeLog
   entries is inserted second or third after some already present entries,
   the danger is too high that the change gets unnoticed.

So "git-merge-changelog OLD CURRENT OTHERS" should IMO do the following:
1) Collect the changes between OLD and HEAD (I don't know if that is CURRENT
   or OTHERS?), in two categories:
     - added entries,
     - changed and removed entries.
2) Back out the added entries, keeping only the changed and removed entries
   as modifications.
3) Do a normal merge between this and the pulled in remote branch (I don't
   know if that is OTHERS or CURRENT?). If ...
From: Benoit SIGOURE
Date: Tuesday, October 9, 2007 - 11:19 am

I'll check but I'm afraid that Git bails out before actually trying  

Yes, it's pretty stupid, I hacked this in a hurry.  I'll try to  

OK I'll try to rework the driver so that it implements this.  It will  
take some time though, I'm quite busy these days.
Akim Demaille would also like it to squash the commits added by the  
merge (the new commits in OTHERS):

YYYY-MM-DD  Author  <who@where.com>

	Merge whatever:

	YYYY-MM-DD  Someone Else  <foo@bar.com>
	Some change.
	* FileChanged.c: Whatever.

	YYYY-MM-DD  Who Cares  <who@cares.com>
	Some other change.
	* OtherFile.c: Do it.

I thought this was mandated by the GNU Coding Standards but I  
checked, it doesn't say anything about merges.  Would this sort of  
strategy be useful to you?  Should it be default (or enabled by some  
--squash option)?

Cheers,

-- 
Benoit Sigoure aka Tsuna
EPITA Research and Development Laboratory


From: Bruno Haible
Date: Tuesday, October 9, 2007 - 12:38 pm

This merge is occurring in a different situation:

The situation where we need ChangeLog merging most often is when a developer
has made changes on his own and pulls in the changes from the remote repository
(via "git stash; git pull; git stash apply").

The situation that Akim is describing is that he pulls changes from the
repository of Someone Else and Who Cares, and then pushes them into the
central repository, under his responsibility.

For the first situation, the non-remote ChangeLog entries should be moved
to the top, without modification or indentation.

For the second situation, three different styles are in use at GNU
(because they don't use "Signed-off" lines):

1) unmodified copying of the ChangeLog entries:

YYYY-MM-DD  Someone Else  <foo@bar.com>
	Some change.
	* FileChanged.c: Whatever.

YYYY-MM-DD  Who Cares  <who@cares.com>
	Some other change.
	* OtherFile.c: Do it.

2) copying with lieutenant's email address, like Akim described it:

YYYY-MM-DD  Lieu Tenant  <who@where.com>

	YYYY-MM-DD  Someone Else  <foo@bar.com>
	Some change.
	* FileChanged.c: Whatever.

	YYYY-MM-DD  Who Cares  <who@cares.com>
	Some other change.
	* OtherFile.c: Do it.

3) similar, but with indentation of the entire copied-in ChangeLog entries:

YYYY-MM-DD  Lieu Tenant  <who@where.com>

	YYYY-MM-DD  Someone Else  <foo@bar.com>
		Some change.
		* FileChanged.c: Whatever.

	YYYY-MM-DD  Who Cares  <who@cares.com>
		Some other change.
		* OtherFile.c: Do it.

First of all, your merge driver could try to guess whether we're in the
first or second situation (maybe by testing whether the names in the
ChangeLog entry match the [user]name from the git config).

Then, for the second situation, there can be some flag in the driver or in
the git config that describes which of the 3 styles to apply.

Bruno

-

Previous thread: Re: git: avoiding merges, rebasing by Benoit SIGOURE on Monday, October 8, 2007 - 6:16 am. (1 message)

Next thread: [TIG PATCH] Add missing = for comparison in obsolete actions check. by James Bowes on Monday, October 8, 2007 - 10:30 am. (2 messages)