Re: [PATCH] Add `git diff2`, a GNU diff workalike

Previous thread: Re: Dangers of working on a tracking branch by Jakub Narebski on Thursday, February 15, 2007 - 7:00 pm. (8 messages)

Next thread: [PATCH] git-blame: prevent argument parsing segfault by Tommi Kyntola on Friday, February 16, 2007 - 12:38 am. (2 messages)
From: Johannes Schindelin
Date: Thursday, February 15, 2007 - 9:01 pm

Git does have a wonderful diff engine. For example, colored diffs
really shine, and there are other useful options like --check,
--patch-with-stat, etc. I always dreamt of using this diff engine
also outside of a git repository.

With this commit, you can say

	git diff2 file1 file2

to compare the (possibly untracked) files "file1" and "file2", and

	git diff2 dir1 dir2

to compare the directories "dir1" and "dir2".

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 .gitignore      |    1 +
 Makefile        |    1 +
 builtin-diff2.c |  163 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
 builtin.h       |    1 +
 diff.c          |    3 +-
 git.c           |    1 +
 6 files changed, 169 insertions(+), 1 deletions(-)
 create mode 100644 builtin-diff2.c

diff --git a/.gitignore b/.gitignore
index 9b5502b..9809a7d 100644
--- a/.gitignore
+++ b/.gitignore
@@ -32,6 +32,7 @@ git-cvsserver
 git-daemon
 git-describe
 git-diff
+git-diff2
 git-diff-files
 git-diff-index
 git-diff-tree
diff --git a/Makefile b/Makefile
index ea2b4f5..94b1c75 100644
--- a/Makefile
+++ b/Makefile
@@ -284,6 +284,7 @@ BUILTIN_OBJS = \
 	builtin-count-objects.o \
 	builtin-describe.o \
 	builtin-diff.o \
+	builtin-diff2.o \
 	builtin-diff-files.o \
 	builtin-diff-index.o \
 	builtin-diff-tree.o \
diff --git a/builtin-diff2.c b/builtin-diff2.c
new file mode 100644
index 0000000..1de82c1
--- /dev/null
+++ b/builtin-diff2.c
@@ -0,0 +1,163 @@
+#include "cache.h"
+#include "diff.h"
+#include "diffcore.h"
+#include "path-list.h"
+
+static const char diff2_usage[] = "git diff2 [diff-opts] file1 file2";
+
+static int read_directory(const char *path, struct path_list *list)
+{
+	DIR *dir;
+	struct dirent *e;
+
+	if (!(dir = opendir(path)))
+		return error("Could not open directory %s", path);
+
+	while ((e = readdir(dir)))
+		if (strcmp(".", e->d_name) && strcmp("..", e->d_name))
+			path_list_insert(xstrdup(e->d_name), ...
From: Junio C Hamano
Date: Friday, February 16, 2007 - 1:06 am

Why diff2?  I would have expected this to be a new low-level
sibling of diff-files and diff-index, which in turn hook into

I am still debating myself if these should be lstat(2) instead

If a/frotz is a file and b/frotz/nitfol is there, I do not think
we show an error; we say "a/frotz" was removed (see notes below,

I suspect your favorite path-list might not be optimal for this
kind of codeflow; wouldn't reading everything in an expanding
array and sorting them with a single qsort() after reading one

I think you would want to be consistent with how git sorts paths
here.  In particular, you can always pretend that a path that is
a directory has '/' at the end.  Which (as a side effect) means
that you do not have to worry about a/frotz and b/frotz/nitfol,
because element from p1 will be "frotz" and the one from p2 will
be "frotz/" in this case.  You will never feed queue_diff() with

The rest looks quite straightforward use of diffcore API, done

I didn't look, but you might also need to teach diffcore-rename
that two objects both with null object names are not equal.

-

From: Johannes Schindelin
Date: Friday, February 16, 2007 - 7:01 am

Hi,


I planned this. However, I could not think of any sane way to handle the 
case in an intuitive manner where you want to compare two _tracked_ files. 
How would you do that in a consistent manner?

BTW I also realized that `git diff bla` will _not_ complain when the file 
bla exists, but is untracked. Neither will it show a diff. I did not come 

Well, it should probably be lstat(), with an addremove when one or both 


You are right. But for the moment, I'll leave it, since it is one of the 
least performance critical parts of diff2, _and_ the use of path-list is 

At that point, the entries in p1 and p2 are from readdir(), so I do not 
know the type (e->d_type is non-portable according to my man-page here).



But would not every diff be marked as a rename then?

Ciao,
Dscho

-

From: Johannes Schindelin
Date: Friday, February 16, 2007 - 7:20 am

Hi,


Thinking about this again, I do not know of any patch implementation which 
removes a directory which just became empty, so dir->file is a real 
problem. Also, if dir is empty, another problem looms.

So at least the dir->non-dir case should be an error.

BTW I just realized that diff2 as-is will output a diff header for _every_ 
file pair, even if they do compare equally. Actually, I like it, so I 
don't want to imitate GNU diff behaviour here.

Ciao,
Dscho

P.S.: Here is a quick-fix patch on top of my original (tested with 
dir->file (error!), file->dir, link->link, link->null, file->link):

 builtin-diff2.c |   16 +++++++++++-----
 1 files changed, 11 insertions(+), 5 deletions(-)

diff --git a/builtin-diff2.c b/builtin-diff2.c
index 1de82c1..234dabb 100644
--- a/builtin-diff2.c
+++ b/builtin-diff2.c
@@ -15,7 +15,7 @@ static int read_directory(const char *path, struct path_list *list)
 
 	while ((e = readdir(dir)))
 		if (strcmp(".", e->d_name) && strcmp("..", e->d_name))
-			path_list_insert(xstrdup(e->d_name), list);
+			path_list_insert(e->d_name, list);
 
 	closedir(dir);
 	return 0;
@@ -28,18 +28,24 @@ static int queue_diff(struct diff_options *o,
 	int mode1 = 0, mode2 = 0;
 
 	if (name1) {
-		if (stat(name1, &st))
+		if (lstat(name1, &st))
 			return error("Could not access '%s'", name1);
 		mode1 = st.st_mode;
 	}
 	if (name2) {
-		if (stat(name2, &st))
+		if (lstat(name2, &st))
 			return error("Could not access '%s'", name1);
 		mode2 = st.st_mode;
 	}
 
-	if (mode1 && mode2 && S_ISDIR(mode1) != S_ISDIR(mode2))
-		return error("file/directory conflict: %s, %s", name1, name2);
+	if (mode1 && mode2) {
+		if (S_ISDIR(mode1) && !S_ISDIR(mode2))
+			return error("Cannot handle dir->file: %s -> %s",
+				name1, name2);
+		if (!S_ISDIR(mode1) && S_ISDIR(mode2))
+			return queue_diff(o, name1, NULL) ||
+				queue_diff(o, NULL, name2);
+	}
 
 	if (S_ISDIR(mode1) || S_ISDIR(mode2)) {
 		char buffer1[PATH_MAX], buffer2[PATH_MAX];
-

Previous thread: Re: Dangers of working on a tracking branch by Jakub Narebski on Thursday, February 15, 2007 - 7:00 pm. (8 messages)

Next thread: [PATCH] git-blame: prevent argument parsing segfault by Tommi Kyntola on Friday, February 16, 2007 - 12:38 am. (2 messages)