Re: [PATCH] Also use unpack_trees() in do_diff_cache()

!MAILaRCHIVE_VOTE_RePLACE
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]
To: Johannes Schindelin <Johannes.Schindelin@...>
Cc: Steffen Prohaska <prohaska@...>, Junio C Hamano <gitster@...>, Git Mailing List <git@...>
Date: Sunday, January 20, 2008 - 5:53 pm

On Sun, 20 Jan 2008, Linus Torvalds wrote:

This cleanup patch really looks big and fairly ugly, but it splits up the 
magic rules for "unpack_trees()" and adds a lot of comments about what is 
going on, and in the presense lays the groundwork for doing a much better 
job on unmerged entries - it now keeps track of how many index entries we 
have that match the tree entry we found, so it should be pretty trivial 
to now add the logic to do a combined diff..

It replaces the previous patch, and should fix both of the issues I 
mention above. It adds a lot more lines than it removes, but much of that 
is due to some comments and a lot cleaner abstractions, so that the 
resulting code is, I think, quite a bit more obvious.

The unpack_trees() interface really isn't trivial (it can't be: the things 
we need to do with the index for a merge - at the same time as traversing 
it! - are quite involved). So clarifying all that is definitely worth it.

With this patch in place, I think Dscho's two other patches are now good 
to go, and we now generate the same output we used to. In addition, we now 
have the infrastructure to generate *better* output, so if we want to make 
"git diff HEAD" generate a nice combined diff, this sets the stage for 
that.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
---
 diff-lib.c |  112 ++++++++++++++++++++++++++++++++++++++++++++++++------------
 1 files changed, 90 insertions(+), 22 deletions(-)

diff --git a/diff-lib.c b/diff-lib.c
index 23d0fa6..2a3a9ff 100644
--- a/diff-lib.c
+++ b/diff-lib.c
@@ -668,45 +668,113 @@ static void mark_merge_entries(void)
 	}
 }
 
-static int oneway_diff(struct cache_entry **src,
-	struct unpack_trees_options *o,
-	int remove)
+/*
+ * This gets a mix of an existing index and a tree, one pathname entry
+ * at a time. The index entry may be a single stage-0 one, but it could
+ * also be multiple unmerged entries (in which case you idx_pos/idx_nr
+ * will give you the position and number of entries in the index).
+ */
+static void do_oneway_diff(struct unpack_trees_options *o,
+	struct cache_entry *idx,
+	struct cache_entry *tree,
+	int idx_pos, int idx_nr)
 {
-	struct cache_entry *idx = src[0];
-	struct cache_entry *tree = src[1];
 	struct rev_info *revs = o->unpack_data;
 
-	/*
-	 * Unpack-trees generates a DF/conflict entry if
-	 * there was a directory in the index and a tree
-	 * in the tree. From a diff standpoint, that's a
-	 * delete of the tree and a create of the file.
-	 */
-	if (tree == o->df_conflict_entry)
-		tree = NULL;
+	if (o->index_only && idx && ce_stage(idx)) {
+		if (tree)
+			diff_unmerge(&revs->diffopt, idx->name, idx->ce_mode, idx->sha1);
+		return;
+	}
 
 	/*
 	 * Something added to the tree?
 	 */
 	if (!tree) {
-		if (ce_path_match(idx, revs->prune_data))
-			show_new_file(revs, idx, o->index_only, 0);
-		return 1;
+		show_new_file(revs, idx, o->index_only, 0);
+		return;
 	}
 
 	/*
 	 * Something removed from the tree?
 	 */
 	if (!idx) {
-		if (ce_path_match(tree, revs->prune_data))
-			diff_index_show_file(revs, "-", tree, tree->sha1, tree->ce_mode);
-		return 0;
+		diff_index_show_file(revs, "-", tree, tree->sha1, tree->ce_mode);
+		return;
 	}
 
 	/* Show difference between old and new */
-	if (ce_path_match(idx, revs->prune_data))
-		show_modified(revs, tree, idx, 1, o->index_only, 0);
-	return 1;
+	show_modified(revs, tree, idx, 1, o->index_only, 0);
+}
+
+/*
+ * Count how many index entries go with the first one
+ */
+static inline int count_skip(const struct cache_entry *src, int pos)
+{
+	int skip = 1;
+
+	/* We can only have multiple entries if the first one is not stage-0 */
+	if (ce_stage(src)) {
+		struct cache_entry **p = active_cache + pos;
+		int namelen = ce_namelen(src);
+
+		for (;;) {
+			const struct cache_entry *ce;
+			pos++;
+			if (pos >= active_nr)
+				break;
+			ce = *++p;
+			if (ce_namelen(ce) != namelen)
+				break;
+			if (memcmp(ce->name, src->name, namelen))
+				break;
+			skip++;
+		}
+	}
+	return skip;
+}
+
+/*
+ * The unpack_trees() interface is designed for merging, so
+ * the different source entries are designed primarily for
+ * the source trees, with the old index being really mainly
+ * used for being replaced by the result.
+ *
+ * For diffing, the index is more important, and we only have a
+ * single tree.  
+ *
+ * We're supposed to return how many index entries we want to skip.
+ *
+ * This wrapper makes it all more readable, and takes care of all
+ * the fairly complex unpack_trees() semantic requirements, including
+ * the skipping, the path matching, the type conflict cases etc.
+ */
+static int oneway_diff(struct cache_entry **src,
+	struct unpack_trees_options *o,
+	int index_pos)
+{
+	int skip = 0;
+	struct cache_entry *idx = src[0];
+	struct cache_entry *tree = src[1];
+	struct rev_info *revs = o->unpack_data;
+
+	if (index_pos >= 0)
+		skip = count_skip(idx, index_pos);
+
+	/*
+	 * Unpack-trees generates a DF/conflict entry if
+	 * there was a directory in the index and a tree
+	 * in the tree. From a diff standpoint, that's a
+	 * delete of the tree and a create of the file.
+	 */
+	if (tree == o->df_conflict_entry)
+		tree = NULL;
+
+	if (ce_path_match(idx ? idx : tree, revs->prune_data))
+		do_oneway_diff(o, idx, tree, index_pos, skip);
+
+	return skip;
 }
 
 int run_diff_index(struct rev_info *revs, int cached)
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

Messages in current thread:
Updated in-memory index cleanup, Linus Torvalds, (Fri Jan 18, 11:25 pm)
[PATCH] Avoid running lstat(2) on the same cache entry., Junio C Hamano, (Sat Jan 19, 3:45 am)
Re: [PATCH] Avoid running lstat(2) on the same cache entry., Johannes Schindelin, (Sat Jan 19, 10:42 pm)
Re: [PATCH] Avoid running lstat(2) on the same cache entry., Johannes Schindelin, (Sat Jan 19, 9:48 pm)
Re: [PATCH] Avoid running lstat(2) on the same cache entry., Steffen Prohaska, (Sun Jan 20, 11:10 am)
Re: [PATCH] Avoid running lstat(2) on the same cache entry., Steffen Prohaska, (Sun Jan 20, 5:40 pm)
Re: [PATCH] Avoid running lstat(2) on the same cache entry., Johannes Schindelin, (Sun Jan 20, 11:18 am)
[PATCH] Try to resurrect the handling for 'diff-index -m', Johannes Schindelin, (Sun Jan 20, 11:21 am)
[PATCH] Also use unpack_trees() in do_diff_cache(), Johannes Schindelin, (Sun Jan 20, 11:19 am)
Re: [PATCH] Also use unpack_trees() in do_diff_cache(), Linus Torvalds, (Sun Jan 20, 4:32 pm)
Re: [PATCH] Also use unpack_trees() in do_diff_cache(), Linus Torvalds, (Sun Jan 20, 5:53 pm)
Re: [PATCH] Also use unpack_trees() in do_diff_cache(), Johannes Schindelin, (Sun Jan 20, 7:34 pm)
Re: [PATCH] Also use unpack_trees() in do_diff_cache(), Linus Torvalds, (Sun Jan 20, 7:58 pm)
Re: [PATCH] Also use unpack_trees() in do_diff_cache(), Johannes Schindelin, (Sun Jan 20, 8:19 pm)
Re: [PATCH] Avoid running lstat(2) on the same cache entry., Steffen Prohaska, (Sun Jan 20, 6:33 am)
Re: [PATCH] Avoid running lstat(2) on the same cache entry., Johannes Schindelin, (Sun Jan 20, 10:15 am)
Re: [PATCH] Avoid running lstat(2) on the same cache entry., Linus Torvalds, (Sat Jan 19, 10:02 pm)
[PATCH] index: be careful when handling long names, Junio C Hamano, (Sat Jan 19, 3:42 am)