login
Header Space

 
 

Re: detecting rename->commit->modify->commit

Score:
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]
To: Teemu Likonen <tlikonen@...>
Cc: Junio C Hamano <gitster@...>, Ittay Dror <ittayd@...>, <git@...>
Date: Thursday, May 1, 2008 - 7:09 pm

[cc'd Junio for comments on this rename optimization]

On Thu, May 01, 2008 at 11:39:40PM +0300, Teemu Likonen wrote:


Ah, OK. The problem comes because the toy example is so tiny. It hits
this code chunk:

  if (base_size * (MAX_SCORE-minimum_score) < delta_size * MAX_SCORE)
          return 0;

where base_size is the size of the smaller file in bytes, and delta_size
is the difference between the size of the two files. This is an
optimization so that we don't even have to look at the contents.

But it is basing the percentage off of the smaller file, so even though
file B ("hello\nworld\n") is 50% made up of file A ("hello\n"), we
actually end up saying "there must be at least as much content added to
make B as there is in A already". IOW, the "percentage similarity" is
based off of the smaller file for this optimization.

Obviously this is a toy case, but I wonder if there are other larger
cases where you end up with a file which has substantial copied content,
but also _grows_ a lot (not just changes). For example, consider the
file:

  1
  2
  3
  4
  5
  6
  7
  8
  9

that is, ten lines each with a number. Now rename it, and start adding
more numbers. We detect the addition of 10, 11, 12. But adding 13 means
we no longer match. So even with only 4 lines added, we fail to match.

But again, this is a bit of a toy case. It relies on the line length
being a significant factor compared to number of lines.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

Messages in current thread:
detecting rename-&gt;commit-&gt;modify-&gt;commit, Ittay Dror, (Thu May 1, 10:10 am)
Re: detecting rename-&gt;commit-&gt;modify-&gt;commit, Sitaram Chamarty, (Thu May 1, 12:39 pm)
Re: detecting rename-&gt;commit-&gt;modify-&gt;commit, Avery Pennarun, (Thu May 1, 11:27 am)
Re: detecting rename-&gt;commit-&gt;modify-&gt;commit, Steven Grimm, (Thu May 1, 3:12 pm)
Re: merge renamed files/directories?, Ittay Dror, (Sun May 4, 2:08 am)
Re: merge renamed files/directories?, Avery Pennarun, (Mon May 5, 12:40 pm)
Re: merge renamed files/directories?, Robin Rosenberg, (Mon May 5, 5:49 pm)
Re: merge renamed files/directories?, Linus Torvalds, (Mon May 5, 6:20 pm)
Re: merge renamed files/directories?, Avery Pennarun, (Mon May 5, 9:38 pm)
Re: merge renamed files/directories?, Linus Torvalds, (Mon May 5, 10:19 pm)
Re: merge renamed files/directories?, Shawn O. Pearce, (Mon May 5, 9:46 pm)
Re: merge renamed files/directories?, Avery Pennarun, (Mon May 5, 9:58 pm)
Re: merge renamed files/directories?, Shawn O. Pearce, (Mon May 5, 10:12 pm)
Re: merge renamed files/directories?, Steven Grimm, (Mon May 5, 7:07 pm)
Re: merge renamed files/directories?, Linus Torvalds, (Mon May 5, 8:29 pm)
Re: merge renamed files/directories?, Theodore Tso, (Tue May 6, 11:47 am)
Re: merge renamed files/directories?, Linus Torvalds, (Tue May 6, 12:10 pm)
Re: merge renamed files/directories?, Ittay Dror, (Tue May 6, 12:32 pm)
Re: merge renamed files/directories?, Linus Torvalds, (Tue May 6, 12:39 pm)
Re: merge renamed files/directories?, Linus Torvalds, (Tue May 6, 12:15 pm)
Re: merge renamed files/directories?, Linus Torvalds, (Mon May 5, 8:40 pm)
Re: merge renamed files/directories?, Jakub Narebski, (Sun May 4, 5:34 am)
Re: detecting rename-&gt;commit-&gt;modify-&gt;commit, Avery Pennarun, (Thu May 1, 11:50 am)
Re: detecting rename-&gt;commit-&gt;modify-&gt;commit, Avery Pennarun, (Thu May 1, 3:45 pm)
Re: detecting rename-&gt;commit-&gt;modify-&gt;commit, David Tweed, (Thu May 1, 11:30 am)
Re: detecting rename-&gt;commit-&gt;modify-&gt;commit, Teemu Likonen, (Thu May 1, 4:39 pm)
Re: detecting rename-&gt;commit-&gt;modify-&gt;commit, Sitaram Chamarty, (Thu May 1, 10:06 pm)
Re: detecting rename-&gt;commit-&gt;modify-&gt;commit, Junio C Hamano, (Thu May 1, 10:38 pm)
Re: detecting rename-&gt;commit-&gt;modify-&gt;commit, Sitaram Chamarty, (Fri May 2, 12:59 pm)
Re: detecting rename->commit->modify->commit, Jeff King, (Thu May 1, 7:09 pm)
Re: detecting rename-&gt;commit-&gt;modify-&gt;commit, Jakub Narebski, (Thu May 1, 11:47 am)
speck-geostationary