These are a few gitweb issues and features I'm currently working on
(or plan working on).1. New patchset view (commitdiff, blobdiff)
In "old" gitweb commitdiff view was generated by iterating over lines of
git-diff-tree raw format output, and generating diffs using
git-cat-file and external diff utility (/usr/bin/diff). This required
having temporary directory for diff generation, and of course diffs
didn't have extended git headers.The "new" commitdiff view is generated from single git-diff-tree
--raw-with-path output. But I have made incorrect assumption that one
line from "raw" diff-tree output always corresponds to only one patch
in the patchset part of output. This is not the case. I'm not sure if
those are the only cases when patch is broken, but changing file into
symlink or symlink into file ('T' status), and explicit breaking ('B'
status) generates two patches to one line of raw difftree output. The
second is not of much importance for gitweb, unless yoy add -B to
@diff_opts, but the first is important; it is currently broken, see
http://tinyurl.com/y3cfop
(commit 4c52c0d31f0f7142d81a465c40789befc2e86548 on
gitweb-test-funny-char branch in git.git repository).I have thought of the following (mutually exclusive) ways to fix this
a. Change core git git-diff-tree command to not break (some?) of T
changes into two patches. From what gitster said on #git this
feature is for git-diff patches to be patch(1) compatibile; but
-M already causes patches to be incompatibile with patch. I'm
thinking here about adding some kind of -s/--single-patch
--do-not-break-patch-into-two-please command line option for
git-diffb. Check the raw difftree line for status and perhaps other info
to know if the line generates more than one patch. It needs detailed
knowledge about _when_ git-diff generates more than one patch to one
"raw" format line, and would break if core diff changes in that
detail. Simplest to implement, I t...
This is the most appropriate. Right now it is not independently
controllable but it is not so inplausible for somebody to want
to get non recursive view of 'raw' part along with textual diffs
when running "--raw -p" diff and your solution c. is robust
against even such changes.I would probably not call that "caching" but keeping track of
where you are, and you would need to know in which filepair you
are in anyway when you want to implement links to blob view fromI am not sure what's more useful to show there. The part of the
output shows "the list of files that have changed by this
commit" for non-merge commits, so "the list of files that have
changed in this merge" is what you would want to show, but there
are three ways you can define "what changed" for a merge (see
the message that explains --cc to linux@horizontal I sent
yesterday). I find "diff --name-status $it^1 $it" the most
natural semantics for the most of the time, and that is what we
do for --stat output.If you want to treat all the parents equally, you could read
from "diff-tree -c --raw $it" which would show list of files
that needed file-level merges, along with the blob object names
from all parents.We might want to give that when "diff-tree --cc --patch-with-raw
I think that is a minor implementation detail; I think the
latter is probably less painful to maintain in the longer run
because if you encode things in earlier stages, postprocessing
stages need to know which part of the result from the earlier
processing needs decoding before further processing, but I
suspect you will be the one primarily hacking on that part, so
it is not my preference, but just a suggestion to make your life
less painful.-
What about the fact that git-diff -M is _not_ patch-compatibile;
does creating two patches for one difftree raw format line for
mode change/'T' status (I guess only for "type" mode changes, i.e.
file to/from symlink, file to/from directory) helps understanding
the change? If not, perhaps it would be better to introduce optionI'd say "buffering" rather than "caching". The problem is that
you have to read up to the "index <hash>..<hash>[ <mode>]" to
check if you have to go to the next raw difftree line or not.
And the info from difftree line is needed to hyperlink parts
of extended git diff header (and also to hyperlink parts ofWe should have whatchanged part corresponding to the patchset
part at least in "commitdiff" view, which means '-c' (and for
the time being perhaps mean '-c' also in patchset part). '--cc'
which uses '-c' for the raw part would be nice..."Commit" view could use "diff --name-status $hash^1 $hash",
(i.e. I think the same what it does now), even if it is not
compatibile with "commitdiff" view.--
Jakub Narebski
Poland
-
What about it? I've never said patch compatibility is an issue.
We have something patch cannot represent or understand and you
should admit it. The point is to make it easier to massage by
hand, when the recipient does not have git handy.With -M, the recipient can read and understand the patch text
better than "remove this oldfile and create this newfile that
the diff output does not tell you is related" diff. And we say
"rename" in plain language so the recipient _can_ do "mv A B"
then "patch -p1". Similarly, with -T that changes a symlink
into a real file, if we do not do the current "remove the old
and then create the new" and did instead "show the textual diff
that can be applied", a non-git tool that does not understand
the typechange can mistakenly muck with the target of the
symlink, which is a disaster. "Remove the target and then
create this" at least would have lesser damage -- the object
left as the result is incorrect nevertheless, but reading the
contents and creating a symlink that has that contents by handI am not sure what you mean by patchset part, but if you are
talking about the multiway diff text, I think most of the time
output from "-c -p" is much less interesting than "--cc".-
So we split patch for "type change" mode change for patch -p1
safety. But for gitweb more important than producing safe patch
is producing _readable_ patch[*1*]. Hence request for -T option
(or under other, better name) to git-diff which would _not_
split patch (not create two patches for one raw difftree line).Sorry, perhaps I was not clear. I'd like for git-diff-tree --cc
--raw -p to output both raw part (perhaps taken from -c) and
patch part[*2*]. Gitweb needs raw part for both whatchanged
and also for creating hyperlinks and gitweb quoted filenames
in the patch part (although with git diff headers buffering
and git diff extended headers parsing we could get the information
needed for hyperlinks and gitweb quoting of filenames for patch
part from patch part).[*2*] I'm wondering why we don't have --patch long version of -p
option to git-diff[-tree].--
Jakub Narebski
Poland
-
Honestly, I do not have strong feeling either way. As you say,
I suspect diff to change between symlink and regular file is not
readable no matter how you present it, and it is a corner case
that is not very interesting. It happens in real life but it is
rare enough that split patches or a single patch would not make
much difference either way.So I would not oppose to a patch to add an option to update
git-diff to produce either format, but I doubt it is worth the
effort required to make sure that the change does not break
anything else and also to make matching adjustment to git-apply,I can see why somebody might want to do this, to exactly the
same degree that I can see why somebody might want to use the
combination of "--raw -p".I think this would work in git.git itself:
git diff-tree --cc --numstat --raw -p v1.0.0
-
