[RFC PATCH v2 0/3] Documentation: refactor config variable descriptions

Previous thread: [PATCH] make pack-objects a bit more resilient to repo corruption by Nicolas Pitre on Thursday, October 21, 2010 - 9:53 pm. (11 messages)

Next thread: Buglet in i18n? by Johannes Sixt on Friday, October 22, 2010 - 12:18 am. (27 messages)
From: Thomas Rast
Date: Thursday, October 21, 2010 - 10:02 pm

This resurrects (finally) the earlier attempt at

  http://mid.gmane.org/cover.1280169048.git.trast@student.ethz.ch

It tries the inverse approach: teaching the script how to find config
variable blocks in each manpage, and then linking them from the main
list.  (Obviously just inserting them into the main list could also
work.)

In other words, it attempts to push out the "original" documentation
of each variable from the main list to the individual manpage, which
is exactly opposite from v1.

This has the advantage that it does not make the source .txt for each
manpage "unreadable by themselves", as the earlier approach did (and
Jonathan noticed).

It also has the advantage that it will shrink config-vars.txt over
time.

Instead of going through all manpages again, I just tacked on a sample
patch as 3/3 that shows what the use/effect could be.  Once we settle
on the direction (v1 or v2) of the refactoring, I'll make more patches
or resurrect the old ones.

I'm afraid 1/3 (semantically unchanged from the equivalent patch in
v1) will again not make it through, so I again pushed this out:

  git://repo.or.cz/git/trast.git t/doc-config-extraction-v2

(incidentally v1 is still there as t/doc-config-extraction).  I based
it on master; 1/3 is prone to conflicts but can easily be recreated
from scratch.


Thomas Rast (3):
  Documentation: Move variables from config.txt to separate file
  Documentation: complete config list from other manpages
  Documentation: move format.* documentation to format-patch

 Documentation/Makefile              |    9 +-
 Documentation/config-vars-src.txt   | 1691 +++++++++++++++++++++++++++++++++
 Documentation/config.txt            | 1748 +----------------------------------
 Documentation/git-format-patch.txt  |   65 ++-
 Documentation/make-config-list.perl |  131 +++
 5 files changed, 1892 insertions(+), 1752 deletions(-)
 create mode 100644 Documentation/config-vars-src.txt
 create mode 100755 Documentation/make-config-list.perl

-- ...
From: Thomas Rast
Date: Thursday, October 21, 2010 - 10:02 pm

Signed-off-by: Thomas Rast <trast@student.ethz.ch>
---
 Documentation/config-vars-src.txt  |   56 -------------------------------
 Documentation/git-format-patch.txt |   65 +++++++++++++++++++++++++++++++++--
 2 files changed, 61 insertions(+), 60 deletions(-)

diff --git a/Documentation/config-vars-src.txt b/Documentation/config-vars-src.txt
index 949259c..e12ff6e 100644
--- a/Documentation/config-vars-src.txt
+++ b/Documentation/config-vars-src.txt
@@ -775,67 +775,11 @@ fetch.unpackLimit::
 	especially on slow filesystems.  If not set, the value of
 	`transfer.unpackLimit` is used instead.
 
-format.attach::
-	Enable multipart/mixed attachments as the default for
-	'format-patch'.  The value can also be a double quoted string
-	which will enable attachments as the default and set the
-	value as the boundary.  See the --attach option in
-	linkgit:git-format-patch[1].
-
-format.numbered::
-	A boolean which can enable or disable sequence numbers in patch
-	subjects.  It defaults to "auto" which enables it only if there
-	is more than one patch.  It can be enabled or disabled for all
-	messages by setting it to "true" or "false".  See --numbered
-	option in linkgit:git-format-patch[1].
-
-format.headers::
-	Additional email headers to include in a patch to be submitted
-	by mail.  See linkgit:git-format-patch[1].
-
-format.to::
-format.cc::
-	Additional recipients to include in a patch to be submitted
-	by mail.  See the --to and --cc options in
-	linkgit:git-format-patch[1].
-
-format.subjectprefix::
-	The default for format-patch is to output files with the '[PATCH]'
-	subject prefix. Use this variable to change that prefix.
-
-format.signature::
-	The default for format-patch is to output a signature containing
-	the git version number. Use this variable to change that default.
-	Set this variable to the empty string ("") to suppress
-	signature generation.
-
-format.suffix::
-	The default for format-patch is to output files with the suffix
-	`.patch`. Use ...
From: Jakub Narebski
Date: Thursday, October 21, 2010 - 11:19 pm

Could you please resend this patch using rename detection 
(git format-patch -M)?  It would make it clear what the difference
between config-vars and config-vars-src is, if any.

-- 
Jakub Narebski
Poland
--

From: Jakub Narebski
Date: Friday, October 22, 2010 - 1:18 am

I like it, even if the rest of this series would not get accepted.

-- 
Jakub Narebski
Poland
--

From: Jakub Narebski
Date: Friday, October 22, 2010 - 7:31 am

So is this script about automatically managing links from git-config
manpage to manpages of individual commands?  Does it mean that for 
something like the following in Documentation/config-vars-src.txt

  foo.bar::
  	Lorem ipsum dolor sit amet, consectetur adipisicing elit,
  	sed do eiusmod tempor incididunt ut labore et dolore magna
  	aliqua.

and if `foo.bar` is referenced in Documentation/git-foo.txt in some way
(from reading source of mentioned autogeneration script it considers
only 'foo.bar::', like in Documentation/git-send-email.txt), then it
adds

  See linkgit:git-foo[1]

at the end of description of variable in Documentation/config-vars-src.txt
(taking into account continuation blocks)?  What about config variables
mentioned in different ways, e.g. '`foo.bar`', like in git-update-index
documentation?  Does it checks that 'See linkgit:git-foo[1]' already
exists, e.g. in extended form 'See linkgit:git-foo[1].  True by default.'?
 
Or is it about automatically creating and updating blocks like this:

  sendemail.aliasesfile::
  sendemail.aliasfiletype::
  sendemail.bcc::
  sendemail.cc::
  [...]
  sendemail.thread::
  sendemail.validate::
  	See linkgit:git-send-email[1] for description.


See also comments below, where I realized what this scipt does...


While other helper scripts do not include comments describing them, it
would be nice to have here description what script does and for what it
is used.

Comments in code would also be nice.



The names <mainlist> (which is file with list of configuration variables
to modify), <ignore> (which is about ignoring some asciidoc_manpage files,
but only when recursing / following linkgit links) doesn't tell us much
by themselves.  It really needs either better names, or comment describing

It would be good to have comment what this subroutine does.  It reads
and parses file with list of config variables and their description,
and returns reference to list of variables, in the order they were ...
From: Jakub Narebski
Date: Friday, October 22, 2010 - 8:17 am

Note that patches 1/3 and 2/3 didn't made it to git mailing list
beause they were too large for vger anti-SPAM filter.  The 2/3
can use rename detection (-M) to be much smaller.

-- 
Jakub Narebski
Poland
--

From: Jeff King
Date: Friday, October 22, 2010 - 8:53 am

Thanks for working on this.

I think this approach is much saner for writers and readers of the
source files, and I think the output is much better (in particular, the
giant list having "see X" pointers instead of the actual description
blocks).

Your 2/3 doesn't seem to have made it through to the list, but I pulled
from your repository and looked at it. I have two comments on the
approach:

  1. It looks like you're more or less just parsing "::" list keys from
     all of the manpages. This seems somewhat fragile, as there could be
     other non-config lists. Can we at least restrict it to
     CONFIGURATION sections? It would be nice if we could put in some
     kind of semantic markup, but I'm not sure how painful that would be
     with asciidoc (we could always resort to comments in the source,
     but that would probably get unreadable quickly).

  2. You recursively follow includes via include::. This means that the
     make rule is not accurate. E.g., try:

       rm cmds-mainporcelain.txt config-vars.txt
       make config-vars.txt

     which should fail, as we actually depend on cmds-mainporcelain.txt.
     Doing it accurately and automatically would mean making .depend
     files for make.

     But I wonder if the recursive lookup is really required. Some of
     the includes with config files can just go away (e.g.,
     merge-config.txt is included only by config-vars-src.txt and
     git-merge.txt; it can just be merged straight into git-merge.txt
     once this system exists). Others, like pretty-formats.txt, should,
     IMHO, get their own user-visible page. Right now with your script
     you get[1]:

       format.pretty::
               The default pretty format for log/show/whatchanged command,
               See linkgit:git-log[1], linkgit:git-show[1],
               linkgit:git-whatchanged[1].

     but I would rather see[2]:

       format.pretty::
               See linkgit:gitpretty[7].

     [1]: I assume the single line of ...
From: Thomas Rast
Date: Saturday, October 23, 2010 - 6:24 pm

I figured for consistency and ease of lookup *all* configuration docs
should name the variables in the same format.  It can still be helpful
to mention them elsewhere, e.g. in the option documentations, but the
main docs should be a CONFIGURATION section formatted like this.

Or do you think that would be a bad thing?

As for false positives, we could do the CONFIGURATION but in any case
I was hoping to avoid a special markup by using an asciidoc markup to
mark false positives if they arise (there currently aren't any).
E.g., it should be possible to make a {noconfig} attribute that
expands to nothing or so.  [Then again the same trick could be used

True, you said that.  I'll hack it into this format, since it's easy
to do once the parsers are stable and can then just say something like
"herein" for all the ones actually in git-config(1).

-- 
Thomas Rast
trast@{inf,student}.ethz.ch
--

From: Jeff King
Date: Monday, October 25, 2010 - 8:04 am

No, I think everything should be in a CONFIGURATION section. So checking
for that is probably enough. I was more concerned about false positives
creeping in. Checking just the CONFIGURATION section would probably make
that unlikely, though.

-Peff
--

From: Jakub Narebski
Date: Monday, October 25, 2010 - 5:44 am

The regexp for catching config variables is quite strict, but tests
done with my version of script modifier further to check for 
CONFIGURATION-like sections (matching /^Config/) shown that we have
problems either way.

1. Without checking for CONFIGURATION sections, we have what are 
   probably false positives from gitmodules.txt:

     submodule.<name>.path::
     submodule.<name>.url::
     submodule.<name>.update::
     submodule.<name>.ignore::

   I say *probably* because while gitmodules manpage describes those 
   variables as going into .gitmodules, if I understand it correctly
   those variables can got to .git/config as an override.

2. With checking for CONFIGURATION-like, we would miss the following
   configuration variables:

     http.getanyfile:: (for git-http-backend, in 'SERVICES')
     http.uploadpack:: (for git-http-backend, in 'SERVICES')
     http.receivepack:: (for git-http-backend, in 'SERVICES')

   These are in git-http-backend manpage, in 'SERVICES' section, which 
   probably should be named then 'CONFIGURING SERVICES'.

   BTW, CONFIGURATION-like means:

    * Configuration
    * CONFIGURATION
   
  but also

    * CONFIG FILE-ONLY OPTIONS
    * CONFIGURATION FILE
    * Configuration Mechanism
    * CONFIG VARIABLES
    * CONFIGURATION VARIABLES

We do that: see 'doc.dep' target in Documentation/Makefile.  We just
need for this target to also add dependencies for config-vars.txt
(perhaps separate mode for make-config-list.perl, or perhaps 

merge-config.txt is the only included file with config variables.
Other includes does not contain config variables.  And because 
merge-config.txt is also included in config-vars-src.txt, therefore
the whole recursive lookup is not necessary.

Note however that make-config-list.perl only creates minimal documentation,
just link(s) to appropriate mapage(s).  Include-ing merge-config.txt both
in git-merge.txt and config-vars-src.txt means that we have merg config

Actually the above ...
From: Jeff King
Date: Monday, October 25, 2010 - 8:11 am

I would argue those should probably go in a CONFIGURATION section for

Again, I think for consistency for the reader, it may make sense to
switch them all to CONFIGURATION. I'd have to look at each page and see


I disagree. I think one of the benefits of this exercise is generating a
more concise list. That being said, I don't think there's any reason we
can't have a terse list in gitconfig(7) and a much larger one in
gitconfigfull(7) or something like that (or even put it later in the
manpage of gitconfig(7), or whatever).

If you're going to do that, though, there's no point in having
merge-options separate. make-config-list should just generate both

Oh, you're right. I was browsing the output and just assumed it was

The comma at the end made it look to me like a sentence had been cut off
during parsing. But looking at config.txt, it is simply a typo.  The
comma should be a period.

-Peff
--

From: Jakub Narebski
Date: Monday, October 25, 2010 - 8:49 am

What consistency ;-)?  But I agree, if we can switch all the following 
variants to 'CONFIGURATION', it should also go in 'CONFIGURATION'
section.  If we cannot, then 'CONFIGURING SERVICES' as section name,


Though simpler would be just to not use or turn off following includes,
as it turned out that it doesn't matter to follow includes in manpages:
if we include with config variables, it is to also include it in 
config-vars-src.txt.

Well, assuming that we don't have to follow includes in config-vars-src.txt;

I can agree with eliminating those blocks that consist only of list of
variables and reference to main page, like the following[1]

  sendemail.<identity>.*::
        Identity-specific versions of the 'sendemail.*' parameters
        found below, taking precedence over those when the this
        identity is selected, through command-line or
        'sendemail.identity'.

  sendemail.aliasesfile::
  sendemail.aliasfiletype::
  sendemail.bcc::
  sendemail.cc::
  [...]
  sendemail.thread::
  sendemail.validate::
        See linkgit:git-send-email[1] for description.


and perhaps also this block

  imap::
        The configuration variables in the 'imap' section are described
        in linkgit:git-imap-send[1].

[1] Though only if we can ensure that they are added below 

You can mechanically move or copy description of config variables from
config-vars-src.txt to individual manpages.  Moving in reverse direction,
like in your autogenerated gitconfigfull(7) proposal, is not that easy:
description of config variables in individual manpages can refer to
other parts of said manpage (and barring AI you cannot reliably detect
such situation).

-- 
Jakub Narebski
Poland
--

From: Jakub Narebski
Date: Wednesday, October 27, 2010 - 3:56 am

The part of patch (commit) touching Documentation/Makefile could look
like this:

-- 8< --
diff --git a/Documentation/Makefile b/Documentation/Makefile
index e117bc4..351aa9c 100644
--- a/Documentation/Makefile
+++ b/Documentation/Makefile
@@ -212,9 +212,12 @@ install-html: html
 #
 # Determine "include::" file references in asciidoc files.
 #
-doc.dep : $(wildcard *.txt) build-docdep.perl
+doc.dep : $(wildcard *.txt) build-docdep.perl make-config-list.perl
 	$(QUIET_GEN)$(RM) $@+ $@ && \
 	$(PERL_PATH) ./build-docdep.perl >$@+ $(QUIET_STDERR) && \
+	$(PERL_PATH) ./make-config-list.perl --deps=config-vars.txt \
+		--mainlist=config-vars-src.txt $(MAN_TXT) \
+		>>$@+ $(QUIET_STDERR) && \
 	mv $@+ $@
 
 -include doc.dep
@@ -320,6 +323,11 @@ howto-index.txt: howto-index.sh $(wildcard howto/*.txt)
 	'$(SHELL_PATH_SQ)' ./howto-index.sh $(wildcard howto/*.txt) >$@+ && \
 	mv $@+ $@
 
+config-vars.txt: config-vars-src.txt $(MAN_TXT) make-config-list.perl
+	$(PERL_PATH) ./make-config-list.perl \
+		--mainlist=$< --ignore=$@ $(MAN_TXT) >$@+ && \
+	mv $@+ $@
+
 $(patsubst %,%.html,$(ARTICLES)) : %.html : %.txt
 	$(QUIET_ASCIIDOC)$(ASCIIDOC) $(ASCIIDOC_EXTRA) -b xhtml11 $*.txt
 
-- >8 --

Note that while in rule for config-vars.txt target we can use automatic
variables, we have to spell filenames in full in the part of doc.dep
rules that generate dependency for config-vars.txt.  Note also that we
have to keep those explicit and implicit filenames in sync across those

Decision time!!!

There are two possible approaches:

1. Don't follow includes in manpages (in manpage sources), and don't follow
   includes in config-vars-src.txt

   Advantages:
   * Less code to write and maintain
   * No need to keep filenames in rule for config-vars.txt and for doc.dep
     in sync.

   Disadvantages:
   * Theoretically fragile: it depends on two assumptions:

     A. If file included in manpage contains description of config vars,
        then this file would be included ...
From: Jens Lehmann
Date: Tuesday, November 2, 2010 - 6:42 am

AFAICS only 'path' won't show up in .git/config. 'url' and 'update'
are copied into .git/config by "git submodule init", 'ignore' might
be added by a developer to override the setting from .gitmodules.
--

Previous thread: [PATCH] make pack-objects a bit more resilient to repo corruption by Nicolas Pitre on Thursday, October 21, 2010 - 9:53 pm. (11 messages)

Next thread: Buglet in i18n? by Johannes Sixt on Friday, October 22, 2010 - 12:18 am. (27 messages)