Re: Implementing branch attributes in git config

Previous thread: [patch] munmap-before-rename, cygwin need by Yakov Lerner on Sunday, May 7, 2006 - 12:58 pm. (5 messages)

Next thread: [PATCH] Release config lock if the regex is invalid by Pavel Roskin on Sunday, May 7, 2006 - 2:36 pm. (5 messages)
From: Pavel Roskin
Date: Sunday, May 7, 2006 - 2:34 pm

Hello!

I have tried to implement branch attributes (more specifically, the
default for git-fetch) in git config.  It turns out that the format of
the config file is to rigid.

Minuses, underscores, colons and dots are not allowed in the section
headers and keys.  I can understand that dots should be allowed either
in the section names or in the key names, but other limitations seem
totally unnecessary.

I think a good implementation should allow any characters in the keys,
even "=", because the key can be quoted.  Section names may disallow
square brackets and dots.

Such limitations make it unpractical to use branch names in section or
key names.  I'd like to have it fixed.

The only remaining place is values.  This means that there should be
multiple entries for the same key.  While this is allowed, it seems
quite fragile and inconvenient.

In particular, git-repo-config leaves the config file locked in the
regex is wrong:

$ git-repo-config branch.fetch "master:origin" +
Invalid pattern: +
$ git-repo-config branch.fetch "master:origin" +
could not lock config file

To fix it, just add "close(fd); unlink(lock_file);" after "Invalid
pattern" in config.c.

I don't quite understand what pattern is needed to add an entry.  "foo"
seems to work fine, I don't know why.

That problem with multiple values is that they are quite fragile and
require special options to access them.  Since regex is used, dots in
the branch names need to be escaped.  Probably more escapes are needed.

Anyway, here's the preliminary patch that implements default fetch
branches.  Unfortunately, it doesn't even come close to the goal of
having per-branch attributes due to the config file limitations.

To add an entry, use

git-repo-config branch.fetch "localbranch:remotebranch" foo


Read the default fetch branch from the config file

From: Pavel Roskin <proski@gnu.org>


---

 git-fetch.sh |   21 +++++++++++++++++----
 1 files changed, 17 insertions(+), 4 deletions(-)

diff --git ...
From: Junio C Hamano
Date: Sunday, May 7, 2006 - 3:24 pm

I'd give Johannes the first refusal right to deal with this and
not touch repo-config.c myself for now, since I suspect I

I think the value regexp is "replace the ones that match this",
and the convention he came up with is to use "^$" to append (see
some examples in t/t1300-repo-config.sh).

In any case, Documentation/git-repo-config.txt mentions
value_regex without explaining what the semantics is.  This

I have a suspicion that using regex while is more powerful and
expressive might be a mistake and it would be easier for users

I thought that is what the "for blah" convention is for (BTW
"for" is not a keyword, just a convention).

Also the syntax is a bit confusing.  I initially was confused,
after looking at your example:

	[branch]
        	fetch = "localbranch:remotebranch"

that the colon separated value was a usual refspec, but it
isn't.  The LHS is the name of the current branch, and RHS is
the name of the remotes/ file, which is not a remote _branch_ at
all.  Perhaps something like this is semantically clearer, aside
from names -- I am bad at coming up with them:

	[branch]
		defaultremote = origin for master
                defaultremote = private for test

to mean, "we use remotes/origin when on master branch by
default".  Also we would want to use remote.origin.{url,fetch}
configuration item as well, so the "file existence" check you do

We should instead let the existence check do the right thing for
"$1" when it is used; after all the default "origin" may not
exist, and I suspect we do not probably check that (or if we do
already, the above check is totally unnecessary).

I haven't thought things through but I have a suspicion that
get_config_rem_branch might belong to git-parse-remote.sh not
git-fetch.sh.

-- >8 --
diff --git a/Documentation/git-repo-config.txt b/Documentation/git-repo-config.txt
index fd44f62..660c18f 100644
--- a/Documentation/git-repo-config.txt
+++ b/Documentation/git-repo-config.txt
@@ -23,10 +23,11 @@ You can ...
From: Johannes Schindelin
Date: Sunday, May 7, 2006 - 5:43 pm

Hi,


Thanks. Yes, you tempted me real hard :-) Unfortunately, the restructuring 
will have to wait a little, because I really should work for my day job 

I did not know about fnmatch()... It is probably better than regular 
expressions. After all, what I use it for most often is

	git-repo-config --get-all remote.*url


FWIW I like that syntax much better. But then, somebody called me weird 

I would actually prefer to go with your suggestion of using shell patterns 
instead of regular expressions. They are not needed, and most users tend 
to positively hate regular expressions.

Thoughts?

Ciao,
Dscho

-

From: Linus Torvalds
Date: Sunday, May 7, 2006 - 5:05 pm

That wouldn't help. The thing is designed to not only need no quoting 
(except for the _value_), it also is designed to have both section and key 
names ignore case. So you really aren't supposed to put things like branch 
names (which are potentially case-sensitive, depending on filesystem) in 
either.

And we depend on that (and I think it's important - users normally should 
_not_ care about capitalization)

So you'd need to literally create a different syntax if you want unquoted 

Here's a possible syntax/patch. I actually think the quoting makes more 
sense in the section name, since you'll usually want several keys under 
each branch, so it makes sense to make the _branch_ be the section.

It basically makes it a special case if you have the section name be 
marked with quotes, like

	["XyZZy"]

and in that case it turns the _real_ section name into the string 
"branch.XyZZy", where the important part is that it does this without 
changing case or limiting the character set (but it will _not_ allow a 
newline) in any way.

So you could have something like

	["Origin"]
		URL = ...
		fetch = master

and it would just turn it into

	branch.Origin.url = ...
	branch.Origin.fetch = master

etc.

No, I'm not sure this is the best possible syntax. It's just a suggestion. 
And it's certainly simple enough.

The downside is that if you start using config files like this, you 
literally can't go back to older git versions. They'll refuse to touch 
such a config file (rather than just ignoring the new entries) and will 
exit with nasty messages. That might be unacceptable.

Instead of quoting with double-quotes, it might be ok to use some 
alternate syntax line "[:$branchname:]" which looks visually reasonable, 
and has the potential advantage that ':' is already an invalid character 
in a branch name, so you don't actually even need any quoting logic at all 
at that point.

I think the

	["branch"]
		...

syntax looks reasonably readable and ...
From: Linus Torvalds
Date: Sunday, May 7, 2006 - 5:18 pm

You also need to move the "int c = get_next_char()" in the for() loop down 
to the end of the loop (removing the "int", of course).

So here's the fixed thing in case people care (and this time I actually 
looked at whether it might even work, not just compile ;)

		Linus
---
diff --git a/config.c b/config.c
index 41066e4..69d451a 100644
--- a/config.c
+++ b/config.c
@@ -134,12 +134,46 @@ static int get_value(config_fn_t fn, cha
 	return fn(name, value);
 }
 
+static int get_branch_name(char *name)
+{
+	int baselen = 7;
+	int quote = 0;
+
+	memcpy(name, "branch.", 7);
+	for (;;) {
+		int c = get_next_char();
+		if (c == EOF)
+			return -1;
+		if (c == '\n')
+			return -1;
+		if (!quote) {
+			if (c == '"')
+				break;
+			if (c == ']')
+				return -1;
+			if (c == '\\') {
+				quote = 1;
+				continue;
+			}
+		}
+		name[baselen++] = c;
+		if (baselen > MAXNAME / 2)
+			return -1;
+	}
+	if (get_next_char() != ']')
+		return -1;
+	return baselen;
+}
+
 static int get_base_var(char *name)
 {
 	int baselen = 0;
+	int c = get_next_char();
+
+	if (c == '"')
+		return get_branch_name(name);
 
 	for (;;) {
-		int c = get_next_char();
 		if (c == EOF)
 			return -1;
 		if (c == ']')
@@ -149,6 +183,7 @@ static int get_base_var(char *name)
 		if (baselen > MAXNAME / 2)
 			return -1;
 		name[baselen++] = tolower(c);
+		c = get_next_char();
 	}
 }
 
-

From: Linus Torvalds
Date: Sunday, May 7, 2006 - 5:25 pm

Side note: with this syntax, the _users_ will all just basically do

	if (!strncmp(name, "branch.", 7)) {
		branch = name + 7;
		dot = strchr(branch, '.');
		if (!dot)
			return -1;
		*dot++ = 0;
		.. we now have the branchname in "branc",
		    and the rest in "dot" ..

and if your branch names are purely alphabetical and lower-case, you can 
now write

	[branch.origin]
		remote = true
		url = git://git.kernel.org/...
		fetch = master

	[branch.master]
		pull = origin

and it will be parsed _exactly_ the same as

	["origin"]
		remote = true
		url = git://git.kernel.org/...
		fetch = master

	["master"]
		pull = origin

while the [branch.origin] syntax allows old versions of git to happily 
ignore it. So that would be a kind of cheesy work-around: the new 
double-quoted format is only _required_ for any branch-names that have 
special characters in it.

		Linus
-

From: Johannes Schindelin
Date: Sunday, May 7, 2006 - 6:05 pm

Hi,


Eek.

The ["blablabla"] syntax fails the is-it-obvious-what-this-does test. What 
*is* wrong with the " for " syntax? IIRC it was even proposed by you, and 
it happens to be backward compatible.

Ciao,
Dscho

-

From: Pavel Roskin
Date: Sunday, May 7, 2006 - 6:21 pm

Hello, Johannes!


Not trying to answer for Linus, here's my take.

The "for" syntax is one more layer in the config hierarchy.  Adding
another layer is a more intrusive solution than relaxing the syntax of
the existing elements without changing their semantic.

git-repo-config is "for" agnostic.  It doesn't parse "for" (as far as I
know).

"for" can be confusing in some contexts, or may force inversion of the
hierarchy to make the config file more readable.  How would you
implement branch descriptions?  See this:

[branchdata]
description = "netdev" for "Network device development"

and this

[branchdata]
description = "Network device development" for "netdev"

The later is closer to English.  Or should we use the first approach and
"is" instead of "for"?

Now, how can I get a description for the "netdev" branch by one
git-repo-config command, without pipes?

-- 
Regards,
Pavel Roskin

-

From: Johannes Schindelin
Date: Sunday, May 7, 2006 - 6:27 pm

Hi,



	git-repo-config --get branchdata.description ' for netdev$'

Hth,
Dscho

-

From: Pavel Roskin
Date: Sunday, May 7, 2006 - 6:55 pm

No, it doesn't remove "for netdev".  What I really don't like is that
git-repo-config treats it as "not my problem".

git-repo-config places extremely tight limitations of the names of the
sections and the keys.  But sometimes a relationship between two loosely
defined strings needs to be presented.  It's a real need.  And
git-repo-config doesn't address this need.

I believe git-repo-config should allow direct retrieval of data from any
depth, and the syntax should be explicit rather than fuzzy.  A dot is
more explicit than "for", especially if the dot appears after a name
that may not contain dots.

Another question is how we want to group the data.  Do we want to have
all descriptions together or in separate sections?  Whatever the answer,
git-repo-config should provide means to extract all data in one command,
without need for postprocessing.

So I understand arguing where to place the branch name.  But what I
don't like is the desire to offload part of the processing on the
callers.

-- 
Regards,
Pavel Roskin

-

From: Junio C Hamano
Date: Monday, May 8, 2006 - 2:00 am

Stating what you do not like about something is a good first
step to improve that something.  It should not be too hard to
extend the parser to grok:

	repo-config --get branchdata.description '\(.*\) for netdev$'

and when the value_regex has a capture return what matches
instead of the entire value.  I think that would do what you
want.

-

From: Johannes Schindelin
Date: Monday, May 8, 2006 - 5:17 am

Hi,


POSIX regexps do not want the backslashes...

Something like this?

---
[PATCH] repo-config: allow one group in value regexp

The regular expression for the value can now contain one group. In case
there is one, the output will be just that group, not the whole value.
Now you can say

        git-repo-config --get branch.defaultremote '(.*) for junio'

and for a config like this:

        [branch]
                defaultRemote = sushi for junio

the output will be "sushi".

Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>

---

 repo-config.c |   30 ++++++++++++++++++++----------
 1 files changed, 20 insertions(+), 10 deletions(-)

diff --git a/repo-config.c b/repo-config.c
index 63eda1b..9ac4c51 100644
--- a/repo-config.c
+++ b/repo-config.c
@@ -26,31 +26,41 @@ static int show_all_config(const char *k
 static int show_config(const char* key_, const char* value_)
 {
 	char value[256];
-	const char *vptr = value;
+	const char *vptr = value_;
 	int dup_error = 0;
 
 	if (value_ == NULL)
-		value_ = "";
+		vptr = value_ = "";
 
 	if (!use_key_regexp && strcmp(key_, key))
 		return 0;
 	if (use_key_regexp && regexec(key_regexp, key_, 0, NULL, 0))
 		return 0;
-	if (regexp != NULL &&
-			 (do_not_match ^
-			  regexec(regexp, value_, 0, NULL, 0)))
-		return 0;
+	if (regexp != NULL) {
+		regmatch_t matches[2];
+		int len;
+
+		if (do_not_match ^ regexec(regexp, value_, 2, matches, 0))
+			return 0;
+		len = matches[1].rm_eo - matches[1].rm_so;
+		if (!do_not_match && len > 0) {
+			if (len > 255)
+				len = 255;
+			strncpy(value, value_ + matches[1].rm_so, len);
+			value[len] = 0;
+			vptr = value;
+		}
+	}
 
 	if (show_keys)
 		printf("%s ", key_);
 	if (seen && !do_all)
 		dup_error = 1;
-	if (type == T_INT)
+	if (type == T_INT) {
 		sprintf(value, "%d", git_config_int(key_, value_));
-	else if (type == T_BOOL)
+		vptr = value;
+	} else if (type == T_BOOL)
 		vptr = git_config_bool(key_, value_) ? "true" : ...
From: Pavel Roskin
Date: Monday, May 8, 2006 - 8:15 am

OK, that would be acceptable.

I still don't like the "for" conversion because it masquerades syntax
(note that text to right of "for" must be unique) as plain text, but
it's a matter of taste, so it's hard for me to argue about it.

-- 
Regards,
Pavel Roskin

-

From: Pavel Roskin
Date: Sunday, May 7, 2006 - 5:36 pm

Hello, Linus!


You code faster that I write e-mails :-)

I like your approach, even though it breaks compatibility.  I understand
we are going to more .git/remotes to the config, so compatibility will
be broken anyways.

I'm only concerned that we would be hardcoding the word "branch".  We
could need fancy section names for other things in the future.

-- 
Regards,
Pavel Roskin

-

From: Linus Torvalds
Date: Sunday, May 7, 2006 - 5:43 pm

Fair enough. We could have used some fake section name that you can't 
generate any other way (in fact, "Branch.$branchname" would be that), but 
the upside of using "branch" is exactly that you _can_ generate it with 
the old-style syntax that is acceptable to older git versions too.

So the common case (all-lowercase, no special characters branch names) 
wouldn't need to break.

Now, backwards competibility for the .git/config file isn't likely a huge 
issue, but it does matter if you want to do things like "git bisect" to 
bisect a totally unrelated bug, and part of the bisection is actually to 
install the older git version that you're testing for the bug..

(Which is probably an insane thing to do anyway - you should be able to 
test any bugs _without_ actually having to install the git version you're 
testing. But whatever..)

		Linus
-

From: Junio C Hamano
Date: Sunday, May 7, 2006 - 6:27 pm

Sorry, I do not follow the old-style syntax part.

How about keeping the default syntax as it is (tokens are case
insensitive and alnums only, dot separates tokens into
sections), and when a token that violates that rule needs to be
spelled out, require quoting, so:

	branch.foo	BranCh.FoO	branch.FOO

are the same (section "branch.foo"), and if I have js/fmt.patch
branch, I need to spell the configuration for that branch like
so:

	branch."js/fmt.patch"	or   "branch.js/fmt.patch"        

and the URL variable for that section is

	$ git repo-config '"branch.js/fmt.patch".url'

(BTW, you could even have a variable with dots in it by quoting
the variable name, like "branch.js/fmt.patch"."fetch.option"; I
do not know if it is worth it).

My repository is full of topic branches that are named xx/yyyy.
It is very handy to be able to say "show-branch --topics master
'heads/??/*' next" which would not show my other branches like
"test", "throwaway", "rework", "temp", etc.

-

From: sean
Date: Sunday, May 7, 2006 - 6:34 pm

On Sun, 07 May 2006 18:27:32 -0700

Doesn't that mean you have to then prohibit creating mixed

How about transforming slashes into dots?  so the above would 
be:

   [branch.js.fmt.patch]

      And could be accessed by either:

    $ git repo-config branch.js.fmt.patch

Not worth it.  Branch names should be alnums and imho should be

Very nice.

Sean
-

From: Johannes Schindelin
Date: Sunday, May 7, 2006 - 6:45 pm

Hi,


Why should they be case sensitive? So you have a branch "origin" and 
another named "Origin" and get totally confused?

Ciao,
Dscho

-

From: sean
Date: Sunday, May 7, 2006 - 6:44 pm

On Mon, 8 May 2006 03:45:47 +0200 (CEST)

I don't care either way, just think we should be consistent.  Currently
we support case sensitivity in branch names and let people get 
totally confused.   But in practice people don't really get confused
by the linux filesystem which is case sensitive.

Sean
-

From: Junio C Hamano
Date: Sunday, May 7, 2006 - 7:29 pm

Not at all.  Whatever Porcelain that runs repo-config to record
the branch name needs to spell that branch name with proper

I _do_ want to keep my slashes intact and also dots; mangling
them is not very nice to me X-<.

-

From: sean
Date: Sunday, May 7, 2006 - 7:39 pm

On Sun, 07 May 2006 19:29:50 -0700

Okay.  It just seems nuts to require quoting because you happen
to use an uppercase character.  People are used to quoting 

You're right.

We should just relax the config file rules for legal characters,
in identifiers, at least [a-zA-Z0-9_/-].

Sean.
-

From: Daniel Barkalow
Date: Monday, May 8, 2006 - 4:20 pm

You could tell people always to use:

 [branch."name"]

even if the branch name is all lowercase anyway. They could even use:

 [Branch."MyMixedCaseBranch"]

Then when you refer to something case-sensitive with the possibility of 
funny characters, you put it in quotes, regardless of what it is.

For that matter, we could retain the quotes when we parse the file, and 
reject [branch.master] for lacking the quotes, so that people who are only 
exposed to branch names all in lowercase letters don't get habits that 
will fail when they have a v2.6.16.x branch.

I don't think that people are likely to use older versions of git on the 
same repository they've used newer versions on. (Clones of it, sure, but 
that doesn't matter here.) But we should, in any case, make the code 
ignore sections or lines with syntax errors, under the assumption that 
they're a later extension and possibly legal but not anything the code 
could be interested in getting from a parser that doesn't support them.

	-Daniel
*This .sig left intentionally blank*
-

From: sean
Date: Monday, May 8, 2006 - 4:30 pm

On Mon, 8 May 2006 19:20:17 -0400 (EDT)


Unfortunately that's not the only place where you have to use the 
excessive quoting; you also have to do the same when using the git
repo-config command line.   All because we're clinging to case
insensitivity and a very restrictive set of characters for 
identifiers in the config file.  And just why are we clinging?

Believe when Linus offered the code originally he said that it was
easier to start out very restrictive and loosen up later than it was
to start loose and become restrictive later.  That makes a lot of
sense, but somewhere along the way we seem to have made a virtue
out of something that is actually getting in the way of natural
syntactic usage.  There's no good reason for git to break with 
traditional and common practice in this case.

Sean
-

From: Johannes Schindelin
Date: Monday, May 8, 2006 - 4:44 pm

Hi,


It is well established common practice that ini files (and everything in 
config resembles an ini file) have case insensitive section headers as 
well as keys.

Ciao,
Dscho

-

From: sean
Date: Monday, May 8, 2006 - 5:08 pm

On Tue, 9 May 2006 01:44:31 +0200 (CEST)

Not in ini files that support section headers that represent filenames
and directories.  Exactly because those things _are_ case sensitive 
and include more characters than just alnums.  But we're not just 
talking about the config file syntax we're talking about how the 
user must ultimately refer to branches with uppercase character.  
Now everytime a person wants to add a branch attribute via repo-config
they have to remember that git thinks uppercase characters should
be quoted.  Doesn't that sound ridiculous to you?

The point is that we should make the config file and the repo-config
command easy for the _users_.   Instead we're going to make them use
some crazy extra syntax because we refuse to come to terms with the
limitations of the choices we've made so far.

One option, which I don't really like and comes with its own set of 
problems, would be to do something like:

[branch1]
    streetname = "p4/BrAnCH"
[branch2]
    streetname = "origin"

And then allow reference to it from git-repo-config by either branch#
or the value of street name.  While it would take some extra coding
but at least it lives within the current restrictions for key names.

It just seems that once you have to even consider options like this,
a big red flag should be raised about some of the assumptions we've
built into the system.

Sean
-

From: Johannes Schindelin
Date: Monday, May 8, 2006 - 5:23 pm

Hi,


Conceptually, it feels just wrong to me that a section name should contain 
a filename. A section is something which contains data regarding a certain 
concept. Such as "core", which has values concerning the core of git.

In that sense, I stay with my preferance:

[branch]
	defaultRemote = bla for bloop
	defaultRemote = connery for sean

No need to extend the simple config file. And remember, it is so simple, 
because it is so strict.

Ciao,
Dscho

-

From: Linus Torvalds
Date: Monday, May 8, 2006 - 5:37 pm

You don't actually need that.

We could easily do

	[branch]
		name = "p4/BrAnCH"
		url = git://git.kernel.org/...
		pull = master

	;
	; Repeating the "[branch]" section here isn't
	;  needed, but doesn't hurt either, and is
	; more readable
	;
	[branch]
		name = "origin"
		url = ...
		pull = ...

because the config file is always parsed linearly, and just 
trigger on "branch.name", and keep that around when parsing 
anything else.

The problem with _that_ is that "git repo-config" can't add this kind of 
setup sanely: it doesn't understand that kind of statefulness.

The above would work without any changes what-so-ever to the config format 
and readers - and old versions of git wouldn't react badly either. But the 
writer would need to be taught about state.

		Linus
-

From: sean
Date: Monday, May 8, 2006 - 5:49 pm

On Mon, 8 May 2006 17:37:29 -0700 (PDT)

That was pretty much what I was suggesting although i overlooked 

Shouldn't be too hard to deal with.  So we would allow the following
command to make the url entry above:


Well it seems like maybe the best compromise, and it avoids having to put
files and directories into section names which Johannes was objecting to.

Sean
-

From: Junio C Hamano
Date: Monday, May 8, 2006 - 5:54 pm

Wait a minute...  Statefulness is not the issue, I think.

How would you tell your updated repo-config what to update and
what to look up?

	- I want the url for branch whose name is "origin"

	- I want to fetch their "for-linus" branch when fetching
          from the branch whose name is "jgarzik" from now on.

In these query and update, you _are_ naming the branch with
name="xxx"; you just made "name" a special attribute.

Now, how would that compare with:

        [branch.jgarzik]
                url = git://git.kernel.org/...
                fetch = for-linus

or
	[branch."JGarzik"]
                url = git://git.kernel.org/...
                fetch = for-linus

I would say if we are grouping things together, if we can give
name to each group, _and_ if we are going to refer to the group
with its name, we are better off making the groups into distinct
sections _and_ make the syntax obvious that what name refers to
the section.  I think [branch.jgarzik] syntax is more obvious
than your example where "name =" line is implicitly used as a
keyword to differenciate multiple occurrences of [branch]
sections.

Having said that, perhaps you have something more elaborate in mind,
e.g.

	repo-config --get branch.url where name = 'origin'
	repo-config --get branch.name where url like 'git://%'
	repo-config branch.url 'git://git.kernel.org/...' where name = foo

;-) ;-) ;-) ???  If that is what you are after, then I agree
your syntax is more generic and suitable.  But otherwise I fail
to see its point.

On a related topic, I have always been torn about the "for"
convention.  While I think it was a cute hack, it would break
quite badly once we start doing anything complex.

	[branch]
        	url = git://git.kernel.org/... for jgarzik
                fetch = for-linus for jgarzik
                proxy = none

        	url = git://git.kernel.org/... for torvalds
                fetch = master for torvalds

-

From: Linus Torvalds
Date: Monday, May 8, 2006 - 6:05 pm

Exactly, git repo-config would have to know about this magic thing, and 
have a special argument like

	--state=branch.name

that says that "state" is to be taken from the "branch.name" variable when 
seen.

Then, in addition to the regexp, you would have a way to trigger on the 

It would be _able_ to do all the same things, but thanks to statefulness 

I agree. And I think it's actually very much the same thing. It adds 
state, but it adds it to each _value_, instead of adding it once "before" 
the values.

		Linus
-

From: Junio C Hamano
Date: Monday, May 8, 2006 - 6:18 pm

Not really, you ended up making it so, perhaps, but I do not
think it needs to be.

You dodged my comments on the SQL-like queries ;-).  I was half
(perhaps 3/4) joking, but some people actually might like to be
able to say:

	git repo-config --get-all branch.name where url like 'git://%'

to list all the repositories reachable via git-native protocol.

Oops, by the way, why does a [branch] have url as its attribute?
I think this was a bad example -- we are talking about [remote]

I am reluctant to buy that argument (I see it is an easy way
from the implementation point of view) -- it appears to me that
it would invite this easy user error.


	[branch]
        	name = linus
        	url = git://git.kernel.org/../torvalds/linux-2.6

	[branch]
        	url = git://git.kernel.org/../jgarzik/libata-dev

Yes, but that statefulness is inviting user errors, and you need
to update repo-config and config parser anyway, so I still do
not see what the advantage is.

-

From: Linus Torvalds
Date: Monday, May 8, 2006 - 6:30 pm

I think databases tend to be a huge mistake. Show me a SQL database where 
users can edit the data by hand, and it's all readable, and maybe I'll 
change my mind.

The monotone guys have almost everything in a database, and from what I 
can tell it results in (a) you can do some really funky queries and (b) 
it's confusing and slow as hell.

Now, the speed issue doesn't matter for a config file, but there the 


Yes, that would be a silent and confusing error.

		Linus
-

From: Junio C Hamano
Date: Monday, May 8, 2006 - 10:31 pm

Although I haved raised objections, I actually started to like
the idea of using multiple [branch] (or wasn't it [remote] given
the example variables we have been using?) sections.

We should first depart from the Windoze .INI mindset.  While I
do not think users expect case insensisitivity, only because the
section headers are marked with [brackets], if that syntax
somehow makes people expect such, maybe we should stop using
[bracket] as section markers.

Whatever marker we end up using, I'd suggest somewhat different
approach.

 - Treat each part that are grouped under [bracketted-string]
   marker as a bag of variable=value pairs.  Loosely speaking,
   the bracketted-string defines a schema -- what kind of
   variables are expected to be there in that seciton.  For
   example, a section for things we traditionally had in remotes
   file would contain fields (variables) such as url, fetch, and
   push (we might add proxy to this list).  And we call this
   "bag of variable=value" a section.

 - There can be multiple sections in a config file that uses the
   same schema.  The example at the beginning of this message
   is perfectly valid.  It defines two sections of type
   [branch], each of which has two variables (name and url) in
   it.

Unlike your earlier suggestion, the second [branch] is not just
for readability; it is mandatory, because we are talking about
two different [branch]es (eh, that's [remote]s, really), it
needs to be there to separate two instances.

The above would break the existing repo-config command, but
let's forget about it for now.  I think we are breaking
forward/backward compatibility in any proposals brought up so
far anyway.

We would need user interface level commands to

	add a new section
        delete a section

We would need a way to identify a secion, perhaps using a value
of arbitrary key (e.g. "where name=blah").  Creating a section
could be implicit, just like the current repo-config.

        add a variable=value to a ...
From: Linus Torvalds
Date: Monday, May 8, 2006 - 6:57 pm

Btw, I keep coming back to the same

	["jc/show-branch-dense"]
		remote = git://...

branch specifier syntax. It just seems very intuitive and is easy to 
parse. 

The only real downside ends up being the non-forwards-compatibility thing. 
But trying to be forwards-compatible for old git versions with this thing 
would seem to be a major pain for rather slim gain.

			Linus
-

From: sean
Date: Monday, May 8, 2006 - 7:47 pm

On Mon, 8 May 2006 18:57:08 -0700 (PDT)

We already need such section headers for remotes, and for branches.. 
both which may well need to be quoted as above.. how to distinguish
between them?   How to handle the next case that comes along where we
want these special header semantics?

We really need:

   [branch.Whatever]

and:

   [remote.Whatever]

As in the case of "origin" where we have a remote and a branch

What's the advantage of section quotation marks over just allowing these
characters in regular section names?  To be specific, what is wrong with:

   [jc/show-branch-dense]
       remote = git://...

If we just relax the legal characters in identifiers to include slash and dash?
It doesn't seem to be any different in amount of code needed to achieve, and it
is just as intuitive (perhaps more so) and no worse on the forward-compatibility
thing.

If we continue with case insensitive section names, it's quite possible for 
the user to mess up while hand editing or forgetting proper "git" quoting
rules on the command line and end up with silent breakage:

$ git repo-config "Branch".url = git://...  
        (updates section ["Branch"])

And then the next time forget the quotes and use:

$ git repo-config Branch.url = git://...    
         (updates section [branch])

And all of a sudden the user is updating _different_ sections with 
unpredictable results.  This is just awkward; why not just admit that 
section names should always be case sensitive if we're going to put
filenames inside them?

Sean
-

From: Linus Torvalds
Date: Monday, May 8, 2006 - 8:08 pm

This would _suck_

What if you have a branch called "core"? Not all all unlikely. 

Think about what a section like

	[core]

really means.

Plus I really want to not be case sensitive by default. Case sensitivity 
really is _not_ normal for this kind of config file syntax.

			Linus
-

From: sean
Date: Monday, May 8, 2006 - 8:07 pm

On Mon, 8 May 2006 20:08:41 -0700 (PDT)


Yeah, but the part of my message you didn't quote made it quite clear i know
about this problem, what i would really propose is:

[core]
 ...
[branch.core]
 ...
[remote.core]
 ...


But it's not just the config file, it's also how it ends up being used
on the command line..  you have to admit silent differences between
these two command lines is _not_ desirable:

    $ git repo-config "Branch".url  
    $ git repo-config Branch.url

That can't be something you want to see either.

Sean
-

From: Linus Torvalds
Date: Monday, May 8, 2006 - 9:11 pm

Ok. In that case, I would suggest a much more readable format:

	[core]
		...

	[branch "core"]
		...

	[remote "core"]
		...

and yes, enforce the <space>+<quoted name> format. We'd turn it internally 
into some random internal string format (probably replacing the space with 

Actually, the command line migth as well allow any strange thing, and 
_add_ the quotes internally. So you could do something that does

	git repo-config set Remote.Core.Pull master

and we could just let that generate

	[remote "Core"]
		pull = master

or whatever.

I care about the _file_ being human-editable, but that means that I think 
it's perfectly fine to have some smarts in the tools that help us do so, 
and let them recognize the magic "remote" and "branch" prefixes.

		Linus
-

From: Martin Waitz
Date: Tuesday, May 9, 2006 - 4:26 am

hoi :)


So why is everybody trying to munch all branch related data into
one .ini style config file?

why not simply use the mechanisms we use elsewhere and build something
like our remotes or the new HEAD file?

--=20
Martin Waitz
From: Johannes Schindelin
Date: Tuesday, May 9, 2006 - 4:34 am

Hi,


Because it is good to have one consistent tool to query/update what makes 
up the configuration. Of course, we really could go about it like M$ who 
invent a hundred an twenty three ways to do the same thing, all with their 
own set of bugs.

I admit that repo-config has not been as stable as it could have been. 
That was my fault, certainly. But with the help of the list, it has become 
more stable.

Now, if we decide upon a totally different config format, okay, that's 
what it takes. But please let's not have several different formats 
*again*.

Ciao,
Dscho

-

From: Johannes Schindelin
Date: Monday, May 8, 2006 - 4:40 pm

Hi,



I have to bisect git sometimes, just because I have some local changes. 
Older gits do barf with a fatal error when encountering a bad config.

Further, it is probably not a good idea to relax error-checking. It is too 
easy to overlook.

Ciao,
Dscho

-

From: sean
Date: Sunday, May 7, 2006 - 5:34 pm

On Sun, 7 May 2006 17:05:26 -0700 (PDT)

Having magic sections that prepend "branch." seems a bit suspect;
why not just be explicit:

  [branch.Origin]
      URL = ...
      fetch = master

Wouldn't it be reasonable for git to impose modest restrictions on
branch names; such as restricting them to [a-zA-Z0-9] characters?
Then we just have to make section names case sensitve within the
config file; keys could still be case insensitive.

Actually it would be nice if we were consistent.  If case matters to
git then the config file should be case sensitive.  If case doesn't
matter to git, then it should consider "Branch", "branch" and "BrAnCh"
as the same in all contexts (eg. git branch -b BrAnCh).  It seems
silly for us to say people are too dumb to handle case sensitivity
in a config file, but it's perfectly acceptable everywhere else.

Sean
-

From: Linus Torvalds
Date: Sunday, May 7, 2006 - 5:55 pm

Exactly because section (and key) names are normally not case sensitive.

Even the documentation actually talks about "core.fileMode" and "[imap] 
Folders". 

		Linus
-

From: Pavel Roskin
Date: Sunday, May 7, 2006 - 6:04 pm

Make it ["branch.Origin"]

No hardcoded "branch" prepending needed.  The case sensitive name is
still protected by quotes.  This extends trivially to ["user.Linus"] or
["path./src/git.c"] or whatever.

-- 
Regards,
Pavel Roskin

-

From: sean
Date: Sunday, May 7, 2006 - 6:11 pm

On Sun, 7 May 2006 17:55:25 -0700 (PDT)

Restore case sensitivity to config file parsing and the problem largely goes
away.  Or go the other route and remove case sensitivity from all the other 
bits (branches names etc..).

Sean
-

Previous thread: [patch] munmap-before-rename, cygwin need by Yakov Lerner on Sunday, May 7, 2006 - 12:58 pm. (5 messages)

Next thread: [PATCH] Release config lock if the regex is invalid by Pavel Roskin on Sunday, May 7, 2006 - 2:36 pm. (5 messages)