just add the hostname in the path when using --base-path and --user-path. this should be enough for most needs. Signed-off-by: Pierre Habouzit <madcoder@debian.org> --- Here is a proposal for daemon side virtualhosting support. This is quite simple, and just uses the hostname in the path where it looks for the git repository. Only works in conjuction of --base-path or --user-path btw. Documentation/git-daemon.txt | 12 +++++++ daemon.c | 73 +++++++++++++++++++++++++++++++++++------- 2 files changed, 73 insertions(+), 12 deletions(-) diff --git a/Documentation/git-daemon.txt b/Documentation/git-daemon.txt index 0f7d274..ab5b232 100644 --- a/Documentation/git-daemon.txt +++ b/Documentation/git-daemon.txt @@ -79,6 +79,18 @@ OPTIONS taken as a request to access `path/foo` repository in the home directory of user `alice`. +--use-vhosts:: + Use git daemon cheap virtual hosting scheme. ++ +Used with --base-path=/foo, it will search the repositories in /foo/HOSTNAME +instead of /foo - if you try to pull git://git.example.com/hello.git, it will +search into /foo/git.example.com/hello.git. ++ +When using --user-path the HOSTNAME is appended to the path as well, so for +--user-path=some/path, request to git://git.example.com/~alice/foo is taken as +a request to access some/path/git.example.com/foo in the home directory of +user `alice`. + --verbose:: Log details about the incoming connections and requested files. diff --git a/daemon.c b/daemon.c index 012936f..146482c 100644 --- a/daemon.c +++ b/daemon.c @@ -17,7 +17,7 @@ static int reuseaddr; static const char daemon_usage[] = "git-daemon [--verbose] [--syslog] [--inetd | --port=n] [--export-all]\n" -" [--timeout=n] [--init-timeout=n] [--strict-paths]\n" +" [--timeout=n] [--init-timeout=n] [--use-vhosts] [--strict-paths]\n" " [--base-path=path] [--user-path | --user-path=path]\n" " [--reuseaddr] [--detach] [--pid-file=file] ...
This part is objectionable -- older clients do not give "host=". I think the plan, when virtual hosting was proposed and we added this to the client side first, was to treat older clients as if they specified the "primary" host. So we would need some mechanism to say where the repositories of the "primary" host I am not sure if the interaction between user-path and vhost should go like this, but I do not think of a good alternative to suggest right now. Your code allows ~user/host1 and ~user/host2 to host different set of repositories, but I suspect if somebody is setting up a virtual hosting of two hosts, he might want to have two distinct set of users (iow to pretend that ~user exist only on host1 but not on host2). I dunno. -
fair enough, so I suppose there is many choices:
* replace --use-vhosts with --use-vhosts=3D${default hostname}
but I do not like it very much
* use `hostname -f` as the default hostname (not very nice either)
* use the magic _default_ (=E0 la apache) hostname ?
But just note that if a git repository is accessible on an host that is=20
not the default, and only on that one, an older client won't be able to=20
clone the repository anyway, because he will be redirected to the=20
default virtual host. Meaning, that someone that uses virtual hosts=20
will break older clients anyway.
what would be nicer would be a way to gracefully reject older clients in=20
that case with an understandable error message, so that the user knows=20
why it does not work. I don't know PROTO_GIT very well yet,so I don't=20
Another option would be not to support virtual hosts, but instead=20
superseed the --base-path and --user-path with some --base-path-fmt=20
and --user-path-fmt where the user can specify how to build the path=20
with simple sprintf-like formats. One could e.g. support:
* %% obviously ;
* %h that will be replaced with the hostname
* %u (only for --user-path-fmt)
* %p (asked path)
* ...
I think that's more clever, and allow more flexible use of the virtual=20
hosting code. It e.g. allow to use the virtual host scheme for the=20
`base-path` repos and to disallow it for the users.
=2D-*-path and --*-path-fmt are obviously mutually exclusive.
What do you think ?
=2D-=20
=B7O=B7 Pierre Habouzit
=B7=B7O madcoder@debian.org
OOO http://www.madism.org
And this is exactly what I have implemented and I kinda like it... :-) jdl -
Heh. Well, I rewrote the interpolation code. As you pointed out earlier, it was crap. I can send that in; no problem. There was some outstanding debate if this was actually the right way to go about things. Specifically, the problem of canonical hostname. A proposal was to use peer IP addresses instead, and I just haven't gotten around to messing with that yet. So that is why I hadn't sent it in yet. Sorry. But, I'll rebase it and send it in this evening. jdl -
Allow a form of virtualhosting, when %h format is used.
Signed-off-by: Pierre Habouzit <madcoder@debian.org>
---
This is intended to be a more flexible solution, that also gives virtual
hosting as a bonus. I still see no way to deal with older clients when
virtual hosting is used by the admin though, having a "default" hostname
won't solve anything at all anyway.
Documentation/git-daemon.txt | 42 ++++++++++
daemon.c | 170 +++++++++++++++++++++++++++++++++++++++---
2 files changed, 201 insertions(+), 11 deletions(-)
diff --git a/Documentation/git-daemon.txt b/Documentation/git-daemon.txt
index 0f7d274..0aa34d4 100644
--- a/Documentation/git-daemon.txt
+++ b/Documentation/git-daemon.txt
@@ -11,6 +11,7 @@ SYNOPSIS
'git-daemon' [--verbose] [--syslog] [--inetd | --port=n] [--export-all]
[--timeout=n] [--init-timeout=n] [--strict-paths]
[--base-path=path] [--user-path | --user-path=path]
+ [--base-path-fmt=pathfmt] [--user-path-fmt=pathfmt]
[--reuseaddr] [--detach] [--pid-file=file] [directory...]
DESCRIPTION
@@ -45,6 +46,10 @@ OPTIONS
'git://example.com/hello.git', `git-daemon` will interpret the path
as '/srv/git/hello.git'.
+--base-path-fmt=pathfmt::
+ Works like --base-path, but uses a format instead. See Path
+ Formats section below.
+
--export-all::
Allow pulling from all directories that look like GIT repositories
(have the 'objects' and 'refs' subdirectories), even if they
@@ -79,6 +84,11 @@ OPTIONS
taken as a request to access `path/foo` repository in
the home directory of user `alice`.
+--user-path-fmt=pathfmt::
+ Allow ~user notation to be used in requests. The path used to look
+ for the git repository is a format string, see Path Formats
+ section below.
+
--verbose::
Log details about the incoming connections and requested files.
@@ -98,6 +108,38 @@ OPTIONS
--strict-paths is specified this will also include subdirectories
of each ...I mildly disagree about the last sentence. Enabling virtual hosting does not have to mean all virtual hosts are treated equal. It is conceivable that a site hosts the primary, "collection of public repositories everybody would want to go to" set, with supplemental ones for specific audiences that are done via virtual hosting. General public who would want to access the primary one can come with older clients that way, and only the narrower audiences have to be told to upgrade. The client-side host= support was done post 1.4.0-rc1 timeframe so we have to be nicer to 1.3 based people, at least give them a way to slurp newer version with their client ;-). Haven't looked at the rest of the patch yet. Will comment later. -
hmm, yes, that's indeed fair. Well, adding a --default-hostname that is=20 used if no host=3D is passed is completely obvious and straightforward=20 and solves that issue. I also spotted a bug in the git_path_fmt function, I test 'pos' at the=20 end of the loop, but the overflow test is not done when there is no=20 format involved, wich could /theorically/ lead to buffer overflow. So=20 please disregard that issue, I will at least provide a patch that fixes=20 np, TIA =2D-=20 =B7O=B7 Pierre Habouzit =B7=B7O madcoder@debian.org OOO http://www.madism.org
Nicely done, almost. Having to have the distinction between %p and %P formats feels I prefer these to be of type "static int". Although I am not an authority of variable naming, these sound funny to me. "is_XXX()" as a function name feels natural, "is_XXX" as a variable name does not --- it is not clear what the predicate is talking about. Maybe "use_fmt_for_base_path" is easier to understand? I dunno. Or "user_path_is_fmt"? That's more logical but still somewhat feels funny. -
agreed. There is also a second patch that never made it to the list that fixes: * some indentation problems due to a bad vimrc * --default-hostname switch (to handle virtual hosts even with older clients) * possible overflow in the formatting method. I'll recompute a new patch that superseeds that one, and merge your=20 comments and my never sent patch too. =2D-=20 =B7O=B7 Pierre Habouzit =B7=B7O madcoder@debian.org OOO http://www.madism.org
I have to admit that I kinda liked JDL's simpler one first (and
it has been in production use for some time). We'll see.
About vger potentially throwing things away, I use this script
(called "taboo.perl") to check my messages before sending them
out.
Obviously the taboo-word list itself is not attached here, but
the actual script should have a copy of it after the __DATA__
marker.
-- >8 --
#!/usr/bin/perl -w
my $tmpl = ' if (%%PATTERN%%) {
print "$lineno ${_}matches %%QPATTERN%%\n";
return;
}
';
my $stmt = "";
my $in_header = 1;
while (<DATA>) {
if (/^\$global_taboo_body =/) {
$in_header = 0;
}
next if (/^\043/ || /^\$/ || /^END$/ || /^\s*$/);
chomp;
my $p = $_;
if ($in_header) {
$p = '/^[-\w_]*:/ && ' . $p;
}
my $q = quotemeta($p);
my $stmt1 = $tmpl;
$stmt1 =~ s|%%PATTERN%%|$p|g;
$stmt1 =~ s|%%QPATTERN%%|$q|g;
$stmt .= $stmt1;
}
close DATA;
$stmt = 'sub check {
my ($line, $lineno) = @_;
' . $stmt . '
}
';
eval $stmt;
while (<>) {
check($_, $.);
}
my $how_to_update_this_script = <<'EOF' ;
( sed -e '/^__DATA__$$/q' taboo.perl && \
wget -q -O - http://vger.kernel.org/majordomo-taboos.txt ) \
>taboo.perl+
if diff -u taboo.perl taboo.perl+; \
then \
rm -f taboo.perl+; \
echo >&2 No changes.; \
else \
mv taboo.perl+ taboo.perl; \
chmod +x taboo.perl; \
fi
EOF
__DATA__
-
that was not it, I was biten (again) by git-send-mail that uses strftime=20 (localized) to generate rfc822 dates, making them unparseable :| I've resent the aggregated patch, that should work right now. =2D-=20 =B7O=B7 Pierre Habouzit =B7=B7O madcoder@debian.org OOO http://www.madism.org
>>>>> "Junio" == Junio C Hamano <junkio@cox.net> writes:
Junio> About vger potentially throwing things away, I use this script
Junio> (called "taboo.perl") to check my messages before sending them
Junio> out.
Junio> Obviously the taboo-word list itself is not attached here, but
Junio> the actual script should have a copy of it after the __DATA__
Junio> marker.
With "Inline::Files" from the CPAN, you could have a switch where the
script updates itself with the new list:
use Inline::Files;
if (@ARGV == 1 and $ARGV[0] eq "-update") {
use LWP::Simple;
my $list = get "http://example.com/foo/bar.txt";
open DATA, ">$DATA" or die "cannot write myself: $!";
print DATA $list;
close DATA;
exit 0;
}
... rest of your program here ...
... read using <DATA> as before ...
__DATA__
the list will magically go here.
--
Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095
<merlyn@stonehenge.com> <URL:http://www.stonehenge.com/merlyn/>
Perl/Unix/security consulting, Technical writing, Comedy, etc. etc.
See PerlTraining.Stonehenge.com for onsite and open-enrollment Perl training!
-
