Re: [PATCH] Add ERR support to smart HTTP

Previous thread: [ANNOUNCE] gitolite now supports smart http by Sitaram Chamarty on Sunday, September 5, 2010 - 8:37 am. (2 messages)

Next thread: [PATCH v3 0/3] Add support for SMTP server options by Pascal Obry on Sunday, September 5, 2010 - 10:48 am. (10 messages)
From: Ilari Liusvaara
Date: Sunday, September 5, 2010 - 10:30 am

All "true smart transports" support ERR packets, allowing server
to send back error message explaining reasons for refusing the
request instead of just rudely closing connection without any error.

However, since smart HTTP isn't "true smart transport", but instead
dumb one from git main executable perspective, smart HTTP needs to
implement its own version of this.

Now that Gitolite supports HTTP too, it needs to be able to send
error messages for authorization failures back to client so that's
one probable user for this feature.

The error is sent as '<packetlength># ERR <message>" and must be the
first packet in response. The reason for putting the '#' there is that
old git versions will interpret that as invalid server response and
print (at least the first line of) the error together with complaint
of invalid response (mangling it a bit but it will still be understandable,
in manner similar to existing smart transport ERR messages).

Thus for example server response:

"0031# ERR W access for foo/alice/a1 DENIED to bob"

Will cause the following to be printed:

"fatal: remote error: W access for foo/alice/a1 DENIED to bob"

If the git version is old and doesn't support this feature, then the
message will be:

"fatal: invalid server response; got '# ERR W access for foo/alice/a1
DENIED to bob'"

Which is at least undertandable.

Signed-off-by: Ilari Liusvaara <ilari.liusvaara@elisanet.fi>
---
 remote-curl.c |    2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/remote-curl.c b/remote-curl.c
index 24fbb9a..46fa971 100644
--- a/remote-curl.c
+++ b/remote-curl.c
@@ -153,6 +153,8 @@ static struct discovery* discover_refs(const char *service)
 
 		if (packet_get_line(&buffer, &last->buf, &last->len) <= 0)
 			die("%s has invalid packet header", refs_url);
+		if (buffer.len >= 6 && !strncmp(buffer.buf, "# ERR ", 6))
+			die("remote error: %s", buffer.buf + 6);
 		if (buffer.len && buffer.buf[buffer.len - 1] == '\n')
 			strbuf_setlen(&buffer, ...
From: Jonathan Nieder
Date: Sunday, September 5, 2010 - 10:41 am

Yippee!  Thanks, Ilari.

For this specific error, why can't gitolite use an HTTP response code?
Should http-backend be using ERR is some places, too, a la [1]?

Jonathan
who would like to find time to write a test case for "git daemon" any
day now

[1] http://thread.gmane.org/gmane.comp.version-control.git/145456/focus=145573
--

From: Ilari Liusvaara
Date: Sunday, September 5, 2010 - 11:49 am

I wanted this error to be sent in manner that causes old clients to print
the error, even if sightly mangled (the ERR of "true smart transports" does
also have this property).

AFAIK, HTTP errors don't have descriptions printed.
 
-Ilari
--

From: Ævar Arnfjörð Bjarmason
Date: Sunday, September 5, 2010 - 12:27 pm

On Sun, Sep 5, 2010 at 18:49, Ilari Liusvaara

I don't know if this applies here but HTTP error codes can come with
any free-form \n-delimited string:

    HTTP/1.1 402 You Must Build Additional Pylons
--

From: Ilari Liusvaara
Date: Sunday, September 5, 2010 - 2:21 pm

Yes, they can, but remote-curl doesn't print those error explanations
(just tried).

-Ilari
--

From: Jakub Narebski
Date: Sunday, September 5, 2010 - 2:22 pm

And you can also send more detailed description in the *body* (and not
only HTTP headers) of HTTP response, though I don't know if git does
that.

-- 
Jakub Narebski
Poland
ShadeHawk on #git
--

From: Sitaram Chamarty
Date: Sunday, September 5, 2010 - 6:04 pm

I'm going to try the patch that Ilari sent when I get to work but to
answer this sub-thread about HTTP status codes and messages, none of
that gets printed by the curl code, as Ilari pointed out.  Here's a
transcript:

Notice the 403 on this one... I do send that back:

06:30:37 sitaram@sita-lt:http-test $ git clone `genurl alice foo/sitaram/try1`
Cloning into try1...
error: The requested URL returned error: 403 while accessing
http://alice:alice@127.0.0.1/git/foo/sitaram/try1/info/refs

fatal: HTTP request failed

You can see the actual message cleanly here:

06:30:46 sitaram@sita-lt:http-test $ curl
http://alice:alice@127.0.0.1/git/foo/sitaram/try1/info/refs
ERR R access for foo/sitaram/try1 DENIED to alice


And here you can see the text part of the HTTP/1.1 NNN status line:

06:31:04 sitaram@sita-lt:http-test $ curl -v
http://alice:alice@127.0.0.1/git/foo/sitaram/try1/info/refs
* About to connect() to 127.0.0.1 port 80 (#0)
*   Trying 127.0.0.1... connected
* Connected to 127.0.0.1 (127.0.0.1) port 80 (#0)
< HTTP/1.1 403 error - gitolite
< Date: Mon, 06 Sep 2010 01:02:23 GMT
< Server: Apache/2.2.16 (Fedora)
< Expires: Fri, 01 Jan 1980 00:00:00 GMT
< Pragma: no-cache
< Cache-Control: no-cache, max-age=0, must-revalidate
< Connection: close
< Transfer-Encoding: chunked
< Content-Type: text/plain; charset=UTF-8
<
ERR R access for foo/sitaram/try1 DENIED to alice
* Closing connection #0
--

From: Sitaram Chamarty
Date: Sunday, September 5, 2010 - 10:45 pm

turns out all this was moot.  It was *because* I was using something
other than "200 OK" that the user was not seeing the message.  Ilari's
patch just makes the message *look* better/cleaner, but I still have
to send it out with a "200 OK" status.

That was... a surprise :-)

Thanks all

sitaram
--

From: Ævar Arnfjörð Bjarmason
Date: Monday, September 6, 2010 - 1:45 am

You can still send it out with a "200 <anything you want here>" if you
want to give a warning/error even on 200.
--

From: Jakub Narebski
Date: Monday, September 6, 2010 - 1:49 am

From what I remember from smart HTTP discussion (during fleshing-out
the protocol/exchange details), the fact that errors from git are send
with "200 OK" HTTP status are very much conscious decision.  But I don't
remember *why* it was chosen this way.  If I remember correctly it was
something about transparent proxies and caches...  Is it documented
anywhere?  Can anyone explain it?

Nevertheless I think it would be a good idea to make *client* more
accepting, which means:
1. Printing full HTTP status, and not only HTTP return / error code;
   perhaps only if it is non-standard, and perhaps only in --verbose
   mode.
2. If message body contains ERR line, print error message even if the
   HTTP status was other than "200 OK".  To be "generous in what you
   receive" (well, kind of).
3. In verbose mode, if body of HTTP error message (not "HTTP OK")
   exists and does not contain ERR line (e.g. an error from web server),
   print it in full (perhaps indented).

I think that neither of the above would lead to leaking sensitive 
information.

What do you think?
-- 
Jakub Narebski
Poland
--

From: Joshua Juran
Date: Monday, September 6, 2010 - 2:15 am

I wasn't involved in the decision process, but I suspect it's because  
HTTP is the transport layer to the Git application.  It's the same  
logic as trying to log in to a Web application with bogus credentials  
and getting back a page (HTTP 200 OK) stating that the login failed.   
As far as HTTP is concerned, the transaction succeeded.

Josh


--

From: Shawn O. Pearce
Date: Monday, September 6, 2010 - 7:56 am

Exactly correct.

FWIW, I meant for the standard git:// ERR type error to be used
here under smart-HTTP.  I'm not sure why we need Ilari's original
patch at all.

That is, the following will trigger a correct error on the client:

  200 OK
  Content-Type: application/x-git-upload-pack-advertisement

  001e# service=git-upload-pack
  0022ERR You shall not do this

Likewise if you wanted to do this with receive-pack, replace upload
with receive above and adjust the pkt-line lengths.

The initial # service= packet is as much part of the "transport
layer" as the HTTP 200 OK response is.  Its the server saying "Yup,
I understood your request correctly.  Now here is your error."

Translation is, gitolite (or GitHub, or ...) should be sending back
two pkt-lines under smart HTTP, not one.

-- 
Shawn.
--

From: Sitaram Chamarty
Date: Monday, September 6, 2010 - 10:59 am

are those counts accurate for the specific example you show or just made up?

It seems the first line has a count in hex that includes the newline
at the end, and the second one has a count in decimal that does not

ok... what about all the other service commands?  like /info/refs?
What should I put there?

Sorry if I'm being stupid but I couldn't find this info anywhere (my C
grokking isn't as good as it used to be anyway).  I've tried all sorts
of combinations of sending out two such lines -- variations on length,
\r, \n, \r\n, neither, etc etc but I can't get the correct output.

Also, experimenting with making the update hook die similarly and
wireshark-ing the responde does not show similar pattern coming
through.

If you could point me to some place that says the precise format,
including \r\n, I'd greatly appreciate it.

Thanks,

Sitaram
--

From: Shawn O. Pearce
Date: Monday, September 6, 2010 - 11:19 am

Feh.  I can't count.  The first count is correct.  The second count
should also be 001e.  I guess that should be obvious by just looking

The only other command that matters is info/refs.

For smart clients, its what I said above.

For dumb clients, you have to use some sort of HTTP error status
that isn't 404.  Dumb clients pre-1.6.6 use a curl error message
buffer to print out an error.  But they don't check the format of
info/refs at all, and skip over garbage and/or interpret garbage
as valid input.  So we can't use a hack like "ERR blah" to even
trigger a parsing failure.

-- 
Shawn.
--

From: Sitaram Chamarty
Date: Wednesday, September 8, 2010 - 7:36 am

Summary of offline discussion with Shawn, so that others can find it if needed:

The first packet (after the HTTP headers of course) should be

XXXX# service=git-upload-pack\n

(or the same with upload replaced by receive).  These are the service
names passed in the service query parameter (/info/refs?service=...).

The XXXX is a hex length of the whole thing.  For these two specific
cases, they will be 1E and 1F.

This should be followed by "0000" (with no \n at the end).  This is a
special packet that means "this sequence of messages is done".

After this you can send any error messages, as follows:

XXXXERR your message\n

where again the XXXX is a hex count of the whole string (including 4
for the count itself, 4 for "ERR ", and a newline if you add it).

-- 
Sitaram
--

From: Sitaram Chamarty
Date: Monday, September 6, 2010 - 7:24 am

I didn't understand this bit about leaking info.  If the bits are
coming into my machine I know what they are anyway (or am able to find
out easily enough, even if git itself isn't showing them to me).
Where's the leak?

And I do see the point that Joshua made that the 200 reflects HTTP
status, not git status.  Makes sense, and answers my original
question...

regards

sitaram
--

From: Jakub Narebski
Date: Monday, September 6, 2010 - 9:31 am

I meant here that programs (including git) do not provide full details
about error condition, especially if it has to do womething with 
authentication, to avoid leaking sensitive information (like e.g. 
saying that username + password combination is invalid, instead of
telling which one is wrong, to avoid disclosing usernames).

-- 
Jakub Narebski
Poland
--

From: Jonathan Nieder
Date: Sunday, September 5, 2010 - 1:11 pm

Thanks for the explanation.  Makes sense.

 $ git clone http://example.com/nonsense.git
 fatal: http://example.com/nonsense.git/info/refs not found: did you run git update-server-info on the server?
--

Previous thread: [ANNOUNCE] gitolite now supports smart http by Sitaram Chamarty on Sunday, September 5, 2010 - 8:37 am. (2 messages)

Next thread: [PATCH v3 0/3] Add support for SMTP server options by Pascal Obry on Sunday, September 5, 2010 - 10:48 am. (10 messages)