The upload-pack protocol states that for every flush packet
(aka "0000") sent by the client, we get exactly one NAK or
ACK line back. However, if multi_ack is enabled, we may
get multiple "ACK %s continue" lines before the ACK/NAK is
sent by the upload-pack server.
Unfortunately, JGit was counting an "ACK %s continue" as a response
to a flush packet, causing the client side to stop reading from
the server too early. This left a lot of "ACK %s continue" lines
in the server output buffer, eventually causing the server to stop
and wait for the output buffer to drain. However, the client would
also get stuck after sending too many "have %s" lines, eventually
deadlocking the entire client-server pair.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
---
.../jgit/transport/BasePackFetchConnection.java | 4 ++--
1 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/org.spearce.jgit/src/org/spearce/jgit/transport/BasePackFetchConnection.java b/org.spearce.jgit/src/org/spearce/jgit/transport/BasePackFetchConnection.java
index f07cc4e..eaa94bd 100644
--- a/org.spearce.jgit/src/org/spearce/jgit/transport/BasePackFetchConnection.java
+++ b/org.spearce.jgit/src/org/spearce/jgit/transport/BasePackFetchConnection.java
@@ -359,15 +359,15 @@ private void negotiate(final ProgressMonitor monitor) throws IOException,
continue;
}
- while (resultsPending > 0) {
+ for (;;) {
final PacketLineIn.AckNackResult anr;
anr = pckIn.readACK(ackId);
- resultsPending--;
if (anr == PacketLineIn.AckNackResult.NAK) {
// More have lines are necessary to compute the
// pack on the remote side. Keep doing that.
//
+ resultsPending--;
break;
}
--
1.6.3.48.g99c76
--
This is the lowest buffer size we actually require to keep the
client and server sides from deadlocking against each other.
Also added documentation, and renamed the symbol to better match
its real purpose; naming the lower threshold we can allow for a
buffer.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
---
.../jgit/transport/BasePackFetchConnection.java | 10 +++++++++-
.../org/spearce/jgit/transport/TransportLocal.java | 2 +-
2 files changed, 10 insertions(+), 2 deletions(-)
diff --git a/org.spearce.jgit/src/org/spearce/jgit/transport/BasePackFetchConnection.java b/org.spearce.jgit/src/org/spearce/jgit/transport/BasePackFetchConnection.java
index eaa94bd..1d1b801 100644
--- a/org.spearce.jgit/src/org/spearce/jgit/transport/BasePackFetchConnection.java
+++ b/org.spearce.jgit/src/org/spearce/jgit/transport/BasePackFetchConnection.java
@@ -89,7 +89,15 @@
*/
private static final int MAX_HAVES = 256;
- protected static final int MAX_CLIENT_BUFFER = MAX_HAVES * 46 + 1024;
+ /**
+ * Amount of data the client sends before starting to read.
+ * <p>
+ * Any output stream given to the client must be able to buffer this many
+ * bytes before the client will stop writing and start reading from the
+ * input stream. If the output stream blocks before this many bytes are in
+ * the send queue, the system will deadlock.
+ */
+ protected static final int MIN_CLIENT_BUFFER = 2 * 32 * 46 + 4;
static final String OPTION_INCLUDE_TAG = "include-tag";
diff --git a/org.spearce.jgit/src/org/spearce/jgit/transport/TransportLocal.java b/org.spearce.jgit/src/org/spearce/jgit/transport/TransportLocal.java
index cffdba1..428f73e 100644
--- a/org.spearce.jgit/src/org/spearce/jgit/transport/TransportLocal.java
+++ b/org.spearce.jgit/src/org/spearce/jgit/transport/TransportLocal.java
@@ -175,7 +175,7 @@ InternalLocalFetchConnection() throws TransportException {
// force the buffer to be big enough, otherwise it
// will deadlock both threads.
...Is this about the fetch-pack protocol where
(1) upload-pack shows what it has; fetch-pack keeps reading until it sees
a flush; then
(2) fetch-pack shows what it wants; upload-pack keeps reading; then
(3) fetch-pack sends a bunch of have's, followed by a flush; upload-pack
keeps reading and then responds with an ACK-continue or NAK, which
fetch-pack reads; this step continues zero or more times; and then
finally
(4) fetch-pack sends a bunch of have's, followed by a flush; upload-pack
keeps reading and then responds with an ACK, fetch-pack says done.
Where do you need "enough buffer"? The conversation looks very much "it's
my turn to talk", "now it's your turn to talk and I'll wait until I hear
--
In step 3 during the first round the client can send up to 2 blocks worth of data, with 32 haves per block. This means the client writes 2952 bytes of data before it reads. C Git doesn't run into this sort of problem because a normal pipe would get 1 page (~4096 bytes) in the kernel for the FIFO buffer. In SSH transport, we still have 4096 between us and the ssh client process, plus that has its own buffering. In TCP transport, we have the kernel TX buffer on this side, and the kernel RX buffer on the remote side, plus network switch buffers in the middle. 2952 bytes nicely fits into just over 2 IP packets, and the TCP window is sufficiently large enough to allow these to be sent without blocking the writer. We need to be able to shove 2952 bytes down at the other guy before we start listening to him. The upload-pack side of the system can (at worst) send us 64 "ACK %s continue" lines. We must be able to enter into the receive mode before the upload-pack side fills their outgoing buffer. In the Sun JVMs a pure in-memory pipe only has room for 1024 bytes in the FIFO before it blocks. Though the technique I am using to boost the FIFO from 1024 to 2952 bytes isn't necessarily going to be portable to other JVMs. If both sides only have 1024 bytes of buffer available and both sides can possibly write more than that, And this should be + 8 here. F@!*! Robin, can you amend? It should be + 8 because we send to end() packets in that initial burst, each packet is 4 bytes in size. -- Shawn. --
Sorry, perhaps I am being extremely slow, but even if the client writes millions of bytes before it starts reading, I do not see how it would be a problem as long as the other side reads these millions of bytes before saying "Ok, I've heard about them and my response so far is Ack-continue (or NAK)", which the client needs to read. --
Ok, maybe my understanding of the fetch-pack/upload-pack protocol is incorrect. If multi_ack is enabled then isn't it possible for the remote to return "ACK %s continue" for the first 63 "have %s" lines the client sent? E.g. the case is the client has only one ref, and has only 1 commit the other side doesn't have, and the other side has only one ref, and has only 1 commit the client doesn't have (so the client will fetch 1 commit). In such a case, the client will blast 64 have lines before pausing to listen to the server. But the server will have 63 of those lines, and will try to send "ACK %s continue" in vain at the client, hoping it will stop enumerating along that branch. If there is insufficient buffering along one of those writers, the entire thing deadlocks. -- Shawn. --
