Well, in the commit objects case you're likely to have a bunch of them
all contigous.
For tree and blob objects it is less likely.
And of course there is the question of deltas for which you might or
might not have the base object locally already.
Still... I wonder if this could be actually workable. A typical daily
update on the Linux kernel repository might consist of a couple hundreds
or a few tousands objects. This could still be faster to fetch parts of
a pack than the whole pack if the size difference is above a certain
treshold. It is certainly not worse than fetching loose objects.
Things would be pretty horrid if you think of fetching a commit object,
parsing it to find out what tree object to fetch, then parse that tree
object to find out what other objects to fetch, and so on.
But if you only take the approach of fetching the pack index files,
finding out about the objects that the remote has that are not available
locally, and then fetching all those objects from within pack files
without even looking at them (except for deltas), then it should be
possible to issue a couple requests in parallel and possibly have decent
performances. And if it turns out that more than, say, 70% of a
particular pack is to be fetched (you can determine that up front), then
it might be decided to fetch the whole pack.
There is no way to sensibly keep those objects packed on the receiving
end of course, but storing them as loose objects and repacking them
afterwards should be just fine.
Of course you'll get objects from branches in the remote repository you
might not be interested in, but that's a price to pay for such a hack.
On average the overhead shouldn't be that big anyway if branches within
a repository are somewhat related.
I think this is something worth experimenting.
Nicolas
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html