Yes, indeed. We can also have another heuristic: if we find a delta, and
we haven't seen the object it deltas against, we can still keep it as a
delta IF WE ALSO DON'T ALREADY HAVE THE BASE OBJECT. Because then we know
that the base object has to be there later in the pack (or we have a
dangling delta, which we'll just consider an error).
So yeah, maybe my patch-series is something we can still save.
However, the thing that makes me suspect that it is _not_ saveable, is
this:
- let's assume we have a nice thin pack, with object A B C D (in that
order), which is actually a good pack in itself (ie it _might_ be thin,
but it's actually self-sufficient)
- let A be a full object, and B be packed as a delta off A, C as a delta
off B, and D as a delta off C.
- Try to repack it as a streaming thing (the end result _should_
obviously be exactly the same as the input, since it turns out to be
self-sufficient)
Looks trivial, no?
The answer is: no. It's not trivial. Or rather, it _is_ trivial, but you
have to _remember_ all of the actual data for A, B, C and D all the way to
the end, because only if you have that data in memory can you actually
_recreate_ B, C and D even enough to get their SHA1's (which you need,
just in order to know that the pack is complete, must less to be able to
create a non-delta version in case it hadn't been).
So we can definitely do the one-pass creation, but it requires that we
keep track of everything we've expanded so far in memory (because we won't
have the data available any other way - we don't have them as objects in
our object database, and we don't have a good new pack yet).
But if you do that, then yes, it's salvageable.
No, you've not missed anything. I didn't really expect anybody to want to
seriously play with it, so I didn't bother to do things properly.
Especially since I hadn't even written very good commit messages.
Anyway, I just pushed the "rewrite-pack" branch to my git repo on
kernel.org, so once it mirrors out, if you really want to try to fix up
the mess I left behind, there it is:
git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/git.git rewrite-pack
Maybe it's recoverable.
Linus
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html