To my surprise, it's not that bad. The Debian testing-security team
uses a single 1.8 MB file (400 KB compressed) to keep vulnerability
data. Most changes to that file involve just a few lines. But even
in this extreme case, git doesn't compare too badly against Subversion
if you pack regularly (but not too often). Disk usage is actually
*below* Subversion FSFS even with --depth=10 (the default,
unfortunately a bit hard to override).
I plan to do another experiment for GCC, which contains marvels such
as:
35905 126056 1379093 gcc/ChangeLog-2005
12610 61215 417584 gcc/combine.c
But the outcome will likely be quite similar to the secure-testing
case: comparable disk space usage, not a difference in the order of
one or more magnitudes.
But Subversion still has got a significant adventage: I can get a
working copy without downloading full history (several gigabytes in
GCC's case). There's also the slight drawback that you shouldn't pack
too often, otherwise you'll reduce its effectiveness. You can always
run "git-repack -a -d", but it's rather expensive. This means that
you need to keep compressed fulltexts from a few dozen revisions, but
I don't think this is a huge burden. All in all, the compressed
fulltexts/packs model is a pretty good trade-off between disk usage,
end user usability nad code complexity.
In your mbox case, you should simply try Maildir. The tree object
(which lists all files in the Maildir folder) will still be rather
large (about 40 to 50 bytes per message stored), though.
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html