Btw, final comment on this issue:
I was initially a bit worried about optimizing for just the "git log" with
pathspec or "git blame" kind of behaviour, and possibly pessimizing some
other load.
But the way the caching works, this is likely to be faster (or at least
not slower) even for something that doesn't ever need the cache (which in
turn is likely to be because it's a smaller footprint query and only works
on one version).
Because the way the cache works, it doesn't really do any extra work: it
basically just delays the "free()" on the buffer we allocated. So for
really small footprints it just avoids the overhead of free() (let the OS
reap the pages for it at exit), and for bigger footprints (that end up
replacing the cache entries) it will just do the same work a bit later.
Because it's a simple direct-mapped cache, the only cost is the (trivial)
hash of a few instructions, and possibly the slightly bigger D$ footprint.
I would strongly suspect that even on loads where it doesn't help by
reusing the cached objects, the delayed free'ing on its own is as likely
to help as it is to hurt.
So there really shouldn't be any downsides.
Testing on some other loads (for example, drivers/scsi/ has more activity
than drivers/usb/), the 2x performance win seems to happen for other
things too. For drivers/scsi, the log generating went down from 3.582s
(best) to 1.448s.
"git blame Makefile" went from 1.802s to 1.243s (both best-case numbers
again: a smaller win, but still a win), but there the issue seems to be
that with a file like that, we actually spend most of our time comparing
different versions.
For the "git blame Makefile" case *all* of zlib combined is just 18%,
while the ostensibly trivial "cmp_suspect()" is 23% and another 11% is
from "assign_blame()" - so for top-level entries the costs would seem to
tend to be in the blame algorithm itself, rather than in the actual object
handling.
(I'm sure that could be improved too, but the take-home message from this
is that zlib wasn't really the problem, and our stupid re-generation of
the same delta base was.
Linus
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html