On Thu, 9 Aug 2007, Linus Torvalds wrote:Ouch. It's the diffing between HEAD and the index, and it's all from "add_index_entry()", which sorts the index array using an insertion sort. So when the index array gets large, that sort spends all its time in huge memmove() calls. The silly thing, of course, is that we don't even "need" to do that: both the index and the trees are really sorted already, so we could just interleave them. But since we read them separately, the thing just sucks. We've fixed other similar cases of this we had (diffing trees against each other) by walking the trees together, but the "index vs tree" diff (and merge) is the one remaining place where we still use the original stupid algorithm. So you'll see this performance problem for - diff tree against index ("git diff HEAD" - merge tree into index ("git read-tree -m HEAD") which both do the stupid index/tree filling. So this is all O(n**2), which is why we haven't reacted very much - it doesn't show up nearly as much with the kernel. Also, with a smaller set of files, it would tends to fit in the L2 cache of most competent CPU's. So not only is it n**2, you get the cache trashing behaviour too, and that, I think, is what really causes it to fall off the cliff edge! Gaah. This shouldn't be *that* hard to fix, but I'm not entirely sure I'll have time today. Diffing the index against the tree *should* be instantaneous. It should be no more costly than reading the tree itself (which is 0.191 seconds for me: test "git read-tree -m HEAD" vs "git read-tree HEAD") and reading the index (which is almost instantaneous - the only way I can test it is by doing something like "git update-index --refresh", and that's 0.131 seconds, but that includes all the 100,000 "lstat()" calls). So basically, we're spending several seconds just doing stupid make-believe work and moving the index array around. Ouch. Anyway, the good news is that this is by no means fundamental. It's a small and stupid detail. The only thing that makes it at all painful is that this is in some low-level crud that we haven't touched in *ages*, so I've long since swapped out all my recollection of how we do it. (We basically do: read_cache(); followed by unpack_trees(); and each of those *on*its*own* is pretty cheap, but when we unpack trees into an already populated index, the end result is ugly. Linus - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
| David Miller | Slow DOWN, please!!! |
| KAMEZAWA Hiroyuki | Re: 2.6.22-rc1-mm1 |
| Steven Rostedt | [RFC PATCH 1/3] Unified trace buffer |
| Steven Rostedt | [RFC PATCH 0/6] Convert all tasklets to workqueues |
git: | |
| Peter Klavins | Re: CRLF problems with Git on Win32 |
| J. Bruce Fields | Re: Git User's Survey 2007 unfinished summary continued |
| Linus Torvalds | Re: VCS comparison table |
| Junichi Uekawa | Re: [ANNOUNCE] GIT 1.5.4 |
| Arjan van de Ven | Re: [GIT]: Networking |
| Rémi | [PATCH 0/6] [RFC] Phonet pipes protocol (v2) |
| Jarek Poplawski | Re: [PATCH] pkt_sched: Destroy gen estimators under rtnl_lock(). |
| Jozsef Kadlecsik | Re: TCP connection stalls under 2.6.24.7 |
| Richard Stallman | Real men don't attack straw men |
| Rogier Krieger | Re: bcw(4) is gone |
| Leon Dippenaar | New tcp stack attack |
| Brandon Lee | DELL PERC 5iR slow performance |
| high memory | 6 hours ago | Linux kernel |
| semaphore access speed | 9 hours ago | Applications and Utilities |
| the kernel how to power off the machine | 10 hours ago | Linux kernel |
| Easter Eggs in windows XP | 13 hours ago | Windows |
| Shared swap partition | 13 hours ago | Linux general |
| Root password | 14 hours ago | Linux general |
| Where/when DNOTIFY is used? | 16 hours ago | Linux kernel |
| How to convert Linux Kernel built-in module into a loadable module | 18 hours ago | Linux kernel |
| Linux 2.6.24 and I/O schedulers | 18 hours ago | Linux kernel |
| USB Driver -- Interrupt Polling -- A Little Help Please | 1 day ago | Linux general |
