Linus Torvalds <torvalds@linux-foundation.org> writes:Yes, I agree with that in principle. Storing computable values makes sense only when it is expensive to recompute. We did not have cache-tree for quite a long time until you noticed that it was rather expensive and wasteful to recompute tree objects from unchanged parts of the index every time. It's the same argument; when the hashing performance starts to become noticeable, we can think about storing and reusing it, not before. Yes, ls-files is cheap. So is lstat(2) on Linux. It only matters when you do it many many times. In any case, the change does not look too bad. The best time (real) of running git-ls-files in the kernel repository on my box is 0.010s vs 0.011s (10% improvement, heh!, which is the same as the master version) and empty commit is both 0.082s (no change). -- >8 -- [PATCH] lazy index hashing This delays the hashing of index names until it becomes necessary for the first time. Signed-off-by: Junio C Hamano <gitster@pobox.com> --- cache.h | 1 + read-cache.c | 26 +++++++++++++++++++++++--- 2 files changed, 24 insertions(+), 3 deletions(-) diff --git a/cache.h b/cache.h index 409738c..e4aeff0 100644 --- a/cache.h +++ b/cache.h @@ -191,6 +191,7 @@ struct index_state { struct cache_tree *cache_tree; time_t timestamp; void *alloc; + unsigned name_hash_initialized : 1; struct hash_table name_hash; }; diff --git a/read-cache.c b/read-cache.c index 9477c0b..e45f4b3 100644 --- a/read-cache.c +++ b/read-cache.c @@ -34,12 +34,11 @@ static unsigned int hash_name(const char *name, int namelen) return hash; } -static void set_index_entry(struct index_state *istate, int nr, struct cache_entry *ce) +static void hash_index_entry(struct index_state *istate, struct cache_entry *ce) { void **pos; unsigned int hash = hash_name(ce->name, ce_namelen(ce)); - istate->cache[nr] = ce; pos = insert_hash(hash, ce, &istate->name_hash); if (pos) { ce->next = *pos; @@ -47,6 +46,24 @@ static void set_index_entry(struct index_state *istate, int nr, struct cache_ent } } +static void lazy_init_name_hash(struct index_state *istate) +{ + int nr; + + if (istate->name_hash_initialized) + return; + for (nr = 0; nr < istate->cache_nr; nr++) + hash_index_entry(istate, istate->cache[nr]); + istate->name_hash_initialized = 1; +} + +static void set_index_entry(struct index_state *istate, int nr, struct cache_entry *ce) +{ + istate->cache[nr] = ce; + if (istate->name_hash_initialized) + hash_index_entry(istate, ce); +} + /* * We don't actually *remove* it, we can just mark it invalid so that * we won't find it in lookups. @@ -75,7 +92,10 @@ static void replace_index_entry(struct index_state *istate, int nr, struct cache int index_name_exists(struct index_state *istate, const char *name, int namelen) { unsigned int hash = hash_name(name, namelen); - struct cache_entry *ce = lookup_hash(hash, &istate->name_hash); + struct cache_entry *ce; + + lazy_init_name_hash(istate); + ce = lookup_hash(hash, &istate->name_hash); while (ce) { if (!(ce->ce_flags & CE_UNHASHED)) { -- 1.5.4.rc4.14.g6fc74 - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
| Nick Piggin | Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS] |
| Greg KH | Linux 2.6.26.3 |
| Vladislav Bolkhovitin | Re: Integration of SCST in the mainstream Linux kernel |
| Greg KH | Re: Dual-Licensing Linux Kernel with GPL V2 and GPL V3 |
git: | |
| David Symonds | Re: git and binary files |
| Sam Song | Fwd: [OT] Re: Git via a proxy server? |
| Linus Torvalds | Re: [PATCH 0/6] Initial subproject support (RFC?) |
| Petr Baudis | Re: repo.or.cz wishes? |
| GVG GVG | ssh_exchange_identification: Connection closed by remote host |
| Jeff Garzik | Re: Wasting our Freedom |
| Alexey Suslikov | OT: OpenBSD on Asus eeePC |
| ropers | Re: Real men don't attack straw men |
| David Howells | [PATCH 6/7] FS-Cache: CacheFiles: ia64: missing copy_page export [try #13] |
| Valdis.Kletnieks | Re: [RFD] Incremental fsck |
| Chris Mason | Re: [PATCH][RFC] fast file mapping for loop |
| Nikolai Joukov | Re: [ANNOUNCE] RAIF: Redundant Array of Independent Filesystems |
