Re: People unaware of the importance of "git gc"?

!MAILaRCHIVE_VOTE_RePLACE
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]
To: Nicolas Pitre <nico@...>
Cc: Nix <nix@...>, Steven Grimm <koreth@...>, Linus Torvalds <torvalds@...>, Git Mailing List <git@...>
Date: Wednesday, September 5, 2007 - 4:01 pm

Nicolas Pitre <nico@cam.org> writes:


Ok, how about doing something like this?

-- >8 -- snipsnap -- >8 -- clipcrap -- >8 --
Implement git gc --auto

This implements a new option "git gc --auto".  When gc.auto is
set to a positive value, and the object database has accumulated
roughly that many number of loose objects, this runs a
lightweight version of "git gc".  The primary difference from
the full "git gc" is that it does not pass "-a" option to "git
repack", which means we do not try to repack _everything_, but
only repack incrementally.  We still do "git prune-packed".  The
default threshold is arbitrarily set by yours truly to:

 - not trigger it for fully unpacked git v0.99 history;

 - do trigger it for fully unpacked git v1.0.0 history;

 - not trigger it for incremental update to git v1.0.0 starting
   from fully packed git v0.99 history.

This patch does not add invocation of the "auto repacking".  It
is left to key Porcelain commands that could produce tons of
loose objects to add a call to "git gc --auto" after they are
done their work.  Obvious candidates are:

	git add
	git fetch
        git merge
        git rebase        

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---

 builtin-gc.c |   64 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++-
 1 files changed, 63 insertions(+), 1 deletions(-)

diff --git a/builtin-gc.c b/builtin-gc.c
index 9397482..093b3dd 100644
--- a/builtin-gc.c
+++ b/builtin-gc.c
@@ -20,6 +20,7 @@ static const char builtin_gc_usage[] = "git-gc [--prune] [--aggressive]";
 
 static int pack_refs = 1;
 static int aggressive_window = -1;
+static int gc_auto_threshold = 6700;
 
 #define MAX_ADD 10
 static const char *argv_pack_refs[] = {"pack-refs", "--all", "--prune", NULL};
@@ -28,6 +29,8 @@ static const char *argv_repack[MAX_ADD] = {"repack", "-a", "-d", "-l", NULL};
 static const char *argv_prune[] = {"prune", NULL};
 static const char *argv_rerere[] = {"rerere", "gc", NULL};
 
+static const char *argv_repack_auto[] = {"repack", "-d", "-l", NULL};
+
 static int gc_config(const char *var, const char *value)
 {
 	if (!strcmp(var, "gc.packrefs")) {
@@ -41,6 +44,10 @@ static int gc_config(const char *var, const char *value)
 		aggressive_window = git_config_int(var, value);
 		return 0;
 	}
+	if (!strcmp(var, "gc.auto")) {
+		gc_auto_threshold = git_config_int(var, value);
+		return 0;
+	}
 	return git_default_config(var, value);
 }
 
@@ -57,10 +64,49 @@ static void append_option(const char **cmd, const char *opt, int max_length)
 	cmd[i] = NULL;
 }
 
+static int need_to_gc(void)
+{
+	/*
+	 * Quickly check if a "gc" is needed, by estimating how
+	 * many loose objects there are.  Because SHA-1 is evenly
+	 * distributed, we can check only one and get a reasonable
+	 * estimate.
+	 */
+	char path[PATH_MAX];
+	const char *objdir = get_object_directory();
+	DIR *dir;
+	struct dirent *ent;
+	int auto_threshold;
+	int num_loose = 0;
+	int needed = 0;
+
+	if (sizeof(path) <= snprintf(path, sizeof(path), "%s/17", objdir)) {
+		warning("insanely long object directory %.*s", 50, objdir);
+		return 0;
+	}
+	dir = opendir(path);
+	if (!dir)
+		return 0;
+
+	auto_threshold = (gc_auto_threshold + 255) / 256;
+	while ((ent = readdir(dir)) != NULL) {
+		if (strspn(ent->d_name, "0123456789abcdef") != 38 ||
+		    ent->d_name[38] != '\0')
+			continue;
+		if (++num_loose > auto_threshold) {
+			needed = 1;
+			break;
+		}
+	}
+	closedir(dir);
+	return needed;
+}
+
 int cmd_gc(int argc, const char **argv, const char *prefix)
 {
 	int i;
 	int prune = 0;
+	int auto_gc = 0;
 	char buf[80];
 
 	git_config(gc_config);
@@ -82,12 +128,28 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
 			}
 			continue;
 		}
-		/* perhaps other parameters later... */
+		if (!strcmp(arg, "--auto")) {
+			if (gc_auto_threshold <= 0)
+				return 0;
+			auto_gc = 1;
+			continue;
+		}
 		break;
 	}
 	if (i != argc)
 		usage(builtin_gc_usage);
 
+	if (auto_gc) {
+		/*
+		 * Auto-gc should be least intrusive as possible.
+		 */
+		prune = 0;
+		for (i = 0; i < ARRAY_SIZE(argv_repack_auto); i++)
+			argv_repack[i] = argv_repack_auto[i];
+		if (!need_to_gc())
+			return 0;
+	}
+
 	if (pack_refs && run_command_v_opt(argv_pack_refs, RUN_GIT_CMD))
 		return error(FAILED_RUN, argv_pack_refs[0]);
 
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

Messages in current thread:
People unaware of the importance of "git gc"?, Linus Torvalds, (Wed Sep 5, 3:09 am)
Re: People unaware of the importance of "git gc"?, Alex Riesen, (Wed Sep 5, 5:07 pm)
Re: People unaware of the importance of "git gc"?, J. Bruce Fields, (Wed Sep 5, 1:44 pm)
Re: People unaware of the importance of "git gc"?, Brandon Casey, (Wed Sep 5, 2:46 pm)
Re: People unaware of the importance of "git gc"?, David Kastrup, (Wed Sep 5, 3:09 pm)
Re: People unaware of the importance of "git gc"?, Mike Hommey, (Wed Sep 5, 3:20 pm)
Re: People unaware of the importance of "git gc"?, J. Bruce Fields, (Wed Sep 5, 3:13 pm)
Re: People unaware of the importance of "git gc"?, David Kastrup, (Wed Sep 5, 3:43 pm)
Re: People unaware of the importance of "git gc"?, Govind Salinas, (Wed Sep 5, 12:47 pm)
Re: People unaware of the importance of "git gc"?, Steven Grimm, (Wed Sep 5, 1:35 pm)
Re: People unaware of the importance of "git gc"?, Carl Worth, (Wed Sep 5, 1:19 pm)
Re: People unaware of the importance of "git gc"?, David Kastrup, (Wed Sep 5, 4:16 am)
Re: People unaware of the importance of "git gc"?, Pierre Habouzit, (Wed Sep 5, 3:42 am)
Re: People unaware of the importance of "git gc"?, Steven Grimm, (Wed Sep 5, 2:14 pm)
Re: People unaware of the importance of "git gc"?, Nicolas Pitre, (Wed Sep 5, 2:54 pm)
Re: People unaware of the importance of "git gc"?, Junio C Hamano, (Wed Sep 5, 4:01 pm)
Re: People unaware of the importance of "git gc"?, Johannes Schindelin, (Thu Sep 6, 11:54 am)
Re: People unaware of the importance of "git gc"?, Junio C Hamano, (Thu Sep 6, 1:49 pm)
Re: People unaware of the importance of "git gc"?, Johannes Schindelin, (Fri Sep 7, 6:12 am)
Re: People unaware of the importance of "git gc"?, Shawn O. Pearce, (Fri Sep 7, 12:48 am)
Re: People unaware of the importance of "git gc"?, Linus Torvalds, (Thu Sep 6, 2:15 pm)
Subject: [PATCH] git-merge-pack, Junio C Hamano, (Thu Sep 6, 7:12 pm)
Re: Subject: [PATCH] git-merge-pack, Andy Parkins, (Fri Sep 7, 3:24 am)
Re: Subject: [PATCH] git-merge-pack, Johannes Sixt, (Fri Sep 7, 3:11 am)
Re: Subject: [PATCH] git-merge-pack, Junio C Hamano, (Fri Sep 7, 3:34 am)
Re: Subject: [PATCH] git-merge-pack, Nicolas Pitre, (Thu Sep 6, 8:51 pm)
Re: Subject: [PATCH] git-merge-pack, Junio C Hamano, (Fri Sep 7, 12:43 am)
[PATCH] pack-objects --repack-unpacked, Junio C Hamano, (Sat Sep 8, 6:01 am)
Re: [PATCH] pack-objects --repack-unpacked, Shawn O. Pearce, (Sat Sep 8, 10:57 pm)
Re: [PATCH] pack-objects --repack-unpacked, Junio C Hamano, (Sun Sep 9, 1:04 am)
Re: [PATCH] pack-objects --repack-unpacked, Nicolas Pitre, (Sun Sep 9, 8:29 am)
Re: [PATCH] pack-objects --repack-unpacked, Shawn O. Pearce, (Sun Sep 9, 1:49 pm)
Re: Subject: [PATCH] git-merge-pack, Shawn O. Pearce, (Fri Sep 7, 12:07 am)
Re: Subject: [PATCH] git-merge-pack, Junio C Hamano, (Thu Sep 6, 9:58 pm)
Re: Subject: [PATCH] git-merge-pack, Nicolas Pitre, (Thu Sep 6, 10:32 pm)
Re: Subject: [PATCH] git-merge-pack, Linus Torvalds, (Thu Sep 6, 7:35 pm)
Re: People unaware of the importance of "git gc"?, Steven Grimm, (Thu Sep 6, 2:29 pm)
Re: People unaware of the importance of "git gc"?, Shawn O. Pearce, (Wed Sep 5, 10:45 pm)
Re: People unaware of the importance of "git gc"?, Steven Grimm, (Wed Sep 5, 10:49 pm)
Re: People unaware of the importance of "git gc"?, Shawn O. Pearce, (Wed Sep 5, 10:56 pm)
Re: People unaware of the importance of "git gc"?, Alex Riesen, (Wed Sep 5, 5:18 pm)
Re: [PATCH] Invoke "git gc --auto" from "git add" and "git f..., Johannes Schindelin, (Thu Sep 6, 8:02 am)
Re: People unaware of the importance of "git gc"?, Nicolas Pitre, (Wed Sep 5, 4:35 pm)
Re: People unaware of the importance of "git gc"?, Junio C Hamano, (Wed Sep 5, 5:49 pm)
Invoke "git gc --auto" from commit, merge, am and rebase., Junio C Hamano, (Wed Sep 5, 5:59 pm)
Re: People unaware of the importance of "git gc"?, Junio C Hamano, (Wed Sep 5, 5:46 pm)
Re: People unaware of the importance of "git gc"?, David Kastrup, (Thu Sep 6, 1:55 am)
Re: People unaware of the importance of "git gc"?, Nicolas Pitre, (Wed Sep 5, 7:04 pm)
Re: People unaware of the importance of "git gc"?, Junio C Hamano, (Wed Sep 5, 7:42 pm)
Re: People unaware of the importance of "git gc"?, Carlos Rica, (Wed Sep 5, 8:27 pm)
Re: People unaware of the importance of "git gc"?, Steven Grimm, (Wed Sep 5, 4:50 am)
Re: People unaware of the importance of "git gc"?, David Kastrup, (Wed Sep 5, 5:13 am)
Re: People unaware of the importance of "git gc"?, Pierre Habouzit, (Wed Sep 5, 5:14 am)
Re: People unaware of the importance of "git gc"?, Junio C Hamano, (Wed Sep 5, 5:07 am)
Re: People unaware of the importance of "git gc"?, Martin Langhoff, (Wed Sep 5, 5:27 am)
Re: People unaware of the importance of "git gc"?, Matthieu Moy, (Wed Sep 5, 5:33 am)
Re: People unaware of the importance of "git gc"?, Johan De Messemaeker, (Wed Sep 5, 10:17 am)
Re: People unaware of the importance of "git gc"?, Matthieu Moy, (Wed Sep 5, 1:31 pm)
Re: People unaware of the importance of "git gc"?, Jeff King, (Wed Sep 5, 7:56 pm)
Re: People unaware of the importance of "git gc"?, Junio C Hamano, (Wed Sep 5, 4:16 am)
Re: People unaware of the importance of "git gc"?, Junio C Hamano, (Wed Sep 5, 3:30 am)
Re: People unaware of the importance of "git gc"?, Wincent Colaiuta, (Wed Sep 5, 4:51 am)
Re: People unaware of the importance of "git gc"?, Johan Herland, (Wed Sep 5, 4:13 am)
Re: People unaware of the importance of "git gc"?, Matthieu Moy, (Wed Sep 5, 4:39 am)
Re: People unaware of the importance of "git gc"?, Pierre Habouzit, (Wed Sep 5, 4:51 am)
Re: People unaware of the importance of "git gc"?, Matthieu Moy, (Wed Sep 5, 5:04 am)
Re: People unaware of the importance of "git gc"?, Johan Herland, (Wed Sep 5, 4:41 am)
Re: People unaware of the importance of "git gc"?, Tomash Brechko, (Wed Sep 5, 3:26 am)
Re: People unaware of the importance of "git gc"?, Martin Langhoff, (Wed Sep 5, 3:21 am)