Changes since v2 --- patch */9 --- - Changed the order of some patches. - Some updates to the commit log messages. --- patch 2/9 --- - New patch which let us later use longest_match_lstat_cache(), now renamed to longest_path_match(), inside patch 4/9. --- patch 3/9 --- - New patch which update the symlinks.c file to be more in line with the GIT source code (s/length,string/string,length/ for function arguments). -- patch 4/9 --- - The new function schedule_dir_for_removal() is placed inside symlinks.c instead of unpack-trees.c -- patch 9/9 --- - NOTE/NB: this patch is only a debug patch, not be included in the final GIT release version. Kjetil Barvik (9): lstat_cache(): small cleanup and optimisation lstat_cache(): generalise longest_match_lstat_cache() lstat_cache(): swap func(length, string) into func(string, length) unlink_entry(): introduce schedule_dir_for_removal() create_directories(): remove some memcpy() and strchr() calls write_entry(): cleanup of some duplicated code write_entry(): use fstat() instead of lstat() when file is open show_patch_diff(): remove a call to fstat() lstat_cache(): print a warning if doing ping-pong between cache types Documentation/CodingGuidelines | 3 + builtin-add.c | 2 +- builtin-apply.c | 2 +- builtin-update-index.c | 2 +- cache.h | 10 ++- combine-diff.c | 4 +- diff-lib.c | 2 +- dir.c | 2 +- entry.c | 108 ++++++++++++------------- symlinks.c | 177 ++++++++++++++++++++++++++++++---------- unpack-trees.c | 34 ++------ 11 files changed, 208 insertions(+), 138 deletions(-) --
Simplify the if-else test in longest_match_lstat_cache() such that we
only have one simple if test. Instead of testing for 'i == cache.len'
or 'i == len', we transform this to a common test for 'i == max_len'.
And to further optimise we use 'i >= max_len' instead of 'i ==
max_len', the reason is that it is now the exact opposite of one part
inside the while-loop termination expression 'i < max_len && name[i]
== cache.path[i]', and then the compiler can probably reuse a test
instruction from it.
We also throw away the arguments to reset_lstat_cache(), such that all
the safeguard logic inside lstat_cache() is handled at one place.
Signed-off-by: Kjetil Barvik <barvik@broadpark.no>
---
symlinks.c | 44 ++++++++++++++++++++++++--------------------
1 files changed, 24 insertions(+), 20 deletions(-)
diff --git a/symlinks.c b/symlinks.c
index f262b7c..ae57e56 100644
--- a/symlinks.c
+++ b/symlinks.c
@@ -25,27 +25,30 @@ static inline int longest_match_lstat_cache(int len, const char *name,
}
i++;
}
- /* Is the cached path string a substring of 'name'? */
- if (i == cache.len && cache.len < len && name[cache.len] == '/') {
- match_len_prev = match_len;
- match_len = cache.len;
- /* Is 'name' a substring of the cached path string? */
- } else if ((i == len && len < cache.len && cache.path[len] == '/') ||
- (i == len && len == cache.len)) {
+ /*
+ * Is the cached path string a substring of 'name', is 'name'
+ * a substring of the cached path string, or is 'name' and the
+ * cached path string the exact same string?
+ */
+ if (i >= max_len && ((len > cache.len && name[cache.len] == '/') ||
+ (len < cache.len && cache.path[len] == '/') ||
+ (len == cache.len))) {
match_len_prev = match_len;
- match_len = len;
+ match_len = i;
}
*previous_slash = match_len_prev;
return match_len;
}
-static inline void reset_lstat_cache(int track_flags, int prefix_len_stat_func)
+static inline void reset_lstat_cache(void)
{
...Swap function argument pair (length, string) into (string, length) to
conform with the commonly used order inside the GIT source code.
Also, add a note about this fact into the coding guidelines.
Signed-off-by: Kjetil Barvik <barvik@broadpark.no>
---
Documentation/CodingGuidelines | 3 +++
builtin-add.c | 2 +-
builtin-apply.c | 2 +-
builtin-update-index.c | 2 +-
cache.h | 8 ++++----
diff-lib.c | 2 +-
dir.c | 2 +-
entry.c | 2 +-
symlinks.c | 16 ++++++++--------
unpack-trees.c | 4 ++--
10 files changed, 23 insertions(+), 20 deletions(-)
diff --git a/Documentation/CodingGuidelines b/Documentation/CodingGuidelines
index 0d7fa9c..b8bf618 100644
--- a/Documentation/CodingGuidelines
+++ b/Documentation/CodingGuidelines
@@ -129,3 +129,6 @@ For C programs:
used in the git core command set (unless your command is clearly
separate from it, such as an importer to convert random-scm-X
repositories to git).
+
+ - When we pass <string, length> pair to functions, we should try to
+ pass them in that order.
diff --git a/builtin-add.c b/builtin-add.c
index ac98c83..a23ad96 100644
--- a/builtin-add.c
+++ b/builtin-add.c
@@ -148,7 +148,7 @@ static const char **validate_pathspec(int argc, const char **argv, const char *p
if (pathspec) {
const char **p;
for (p = pathspec; *p; p++) {
- if (has_symlink_leading_path(strlen(*p), *p)) {
+ if (has_symlink_leading_path(*p, strlen(*p))) {
int len = prefix ? strlen(prefix) : 0;
die("'%s' is beyond a symbolic link", *p + len);
}
diff --git a/builtin-apply.c b/builtin-apply.c
index 1e7f917..a1de3cb 100644
--- a/builtin-apply.c
+++ b/builtin-apply.c
@@ -2360,7 +2360,7 @@ static int check_to_create_blob(const char *new_name, int ok_if_exists)
* In such a case, path "new_name" does not exist as
...Currently inside show_patch_diff() we have an fstat() call after an
ok lstat() call. Since before the call to fstat() we have already
tested for the link case with S_ISLNK(), the fstat() can be removed.
Signed-off-by: Kjetil Barvik <barvik@broadpark.no>
---
combine-diff.c | 4 +---
1 files changed, 1 insertions(+), 3 deletions(-)
diff --git a/combine-diff.c b/combine-diff.c
index bccc018..4300319 100644
--- a/combine-diff.c
+++ b/combine-diff.c
@@ -713,9 +713,7 @@ static void show_patch_diff(struct combine_diff_path *elem, int num_parent,
result_size = buf.len;
result = strbuf_detach(&buf, NULL);
elem->mode = canon_mode(st.st_mode);
- }
- else if (0 <= (fd = open(elem->path, O_RDONLY)) &&
- !fstat(fd, &st)) {
+ } else if (0 <= (fd = open(elem->path, O_RDONLY))) {
size_t len = xsize_t(st.st_size);
ssize_t done;
int is_file, i;
--
1.6.1.349.g99fa5
--
Rename the function to longst_path_match() and generalise it such that
it can also be used by other functions.
Signed-off-by: Kjetil Barvik <barvik@broadpark.no>
---
symlinks.c | 46 ++++++++++++++++++++++++----------------------
1 files changed, 24 insertions(+), 22 deletions(-)
diff --git a/symlinks.c b/symlinks.c
index ae57e56..4596aee 100644
--- a/symlinks.c
+++ b/symlinks.c
@@ -1,38 +1,30 @@
#include "cache.h"
-static struct cache_def {
- char path[PATH_MAX + 1];
- int len;
- int flags;
- int track_flags;
- int prefix_len_stat_func;
-} cache;
-
/*
* Returns the length (on a path component basis) of the longest
- * common prefix match of 'name' and the cached path string.
+ * common prefix match of 'name_a' and 'name_b'.
*/
-static inline int longest_match_lstat_cache(int len, const char *name,
- int *previous_slash)
+static int longest_path_match(const char *name_a, int len_a,
+ const char *name_b, int len_b,
+ int *previous_slash)
{
int max_len, match_len = 0, match_len_prev = 0, i = 0;
- max_len = len < cache.len ? len : cache.len;
- while (i < max_len && name[i] == cache.path[i]) {
- if (name[i] == '/') {
+ max_len = len_a < len_b ? len_a : len_b;
+ while (i < max_len && name_a[i] == name_b[i]) {
+ if (name_a[i] == '/') {
match_len_prev = match_len;
match_len = i;
}
i++;
}
/*
- * Is the cached path string a substring of 'name', is 'name'
- * a substring of the cached path string, or is 'name' and the
- * cached path string the exact same string?
+ * Is 'name_b' a substring of 'name_a', the other way around,
+ * or is 'name_a' and 'name_b' the exact same string?
*/
- if (i >= max_len && ((len > cache.len && name[cache.len] == '/') ||
- (len < cache.len && cache.path[len] == '/') ||
- (len == cache.len))) {
+ if (i >= max_len && ((len_a > len_b && name_a[len_b] == '/') ||
+ (len_a < len_b && name_b[len_a] == '/') ||
+ (len_a == len_b))) {
...Remove the call to memcpy() and strchr() for each path component
tested, and instead add each path component as we go forward inside
the while-loop.
Impact: small optimisation
Signed-off-by: Kjetil Barvik <barvik@broadpark.no>
---
entry.c | 23 ++++++++++++++---------
1 files changed, 14 insertions(+), 9 deletions(-)
diff --git a/entry.c b/entry.c
index bb6bdb9..cc8f0c6 100644
--- a/entry.c
+++ b/entry.c
@@ -2,15 +2,19 @@
#include "blob.h"
#include "dir.h"
-static void create_directories(const char *path, const struct checkout *state)
+static void create_directories(const char *path, int path_len,
+ const struct checkout *state)
{
- int len = strlen(path);
- char *buf = xmalloc(len + 1);
- const char *slash = path;
-
- while ((slash = strchr(slash+1, '/')) != NULL) {
- len = slash - path;
- memcpy(buf, path, len);
+ char *buf = xmalloc(path_len + 1);
+ int len = 0;
+
+ while (len < path_len) {
+ do {
+ buf[len] = path[len];
+ len++;
+ } while (len < path_len && path[len] != '/');
+ if (len >= path_len)
+ break;
buf[len] = 0;
/*
@@ -190,6 +194,7 @@ int checkout_entry(struct cache_entry *ce, const struct checkout *state, char *t
memcpy(path, state->base_dir, len);
strcpy(path + len, ce->name);
+ len += ce_namelen(ce);
if (!lstat(path, &st)) {
unsigned changed = ce_match_stat(ce, &st, CE_MATCH_IGNORE_VALID);
@@ -218,6 +223,6 @@ int checkout_entry(struct cache_entry *ce, const struct checkout *state, char *t
return error("unable to unlink old '%s' (%s)", path, strerror(errno));
} else if (state->not_new)
return 0;
- create_directories(path, state);
+ create_directories(path, len, state);
return write_entry(ce, path, state, 0);
}
--
1.6.1.349.g99fa5
--
Currently inside unlink_entry() if we get a successful removal of one
file with unlink(), we try to remove the leading directories each and
every time. So if one directory containing 200 files is moved to an
other location we get 199 failed calls to rmdir() and 1 successful
call.
To fix this and avoid some unnecessary calls to rmdir(), we schedule
each directory for removal and wait much longer before we do the real
call to rmdir().
Since the unlink_entry() function is called with alphabetically sorted
names, this new function end up being very effective to avoid
unnecessary calls to rmdir(). In some cases over 95% of all calls to
rmdir() is removed with this patch.
Signed-off-by: Kjetil Barvik <barvik@broadpark.no>
---
cache.h | 2 +
symlinks.c | 60 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++
unpack-trees.c | 30 +++++----------------------
3 files changed, 68 insertions(+), 24 deletions(-)
diff --git a/cache.h b/cache.h
index 30039ac..717dd84 100644
--- a/cache.h
+++ b/cache.h
@@ -726,6 +726,8 @@ extern int has_symlink_or_noent_leading_path(const char *name, int len);
extern int has_dirs_only_path(const char *name, int len, int prefix_len);
extern void invalidate_lstat_cache(const char *name, int len);
extern void clear_lstat_cache(void);
+extern void schedule_dir_for_removal(const char *name, int len);
+extern void remove_scheduled_dirs(void);
extern struct alternate_object_database {
struct alternate_object_database *next;
diff --git a/symlinks.c b/symlinks.c
index 5167286..215d049 100644
--- a/symlinks.c
+++ b/symlinks.c
@@ -245,3 +245,63 @@ int has_dirs_only_path(const char *name, int len, int prefix_len)
FL_DIR|FL_FULLPATH, prefix_len) &
FL_DIR;
}
+
+static struct removal_def {
+ char path[PATH_MAX];
+ int len;
+} removal;
+
+static void do_remove_scheduled_dirs(int new_len)
+{
+ while (removal.len > new_len) {
+ removal.path[removal.len] = '\0';
+ if (rmdir(removal.path))
+ break;
+ do ...The switch-cases for S_IFREG and S_IFLNK was so similar that it will
be better to do some cleanup and use the common parts of it.
And the entry.c file should now be clean for 'gcc -Wextra' warnings.
Signed-off-by: Kjetil Barvik <barvik@broadpark.no>
---
entry.c | 75 +++++++++++++++++++++++++-------------------------------------
1 files changed, 30 insertions(+), 45 deletions(-)
diff --git a/entry.c b/entry.c
index cc8f0c6..1f53588 100644
--- a/entry.c
+++ b/entry.c
@@ -78,7 +78,7 @@ static int create_file(const char *path, unsigned int mode)
return open(path, O_WRONLY | O_CREAT | O_EXCL, mode);
}
-static void *read_blob_entry(struct cache_entry *ce, const char *path, unsigned long *size)
+static void *read_blob_entry(struct cache_entry *ce, unsigned long *size)
{
enum object_type type;
void *new = read_sha1_file(ce->sha1, &type, size);
@@ -93,36 +93,51 @@ static void *read_blob_entry(struct cache_entry *ce, const char *path, unsigned
static int write_entry(struct cache_entry *ce, char *path, const struct checkout *state, int to_tempfile)
{
- int fd;
- long wrote;
-
- switch (ce->ce_mode & S_IFMT) {
- char *new;
- struct strbuf buf;
- unsigned long size;
-
+ unsigned int ce_mode_s_ifmt = ce->ce_mode & S_IFMT;
+ int fd, ret;
+ char *new;
+ struct strbuf buf = STRBUF_INIT;
+ unsigned long size;
+ size_t wrote, newsize = 0;
+
+ switch (ce_mode_s_ifmt) {
case S_IFREG:
- new = read_blob_entry(ce, path, &size);
+ case S_IFLNK:
+ new = read_blob_entry(ce, &size);
if (!new)
return error("git checkout-index: unable to read sha1 file of %s (%s)",
path, sha1_to_hex(ce->sha1));
+ if (ce_mode_s_ifmt == S_IFLNK && has_symlinks && !to_tempfile) {
+ ret = symlink(new, path);
+ free(new);
+ if (ret)
+ return error("git checkout-index: unable to create symlink %s (%s)",
+ path, strerror(errno));
+ break;
+ }
+
/*
* Convert from git internal format to working tree format
...Currently inside write_entry() we do an lstat(path, &st) call on a
file which have just been opened inside the exact same function. It
should be better to call fstat(fd, &st) on the file while it is open,
and it should be at least as fast as the lstat() method.
Signed-off-by: Kjetil Barvik <barvik@broadpark.no>
---
entry.c | 12 +++++++++---
1 files changed, 9 insertions(+), 3 deletions(-)
diff --git a/entry.c b/entry.c
index 1f53588..5daacc2 100644
--- a/entry.c
+++ b/entry.c
@@ -94,11 +94,12 @@ static void *read_blob_entry(struct cache_entry *ce, unsigned long *size)
static int write_entry(struct cache_entry *ce, char *path, const struct checkout *state, int to_tempfile)
{
unsigned int ce_mode_s_ifmt = ce->ce_mode & S_IFMT;
- int fd, ret;
+ int fd, ret, fstat_done = 0;
char *new;
struct strbuf buf = STRBUF_INIT;
unsigned long size;
size_t wrote, newsize = 0;
+ struct stat st;
switch (ce_mode_s_ifmt) {
case S_IFREG:
@@ -145,6 +146,11 @@ static int write_entry(struct cache_entry *ce, char *path, const struct checkout
}
wrote = write_in_full(fd, new, size);
+ /* use fstat() only when path == ce->name */
+ if (state->refresh_cache && !to_tempfile && !state->base_dir_len) {
+ fstat(fd, &st);
+ fstat_done = 1;
+ }
close(fd);
free(new);
if (wrote != size)
@@ -161,8 +167,8 @@ static int write_entry(struct cache_entry *ce, char *path, const struct checkout
}
if (state->refresh_cache) {
- struct stat st;
- lstat(ce->name, &st);
+ if (!fstat_done)
+ lstat(ce->name, &st);
fill_stat_cache_info(ce, &st);
}
return 0;
--
1.6.1.349.g99fa5
--
I've a bad gut feeling about this: It may not work as expected on Windows because there is this statement in the documentation: "The only guarantee about a file timestamp is that the file time is correctly reflected when the handle that makes the change is closed." (http://msdn.microsoft.com/en-us/library/ms724290(VS.85).aspx) We are operating on a temporary file. It could happen that the fstat() returns the time when the file was created, as opposed to when the write_in_full() was completed successfully. The fstat()ed time ends up in the index, but it can be different from what later lstat() calls report (and the file would be regarded as modified). I have the suspicion that the gain from this patch is minimal. Would you mind playing it safe and drop this patch? -- Hannes --
Hmm, write_entry() is actually called once per one path we write out, and the fstat() is added to the common case (no --tempfile, no --prefix=<dir> checkout), which would mean that if there were any performance gain from this change, it was obtained by trading correctness away. Sad. --
Yes, I had to make sure that the path string and ce->name was the exact same string, so therefore I had to add the test "&& !to_tempfile Sorry about this. Yes, I agree that we should drop this patch. And, yes, since each lstat() call cost approximately 44 microseconds compared to 12-16 for each lstat() on my Linux box, there was a little performance gain from this patch. Junio, is it OK to ask that you drop this patch if/when you update the pu branch, such that I do not have to resend the patch series almost unchanged to the mailinglist (except for one missing patch)? Ok, maybe wait one day, just in case there will be more comments. And, thanks for the review! -- kjetil --
Surely. Although I've replaced the topic with the updated series, I am not in the reintegration phase yet, so not much work is lost. I was planning to merge this to 'next' today after one last round of eyeballing. Thanks. --
This does look like a good gain. But do you have hard numbers that back the claim? If you can squeeze out a 10% improvement on Linux with this patch, we should take it, and if it turns out that it doesn't work on Windows, we could trivially throw in an #ifdef MINGW (or even #ifdef WIN32 if Cygwin is affected, too) that skips the fstat() optimization. But we really should have numbers that justify this effort. BTW, the statement from the Windows documentation was: "The only guarantee about a file timestamp is that the file time is correctly reflected when the handle that makes the change is closed." In the case of this patch, the timestamp is queried via the handle that made the change, and in this case special case the timestamp could be correct nevertheless. The guarantee doesn't cover this case, but it would be natural, and perhaps it Just Works? -- Hannes --
Hi, Of course, what we _really_ would do is to provide a flag, say, FSTAT_UNRELIABLE and test for _that_ (after defining it in the Makefile for the appropriate platforms). Ciao, Dscho --
* On Thu, 5 Feb 2009, Johannes Sixt wrote: | |> Kjetil Barvik schrieb: |> > And, yes, since each lstat() call cost approximately 44 microseconds |> > compared to 12-16 for each lstat() on my Linux box, there was a little |> ^^^^^^^ fstat()? |> > performance gain from this patch. |> |> This does look like a good gain. But do you have hard numbers that back |> the claim? __________________ Test description I have used the following 8 git binaries: $ for g in ./git_t*; do printf "$g => $($g --version)\n"; done ./git_t0 => git version 1.6.1.80.g7eb5 ./git_t1 => git version 1.6.1.85.gbda6 <= added lstat_cache() ./git_t2 => git version 1.6.1.2.306.gc0f6f ./git_t3 => git version 1.6.1.2.307.g55385 ./git_t4 => git version 1.6.1.2.308.g052a ./git_t5 => git version 1.6.1.2.310.g40dd2 <= added schedule_dir_for_removal() ./git_t6 => git version 1.6.1.2.313.g9deee ./git_t7 => git version 1.6.1.2.314.g0ad9 <= added this patch (7/9) Except for git_t7 all commits should be in origin/pu, so people should be able to do git show/diff/log on the last hex chars on the version- strings to see the differences. CFLAGS="-mtune=core2 -O2 -fomit-frame-pointer -fno-stack-protector -g0 -s" Each git_t* binary is run with args "checkout -q my-v.2.6.2[57]" for a total of 100 times (50/50 to my-v2.6.25/my-v2.6.27). Before each run the test script sleeps 20 seconds to allow the disk to finish and being idle. Time is collected by /usr/bin/time -o output-file prog. While the test script was run, the laptop with an Intel Core2 Duo 2 GHz processor, was run without X and was otherwise idle. The test script took 9 hours and 45 minutes to complete. __________________ Test numbers $ for ((t=0; $t<=7; t++)); do echo "git_t$t => $(parse_usr_bin_time_lines.pl git_t$t*)"; done git_t0 => 5.646 user 8.322 sys 25.579 real 54.6% CPU faults: 0 major 39587 minor git_t1 => 5.631 user 6.826 sys 23.941 real 52.1% CPU ...
Hi, No, I think that would be wrong. Especially after Junio's remarks that fstat() is actually required to behave as you expected it, and only You seemed to get a nice speedup on Linux. If Windows cannot follow suit, too bad. But I do not want to be punished because other people's OS is not as good as mine, so I _want_ fstat(). And those few lines will not hurt, they'll be readable enough. Ciao, Dscho --
OK, I have done some testing/profiling with oprofile(1), and one thing I found out was that my Linux kernel was built with SLUB debug, and of course it cost some system time to run the VM debug code. After I turned this off, the total system time when down from aprox 6 to 3 seconds for the 'git checkout -q my-v2.6.25/7' test. Also, from strace output each lstat() call now take around 16 microseconds, and each fstat() call around 12 microseconds, so for aprox 14000 changed calls (lstat() => fstat()) the performance gain should now only be (* 14000 (- 16 12)) = 56 ms, compared to 467 ms, which I reported before. -- kjetil 1) http://oprofile.sourceforge.net/about/ --
This is a debug patch which is only to be used while the lstat_cache()
is in the test stage, and should be removed/reverted before the final
relase.
I think it should be useful to catch these warnings, as I it could be
an indication of that the cache would not be very effective if it is
doing ping-pong by switching between different cache types too many
times.
Also, if someone is experimenting with the lstat_cache(), this patch
will maybe be useful while debugging.
If someone is able to trigger the warning, then send a mail to the GIT
mailing list, containing the first 15 lines of the warning, and a
short description of the GIT commands to trigger the warnings.
I hope someone is willing to use this patch for a while, to be able to
catch possible ping-pong's.
Signed-off-by: Kjetil Barvik <barvik@broadpark.no>
---
symlinks.c | 23 +++++++++++++++++++++++
1 files changed, 23 insertions(+), 0 deletions(-)
diff --git a/symlinks.c b/symlinks.c
index 215d049..0ad8e39 100644
--- a/symlinks.c
+++ b/symlinks.c
@@ -51,6 +51,11 @@ static inline void reset_lstat_cache(void)
*/
}
+#define SWITCHES_BEFORE_WARNING 10
+static unsigned int cache_switches, number_of_warnings;
+static unsigned int current_cache_func, last_cache_func;
+static unsigned int total_calls;
+
#define FL_DIR (1 << 0)
#define FL_NOENT (1 << 1)
#define FL_SYMLINK (1 << 2)
@@ -77,6 +82,7 @@ static int lstat_cache(const char *name, int len,
int match_flags, ret_flags, save_flags, max_len, ret;
struct stat st;
+ total_calls++;
if (cache.track_flags != track_flags ||
cache.prefix_len_stat_func != prefix_len_stat_func) {
/*
@@ -88,6 +94,17 @@ static int lstat_cache(const char *name, int len,
cache.track_flags = track_flags;
cache.prefix_len_stat_func = prefix_len_stat_func;
match_len = last_slash = 0;
+ cache_switches++;
+ if (cache_switches > SWITCHES_BEFORE_WARNING) {
+ if (number_of_warnings < 10 || number_of_warnings % 1000 == ...