Hi, I've decided to get this series merged now instead of waiting for the ternary treap refactor. It's in excellent shape, thanks to Jonathan's constant reviews/ fixes and David's constant refactoring. Since Jonathan's last send, I've incoporated some suggestions from my own review of Jonathan's series and split it into two series as I proposed earlier; I'll send the second series (rr/contrib-svn-fe) shortly after this. Once this series is merged, I estimate that the following will come in as incremental patches when the work is finished: rr/ternary-trp-refactor rr/zero-tree-refactor rr/dumpfilev3-parser Thanks. -- Ram David Barr (5): Add memory pool library Add string-specific memory pool Add stream helper library Add infrastructure to write revisions in fast-export format Add SVN dump parser Jason Evans (1): Add treap implementation Jonathan Nieder (2): Export parse_date_basic() to convert a date string to timestamp Introduce vcs-svn lib Makefile | 12 ++- cache.h | 1 + date.c | 14 +-- vcs-svn/LICENSE | 33 +++++ vcs-svn/fast_export.c | 75 +++++++++++ vcs-svn/fast_export.h | 14 ++ vcs-svn/line_buffer.c | 93 ++++++++++++++ vcs-svn/line_buffer.h | 14 ++ vcs-svn/obj_pool.h | 80 ++++++++++++ vcs-svn/repo_tree.c | 335 +++++++++++++++++++++++++++++++++++++++++++++++++ vcs-svn/repo_tree.h | 26 ++++ vcs-svn/string_pool.c | 114 +++++++++++++++++ vcs-svn/string_pool.h | 15 +++ vcs-svn/svndump.c | 289 ++++++++++++++++++++++++++++++++++++++++++ vcs-svn/svndump.h | 8 ++ vcs-svn/trp.h | 220 ++++++++++++++++++++++++++++++++ vcs-svn/trp.txt | 102 +++++++++++++++ 17 files changed, 1436 insertions(+), 9 deletions(-) create mode 100644 vcs-svn/LICENSE create mode 100644 vcs-svn/fast_export.c create mode 100644 vcs-svn/fast_export.h create mode 100644 vcs-svn/line_buffer.c create mode 100644 vcs-svn/line_buffer.h create ...
I still haven’t checked this implementation in detail, but it seemed to work in practice and is about to change anyway. I like the documentation updates. What else changed from the previous round? --
Hi again, Nothing else. I should have posted a diff from last time. -- Ram --
From: David Barr <david.barr@cordelta.com> Add a memory pool library implemented using C macros. The obj_pool_gen() macro creates a type-specific memory pool API. The memory pool library is distinguished from the existing specialized allocators in alloc.c by using a contiguous block for all allocations. This means that on one hand, long-lived pointers have to be written as offsets, since the base address changes as the pool grows, but on the other hand, the entire pool can be easily written to the file system. This allows the memory pool to persist between runs of an application. For the svn importer, such a facility is useful because each svn revision can copy trees and files from any previous revision. The relevant information for all revisions has to persist somehow to support incremental runs, and for now it is simplest to avoid relying on the target VCS for that. obj_pool_gen(pre, obj_t, initial_capability) pre: Prefix for generated functions (example: string). obj_t: Type for treap data structure (example: char). initial_capacity: Initial size of the memory pool (example: 4096). void pre_init(void); Read values from a previous run to initialize the pool. If this function is not called, the pool begins valid but empty. uint32_t pre_alloc(uint32_t nmemb); Reserve space for a few objects in the pool and return an offset to the first one. uint32_t pre_free(uint32_t nmemb); Unreserve the last few objects reserved. uint32_t pre_offset(obj_t *pointer); obj_t *pre_pointer(uint32_t offset); Convert between pointers into the in-memory pool and offsets from the beginning (or ~0 for the NULL pointer). Pointers are not guaranteed to remain valid after a pre_alloc() operation or pre_reset() followed by pre_init(), but offsets are. void pre_commit(void); Write the pool to file. A pre_reset() followed by pre_init() (pehaps with exit() in between) will return the pool to the last committed state. void pre_reset(void); Deinitialize ...
Except as a proof of concept, this is the wrong API to have. The problem
is that the caller cannot choose the filename, so it ends up being a .bin
file in the current directory, wherever that is.
The log message leaves out a subtlety: this also increases the
‘committed’ value, and bookkeeping for that might be useful to some
callers.
If you just want something working, I’d suggest stubbing this out:
static MAYBE_UNUSED void pre##_init(void) \
{ \
} \
It even almost makes sense as API: the _init function does all
initialization tasks required, which is to say, none. (The {0, ...}
This can be simplified
static MAYBE_UNUSED void pre##_commit(void) \
{ \
pre##_pool.committed = pre##_pool.size; \
} \
In other words, maybe something like this on top? This includes the
vestigal _init() function which really should not be there (it is
confusing that some callers use it and others don’t). I did not
spend much time on it because in the end I suspect we might throw
obj_pool away anyway.
---
diff --git a/Makefile b/Makefile
index fc31ee0..386a586 100644
--- a/Makefile
+++ b/Makefile
@@ -409,6 +409,7 @@ TEST_PROGRAMS_NEED_X += test-delta
TEST_PROGRAMS_NEED_X += test-dump-cache-tree
TEST_PROGRAMS_NEED_X += test-genrandom
TEST_PROGRAMS_NEED_X += test-match-trees
+TEST_PROGRAMS_NEED_X += test-obj-pool
TEST_PROGRAMS_NEED_X += test-parse-options
TEST_PROGRAMS_NEED_X += test-path-utils
TEST_PROGRAMS_NEED_X += test-run-command
diff --git a/t/t0070-fundamental.sh b/t/t0070-fundamental.sh
index 680d7d6..262f304 100755
--- a/t/t0070-fundamental.sh
+++ b/t/t0070-fundamental.sh
@@ -12,4 +12,10 @@ test_expect_success 'character classes (isspace, isalpha etc.)' '
test-ctype
'
+test_expect_success 'allocator for svn importer' '
+ printf "%s\n" 0 0 0 0 0 2 2 2 2 2 -1 >expected &&
+ test-obj-pool >actual &&
+ test_cmp expected actual
+'
+
test_done
diff --git a/test-obj-pool.c b/test-obj-pool.c
new file mode 100644
index 0000000..1300049
--- ...Hi Jonathan,
Oh, right. I remember that you asked to turn off persistence for this
merge. We can include persistence it in a later series.
Junio: Could you squash this diff into the commit?
diff --git a/vcs-svn/obj_pool.h b/vcs-svn/obj_pool.h
index f60c872..7a256d4 100644
--- a/vcs-svn/obj_pool.h
+++ b/vcs-svn/obj_pool.h
@@ -20,17 +20,6 @@ static struct { \
} pre##_pool = { 0, 0, 0, NULL, NULL}; \
static MAYBE_UNUSED void pre##_init(void) \
{ \
- struct stat st; \
- pre##_pool.file = fopen(#pre ".bin", "a+"); \
- rewind(pre##_pool.file); \
- fstat(fileno(pre##_pool.file), &st); \
- pre##_pool.size = st.st_size / sizeof(obj_t); \
- pre##_pool.committed = pre##_pool.size; \
- pre##_pool.capacity = pre##_pool.size * 2; \
- if (pre##_pool.capacity < initial_capacity) \
- pre##_pool.capacity = initial_capacity; \
- pre##_pool.base = malloc(pre##_pool.capacity * sizeof(obj_t)); \
- fread(pre##_pool.base, sizeof(obj_t), pre##_pool.size, pre##_pool.file); \
} \
static MAYBE_UNUSED uint32_t pre##_alloc(uint32_t count) \
{ \
@@ -62,9 +51,7 @@ static MAYBE_UNUSED obj_t *pre##_pointer(uint32_t offset) \
} \
static MAYBE_UNUSED void pre##_commit(void) \
{ \
- pre##_pool.committed += fwrite(pre##_pool.base + pre##_pool.committed, \
- sizeof(obj_t), pre##_pool.size - pre##_pool.committed, \
- pre##_pool.file); \
+ pre##_pool.committed = pre##_pool.size; \
} \
static MAYBE_UNUSED void pre##_reset(void) \
{ \
--
From: David Barr <david.barr@cordelta.com>
Intern strings so they can be compared by address and stored without
wasting space.
This library uses the macros in the obj_pool.h and trp.h to create a
memory pool for strings and expose an API for handling them.
Signed-off-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
---
Makefile | 4 +-
vcs-svn/string_pool.c | 114 +++++++++++++++++++++++++++++++++++++++++++++++++
vcs-svn/string_pool.h | 15 ++++++
3 files changed, 131 insertions(+), 2 deletions(-)
create mode 100644 vcs-svn/string_pool.c
create mode 100644 vcs-svn/string_pool.h
diff --git a/Makefile b/Makefile
index 663a366..e11e588 100644
--- a/Makefile
+++ b/Makefile
@@ -1740,7 +1740,7 @@ ifndef NO_CURL
endif
XDIFF_OBJS = xdiff/xdiffi.o xdiff/xprepare.o xdiff/xutils.o xdiff/xemit.o \
xdiff/xmerge.o xdiff/xpatience.o
-VCSSVN_OBJS =
+VCSSVN_OBJS = vcs-svn/string_pool.o
OBJECTS := $(GIT_OBJS) $(XDIFF_OBJS) $(VCSSVN_OBJS)
dep_files := $(foreach f,$(OBJECTS),$(dir $f).depend/$(notdir $f).d)
@@ -1864,7 +1864,7 @@ xdiff-interface.o $(XDIFF_OBJS): \
xdiff/xutils.h xdiff/xprepare.h xdiff/xdiffi.h xdiff/xemit.h
$(VCSSVN_OBJS): \
- vcs-svn/obj_pool.h vcs-svn/trp.h
+ vcs-svn/obj_pool.h vcs-svn/trp.h vcs-svn/string_pool.h
endif
exec_cmd.s exec_cmd.o: EXTRA_CPPFLAGS = \
diff --git a/vcs-svn/string_pool.c b/vcs-svn/string_pool.c
new file mode 100644
index 0000000..bd5a380
--- /dev/null
+++ b/vcs-svn/string_pool.c
@@ -0,0 +1,114 @@
+/*
+ * Licensed under a two-clause BSD-style license.
+ * See LICENSE for details.
+ */
+
+#include "git-compat-util.h"
+#include "trp.h"
+#include "obj_pool.h"
+#include "string_pool.h"
+
+static struct trp_root tree = { ~0 };
+
+struct node {
+ uint32_t offset;
+ struct trp_node children;
+};
+
+/* Two memory pools: one for struct node, ...From: Jonathan Nieder <jrnieder@gmail.com> Teach the build system to build a separate library for the upcoming subversion interop support. The resulting vcs-svn/lib.a does not contain any code, nor is it built during a normal build. This is just scaffolding for later changes. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Acked-by: Ramkumar Ramachandra <artagnon@gmail.com> Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com> --- Makefile | 8 +++++++- 1 files changed, 7 insertions(+), 1 deletions(-) diff --git a/Makefile b/Makefile index 9aca8a1..6441dcb 100644 --- a/Makefile +++ b/Makefile @@ -468,6 +468,7 @@ export PYTHON_PATH LIB_FILE=libgit.a XDIFF_LIB=xdiff/lib.a +VCSSVN_LIB=vcs-svn/lib.a LIB_H += advice.h LIB_H += archive.h @@ -1739,7 +1740,8 @@ ifndef NO_CURL endif XDIFF_OBJS = xdiff/xdiffi.o xdiff/xprepare.o xdiff/xutils.o xdiff/xemit.o \ xdiff/xmerge.o xdiff/xpatience.o -OBJECTS := $(GIT_OBJS) $(XDIFF_OBJS) +VCSSVN_OBJS = +OBJECTS := $(GIT_OBJS) $(XDIFF_OBJS) $(VCSSVN_OBJS) dep_files := $(foreach f,$(OBJECTS),$(dir $f).depend/$(notdir $f).d) dep_dirs := $(addsuffix .depend,$(sort $(dir $(OBJECTS)))) @@ -1860,6 +1862,8 @@ http.o http-walker.o http-push.o remote-curl.o: http.h xdiff-interface.o $(XDIFF_OBJS): \ xdiff/xinclude.h xdiff/xmacros.h xdiff/xdiff.h xdiff/xtypes.h \ xdiff/xutils.h xdiff/xprepare.h xdiff/xdiffi.h xdiff/xemit.h + +$(VCSSVN_OBJS): endif exec_cmd.s exec_cmd.o: EXTRA_CPPFLAGS = \ @@ -1908,6 +1912,8 @@ $(LIB_FILE): $(LIB_OBJS) $(XDIFF_LIB): $(XDIFF_OBJS) $(QUIET_AR)$(RM) $@ && $(AR) rcs $@ $(XDIFF_OBJS) +$(VCSSVN_LIB): $(VCSSVN_OBJS) + $(QUIET_AR)$(RM) $@ && $(AR) rcs $@ $(VCSSVN_OBJS) doc: $(MAKE) -C Documentation all -- 1.7.1 --
$ make vcs-svn/lib.a V=1 rm -f vcs-svn/lib.a && ar rcs vcs-svn/lib.a ar: vcs-svn/lib.a: No such file or directory make: *** [vcs-svn/lib.a] Error 1 That is because the vcs-svn directory does not exist. So probably the LICENSE should be added with the same patch (and git should learn to track empty directories). Jonathan --
Hi Jonathan, Oops. Sorry about not checking this: it looked alright at a glance. Yes, we can add LICENSE with this patch. Junio: Could you squash in this diff? --- /dev/null +++ b/vcs-svn/LICENSE @@ -0,0 +1,26 @@ +Copyright (C) 2010 David Barr <david.barr@cordelta.com>. +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions +are met: +1. Redistributions of source code must retain the above copyright + notice(s), this list of conditions and the following disclaimer + unmodified other than the allowable addition of one or more + copyright notices. +2. Redistributions in binary form must reproduce the above copyright + notice(s), this list of conditions and the following disclaimer in + the documentation and/or other materials provided with the + distribution. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDER(S) ``AS IS'' AND ANY +EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR +PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER(S) BE +LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR +CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF +SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR +BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, +WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE +OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, +EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. -- Ram --
From: David Barr <david.barr@cordelta.com> This library provides thread-unsafe fgets()- and fread()-like functions where the caller does not have to supply a buffer. It maintains a couple of static buffers and provides an API to use them. NEEDSWORK: what should buffer_copy_bytes do on error? Signed-off-by: David Barr <david.barr@cordelta.com> Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> --- Makefile | 5 ++- vcs-svn/line_buffer.c | 93 +++++++++++++++++++++++++++++++++++++++++++++++++ vcs-svn/line_buffer.h | 14 +++++++ 3 files changed, 110 insertions(+), 2 deletions(-) create mode 100644 vcs-svn/line_buffer.c create mode 100644 vcs-svn/line_buffer.h diff --git a/Makefile b/Makefile index e11e588..8223d9b 100644 --- a/Makefile +++ b/Makefile @@ -1740,7 +1740,7 @@ ifndef NO_CURL endif XDIFF_OBJS = xdiff/xdiffi.o xdiff/xprepare.o xdiff/xutils.o xdiff/xemit.o \ xdiff/xmerge.o xdiff/xpatience.o -VCSSVN_OBJS = vcs-svn/string_pool.o +VCSSVN_OBJS = vcs-svn/string_pool.o vcs-svn/line_buffer.o OBJECTS := $(GIT_OBJS) $(XDIFF_OBJS) $(VCSSVN_OBJS) dep_files := $(foreach f,$(OBJECTS),$(dir $f).depend/$(notdir $f).d) @@ -1864,7 +1864,8 @@ xdiff-interface.o $(XDIFF_OBJS): \ xdiff/xutils.h xdiff/xprepare.h xdiff/xdiffi.h xdiff/xemit.h $(VCSSVN_OBJS): \ - vcs-svn/obj_pool.h vcs-svn/trp.h vcs-svn/string_pool.h + vcs-svn/obj_pool.h vcs-svn/trp.h vcs-svn/string_pool.h \ + vcs-svn/line_buffer.h endif exec_cmd.s exec_cmd.o: EXTRA_CPPFLAGS = \ diff --git a/vcs-svn/line_buffer.c b/vcs-svn/line_buffer.c new file mode 100644 index 0000000..0f83426 --- /dev/null +++ b/vcs-svn/line_buffer.c @@ -0,0 +1,93 @@ +/* + * Licensed under a two-clause BSD-style license. + * See LICENSE for details. + */ + +#include "git-compat-util.h" + +#include "line_buffer.h" +#include "obj_pool.h" + +#define LINE_BUFFER_LEN 10000 +#define ...
For consistency with the rest of vcs-svn, it should do nothing. :) I would love to see svn-fe diagnosing and recovering somehow from faulty input. For now it follows the easier route of just ignoring (and skipping) confusing input. Probably this should be mentioned in the man page somewhere. The next input/output operation will fail, causing svn-fe to quit early, so it would not be easy for such an error to go unnoticed. --
From: David Barr <david.barr@cordelta.com> repo_tree maintains the exporter's state and provides a facility to to call fast_export, which writes objects to stdout suitable for consumption by fast-import. The exported functions roughly correspond to Subversion FS operations. . repo_add, repo_modify, repo_copy, repo_replace, and repo_delete update the current commit, based roughly on the corresponding Subversion FS operation. . repo_commit calls out to fast_export to write the current commit to the fast-import stream in stdout. . repo_diff is used by the fast_export module to write the changes for a commit. . repo_reset erases the exporter's state, so valgrind can be happy. Signed-off-by: David Barr <david.barr@cordelta.com> Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> --- Makefile | 5 +- vcs-svn/fast_export.c | 75 +++++++++++ vcs-svn/fast_export.h | 14 ++ vcs-svn/repo_tree.c | 335 +++++++++++++++++++++++++++++++++++++++++++++++++ vcs-svn/repo_tree.h | 26 ++++ 5 files changed, 453 insertions(+), 2 deletions(-) create mode 100644 vcs-svn/fast_export.c create mode 100644 vcs-svn/fast_export.h create mode 100644 vcs-svn/repo_tree.c create mode 100644 vcs-svn/repo_tree.h diff --git a/Makefile b/Makefile index 8223d9b..7c66dcc 100644 --- a/Makefile +++ b/Makefile @@ -1740,7 +1740,8 @@ ifndef NO_CURL endif XDIFF_OBJS = xdiff/xdiffi.o xdiff/xprepare.o xdiff/xutils.o xdiff/xemit.o \ xdiff/xmerge.o xdiff/xpatience.o -VCSSVN_OBJS = vcs-svn/string_pool.o vcs-svn/line_buffer.o +VCSSVN_OBJS = vcs-svn/string_pool.o vcs-svn/line_buffer.o \ + vcs-svn/repo_tree.o vcs-svn/fast_export.o OBJECTS := $(GIT_OBJS) $(XDIFF_OBJS) $(VCSSVN_OBJS) dep_files := $(foreach f,$(OBJECTS),$(dir $f).depend/$(notdir $f).d) @@ -1865,7 +1866,7 @@ xdiff-interface.o $(XDIFF_OBJS): \ $(VCSSVN_OBJS): \ ...
David tweaked the API nicely upstream to enforce this constraint. So The usual convention within git is to rely on .c files to include Are the semicolons necessary? (A nitpick, I know). That said, this part is looking pretty good. --
From: Jonathan Nieder <jrnieder@gmail.com>
approxidate() is not appropriate for reading machine-written dates
because it guesses instead of erroring out on malformed dates.
parse_date() is less convenient since it returns its output as a
string. So export the underlying function that writes a timestamp.
While at it, change the return value to match the usual convention:
return 0 for success and -1 for failure.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Acked-by: Ramkumar Ramachandra <artagnon@gmail.com>
Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com>
---
cache.h | 1 +
date.c | 14 ++++++--------
2 files changed, 7 insertions(+), 8 deletions(-)
diff --git a/cache.h b/cache.h
index c9fa3df..68258be 100644
--- a/cache.h
+++ b/cache.h
@@ -811,6 +811,7 @@ const char *show_date_relative(unsigned long time, int tz,
char *timebuf,
size_t timebuf_size);
int parse_date(const char *date, char *buf, int bufsize);
+int parse_date_basic(const char *date, unsigned long *timestamp, int *offset);
void datestamp(char *buf, int bufsize);
#define approxidate(s) approxidate_careful((s), NULL)
unsigned long approxidate_careful(const char *, int *);
diff --git a/date.c b/date.c
index 3c981f7..00f9eb5 100644
--- a/date.c
+++ b/date.c
@@ -586,7 +586,7 @@ static int date_string(unsigned long date, int offset, char *buf, int len)
/* Gr. strptime is crap for this; it doesn't have a way to require RFC2822
(i.e. English) day/month names, and it doesn't work correctly with %z. */
-int parse_date_toffset(const char *date, unsigned long *timestamp, int *offset)
+int parse_date_basic(const char *date, unsigned long *timestamp, int *offset)
{
struct tm tm;
int tm_gmt;
@@ -642,17 +642,16 @@ int parse_date_toffset(const char *date, unsigned long *timestamp, int *offset)
if (!tm_gmt)
*timestamp -= *offset * 60;
- return 1; /* success */
+ return 0; /* success */
}
int parse_date(const char *date, char *result, int ...Junio: I think this should be ejected from the series as an independently useful cleanup. Currently parse_date_toffset() is exported but not declared anywhere. This patch gives it a more predictable API and adds a declaration. Ram: thanks for the reminder. --
Yeah, that makes sense. What this patch does seems to be what c5043cc (Refactor parse_date for approxidate functions, 2010-06-03) should have done from the beginning. Thanks. --
From: David Barr <david.barr@cordelta.com> svndump parses data that is in SVN dumpfile format produced by `svnadmin dump` with the help of line_buffer and uses repo_tree and fast_export to emit a git fast-import stream. Based roughly on com.hydrografix.svndump 0.92 from the SvnToCCase project at <http://svn2cc.sarovar.org/>, by Stefan Hegny and others. Signed-off-by: David Barr <david.barr@cordelta.com> Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> --- Makefile | 5 +- vcs-svn/LICENSE | 4 + vcs-svn/svndump.c | 289 +++++++++++++++++++++++++++++++++++++++++++++++++++++ vcs-svn/svndump.h | 8 ++ 4 files changed, 304 insertions(+), 2 deletions(-) create mode 100644 vcs-svn/svndump.c create mode 100644 vcs-svn/svndump.h diff --git a/Makefile b/Makefile index 7c66dcc..e7b37e0 100644 --- a/Makefile +++ b/Makefile @@ -1741,7 +1741,7 @@ endif XDIFF_OBJS = xdiff/xdiffi.o xdiff/xprepare.o xdiff/xutils.o xdiff/xemit.o \ xdiff/xmerge.o xdiff/xpatience.o VCSSVN_OBJS = vcs-svn/string_pool.o vcs-svn/line_buffer.o \ - vcs-svn/repo_tree.o vcs-svn/fast_export.o + vcs-svn/repo_tree.o vcs-svn/fast_export.o vcs-svn/svndump.o OBJECTS := $(GIT_OBJS) $(XDIFF_OBJS) $(VCSSVN_OBJS) dep_files := $(foreach f,$(OBJECTS),$(dir $f).depend/$(notdir $f).d) @@ -1866,7 +1866,8 @@ xdiff-interface.o $(XDIFF_OBJS): \ $(VCSSVN_OBJS): \ vcs-svn/obj_pool.h vcs-svn/trp.h vcs-svn/string_pool.h \ - vcs-svn/line_buffer.h vcs-svn/repo_tree.h vcs-svn/fast_export.h + vcs-svn/line_buffer.h vcs-svn/repo_tree.h vcs-svn/fast_export.h \ + vcs-svn/svndump.h endif exec_cmd.s exec_cmd.o: EXTRA_CPPFLAGS = \ diff --git a/vcs-svn/LICENSE b/vcs-svn/LICENSE index a3d384c..0a5e3c4 100644 --- a/vcs-svn/LICENSE +++ b/vcs-svn/LICENSE @@ -4,6 +4,10 @@ All rights reserved. Copyright (C) 2008 Jason Evans <jasone@canonware.com>. All rights reserved. +Copyright (C) ...
Probably worth mentioning the this requires a dumpfile v2 (i.e., it A strbuf would do just as well. But using obj_pool looks like a fine Neat. This is a textbook example of where to use a perfect hash, but comparing interned strings is simpler and fast enough (and the Unknown properties are ignored. Adding stream comments to allow A simple reader does not cope well with this kind of function. It is hard to know if it exhaustively deals with all cases. Too long. I realize that writing a state machine can be hard in C; maybe it would be easiest to package up the state in a struct and have a separate function for the main loop body. The patches I didn’t comment on all look good. I don’t think anything I did comment on should prevent this reaching a wider audience. Thanks for the pleasant read, Jonathan --
Except for the obj_pool persistence bit, but you already commented on that. Happy travels, Jonathan --
Hi, While Ram travels halfway across the globe, I have rerolled at his request. You can find the result at git://repo.or.cz/git/jrn.git rr/svn-fe The main difference from the version sent already is the addition of tests and a small tweak to the treap implementation to make them pass. I will send the two patches with new tests as replies to this message. Thoughts welcome, as always. David Barr (5): Add memory pool library vcs-svn: Add string-specific memory pool Add stream helper library Add infrastructure to write revisions in fast-export format Add SVN dump parser Jason Evans (1): Add treap implementation Jonathan Nieder (3): Export parse_date_basic() to convert a date string to timestamp Introduce vcs-svn lib Add a sample user for the svndump library .gitignore | 2 + Makefile | 14 ++- cache.h | 1 + contrib/svn-fe/.gitignore | 4 + contrib/svn-fe/Makefile | 64 +++++++++ contrib/svn-fe/svn-fe.c | 15 ++ contrib/svn-fe/svn-fe.txt | 68 +++++++++ date.c | 14 +- t/t0080-vcs-svn.sh | 101 ++++++++++++++ test-obj-pool.c | 116 ++++++++++++++++ test-treap.c | 65 +++++++++ vcs-svn/LICENSE | 33 +++++ vcs-svn/fast_export.c | 75 ++++++++++ vcs-svn/fast_export.h | 14 ++ vcs-svn/line_buffer.c | 91 +++++++++++++ vcs-svn/line_buffer.h | 12 ++ vcs-svn/obj_pool.h | 61 +++++++++ vcs-svn/repo_tree.c | 331 +++++++++++++++++++++++++++++++++++++++++++++ vcs-svn/repo_tree.h | 26 ++++ vcs-svn/string_pool.c | 101 ++++++++++++++ vcs-svn/string_pool.h | 12 ++ vcs-svn/svndump.c | 289 +++++++++++++++++++++++++++++++++++++++ vcs-svn/svndump.h | 8 + vcs-svn/trp.h | 223 ++++++++++++++++++++++++++++++ vcs-svn/trp.txt | 98 +++++++++++++ 25 files changed, 1829 insertions(+), 9 deletions(-) create mode ...
From: David Barr <david.barr@cordelta.com> Add a memory pool library implemented using C macros. The obj_pool_gen() macro creates a type-specific memory pool. The memory pool library is distinguished from the existing specialized allocators in alloc.c by using a contiguous block for all allocations. This means that on one hand, long-lived pointers have to be written as offsets, since the base address changes as the pool grows, but on the other hand, the entire pool can be easily written to the file system. This could allow the memory pool to persist between runs of an application. For the svn importer, such a facility is useful because each svn revision can copy trees and files from any previous revision. The relevant information for all revisions has to persist somehow to support incremental runs. Signed-off-by: David Barr <david.barr@cordelta.com> Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> --- Stripped out pool_init. And added tests! There is not really much an allocator can do, so it is fun to play around with. .gitignore | 1 + Makefile | 4 +- t/t0080-vcs-svn.sh | 79 +++++++++++++++++++++++++++++++++++ test-obj-pool.c | 116 ++++++++++++++++++++++++++++++++++++++++++++++++++++ vcs-svn/obj_pool.h | 61 +++++++++++++++++++++++++++ 5 files changed, 260 insertions(+), 1 deletions(-) create mode 100755 t/t0080-vcs-svn.sh create mode 100644 test-obj-pool.c create mode 100644 vcs-svn/obj_pool.h diff --git a/.gitignore b/.gitignore index 14e2b6b..1e64a6a 100644 --- a/.gitignore +++ b/.gitignore @@ -167,6 +167,7 @@ /test-genrandom /test-index-version /test-match-trees +/test-obj-pool /test-parse-options /test-path-utils /test-run-command diff --git a/Makefile b/Makefile index d6a779b..3b873cd 100644 --- a/Makefile +++ b/Makefile @@ -409,6 +409,7 @@ TEST_PROGRAMS_NEED_X += test-delta ...
From: Jason Evans <jasone@canonware.com> Provide macros to generate a type-specific treap implementation and various functions to operate on it. It uses obj_pool.h to store memory nodes in a treap. Previously committed nodes are never removed from the pool; after any *_commit operation, it is assumed (correctly, in the case of svn-fast-export) that someone else must care about them. Treaps provide a memory-efficient binary search tree structure. Insertion/deletion/search are about as about as fast in the average case as red-black trees and the chances of worst-case behavior are vanishingly small, thanks to (pseudo-)randomness. The bad worst-case behavior is a small price to pay, given that treaps are much simpler to implement. From http://www.canonware.com/download/trp/trp_hash/trp.h [db: Altered to reference nodes by offset from a common base pointer] [db: Bob Jenkins' hashing implementation dropped for Knuth's] [db: Methods unnecessary for search and insert dropped] Signed-off-by: David Barr <david.barr@cordelta.com> Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> --- Tweaked treap_search() to always return the node after a missing node, like it is documented to. For vcs-svn this doesn’t matter but the predictable semantics should make debugging easier. The rest of the patches are almost identical to the versions Ram sent; see the aforementioned git tree if you are interested in trying them out. Testing would be quite welcome. .gitignore | 1 + Makefile | 3 +- t/t0080-vcs-svn.sh | 22 +++++ test-treap.c | 65 +++++++++++++++ vcs-svn/LICENSE | 3 + vcs-svn/trp.h | 223 ++++++++++++++++++++++++++++++++++++++++++++++++++++ vcs-svn/trp.txt | 98 +++++++++++++++++++++++ 7 files changed, 414 ...
Or rather, it does. Sorry about that.
-- 8< --
Subject: vcs-svn: treap_search should return NULL for missing items
In a misguided attempt to make the code match the documentation,
commit 4692f8e7d (Add treap implementation, 2010-07-15) changed
the semantics of treap_search to return the /next/ node when a
node is missing.
That is great in some circumstances (and the new tests even rely on
it), but the rest of vcs-svn relies on treap_search to return
NULL in that case instead. The documentation only suggested
otherwise because of a typo.
So fix it: now treap_search can do what it was always supposed
to (return NULL on failure) and Jason Evans’s treap_nsearch function
can be used to keep the test suite working.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
---
test-treap.c | 2 +-
vcs-svn/trp.h | 13 +++++++++++++
vcs-svn/trp.txt | 9 +++++++--
3 files changed, 21 insertions(+), 3 deletions(-)
diff --git a/test-treap.c b/test-treap.c
index eae7324..cdba511 100644
--- a/test-treap.c
+++ b/test-treap.c
@@ -52,7 +52,7 @@ int main(int argc, char *argv[])
next = node_offset(treap_next(&root, node_pointer(item)));
treap_remove(&root, node_pointer(item));
- item = node_offset(treap_search(&root, tmp));
+ item = node_offset(treap_nsearch(&root, tmp));
if (item != next && (!~item || node_pointer(item)->n != tmp->n))
die("found %"PRIuMAX" in place of %"PRIuMAX"",
diff --git a/vcs-svn/trp.h b/vcs-svn/trp.h
index 49940cf..1f5f51f 100644
--- a/vcs-svn/trp.h
+++ b/vcs-svn/trp.h
@@ -138,6 +138,19 @@ a_attr a_type MAYBE_UNUSED *a_pre##search(struct trp_root *treap, a_type *key) \
uint32_t ret = treap->trp_root; \
while (~ret && (cmp = (a_cmp)(key, trpn_pointer(a_base,ret)))) { \
if (cmp < 0) { \
+ ret = trp_left_get(a_base, a_field, ret); \
+ } else { \
+ ret = trp_right_get(a_base, a_field, ret); \
+ } \
+ } \
+ return trpn_pointer(a_base, ret); \
+} \
+a_attr a_type MAYBE_UNUSED *a_pre##nsearch(struct ...Hi! svn-fe has some serious changes on the horizon. As a preparation, let’s round up what we have now. The most controversial change is probably the new svn-fe test, which takes about 15 seconds (for the “svnadmin load”, not the svn-fe step :)). It is in the t9* series, so hopefully that will not dissuade people from running the earlier tests. The main highlight in the changes is a new Input error to stderr if a system call failed in reading in the dump file. It still returns status 0 in this and other error situations, though. Based on maint (for no good reason; that’s just where I tried it). Intended to replace rr/svn-export in pu (only if Ram likes it, of course). Thoughts welcome. David Barr (5): Add memory pool library Add string-specific memory pool Add stream helper library Infrastructure to write revisions in fast-export format SVN dump parser Jason Evans (1): Add treap implementation Jonathan Nieder (4): Export parse_date_basic() to convert a date string to timestamp Introduce vcs-svn lib Update svn-fe manual svn-fe manual: Clarify warning about deltas in dumpfiles .gitignore | 5 + Makefile | 25 +++- cache.h | 1 + contrib/svn-fe/svn-fe.c | 1 + contrib/svn-fe/svn-fe.txt | 19 ++-- date.c | 14 +- t/t0080-vcs-svn.sh | 171 +++++++++++++++++++++++ t/t9010-svn-fe.sh | 32 +++++ test-line-buffer.c | 46 +++++++ test-obj-pool.c | 116 ++++++++++++++++ test-string-pool.c | 31 +++++ test-svn-fe.c | 18 +++ test-treap.c | 65 +++++++++ vcs-svn/LICENSE | 33 +++++ vcs-svn/fast_export.c | 74 ++++++++++ vcs-svn/fast_export.h | 11 ++ vcs-svn/line_buffer.c | 102 ++++++++++++++ vcs-svn/line_buffer.h | 12 ++ vcs-svn/line_buffer.txt | 62 +++++++++ vcs-svn/obj_pool.h | 61 +++++++++ vcs-svn/repo_tree.c ...
approxidate() is not appropriate for reading machine-written dates
because it guesses instead of erroring out on malformed dates.
parse_date() is less convenient since it returns its output as a
string. So export the underlying function that writes a timestamp.
While at it, change the return value to match the usual convention:
return 0 for success and -1 for failure.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Acked-by: Ramkumar Ramachandra <artagnon@gmail.com>
---
As before, I think this improves code clarity, independently of
its use for svn-fe. So I would not be unhappy if it is applied
as a separate topic.
No change from last round.
cache.h | 1 +
date.c | 14 ++++++--------
2 files changed, 7 insertions(+), 8 deletions(-)
diff --git a/cache.h b/cache.h
index c9fa3df..68258be 100644
--- a/cache.h
+++ b/cache.h
@@ -811,6 +811,7 @@ const char *show_date_relative(unsigned long time, int tz,
char *timebuf,
size_t timebuf_size);
int parse_date(const char *date, char *buf, int bufsize);
+int parse_date_basic(const char *date, unsigned long *timestamp, int *offset);
void datestamp(char *buf, int bufsize);
#define approxidate(s) approxidate_careful((s), NULL)
unsigned long approxidate_careful(const char *, int *);
diff --git a/date.c b/date.c
index 3c981f7..00f9eb5 100644
--- a/date.c
+++ b/date.c
@@ -586,7 +586,7 @@ static int date_string(unsigned long date, int offset, char *buf, int len)
/* Gr. strptime is crap for this; it doesn't have a way to require RFC2822
(i.e. English) day/month names, and it doesn't work correctly with %z. */
-int parse_date_toffset(const char *date, unsigned long *timestamp, int *offset)
+int parse_date_basic(const char *date, unsigned long *timestamp, int *offset)
{
struct tm tm;
int tm_gmt;
@@ -642,17 +642,16 @@ int parse_date_toffset(const char *date, unsigned long *timestamp, int *offset)
if (!tm_gmt)
*timestamp -= *offset * 60;
- return 1; /* success */
+ return ...Teach the build system to build a separate library for the upcoming subversion interop support. The resulting vcs-svn/lib.a does not contain any code, nor is it built during a normal build. This is just scaffolding for later changes. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> --- This is just for reference; no change from last round. Makefile | 8 +++++++- vcs-svn/LICENSE | 26 ++++++++++++++++++++++++++ 2 files changed, 33 insertions(+), 1 deletions(-) create mode 100644 vcs-svn/LICENSE diff --git a/Makefile b/Makefile index f33648d..71cca35 100644 --- a/Makefile +++ b/Makefile @@ -468,6 +468,7 @@ export PYTHON_PATH LIB_FILE=libgit.a XDIFF_LIB=xdiff/lib.a +VCSSVN_LIB=vcs-svn/lib.a LIB_H += advice.h LIB_H += archive.h @@ -1739,7 +1740,8 @@ ifndef NO_CURL endif XDIFF_OBJS = xdiff/xdiffi.o xdiff/xprepare.o xdiff/xutils.o xdiff/xemit.o \ xdiff/xmerge.o xdiff/xpatience.o -OBJECTS := $(GIT_OBJS) $(XDIFF_OBJS) +VCSSVN_OBJS = +OBJECTS := $(GIT_OBJS) $(XDIFF_OBJS) $(VCSSVN_OBJS) dep_files := $(foreach f,$(OBJECTS),$(dir $f).depend/$(notdir $f).d) dep_dirs := $(addsuffix .depend,$(sort $(dir $(OBJECTS)))) @@ -1860,6 +1862,8 @@ http.o http-walker.o http-push.o remote-curl.o: http.h xdiff-interface.o $(XDIFF_OBJS): \ xdiff/xinclude.h xdiff/xmacros.h xdiff/xdiff.h xdiff/xtypes.h \ xdiff/xutils.h xdiff/xprepare.h xdiff/xdiffi.h xdiff/xemit.h + +$(VCSSVN_OBJS): endif exec_cmd.s exec_cmd.o: EXTRA_CPPFLAGS = \ @@ -1908,6 +1912,8 @@ $(LIB_FILE): $(LIB_OBJS) $(XDIFF_LIB): $(XDIFF_OBJS) $(QUIET_AR)$(RM) $@ && $(AR) rcs $@ $(XDIFF_OBJS) +$(VCSSVN_LIB): $(VCSSVN_OBJS) + $(QUIET_AR)$(RM) $@ && $(AR) rcs $@ $(VCSSVN_OBJS) doc: $(MAKE) -C Documentation all diff --git a/vcs-svn/LICENSE b/vcs-svn/LICENSE new file mode 100644 index 0000000..6e52372 --- ...
From: David Barr <david.barr@cordelta.com> Add a memory pool library implemented using C macros. The obj_pool_gen() macro creates a type-specific memory pool. The memory pool library is distinguished from the existing specialized allocators in alloc.c by using a contiguous block for all allocations. This means that on one hand, long-lived pointers have to be written as offsets, since the base address changes as the pool grows, but on the other hand, the entire pool can be easily written to the file system. This could allow the memory pool to persist between runs of an application. For the svn importer, such a facility is useful because each svn revision can copy trees and files from any previous revision. The relevant information for all revisions has to persist somehow to support incremental runs. [rr: minor cleanups] [jn: added tests; removed file system backing for now] Signed-off-by: David Barr <david.barr@cordelta.com> Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> --- The only change from last round is the notes at the end of the commit message. Hopefully David is less likely to be blamed for bugs I introduced this way. :) .gitignore | 1 + Makefile | 4 +- t/t0080-vcs-svn.sh | 79 +++++++++++++++++++++++++++++++++++ test-obj-pool.c | 116 ++++++++++++++++++++++++++++++++++++++++++++++++++++ vcs-svn/obj_pool.h | 61 +++++++++++++++++++++++++++ 5 files changed, 260 insertions(+), 1 deletions(-) create mode 100755 t/t0080-vcs-svn.sh create mode 100644 test-obj-pool.c create mode 100644 vcs-svn/obj_pool.h diff --git a/.gitignore b/.gitignore index 14e2b6b..1e64a6a 100644 --- a/.gitignore +++ b/.gitignore @@ -167,6 +167,7 @@ /test-genrandom /test-index-version /test-match-trees +/test-obj-pool /test-parse-options /test-path-utils /test-run-command diff --git a/Makefile b/Makefile index ...
From: Jason Evans <jasone@canonware.com> Provide macros to generate a type-specific treap implementation and various functions to operate on it. It uses obj_pool.h to store memory nodes in a treap. Previously committed nodes are never removed from the pool; after any *_commit operation, it is assumed (correctly, in the case of svn-fast-export) that someone else must care about them. Treaps provide a memory-efficient binary search tree structure. Insertion/deletion/search are about as about as fast in the average case as red-black trees and the chances of worst-case behavior are vanishingly small, thanks to (pseudo-)randomness. The bad worst-case behavior is a small price to pay, given that treaps are much simpler to implement. From http://www.canonware.com/download/trp/trp_hash/trp.h [db: Altered to reference nodes by offset from a common base pointer] [db: Bob Jenkins' hashing implementation dropped for Knuth's] [db: Methods unnecessary for search and insert dropped] [rr: Squelched compiler warnings] [db: Added support for immutable treap nodes] [jn: Reintroduced treap_nsearch(); with tests] Signed-off-by: David Barr <david.barr@cordelta.com> Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> --- With the treap_nsearch() fixup from last time squashed in and some more history in the log message. .gitignore | 1 + Makefile | 3 +- t/t0080-vcs-svn.sh | 22 +++++ test-treap.c | 65 ++++++++++++++ vcs-svn/LICENSE | 3 + vcs-svn/trp.h | 236 ++++++++++++++++++++++++++++++++++++++++++++++++++++ vcs-svn/trp.txt | 103 +++++++++++++++++++++++ 7 files changed, 432 insertions(+), 1 deletions(-) create mode 100644 test-treap.c create mode 100644 vcs-svn/trp.h create mode 100644 vcs-svn/trp.txt diff --git a/.gitignore b/.gitignore index 1e64a6a..af47653 100644 --- a/.gitignore +++ b/.gitignore @@ -173,6 +173,7 @@ /test-run-command /test-sha1 ...
SP after "," (same for nsearch) --
Good catch. checkpatch also notices some long lines, but I think
that’s worth ignoring.
-- 8< --
Subject: treap: style fix
Missing spaces in while (0) and trpn_pointer(a, b).
Remove parentheses around return value.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
---
vcs-svn/trp.h | 30 +++++++++++++++---------------
1 files changed, 15 insertions(+), 15 deletions(-)
diff --git a/vcs-svn/trp.h b/vcs-svn/trp.h
index 1f5f51f..ee35c68 100644
--- a/vcs-svn/trp.h
+++ b/vcs-svn/trp.h
@@ -37,7 +37,7 @@ struct trp_root {
*trpn_pointer(a_base, a_offset) = \
*trpn_pointer(a_base, old_offset); \
} \
- } while (0);
+ } while (0)
/* Left accessors. */
#define trp_left_get(a_base, a_field, a_node) \
@@ -46,7 +46,7 @@ struct trp_root {
do { \
trpn_modify(a_base, a_node); \
trp_left_get(a_base, a_field, a_node) = (a_left); \
- } while(0)
+ } while (0)
/* Right accessors. */
#define trp_right_get(a_base, a_field, a_node) \
@@ -55,7 +55,7 @@ struct trp_root {
do { \
trpn_modify(a_base, a_node); \
trp_right_get(a_base, a_field, a_node) = (a_right); \
- } while(0)
+ } while (0)
/*
* Fibonacci hash function.
@@ -72,7 +72,7 @@ struct trp_root {
do { \
trp_left_set(a_base, a_field, (a_node), ~0); \
trp_right_set(a_base, a_field, (a_node), ~0); \
- } while(0)
+ } while (0)
/* Internal utility macros. */
#define trpn_first(a_base, a_field, a_root, r_node) \
@@ -90,7 +90,7 @@ struct trp_root {
trp_right_set(a_base, a_field, (a_node), \
trp_left_get(a_base, a_field, (r_node))); \
trp_left_set(a_base, a_field, (r_node), (a_node)); \
- } while(0)
+ } while (0)
#define trpn_rotate_right(a_base, a_field, a_node, r_node) \
do { \
@@ -98,7 +98,7 @@ struct trp_root {
trp_left_set(a_base, a_field, (a_node), \
trp_right_get(a_base, a_field, (r_node))); \
trp_right_set(a_base, a_field, (r_node), (a_node)); \
- } while(0)
+ } while (0)
#define trp_gen(a_attr, a_pre, a_type, a_field, ...Here are a few more (but feel free to ignore them).
-- 8< --
Subject: Standardize do { ... } while (0) style
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
---
base85.c | 6 +++---
cache.h | 2 +-
diffcore.h | 8 ++++----
http.h | 4 ++--
4 files changed, 10 insertions(+), 10 deletions(-)
diff --git a/base85.c b/base85.c
index e459fee..781b575 100644
--- a/base85.c
+++ b/base85.c
@@ -7,9 +7,9 @@
#define say1(a,b) fprintf(stderr, a, b)
#define say2(a,b,c) fprintf(stderr, a, b, c)
#else
-#define say(a) do {} while(0)
-#define say1(a,b) do {} while(0)
-#define say2(a,b,c) do {} while(0)
+#define say(a) do { /* nothing */ } while (0)
+#define say1(a,b) do { /* nothing */ } while (0)
+#define say2(a,b,c) do { /* nothing */ } while (0)
#endif
static const char en85[] = {
diff --git a/cache.h b/cache.h
index 68258be..37ef9d8 100644
--- a/cache.h
+++ b/cache.h
@@ -449,7 +449,7 @@ extern int init_db(const char *template_dir, unsigned int flags);
alloc = alloc_nr(alloc); \
x = xrealloc((x), alloc * sizeof(*(x))); \
} \
- } while(0)
+ } while (0)
/* Initialize and use the cache information */
extern int read_index(struct index_state *);
diff --git a/diffcore.h b/diffcore.h
index fed9b15..05ebc11 100644
--- a/diffcore.h
+++ b/diffcore.h
@@ -98,7 +98,7 @@ struct diff_queue_struct {
(q)->queue = NULL; \
(q)->nr = (q)->alloc = 0; \
(q)->run = 0; \
- } while(0);
+ } while (0)
extern struct diff_queue_struct diff_queued_diff;
extern struct diff_filepair *diff_queue(struct diff_queue_struct *,
@@ -118,9 +118,9 @@ void diff_debug_filespec(struct diff_filespec *, int, const char *);
void diff_debug_filepair(const struct diff_filepair *, int);
void diff_debug_queue(const char *, struct diff_queue_struct *);
#else
-#define diff_debug_filespec(a,b,c) do {} while(0)
-#define diff_debug_filepair(a,b) do {} while(0)
-#define diff_debug_queue(a,b) do {} while(0)
+#define diff_debug_filespec(a,b,c) do { ...From: David Barr <david.barr@cordelta.com> Intern strings so they can be compared by address and stored without wasting space. This library uses the macros in the obj_pool.h and trp.h to create a memory pool for strings and expose an API for handling them. [rr: added API docs] [jn: with some API simplifications, new documentation and tests] Signed-off-by: David Barr <david.barr@cordelta.com> Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> --- New test. The return value from pool_tok_seq is not checked by the vcs-svn lib but trying to use it in tests revealed it was not so intuitive. pool_tok_seq() was behaving strangely when passed an array of size 0; I think there is nothing sane to do in that case --- maybe it should abort(). The API was passing around char * that cannot be modified; changed to const char *. Another set of eyes on this would be welcome. .gitignore | 1 + Makefile | 9 +++- t/t0080-vcs-svn.sh | 16 +++++++ test-string-pool.c | 31 ++++++++++++++ vcs-svn/string_pool.c | 102 +++++++++++++++++++++++++++++++++++++++++++++++ vcs-svn/string_pool.h | 11 +++++ vcs-svn/string_pool.txt | 43 ++++++++++++++++++++ 7 files changed, 210 insertions(+), 3 deletions(-) create mode 100644 test-string-pool.c create mode 100644 vcs-svn/string_pool.c create mode 100644 vcs-svn/string_pool.h create mode 100644 vcs-svn/string_pool.txt diff --git a/.gitignore b/.gitignore index af47653..9f109db 100644 --- a/.gitignore +++ b/.gitignore @@ -173,6 +173,7 @@ /test-run-command /test-sha1 /test-sigchain +/test-string-pool /test-treap /common-cmds.h *.tar.gz diff --git a/Makefile b/Makefile index e7c33ec..24103c9 100644 --- a/Makefile +++ b/Makefile @@ -415,6 +415,7 @@ TEST_PROGRAMS_NEED_X += test-path-utils TEST_PROGRAMS_NEED_X += test-run-command TEST_PROGRAMS_NEED_X += test-sha1 TEST_PROGRAMS_NEED_X += ...
Good catch. Here’s a fixup for patch 2 (“Introduce vcs-svn lib”). -- 8< -- Subject: vcs-svn: remove build artifacts on “make clean” Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> --- diff --git a/Makefile b/Makefile index 2418820..24c4b3d 100644 --- a/Makefile +++ b/Makefile @@ -2184,8 +2184,8 @@ distclean: clean $(RM) configure clean: - $(RM) *.o block-sha1/*.o ppc/*.o compat/*.o compat/*/*.o xdiff/*.o \ - builtin/*.o $(LIB_FILE) $(XDIFF_LIB) + $(RM) *.o block-sha1/*.o ppc/*.o compat/*.o compat/*/*.o xdiff/*.o vcs-svn/*.o \ + builtin/*.o $(LIB_FILE) $(XDIFF_LIB) $(VCSSVN_LIB) $(RM) $(ALL_PROGRAMS) $(SCRIPT_LIB) $(BUILT_INS) git$X $(RM) $(TEST_PROGRAMS) $(RM) -r bin-wrappers -- --
From: David Barr <david.barr@cordelta.com> This library provides thread-unsafe fgets()- and fread()-like functions where the caller does not have to supply a buffer. It maintains a couple of static buffers and provides an API to use them. [rr: allow input from files other than stdin] [jn: with tests, documentation, and error handling improvements] Signed-off-by: David Barr <david.barr@cordelta.com> Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> --- New tests and API docs. The return value from buffer_deinit can be used to check for errors now (I found this useful when writing tests). .gitignore | 1 + Makefile | 8 +++- t/t0080-vcs-svn.sh | 54 ++++++++++++++++++++++++++ test-line-buffer.c | 46 ++++++++++++++++++++++ vcs-svn/line_buffer.c | 97 +++++++++++++++++++++++++++++++++++++++++++++++ vcs-svn/line_buffer.h | 12 ++++++ vcs-svn/line_buffer.txt | 58 ++++++++++++++++++++++++++++ 7 files changed, 274 insertions(+), 2 deletions(-) create mode 100644 test-line-buffer.c create mode 100644 vcs-svn/line_buffer.c create mode 100644 vcs-svn/line_buffer.h create mode 100644 vcs-svn/line_buffer.txt diff --git a/.gitignore b/.gitignore index 9f109db..8c0512e 100644 --- a/.gitignore +++ b/.gitignore @@ -166,6 +166,7 @@ /test-dump-cache-tree /test-genrandom /test-index-version +/test-line-buffer /test-match-trees /test-obj-pool /test-parse-options diff --git a/Makefile b/Makefile index 24103c9..a76cce5 100644 --- a/Makefile +++ b/Makefile @@ -408,6 +408,7 @@ TEST_PROGRAMS_NEED_X += test-date TEST_PROGRAMS_NEED_X += test-delta TEST_PROGRAMS_NEED_X += test-dump-cache-tree TEST_PROGRAMS_NEED_X += test-genrandom +TEST_PROGRAMS_NEED_X += test-line-buffer TEST_PROGRAMS_NEED_X += test-match-trees TEST_PROGRAMS_NEED_X += test-obj-pool TEST_PROGRAMS_NEED_X += test-parse-options @@ -1743,7 +1744,7 @@ ifndef NO_CURL endif ...
From: David Barr <david.barr@cordelta.com> repo_tree maintains the exporter's state and provides a facility to to call fast_export, which writes objects to stdout suitable for consumption by fast-import. The exported functions roughly correspond to Subversion FS operations. . repo_add, repo_modify, repo_copy, repo_replace, and repo_delete update the current commit, based roughly on the corresponding Subversion FS operation. . repo_commit calls out to fast_export to write the current commit to the fast-import stream in stdout. . repo_diff is used by the fast_export module to write the changes for a commit. . repo_reset erases the exporter's state, so valgrind can be happy. [rr: squelched compiler warnings] [jn: removed support for maintaining state on-disk, though we may want to add it back later] Signed-off-by: David Barr <david.barr@cordelta.com> Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> --- No tests for this one, since the next few patches exercise it and this is not a general-purpose API. Relative to last round, the `pool_commit` and `commit_commit` calls have been eliminated; unlike the `dir_commit` et al calls, those were only meant for committing state to disk, and the changing high-water mark was not being used. Makefile | 5 +- vcs-svn/fast_export.c | 74 +++++++++++ vcs-svn/fast_export.h | 11 ++ vcs-svn/repo_tree.c | 328 +++++++++++++++++++++++++++++++++++++++++++++++++ vcs-svn/repo_tree.h | 26 ++++ 5 files changed, 442 insertions(+), 2 deletions(-) create mode 100644 vcs-svn/fast_export.c create mode 100644 vcs-svn/fast_export.h create mode 100644 vcs-svn/repo_tree.c create mode 100644 vcs-svn/repo_tree.h diff --git a/Makefile b/Makefile index a76cce5..b873399 100644 --- a/Makefile +++ b/Makefile @@ -1744,7 +1744,8 @@ ifndef NO_CURL endif XDIFF_OBJS = xdiff/xdiffi.o xdiff/xprepare.o xdiff/xutils.o xdiff/xemit.o \ ...
From: David Barr <david.barr@cordelta.com> svndump parses data that is in SVN dumpfile format produced by `svnadmin dump` with the help of line_buffer and uses repo_tree and fast_export to emit a git fast-import stream. Based roughly on com.hydrografix.svndump 0.92 from the SvnToCCase project at <http://svn2cc.sarovar.org/>, by Stefan Hegny and others. [rr: allow input from files other than stdin] [jn: with test, more error reporting] Signed-off-by: David Barr <david.barr@cordelta.com> Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> --- New test. It is slow; work by svn gurus to speed this up would be nice. The test program is very similar to svn-fe from contrib, except it exercises Ram’s change to read from a file other than stdin. .gitignore | 1 + Makefile | 8 +- contrib/svn-fe/svn-fe.c | 1 + t/t9010-svn-fe.sh | 32 +++++ test-svn-fe.c | 18 +++ vcs-svn/LICENSE | 4 + vcs-svn/svndump.c | 302 +++++++++++++++++++++++++++++++++++++++++++++++ vcs-svn/svndump.h | 9 ++ 8 files changed, 373 insertions(+), 2 deletions(-) create mode 100644 t/t9010-svn-fe.sh create mode 100644 test-svn-fe.c create mode 100644 vcs-svn/svndump.c create mode 100644 vcs-svn/svndump.h diff --git a/.gitignore b/.gitignore index 8c0512e..258723f 100644 --- a/.gitignore +++ b/.gitignore @@ -175,6 +175,7 @@ /test-sha1 /test-sigchain /test-string-pool +/test-svn-fe /test-treap /common-cmds.h *.tar.gz diff --git a/Makefile b/Makefile index b873399..6228f66 100644 --- a/Makefile +++ b/Makefile @@ -417,6 +417,7 @@ TEST_PROGRAMS_NEED_X += test-run-command TEST_PROGRAMS_NEED_X += test-sha1 TEST_PROGRAMS_NEED_X += test-sigchain TEST_PROGRAMS_NEED_X += test-string-pool +TEST_PROGRAMS_NEED_X += test-svn-fe TEST_PROGRAMS_NEED_X += test-treap TEST_PROGRAMS_NEED_X += test-index-version @@ -1745,7 +1746,7 @@ endif ...
Style: static char *log_copy(...) --
The svn-fe example does not litter the working directory with .bin files any more (hoorah!). The permissive error handling implies a known bug. We should be flagging iffy input and, even if we continue, reporting it on exit. Cc: David Barr <david.barr@cordelta.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> --- contrib/svn-fe/svn-fe.txt | 14 ++++++-------- 1 files changed, 6 insertions(+), 8 deletions(-) diff --git a/contrib/svn-fe/svn-fe.txt b/contrib/svn-fe/svn-fe.txt index de30f83..fb0ee56 100644 --- a/contrib/svn-fe/svn-fe.txt +++ b/contrib/svn-fe/svn-fe.txt @@ -43,11 +43,9 @@ user <user@UUID> as committer, where 'user' is the value of the `svn:author` property and 'UUID' the repository's identifier. -To support incremental imports, 'svn-fe' will put a `git-svn-id` -line at the end of each commit log message if passed an url on the -command line. This line has the form `git-svn-id: URL@REVNO UUID`. - -Empty directories and unknown properties are silently discarded. +To support incremental imports, 'svn-fe' puts a `git-svn-id` line at +the end of each commit log message if passed an url on the command +line. This line has the form `git-svn-id: URL@REVNO UUID`. The resulting repository will generally require further processing to put each project in its own repository and to separate the history @@ -56,9 +54,9 @@ may be useful for this purpose. BUGS ---- -Litters the current working directory with .bin files for -persistence. Will be fixed when the svn-fe infrastructure is aware of -a Git working directory. +Empty directories and unknown properties are silently discarded. + +The exit status does not reflect whether an error was detected. SEE ALSO -------- -- 1.7.2.1.544.ga752d.dirty --
Those in the know would notice that dump file format version 2 means "svnadmin dump --no-deltas", but for the rest of us, an explicit reminder is useful. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> --- That’s the end of the series. Thanks for reading. On the horizon, as you may have guessed, are changes to use dumpfile format v3. Doing so sanely requires two-way communication with fast-import, I think (as discussed). Ram has already put together a prototype delta applier, so it seems to be mostly a matter of plumbing now. contrib/svn-fe/svn-fe.txt | 5 ++++- 1 files changed, 4 insertions(+), 1 deletions(-) diff --git a/contrib/svn-fe/svn-fe.txt b/contrib/svn-fe/svn-fe.txt index fb0ee56..35f84bd 100644 --- a/contrib/svn-fe/svn-fe.txt +++ b/contrib/svn-fe/svn-fe.txt @@ -12,7 +12,7 @@ svnadmin dump --incremental REPO | svn-fe [url] | git fast-import DESCRIPTION ----------- -Converts a Subversion dumpfile (version: 2) into input suitable for +Converts a Subversion dumpfile into input suitable for git-fast-import(1) and similar importers. REPO is a path to a Subversion repository mirrored on the local disk. Remote Subversion repositories can be mirrored on local disk using the `svnsync` @@ -25,6 +25,9 @@ Subversion's repository dump format is documented in full in Files in this format can be generated using the 'svnadmin dump' or 'svk admin dump' command. +Dumps produced with 'svnadmin dump --deltas' (dumpfile format v3) +are not supported. + OUTPUT FORMAT ------------- The fast-import format is documented by the git-fast-import(1) -- 1.7.2.1.544.ga752d.dirty --
Hi Jonathan, Thanks for re-rolling (again)! You've also added a note to the commit messages briefly explaining what each contributor has done. I'd expected some incremental patches instead of a full re-roll, but We have to fix this real soon- I'm waiting for the weekend so I get some solid chunks of hacking time. -- Ram --
Yeah, I think after this series the topic has stabilized enough Thanks for checking. --
Hi, The svndiff format has proved more difficult to parse than expected. This series documents the current state of things, and though it is not complete, it should be ready for nitpicking by the masses. Patches 1-4 modify the line_buffer API by introducing a struct line_buffer to collect state that was previously held in global variables. Callers can use multiple line_buffers to manage input from multiple files at a time. Patches 5-10 add various utility functions to the line_buffer API (wrapping strbuf_fread(), fgetc(), etc). Putting the helpers there instead of having callers work with the FILE* directly means one could easily - tweak the input stream (to insert "link: " at the beginning for symlinks?); - trace reads, for debugging; or - use read() directly in place of stdio and limit the number of bytes buffered if one wants to. Patch 11 adds a data structure and function to manage a "sliding window" without using mmap() or fseek(). See the svndiff0 spec[1] for how this would be used. Patches 12 and 13 are some basic components for reading an svndiff0 file: reading variable-length integers and the opening magic bytes. Patch 15 makes the svn-fe test usable on systems (like Ram's) without libsvn-perl installed. It also should make the test easier to read for people unfamiliar with lib-git-svn.sh. Patch 16 is the delta parser/applier. This patch does _not_ add it to contrib/svn-fe, even though that would be useful, since the command-line interface is not set in stone yet. If you want to try it out, use the test-svn-fe command: test-svn-fe -d <preimage> <delta> <delta length> The preimage or delta arg can be /dev/stdin for use in a pipeline. Both are only read sequentially; they do not need to be regular files. One of the test cases is enormous. The svn delta lib doesn't use multiple windows except when dealing with relatively big files, but probably the test case should be replaced with a smaller, artificial example. One of the ...
The data stored in byte_buffer[] is always either discarded or
written to stdout immediately. No need for it to persist between
function calls.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
---
vcs-svn/line_buffer.c | 3 ++-
1 files changed, 2 insertions(+), 1 deletions(-)
diff --git a/vcs-svn/line_buffer.c b/vcs-svn/line_buffer.c
index 1543567..f22c94f 100644
--- a/vcs-svn/line_buffer.c
+++ b/vcs-svn/line_buffer.c
@@ -14,7 +14,6 @@
obj_pool_gen(blob, char, 4096)
static char line_buffer[LINE_BUFFER_LEN];
-static char byte_buffer[COPY_BUFFER_LEN];
static FILE *infile;
int buffer_init(const char *filename)
@@ -68,6 +67,7 @@ char *buffer_read_string(uint32_t len)
void buffer_copy_bytes(uint32_t len)
{
+ char byte_buffer[COPY_BUFFER_LEN];
uint32_t in;
while (len > 0 && !feof(infile) && !ferror(infile)) {
in = len < COPY_BUFFER_LEN ? len : COPY_BUFFER_LEN;
@@ -83,6 +83,7 @@ void buffer_copy_bytes(uint32_t len)
void buffer_skip_bytes(uint32_t len)
{
+ char byte_buffer[COPY_BUFFER_LEN];
uint32_t in;
while (len > 0 && !feof(infile) && !ferror(infile)) {
in = len < COPY_BUFFER_LEN ? len : COPY_BUFFER_LEN;
--
1.7.2.3
--
Prepare for the line_buffer lib to support input from multiple files,
by collecting global state in a struct that can be easily passed around.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
---
vcs-svn/line_buffer.c | 45 ++++++++++++++++++++++-----------------------
vcs-svn/line_buffer.h | 11 +++++++++++
2 files changed, 33 insertions(+), 23 deletions(-)
diff --git a/vcs-svn/line_buffer.c b/vcs-svn/line_buffer.c
index 6f32f28..e7bc230 100644
--- a/vcs-svn/line_buffer.c
+++ b/vcs-svn/line_buffer.c
@@ -7,17 +7,16 @@
#include "line_buffer.h"
#include "strbuf.h"
-#define LINE_BUFFER_LEN 10000
#define COPY_BUFFER_LEN 4096
-
-static char line_buffer[LINE_BUFFER_LEN];
-static struct strbuf blob_buffer = STRBUF_INIT;
-static FILE *infile;
+static struct line_buffer buf_ = LINE_BUFFER_INIT;
+static struct line_buffer *buf;
int buffer_init(const char *filename)
{
- infile = filename ? fopen(filename, "r") : stdin;
- if (!infile)
+ buf = &buf_;
+
+ buf->infile = filename ? fopen(filename, "r") : stdin;
+ if (!buf->infile)
return -1;
return 0;
}
@@ -25,10 +24,10 @@ int buffer_init(const char *filename)
int buffer_deinit(void)
{
int err;
- if (infile == stdin)
- return ferror(infile);
- err = ferror(infile);
- err |= fclose(infile);
+ if (buf->infile == stdin)
+ return ferror(buf->infile);
+ err = ferror(buf->infile);
+ err |= fclose(buf->infile);
return err;
}
@@ -36,13 +35,13 @@ int buffer_deinit(void)
char *buffer_read_line(void)
{
char *end;
- if (!fgets(line_buffer, sizeof(line_buffer), infile))
+ if (!fgets(buf->line_buffer, sizeof(buf->line_buffer), buf->infile))
/* Error or data exhausted. */
return NULL;
- end = line_buffer + strlen(line_buffer);
+ end = buf->line_buffer + strlen(buf->line_buffer);
if (end[-1] == '\n')
end[-1] = '\0';
- else if (feof(infile))
+ else if (feof(buf->infile))
; /* No newline at end of file. That's fine. */
else
/*
@@ -51,23 +50,23 @@ char ...Collect the line_buffer state in a newly public line_buffer struct.
Callers can use multiple line_buffers to manage input from multiple
files at a time.
The Subversion-format delta applier will use this to stream a delta
and the preimage it applies to at the same time.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
---
vcs-svn/line_buffer.txt | 5 +++--
vcs-svn/fast_export.c | 6 +++---
vcs-svn/fast_export.h | 5 ++++-
vcs-svn/line_buffer.c | 20 ++++++++------------
vcs-svn/line_buffer.h | 14 +++++++-------
vcs-svn/svndump.c | 29 ++++++++++++++++-------------
test-line-buffer.c | 17 +++++++++--------
7 files changed, 50 insertions(+), 46 deletions(-)
diff --git a/vcs-svn/line_buffer.txt b/vcs-svn/line_buffer.txt
index 8906fb1..f8eaa4d 100644
--- a/vcs-svn/line_buffer.txt
+++ b/vcs-svn/line_buffer.txt
@@ -14,14 +14,15 @@ Calling sequence
The calling program:
+ - initializes a `struct line_buffer` to LINE_BUFFER_INIT
- specifies a file to read with `buffer_init`
- processes input with `buffer_read_line`, `buffer_read_string`,
`buffer_skip_bytes`, and `buffer_copy_bytes`
- closes the file with `buffer_deinit`, perhaps to start over and
read another file.
-Before exiting, the caller can use `buffer_reset` to deallocate
-resources for the benefit of profiling tools.
+When finished, the caller can use `buffer_reset` to deallocate
+resources.
Functions
---------
diff --git a/vcs-svn/fast_export.c b/vcs-svn/fast_export.c
index 6cfa256..260cf50 100644
--- a/vcs-svn/fast_export.c
+++ b/vcs-svn/fast_export.c
@@ -63,14 +63,14 @@ void fast_export_commit(uint32_t revision, uint32_t author, char *log,
printf("progress Imported commit %"PRIu32".\n\n", revision);
}
-void fast_export_blob(uint32_t mode, uint32_t mark, uint32_t len)
+void fast_export_blob(uint32_t mode, uint32_t mark, uint32_t len, struct line_buffer *input)
{
if (mode == REPO_MODE_LNK) {
/* svn symlink blobs start with "link ...Tell the caller know how many bytes were actually skipped.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
---
vcs-svn/line_buffer.txt | 3 ++-
vcs-svn/line_buffer.c | 15 ++++++++-------
vcs-svn/line_buffer.h | 2 +-
3 files changed, 11 insertions(+), 9 deletions(-)
diff --git a/vcs-svn/line_buffer.txt b/vcs-svn/line_buffer.txt
index f8eaa4d..d06db24 100644
--- a/vcs-svn/line_buffer.txt
+++ b/vcs-svn/line_buffer.txt
@@ -53,7 +53,8 @@ Functions
`buffer_skip_bytes`::
Discards `len` bytes from the input stream (stopping early
- if necessary because of an error or eof).
+ if necessary because of an error or eof). Return value is
+ the number of bytes successfully read.
`buffer_reset`::
Deallocates non-static buffers.
diff --git a/vcs-svn/line_buffer.c b/vcs-svn/line_buffer.c
index 806932b..999368b 100644
--- a/vcs-svn/line_buffer.c
+++ b/vcs-svn/line_buffer.c
@@ -72,15 +72,16 @@ void buffer_copy_bytes(struct line_buffer *buf, uint32_t len)
}
}
-void buffer_skip_bytes(struct line_buffer *buf, uint32_t len)
+uint32_t buffer_skip_bytes(struct line_buffer *buf, uint32_t nbytes)
{
- char byte_buffer[COPY_BUFFER_LEN];
- uint32_t in;
- while (len > 0 && !feof(buf->infile) && !ferror(buf->infile)) {
- in = len < COPY_BUFFER_LEN ? len : COPY_BUFFER_LEN;
- in = fread(byte_buffer, 1, in, buf->infile);
- len -= in;
+ uint32_t done = 0;
+ while (done < nbytes && !feof(buf->infile) && !ferror(buf->infile)) {
+ char byte_buffer[COPY_BUFFER_LEN];
+ uint32_t len = nbytes - done;
+ uint32_t in = len < COPY_BUFFER_LEN ? len : COPY_BUFFER_LEN;
+ done += fread(byte_buffer, 1, in, buf->infile);
}
+ return done;
}
void buffer_reset(struct line_buffer *buf)
diff --git a/vcs-svn/line_buffer.h b/vcs-svn/line_buffer.h
index fb37390..2796ba7 100644
--- a/vcs-svn/line_buffer.h
+++ b/vcs-svn/line_buffer.h
@@ -17,7 +17,7 @@ int buffer_deinit(struct line_buffer *buf);
char *buffer_read_line(struct line_buffer *buf);
char ...Tweak the line_buffer API to permit seeking and cat-ing segments
longer than 4 GiB. This would be particularly useful for applying
deltas that remove a large segment from the middle of a file.
Callers would still have to be updated to take advantage of this.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
---
Since off_t is a signed type, on systems with 32-bit file offsets,
this might make things worse. Is that worth worrying about?
vcs-svn/line_buffer.c | 8 ++++----
vcs-svn/line_buffer.h | 4 ++--
2 files changed, 6 insertions(+), 6 deletions(-)
diff --git a/vcs-svn/line_buffer.c b/vcs-svn/line_buffer.c
index 999368b..fd1d3c3 100644
--- a/vcs-svn/line_buffer.c
+++ b/vcs-svn/line_buffer.c
@@ -56,7 +56,7 @@ char *buffer_read_string(struct line_buffer *buf, uint32_t len)
return ferror(buf->infile) ? NULL : buf->blob_buffer.buf;
}
-void buffer_copy_bytes(struct line_buffer *buf, uint32_t len)
+void buffer_copy_bytes(struct line_buffer *buf, off_t len)
{
char byte_buffer[COPY_BUFFER_LEN];
uint32_t in;
@@ -72,12 +72,12 @@ void buffer_copy_bytes(struct line_buffer *buf, uint32_t len)
}
}
-uint32_t buffer_skip_bytes(struct line_buffer *buf, uint32_t nbytes)
+off_t buffer_skip_bytes(struct line_buffer *buf, off_t nbytes)
{
- uint32_t done = 0;
+ off_t done = 0;
while (done < nbytes && !feof(buf->infile) && !ferror(buf->infile)) {
char byte_buffer[COPY_BUFFER_LEN];
- uint32_t len = nbytes - done;
+ off_t len = nbytes - done;
uint32_t in = len < COPY_BUFFER_LEN ? len : COPY_BUFFER_LEN;
done += fread(byte_buffer, 1, in, buf->infile);
}
diff --git a/vcs-svn/line_buffer.h b/vcs-svn/line_buffer.h
index 2796ba7..2849faa 100644
--- a/vcs-svn/line_buffer.h
+++ b/vcs-svn/line_buffer.h
@@ -16,8 +16,8 @@ int buffer_init(struct line_buffer *buf, const char *filename);
int buffer_deinit(struct line_buffer *buf);
char *buffer_read_line(struct line_buffer *buf);
char *buffer_read_string(struct line_buffer *buf, uint32_t ...buffer_read_binary() writes to a strbuf so the caller does not need
to keep track of the number of bytes read.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
---
vcs-svn/line_buffer.c | 6 ++++++
vcs-svn/line_buffer.h | 1 +
2 files changed, 7 insertions(+), 0 deletions(-)
diff --git a/vcs-svn/line_buffer.c b/vcs-svn/line_buffer.c
index fd1d3c3..6dd0189 100644
--- a/vcs-svn/line_buffer.c
+++ b/vcs-svn/line_buffer.c
@@ -56,6 +56,12 @@ char *buffer_read_string(struct line_buffer *buf, uint32_t len)
return ferror(buf->infile) ? NULL : buf->blob_buffer.buf;
}
+void buffer_read_binary(struct strbuf *sb, uint32_t size,
+ struct line_buffer *buf)
+{
+ strbuf_fread(sb, size, buf->infile);
+}
+
void buffer_copy_bytes(struct line_buffer *buf, off_t len)
{
char byte_buffer[COPY_BUFFER_LEN];
diff --git a/vcs-svn/line_buffer.h b/vcs-svn/line_buffer.h
index 2849faa..873b0e4 100644
--- a/vcs-svn/line_buffer.h
+++ b/vcs-svn/line_buffer.h
@@ -16,6 +16,7 @@ int buffer_init(struct line_buffer *buf, const char *filename);
int buffer_deinit(struct line_buffer *buf);
char *buffer_read_line(struct line_buffer *buf);
char *buffer_read_string(struct line_buffer *buf, uint32_t len);
+void buffer_read_binary(struct strbuf *sb, uint32_t len, struct line_buffer *f);
void buffer_copy_bytes(struct line_buffer *buf, off_t len);
off_t buffer_skip_bytes(struct line_buffer *buf, off_t len);
void buffer_reset(struct line_buffer *buf);
--
1.7.2.3
--
The buffer_at_eof() function returns 1 if and only if all input from
the input stream has been exhausted (because of EOF or error). The
implementation calls fgetc() followed by ungetc() to force an EOF
condition when there is no more input remaining.
Like many functions in the line_buffer API, this function is not
thread-safe. It could be made to be so with a mutex if needed.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
---
vcs-svn/line_buffer.c | 10 ++++++++++
vcs-svn/line_buffer.h | 1 +
2 files changed, 11 insertions(+), 0 deletions(-)
diff --git a/vcs-svn/line_buffer.c b/vcs-svn/line_buffer.c
index 6dd0189..19caa21 100644
--- a/vcs-svn/line_buffer.c
+++ b/vcs-svn/line_buffer.c
@@ -27,6 +27,16 @@ int buffer_deinit(struct line_buffer *buf)
return err;
}
+int buffer_at_eof(struct line_buffer *buf)
+{
+ int ch;
+ if ((ch = fgetc(buf->infile)) == EOF)
+ return 1;
+ if (ungetc(ch, buf->infile) == EOF)
+ return error("cannot unget %c: %s\n", ch, strerror(errno));
+ return 0;
+}
+
/* Read a line without trailing newline. */
char *buffer_read_line(struct line_buffer *buf)
{
diff --git a/vcs-svn/line_buffer.h b/vcs-svn/line_buffer.h
index 873b0e4..0269aed 100644
--- a/vcs-svn/line_buffer.h
+++ b/vcs-svn/line_buffer.h
@@ -14,6 +14,7 @@ struct line_buffer {
int buffer_init(struct line_buffer *buf, const char *filename);
int buffer_deinit(struct line_buffer *buf);
+int buffer_at_eof(struct line_buffer *buf);
char *buffer_read_line(struct line_buffer *buf);
char *buffer_read_string(struct line_buffer *buf, uint32_t len);
void buffer_read_binary(struct strbuf *sb, uint32_t len, struct line_buffer *f);
--
1.7.2.3
--
Add a buffer_ferror() function to read the error flag from the input
stream, so callers can do:
some_error_prone_operation(f, ...);
if (buffer_ferror(f))
return error("input error: %s", strerror(errno));
instead of waiting until it is time to close the file.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
---
vcs-svn/line_buffer.c | 5 +++++
vcs-svn/line_buffer.h | 1 +
2 files changed, 6 insertions(+), 0 deletions(-)
diff --git a/vcs-svn/line_buffer.c b/vcs-svn/line_buffer.c
index 19caa21..43da509 100644
--- a/vcs-svn/line_buffer.c
+++ b/vcs-svn/line_buffer.c
@@ -27,6 +27,11 @@ int buffer_deinit(struct line_buffer *buf)
return err;
}
+int buffer_ferror(struct line_buffer *buf)
+{
+ return ferror(buf->infile);
+}
+
int buffer_at_eof(struct line_buffer *buf)
{
int ch;
diff --git a/vcs-svn/line_buffer.h b/vcs-svn/line_buffer.h
index 0269aed..4899289 100644
--- a/vcs-svn/line_buffer.h
+++ b/vcs-svn/line_buffer.h
@@ -14,6 +14,7 @@ struct line_buffer {
int buffer_init(struct line_buffer *buf, const char *filename);
int buffer_deinit(struct line_buffer *buf);
+int buffer_ferror(struct line_buffer *buf);
int buffer_at_eof(struct line_buffer *buf);
char *buffer_read_line(struct line_buffer *buf);
char *buffer_read_string(struct line_buffer *buf, uint32_t len);
--
1.7.2.3
--
buffer_read_char() can be used in place of buffer_read_string(1)
to avoid consuming valuable static buffer space. The delta applier
will use this to read variable-length integers one byte at a time.
Underneath, it is fgetc(), wrapped so the line_buffer library can
maintain its role as gatekeeper of input.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
---
vcs-svn/line_buffer.c | 5 +++++
vcs-svn/line_buffer.h | 1 +
2 files changed, 6 insertions(+), 0 deletions(-)
diff --git a/vcs-svn/line_buffer.c b/vcs-svn/line_buffer.c
index 43da509..c54031b 100644
--- a/vcs-svn/line_buffer.c
+++ b/vcs-svn/line_buffer.c
@@ -42,6 +42,11 @@ int buffer_at_eof(struct line_buffer *buf)
return 0;
}
+int buffer_read_char(struct line_buffer *buf)
+{
+ return fgetc(buf->infile);
+}
+
/* Read a line without trailing newline. */
char *buffer_read_line(struct line_buffer *buf)
{
diff --git a/vcs-svn/line_buffer.h b/vcs-svn/line_buffer.h
index 4899289..2375ee1 100644
--- a/vcs-svn/line_buffer.h
+++ b/vcs-svn/line_buffer.h
@@ -18,6 +18,7 @@ int buffer_ferror(struct line_buffer *buf);
int buffer_at_eof(struct line_buffer *buf);
char *buffer_read_line(struct line_buffer *buf);
char *buffer_read_string(struct line_buffer *buf, uint32_t len);
+int buffer_read_char(struct line_buffer *buf);
void buffer_read_binary(struct strbuf *sb, uint32_t len, struct line_buffer *f);
void buffer_copy_bytes(struct line_buffer *buf, off_t len);
off_t buffer_skip_bytes(struct line_buffer *buf, off_t len);
--
1.7.2.3
--
Subversion's delta format has the convenient property that applying
each section of the delta only requires examining (and keeping in
memory) a small portion of the preimage. At any moment, this portion
begins at a well-defined file offset and has a well-defined length,
and as the delta is applied, it moves from the beginning to the end
of the file. Add a move_window() function to keep track of such a
window into a file.
You can use it like this:
struct line_buffer preimage = LINE_BUFFER_INIT;
buffer_init(&preimage, NULL);
struct view window = {&preimage, 0, STRBUF_INIT};
move_window(&window, 3, 7); /* (1) */
move_window(&window, 5, 5); /* (2) */
move_window(&window, 12, 2); /* (3) */
strbuf_release(&window.buf);
buffer_deinit(&preimage);
In this example: (1) reads 10 bytes and discards the first 3;
(2) discards the first 2, which are not needed any more; and (3)
skips 2 bytes and reads 2 new bytes to work with.
Whenever move_window() returns, the file position indicator is at
position window->off + window->buf.len and the data from positions
window->off to the current file position are stored in window->buf.
This function does only sequential access and never seeks, so it
can be safely used on pipes and sockets.
On end-of-file, move_window() just silently reads less than the
caller requested. On other errors, it prints a message to stderr
and returns -1.
Helped-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
---
Makefile | 5 ++-
vcs-svn/sliding_window.c | 65 ++++++++++++++++++++++++++++++++++++++++++++++
vcs-svn/sliding_window.h | 14 ++++++++++
vcs-svn/LICENSE | 2 +
4 files changed, 84 insertions(+), 2 deletions(-)
create mode 100644 vcs-svn/sliding_window.c
create mode 100644 vcs-svn/sliding_window.h
diff --git a/Makefile b/Makefile
index 1f1ce04..d99da33 100644
--- a/Makefile
+++ b/Makefile
@@ -1765,7 +1765,8 @@ endif
XDIFF_OBJS = xdiff/xdiffi.o ...The humble beginnings of the svn-format delta applier. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> --- Maybe this should be squashed with patch 16. Ideas for eliminating the code duplication? vcs-svn/svndiff.c | 59 +++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 files changed, 59 insertions(+), 0 deletions(-) create mode 100644 vcs-svn/svndiff.c diff --git a/vcs-svn/svndiff.c b/vcs-svn/svndiff.c new file mode 100644 index 0000000..4d122a5 --- /dev/null +++ b/vcs-svn/svndiff.c @@ -0,0 +1,59 @@ +/* + * Licensed under a two-clause BSD-style license. + * See LICENSE for details. + */ + +#include "git-compat-util.h" +#include "line_buffer.h" + +/* + * svndiff0 applier + * + * See http://svn.apache.org/repos/asf/subversion/trunk/notes/svndiff. + * + * int ::= highdigit* lowdigit; + * highdigit ::= # binary 1000 0000 OR-ed with 7 bit value; + * lowdigit ::= # 7 bit value; + */ + +#define VLI_CONTINUE 0x80 +#define VLI_DIGIT_MASK 0x7f +#define VLI_BITS_PER_DIGIT 7 + +static int read_int(struct line_buffer *in, uintmax_t *result, off_t *len) +{ + off_t sz = *len; + uintmax_t rv = 0; + while (sz) { + int ch = buffer_read_char(in); + if (ch == EOF) + break; + sz--; + rv <<= VLI_BITS_PER_DIGIT; + rv += (ch & VLI_DIGIT_MASK); + if (!(ch & VLI_CONTINUE)) { + *result = rv; + *len = sz; + return 0; + } + } + return error("Invalid delta: incomplete integer %"PRIuMAX, rv); +} + +static int parse_int(const char **buf, size_t *result, const char *end) +{ + const char *pos; + size_t rv = 0; + for (pos = *buf; pos != end; pos++) { + unsigned char ch = *pos; + rv <<= VLI_BITS_PER_DIGIT; + rv += (ch & VLI_DIGIT_MASK); + if (!(ch & VLI_CONTINUE)) { + *result = rv; + *buf = pos + 1; + return 0; + } + } + return error("Invalid instruction: incomplete integer %"PRIu64, + (uint64_t) rv); +} -- 1.7.2.3 --
The magic number of svn deltas is SVN followed by a null byte. An alternative format (with compressed text) uses magic number SVN\1, but that is deprecated in favor of compressing the deltas as a whole as far as I can tell. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> --- vcs-svn/svndiff.c | 18 ++++++++++++++++++ 1 files changed, 18 insertions(+), 0 deletions(-) diff --git a/vcs-svn/svndiff.c b/vcs-svn/svndiff.c index 4d122a5..df0b1a2 100644 --- a/vcs-svn/svndiff.c +++ b/vcs-svn/svndiff.c @@ -11,6 +11,7 @@ * * See http://svn.apache.org/repos/asf/subversion/trunk/notes/svndiff. * + * svndiff0 ::= 'SVN\0' window window*; * int ::= highdigit* lowdigit; * highdigit ::= # binary 1000 0000 OR-ed with 7 bit value; * lowdigit ::= # 7 bit value; @@ -20,6 +21,23 @@ #define VLI_DIGIT_MASK 0x7f #define VLI_BITS_PER_DIGIT 7 +static int read_magic(struct line_buffer *in, off_t *len) +{ + static const char magic[] = {'S', 'V', 'N', '\0'}; + struct strbuf sb = STRBUF_INIT; + if (*len < sizeof(magic)) + return error("Invalid delta: no file type header"); + buffer_read_binary(&sb, sizeof(magic), in); + if (sb.len != sizeof(magic)) + return error("Invalid delta: no file type header"); + if (memcmp(sb.buf, magic, sizeof(magic))) + return error("Unrecognized file type %.*s", + (int) sizeof(magic), sb.buf); + *len -= sizeof(magic); + strbuf_release(&sb); + return 0; +} + static int read_int(struct line_buffer *in, uintmax_t *result, off_t *len) { off_t sz = *len; -- 1.7.2.3 --
The idiom (a + b < a) works fine for detecting that an unsigned
integer has overflowed, but the more explicit
unsigned_add_overflows(a, b)
might be easier to read.
Define such a macro, expanding roughly to ((a) < UINT_MAX - (b)).
Because the expansion uses each argument only once outside of sizeof()
expressions, it is safe to use this macro with arguments that have
side-effects.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
---
git-compat-util.h | 6 ++++++
1 files changed, 6 insertions(+), 0 deletions(-)
diff --git a/git-compat-util.h b/git-compat-util.h
index 2af8d3e..817f045 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -31,6 +31,9 @@
#define maximum_signed_value_of_type(a) \
(INTMAX_MAX >> (bitsizeof(intmax_t) - bitsizeof(a)))
+#define maximum_unsigned_value_of_type(a) \
+ (UINTMAX_MAX >> (bitsizeof(uintmax_t) - bitsizeof(a)))
+
/*
* Signed integer overflow is undefined in C, so here's a helper macro
* to detect if the sum of two integers will overflow.
@@ -40,6 +43,9 @@
#define signed_add_overflows(a, b) \
((b) > maximum_signed_value_of_type(a) - (a))
+#define unsigned_add_overflows(a, b) \
+ ((b) > maximum_unsigned_value_of_type(a) - (a))
+
#ifdef __GNUC__
#define TYPEOF(x) (__typeof__(x))
#else
--
1.7.2.3
--
From: Ramkumar Ramachandra <artagnon@gmail.com>
The svn-fe test script only requires git and the svn command-line
tools. Make these tests easier to read and run by not using the perl
libsvn bindings and instead duplicating only the relevant code from
lib-git-svn.sh.
Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
---
t/t9010-svn-fe.sh | 14 ++++++++++++--
1 files changed, 12 insertions(+), 2 deletions(-)
diff --git a/t/t9010-svn-fe.sh b/t/t9010-svn-fe.sh
index a713dfc..fd851a4 100755
--- a/t/t9010-svn-fe.sh
+++ b/t/t9010-svn-fe.sh
@@ -2,9 +2,19 @@
test_description='check svn dumpfile importer'
-. ./lib-git-svn.sh
+. ./test-lib.sh
-test_dump() {
+svnconf=$PWD/svnconf
+export svnconf
+
+svn_cmd () {
+ subcommand=$1 &&
+ shift &&
+ mkdir -p "$svnconf" &&
+ svn "$subcommand" --config-dir "$svnconf" "$@"
+}
+
+test_dump () {
label=$1
dump=$2
test_expect_success "$dump" '
--
1.7.2.3
--
The buffer_read_string() function returns a temporary string of
size specified by the caller. It currently uses an obj_pool to
store the return value, but that is overkill: all we need is a
buffer that can grow between requests to accomodate larger
strings.
Use a strbuf instead.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
---
[Resent after messing up the message header; sorry for the noise.]
vcs-svn/line_buffer.c | 16 ++++++----------
1 files changed, 6 insertions(+), 10 deletions(-)
diff --git a/vcs-svn/line_buffer.c b/vcs-svn/line_buffer.c
index f22c94f..6f32f28 100644
--- a/vcs-svn/line_buffer.c
+++ b/vcs-svn/line_buffer.c
@@ -5,15 +5,13 @@
#include "git-compat-util.h"
#include "line_buffer.h"
-#include "obj_pool.h"
+#include "strbuf.h"
#define LINE_BUFFER_LEN 10000
#define COPY_BUFFER_LEN 4096
-/* Create memory pool for char sequence of known length */
-obj_pool_gen(blob, char, 4096)
-
static char line_buffer[LINE_BUFFER_LEN];
+static struct strbuf blob_buffer = STRBUF_INIT;
static FILE *infile;
int buffer_init(const char *filename)
@@ -58,11 +56,9 @@ char *buffer_read_line(void)
char *buffer_read_string(uint32_t len)
{
- char *s;
- blob_free(blob_pool.size);
- s = blob_pointer(blob_alloc(len + 1));
- s[fread(s, 1, len, infile)] = '\0';
- return ferror(infile) ? NULL : s;
+ strbuf_reset(&blob_buffer);
+ strbuf_fread(&blob_buffer, len, infile);
+ return ferror(infile) ? NULL : blob_buffer.buf;
}
void buffer_copy_bytes(uint32_t len)
@@ -94,5 +90,5 @@ void buffer_skip_bytes(uint32_t len)
void buffer_reset(void)
{
- blob_reset();
+ strbuf_release(&blob_buffer);
}
--
1.7.2.3
--
Implement an svndiff 0 interpreter, for use by the dumpfilev3 importer. It is slower than it needs to be (e.g., it does not use fseek() on input) for simplicity. This is based only on the spec and not Subversion's implementation of the svndiff0 spec. The tests come from various deltas encountered in importing the Apache SVN repo. The svndiff0 semantics are not completely documented, meaning that some of this work had to be done by guesswork. It is not complete. This version of the patch omits the enormous Xerces.cpp test (which is not so interesting because it passes, anyway). Helped-by: Ramkumar Ramachandra <artagnon@gmail.com> Helped-by: Thomas Rast <trast@student.ethz.ch> Helped-by: David Barr <david.barr@cordelta.com> Not-signed-off-by: Jonathan Nieder <jrnieder@gmail.com> --- [alternate ending, for those who do not like reading 3-MiB messages - sorry about that] Still not signed off because I haven't checked the copyright of the tests. I would prefer tiny deltas or deltas of some public-domain work (e.g., drafts of foundation documents of some country). This patch owes a great deal to David and Ram, probably more than it owes to me. David made lots and lots of fixes. Ram introduced the tests and helped with design. Thomas provided an early sanity check for code clarity. Thanks for reading. Makefile | 4 +- t/t9010-svn-fe.sh | 14 ++ t/t9010/newdata.diff0 | Bin 0 -> 19392 bytes t/t9010/newdata.done | 522 +++++++++++++++++++++++++++++++++++++++++++++++++ t/t9010/src.diff0 | Bin 0 -> 74 bytes t/t9010/src.done | 522 +++++++++++++++++++++++++++++++++++++++++++++++++ test-svn-fe.c | 37 +++- vcs-svn/svndiff.c | 265 +++++++++++++++++++++++++ vcs-svn/svndiff.h | 9 + 9 files changed, 1364 insertions(+), 9 deletions(-) create mode 100644 t/t9010/blank.done create mode 100644 t/t9010/newdata.diff0 create mode 100644 t/t9010/newdata.done create mode 100644 t/t9010/src.diff0 create ...
I hear that was nigh unreadable, so here's a reroll. Less cargo-cult support for broken deltas, more readability and tests. Patches apply on top of "[PATCH 15/16] t9010 (svn-fe): Eliminate dependency on svn perl bindings". As before, the end result includes a 'test-svn-fe -d' command that can apply svndiff0-format deltas, meaning less binary garbage to worry about as you puzzle over that confusing "svnrdump dump" output in debugging sessions. Questions? Improvements? Bugs? Patch 1 is a fixup to the variable-length integer parsing code, to report unexpected EOF (i.e., declared content length too long) correctly when it occurs in the middle of such an integer. Patch 2 is the svndiff0 interpreter in broad strokes: read window, read window, read window, .... The patch doesn't encode any knowledge about what actually goes _in_ a window aside from the header, so it will error out for nonempty windows. Patch 3 teaches the nacent interpreter to keep the appropriate piece of the preimage in memory. This is probably earlier in the series than it ought to be, but I wanted to try out the sliding window code. With patches 4 and 5, the interpreter learns to read the "data" and "instructions" section of a window. The effect is observable because it finds the beginning of the next window correctly. Patch 6 is an example instruction (copyfrom_data). Patches 7-8 introduce some sanity checks. Patches 9 and 10 are another instruction (copyfrom_target) and another sanity check. Patch 11 is the last instruction (copyfrom_source). That's it. You can apply deltas now! If anything seems unclear, please don't spend time puzzling it out --- just yell at me, so the code or documentation can be cleaned up. Happy reading. --
Report EOF correctly in integer parsing code.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
---
This patch is meant for squashing.
vcs-svn/svndiff.c | 3 ++-
1 files changed, 2 insertions(+), 1 deletions(-)
diff --git a/vcs-svn/svndiff.c b/vcs-svn/svndiff.c
index df0b1a2..36d2b30 100644
--- a/vcs-svn/svndiff.c
+++ b/vcs-svn/svndiff.c
@@ -45,7 +45,8 @@ static int read_int(struct line_buffer *in, uintmax_t *result, off_t *len)
while (sz) {
int ch = buffer_read_char(in);
if (ch == EOF)
- break;
+ return error("Delta ends early (%"PRIu64" bytes remaining)",
+ (uint64_t) sz);
sz--;
rv <<= VLI_BITS_PER_DIGIT;
rv += (ch & VLI_DIGIT_MASK);
--
1.7.2.3
--
A delta in the subversion delta (svndiff0) format consists of the magic bytes SVN\0 followed by a sequence of windows, each beginning with a window header consisting of five integers (with variable-length representation): source view offset source view length output length instructions length auxiliary data length Add an svndiff0_apply() function and test-svn-fe -d commandline tool to parse such a delta in the special case of not including any instructions or auxiliary data. Later patches will add features to turn this into a fully functional delta applier, for use by svn-fe in parsing the streams produced by "svnrdump dump" and "svnadmin dump --deltas". Helped-by: Ramkumar Ramachandra <artagnon@gmail.com> Helped-by: David Barr <david.barr@cordelta.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> --- Makefile | 4 +- t/t9011-svn-da.sh | 82 +++++++++++++++++++++++++++++++++++++++++++++++++++++ test-svn-fe.c | 39 ++++++++++++++++++++---- vcs-svn/svndiff.c | 64 +++++++++++++++++++++++++++++++++++++++++ vcs-svn/svndiff.h | 9 ++++++ 5 files changed, 189 insertions(+), 9 deletions(-) create mode 100755 t/t9011-svn-da.sh create mode 100644 vcs-svn/svndiff.h diff --git a/Makefile b/Makefile index d99da33..966f5c7 100644 --- a/Makefile +++ b/Makefile @@ -1766,7 +1766,7 @@ XDIFF_OBJS = xdiff/xdiffi.o xdiff/xprepare.o xdiff/xutils.o xdiff/xemit.o \ xdiff/xmerge.o xdiff/xpatience.o VCSSVN_OBJS = vcs-svn/string_pool.o vcs-svn/line_buffer.o \ vcs-svn/repo_tree.o vcs-svn/fast_export.o vcs-svn/svndump.o \ - vcs-svn/sliding_window.o + vcs-svn/sliding_window.o vcs-svn/svndiff.o OBJECTS := $(GIT_OBJS) $(XDIFF_OBJS) $(VCSSVN_OBJS) dep_files := $(foreach f,$(OBJECTS),$(dir $f).depend/$(notdir $f).d) @@ -1893,7 +1893,7 @@ xdiff-interface.o $(XDIFF_OBJS): \ $(VCSSVN_OBJS): \ vcs-svn/obj_pool.h vcs-svn/trp.h vcs-svn/string_pool.h \ vcs-svn/line_buffer.h vcs-svn/repo_tree.h vcs-svn/fast_export.h \ - vcs-svn/svndump.h ...
The source view offset heading each svndiff0 window represents a number of bytes past the beginning of the preimage. Together with the source view length, it instructs the delta applier about what portion of the preimage instructions will refer to. Read in that data right away using the sliding window code. Maybe some day we will mmap() to prepare to read data more lazily. For compatibility with Subversion's implementation, tolerate source view offsets pointing past the end of the preimage file (a later patch will remove this flexibility). For simplicity, also permit source views that start within the preimage and end outside of it, even though Subversion does not. This does not teach the delta applier to read instructions or copy data from the source view yet. Deltas that would produce nonempty output are still rejected. Helped-by: Ramkumar Ramachandra <artagnon@gmail.com> Helped-by: David Barr <david.barr@cordelta.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> --- It occurs to me that Sam Vilain may well have something valuable to say about this series, having implemented something similar[1]. Sam, this series adds an svndiff0 parser for git to use in parsing v3 dumps (which are way easier to produce with remote access to an svn repository than v2 dumps). The beginning of the series is at [2], though that cover letter is out of date: now, modulo any new bugs I've introduced with this reroll, it is known to successfully apply all the deltas involved in a complete dump of the ASF repo. I am interested in improvements and complaints of all kinds. [1] http://search.cpan.org/~samv/Parse-SVNDiff-0.03/lib/Parse/SVNDiff.pm [2] http://thread.gmane.org/gmane.comp.version-control.git/151086/focus=158731 t/t9011-svn-da.sh | 38 ++++++++++++++++++++++++++++++++++++++ vcs-svn/svndiff.c | 22 +++++++++++++++------- 2 files changed, 53 insertions(+), 7 deletions(-) diff --git a/t/t9011-svn-da.sh b/t/t9011-svn-da.sh index 8dccd16..b9aad70 100755 --- ...
All I did was make the module lazy - version 0.02 was by 唐鳳, any detailed knowledge I had of the binary format fell out of my head pretty quickly :-) Sam --
Ah, my bad. 唐鳳, as part of an attempt to natively support fetching from and pushing to svn repositories, some contributors to the git project are working on an svndiff0 applier. If you're interested in reliving old memories, please feel free to look it over (especially the test cases). Thoughts, simplifications, bug reports, improvements welcome. http://thread.gmane.org/gmane.comp.version-control.git/151086/focus=158913 Anyone wanting to try it can check out git://repo.or.cz/git/jrn.git svn-da and use the test-svn-fe command: make test-svn-fe ./test-svn-fe -d <preimage> <delta> <delta length> or the tests: make cd t && sh t9011-svn-da.sh -v -i The preimage or delta argument can be /dev/stdin for use in a pipeline. Thanks for your work on svk and pugs! Jonathan --
Each window of an svndiff0-format delta includes a section for new data that will be copied into the preimage (in the order it appears in the window, possibly interspersed with other data). Read this data when encountering it. It is not actually necessary to do so --- it would be just as easy to copy straight from the delta to output when interpreting the relevant instructions --- but this way, the code that interprets svndiff0 instructions can proceed more quickly because it does not require any I/O. Subversion's implementation rejects deltas that do not consume all the auxiliary data that is available. Do not check that for now, because it would make it impossible to test the function of this patch until the instructions to consume data are implemented. Do check for truncated data sections. Since Subversion's applier rejects deltas that end before the new-data section is declared to end, it should be safe for this applier to reject such deltas, too. Helped-by: Ramkumar Ramachandra <artagnon@gmail.com> Helped-by: David Barr <david.barr@cordelta.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> --- t/t9011-svn-da.sh | 12 ++++++++++++ vcs-svn/svndiff.c | 27 ++++++++++++++++++++++++--- 2 files changed, 36 insertions(+), 3 deletions(-) diff --git a/t/t9011-svn-da.sh b/t/t9011-svn-da.sh index b9aad70..44832b0 100755 --- a/t/t9011-svn-da.sh +++ b/t/t9011-svn-da.sh @@ -117,4 +117,16 @@ test_expect_success 'preimage view: accept truncated preimage' ' test_cmp empty actual.longread ' +test_expect_success 'inline data' ' + printf "SVNQ%b%s%b%s" "QQQQ\003" "bar" "QQQQ\001" "x" | + q_to_nul >inline.clear && + test-svn-fe -d preimage inline.clear 18 >actual && + test_cmp empty actual +' + +test_expect_success 'truncated inline data' ' + printf "SVNQ%b%s" "QQQQ\003" "b" | q_to_nul >inline.trunc && + test_must_fail test-svn-fe -d preimage inline.trunc 10 +' + test_done diff --git a/vcs-svn/svndiff.c b/vcs-svn/svndiff.c index f2876b3..c60d732 ...
Buffer the instruction section upon encountering it for later
interpretation.
An alternative design would involve parsing the instructions
at this point and buffering them in some processed form. Using
the unprocessed form is simpler.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
---
t/t9011-svn-da.sh | 5 +++++
vcs-svn/svndiff.c | 23 ++++++++++++++++++-----
2 files changed, 23 insertions(+), 5 deletions(-)
diff --git a/t/t9011-svn-da.sh b/t/t9011-svn-da.sh
index 44832b0..1383263 100755
--- a/t/t9011-svn-da.sh
+++ b/t/t9011-svn-da.sh
@@ -129,4 +129,9 @@ test_expect_success 'truncated inline data' '
test_must_fail test-svn-fe -d preimage inline.trunc 10
'
+test_expect_success 'truncated inline data (after instruction section)' '
+ printf "SVNQ%b%b%s" "QQ\001\001\003" "\0201" "b" | q_to_nul >insn.trunc &&
+ test_must_fail test-svn-fe -d preimage insn.trunc 11
+'
+
test_done
diff --git a/vcs-svn/svndiff.c b/vcs-svn/svndiff.c
index c60d732..72fe716 100644
--- a/vcs-svn/svndiff.c
+++ b/vcs-svn/svndiff.c
@@ -24,6 +24,7 @@
#define VLI_BITS_PER_DIGIT 7
struct window {
+ struct strbuf instructions;
struct strbuf data;
};
@@ -119,7 +120,7 @@ static int read_chunk(struct line_buffer *delta, off_t *delta_len,
static int apply_one_window(struct line_buffer *delta, off_t *delta_len)
{
- struct window ctx = {STRBUF_INIT};
+ struct window ctx = {STRBUF_INIT, STRBUF_INIT};
size_t out_len;
size_t instructions_len;
size_t data_len;
@@ -131,13 +132,25 @@ static int apply_one_window(struct line_buffer *delta, off_t *delta_len)
read_length(delta, &instructions_len, delta_len) ||
read_length(delta, &data_len, delta_len))
return -1;
+ if (read_chunk(delta, delta_len, &ctx.instructions, instructions_len))
+ return error("Invalid delta: incomplete instructions section");
+ if (buffer_ferror(delta)) {
+ rv = error("Cannot read delta: %s", strerror(errno));
+ goto done;
+ }
+ if (read_chunk(delta, delta_len, &ctx.data, ...The copyfrom_data instruction copies a few bytes verbatim from the auxiliary data section of a window to the postimage. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> --- t/t9011-svn-da.sh | 31 +++++++++++++++++++ vcs-svn/svndiff.c | 86 +++++++++++++++++++++++++++++++++++++++++++++++++--- 2 files changed, 112 insertions(+), 5 deletions(-) diff --git a/t/t9011-svn-da.sh b/t/t9011-svn-da.sh index 1383263..9279924 100755 --- a/t/t9011-svn-da.sh +++ b/t/t9011-svn-da.sh @@ -134,4 +134,35 @@ test_expect_success 'truncated inline data (after instruction section)' ' test_must_fail test-svn-fe -d preimage insn.trunc 11 ' +test_expect_success 'copyfrom_data' ' + echo hi >expect && + printf "SVNQ%b%b%b" "QQ\003\001\003" "\0203" "hi\n" | q_to_nul >copydat && + test-svn-fe -d preimage copydat 13 >actual && + test_cmp expect actual +' + +test_expect_success 'multiple copyfrom_data' ' + echo hi >expect && + printf "SVNQ%b%b%b%b%b" "QQ\003\002\003" "\0201\0202" "hi\n" \ + "QQQ\002Q" "\0200Q" | q_to_nul >copy.multi && + len=$(wc -c <copy.multi) && + test-svn-fe -d preimage copy.multi $len >actual && + test_cmp expect actual +' + +test_expect_success 'incomplete multiple insn' ' + printf "SVNQ%b%b%b" "QQ\003\002\003" "\0203\0200" "hi\n" | + q_to_nul >copy.partial && + len=$(wc -c <copy.partial) && + test_must_fail test-svn-fe -d preimage copy.partial $len +' + +test_expect_success 'catch attempt to copy missing data' ' + printf "SVNQ%b%b%s%b%s" "QQ\002\002\001" "\0201\0201" "X" \ + "QQQQ\002" "YZ" | + q_to_nul >copy.incomplete && + len=$(wc -c <copy.incomplete) && + test_must_fail test-svn-fe -d preimage copy.incomplete $len +' + test_done diff --git a/vcs-svn/svndiff.c b/vcs-svn/svndiff.c index 72fe716..ac776e0 100644 --- a/vcs-svn/svndiff.c +++ b/vcs-svn/svndiff.c @@ -14,20 +14,40 @@ * * svndiff0 ::= 'SVN\0' window window*; * window ::= int int int int int instructions inline_data; + * instructions ::= instruction*; + * instruction ::= ...
Check that the declared output size for each window is correct, and
reserve that amount of space in the output buffer in advance.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
---
vcs-svn/svndiff.c | 5 +++++
1 files changed, 5 insertions(+), 0 deletions(-)
diff --git a/vcs-svn/svndiff.c b/vcs-svn/svndiff.c
index ac776e0..c03cd7e 100644
--- a/vcs-svn/svndiff.c
+++ b/vcs-svn/svndiff.c
@@ -220,10 +220,15 @@ static int apply_one_window(struct line_buffer *delta, off_t *delta_len,
rv = error("Cannot read delta: %s", strerror(errno));
goto done;
}
+ strbuf_grow(&ctx.out, out_len);
if (apply_window_in_core(&ctx) || write_strbuf(&ctx.out, out)) {
rv = -1;
goto done;
}
+ if (ctx.out.len != out_len) {
+ rv = error("Invalid delta: incorrect postimage length");
+ goto done;
+ }
done:
strbuf_release(&ctx.data);
strbuf_release(&ctx.instructions);
--
1.7.2.3
--
The main point is to constrain the format of deltas more, so corruption and other breakage can be more easily detected. Requiring deltas not to provide unconsumed data also opens the possibility of ignoring the declared amount of new data and simply streaming the data as needed to fulfill copyfrom_data requests. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> --- t/t9011-svn-da.sh | 12 +++--------- vcs-svn/svndiff.c | 2 ++ 2 files changed, 5 insertions(+), 9 deletions(-) diff --git a/t/t9011-svn-da.sh b/t/t9011-svn-da.sh index 9279924..c9f4768 100755 --- a/t/t9011-svn-da.sh +++ b/t/t9011-svn-da.sh @@ -117,20 +117,14 @@ test_expect_success 'preimage view: accept truncated preimage' ' test_cmp empty actual.longread ' -test_expect_success 'inline data' ' +test_expect_success 'unconsumed inline data' ' printf "SVNQ%b%s%b%s" "QQQQ\003" "bar" "QQQQ\001" "x" | q_to_nul >inline.clear && - test-svn-fe -d preimage inline.clear 18 >actual && - test_cmp empty actual + test_must_fail test-svn-fe -d preimage inline.clear 18 >actual ' test_expect_success 'truncated inline data' ' - printf "SVNQ%b%s" "QQQQ\003" "b" | q_to_nul >inline.trunc && - test_must_fail test-svn-fe -d preimage inline.trunc 10 -' - -test_expect_success 'truncated inline data (after instruction section)' ' - printf "SVNQ%b%b%s" "QQ\001\001\003" "\0201" "b" | q_to_nul >insn.trunc && + printf "SVNQ%b%b%s" "QQ\003\001\003" "\0203" "b" | q_to_nul >insn.trunc && test_must_fail test-svn-fe -d preimage insn.trunc 11 ' diff --git a/vcs-svn/svndiff.c b/vcs-svn/svndiff.c index c03cd7e..8755c83 100644 --- a/vcs-svn/svndiff.c +++ b/vcs-svn/svndiff.c @@ -188,6 +188,8 @@ static int apply_window_in_core(struct window *ctx) while (insn != ctx->instructions.buf + ctx->instructions.len) if (step(ctx, &insn, &data_pos)) return -1; + if (data_pos != ctx->data.len) + return error("Invalid delta: does not copy all new data"); return 0; } -- 1.7.2.3 --
The copyfrom_target instruction copies appends data that is already present in the current output view to the end of output. (The offset argument is relative to the beginning of output produced by the current window.) The region copied is allowed to run past the end of the existing output. To support that case, copy one character at a time rather than using memcpy() or memmove(). This allows copyfrom_target to be used once to repeat a string many times. For example: COPYFROM_DATA 2 COPYFROM_OUTPUT 10, 0 DATA "ab" would produce the output "ababababababababababab". Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> --- t/t9011-svn-da.sh | 42 ++++++++++++++++++++++++++++++++++++++++++ vcs-svn/svndiff.c | 30 ++++++++++++++++++++++++++++-- 2 files changed, 70 insertions(+), 2 deletions(-) diff --git a/t/t9011-svn-da.sh b/t/t9011-svn-da.sh index c9f4768..ccd31e9 100755 --- a/t/t9011-svn-da.sh +++ b/t/t9011-svn-da.sh @@ -159,4 +159,46 @@ test_expect_success 'catch attempt to copy missing data' ' test_must_fail test-svn-fe -d preimage copy.incomplete $len ' +test_expect_success 'copyfrom target to repeat data' ' + printf foofoo >expect && + printf "SVNQ%b%b%s" "QQ\006\004\003" "\0203\0100\003Q" "foo" | + q_to_nul >copytarget.repeat && + len=$(wc -c <copytarget.repeat) && + test-svn-fe -d preimage copytarget.repeat $len >actual && + test_cmp expect actual +' + +test_expect_success 'copyfrom target out of order' ' + printf foooof >expect && + printf "SVNQ%b%b%s" \ + "QQ\006\007\003" "\0203\0101\002\0101\001\0101Q" "foo" | + q_to_nul >copytarget.reverse && + len=$(wc -c <copytarget.reverse) && + test-svn-fe -d preimage copytarget.reverse $len >actual && + test_cmp expect actual +' + +test_expect_success 'catch copyfrom future' ' + printf "SVNQ%b%b%s" "QQ\004\004\003" "\0202\0101\002\0201" "XYZ" | + q_to_nul >copytarget.infuture && + len=$(wc -c <copytarget.infuture) && + test_must_fail test-svn-fe -d preimage copytarget.infuture ...
Some particularly strange deltas of unknown origin were found to request copies beyond the end of the preimage. But svn 1.6 never produces anything like that. Although Subversion accepts these perverse deltas as input, let's error out if some future version of subversion starts to actually produce them. Without this change, the diff applier would have to separately keep track of the number of bytes supposedly and actually written out. Helped-by: Ramkumar Ramachandra <artagnon@gmail.com> Helped-by: David Barr <david.barr@cordelta.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> --- t/t9011-svn-da.sh | 11 ++++------- vcs-svn/sliding_window.c | 10 ++++++---- 2 files changed, 10 insertions(+), 11 deletions(-) diff --git a/t/t9011-svn-da.sh b/t/t9011-svn-da.sh index ccd31e9..c4bd1f3 100755 --- a/t/t9011-svn-da.sh +++ b/t/t9011-svn-da.sh @@ -105,16 +105,13 @@ test_expect_success 'preimage view: offsets compared by value' ' test_cmp empty actual ' -test_expect_success 'preimage view: accept truncated preimage' ' +test_expect_success 'preimage view: reject truncated preimage' ' printf "SVNQ%b" "\010QQQQ" | q_to_nul >clear.lateemptyread && printf "SVNQ%b" "\010\001QQQ" | q_to_nul >clear.latenonemptyread && printf "SVNQ%b" "\001\010QQQ" | q_to_nul >clear.longread && - test-svn-fe -d preimage clear.lateemptyread 9 >actual.emptyread && - test-svn-fe -d preimage clear.latenonemptyread 9 >actual.nonemptyread && - test-svn-fe -d preimage clear.longread 9 >actual.longread && - test_cmp empty actual.emptyread && - test_cmp empty actual.nonemptyread && - test_cmp empty actual.longread + test_must_fail test-svn-fe -d preimage clear.lateemptyread 9 && + test_must_fail test-svn-fe -d preimage clear.latenonemptyread 9 && + test_must_fail test-svn-fe -d preimage clear.longread 9 ' test_expect_success 'unconsumed inline data' ' diff --git a/vcs-svn/sliding_window.c b/vcs-svn/sliding_window.c index 8273970..5c08828 100644 --- ...
The copyfrom_source instruction appends data from the preimage buffer to the end of output. Its arguments are a length and an offset relative to the beginning of the source view. Helped-by: Ramkumar Ramachandra <artagnon@gmail.com> Helped-by: David Barr <david.barr@cordelta.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> --- That's the end of the series. Thanks for reading. Hopefully this round did not introduce too many bugs but if it did, I'd be glad to hear about them. Good night, Jonathan t/t9011-svn-da.sh | 35 +++++++++++++++++++++++++++++++++++ vcs-svn/svndiff.c | 27 +++++++++++++++++++++++---- 2 files changed, 58 insertions(+), 4 deletions(-) diff --git a/t/t9011-svn-da.sh b/t/t9011-svn-da.sh index c4bd1f3..c8959e2 100755 --- a/t/t9011-svn-da.sh +++ b/t/t9011-svn-da.sh @@ -198,4 +198,39 @@ test_expect_success 'catch copy that overflows' ' test_must_fail test-svn-fe -d preimage copytarget.overflow $len ' +test_expect_success 'copyfrom source' ' + printf foo >expect && + printf "SVNQ%b%b" "Q\003\003\002Q" "\003Q" | q_to_nul >copysource.all && + test-svn-fe -d preimage copysource.all 11 >actual && + test_cmp expect actual +' + +test_expect_success 'copy backwards' ' + printf oof >expect && + printf "SVNQ%b%b" "Q\003\003\006Q" "\001\002\001\001\001Q" | + q_to_nul >copysource.rev && + test-svn-fe -d preimage copysource.rev 15 >actual && + test_cmp expect actual +' + +test_expect_success 'offsets are relative to window' ' + printf fo >expect && + printf "SVNQ%b%b%b%b" "Q\003\001\002Q" "\001Q" \ + "\002\001\001\002Q" "\001Q" | + q_to_nul >copysource.two && + test-svn-fe -d preimage copysource.two 18 >actual && + test_cmp expect actual +' + +test_expect_success 'example from notes/svndiff' ' + printf aaaaccccdddddddd >expect && + printf aaaabbbbcccc >source && + printf "SVNQ%b%b%s" "Q\014\020\007\001" \ + "\004Q\004\010\0201\0107\010" d | + q_to_nul >delta.example && + len=$(wc -c <delta.example) && + test-svn-fe -d ...
The copyfrom_source instruction appends data from the preimage buffer to the end of output. Its arguments are a length and an offset relative to the beginning of the source view. Helped-by: Ramkumar Ramachandra <artagnon@gmail.com> Helped-by: David Barr <david.barr@cordelta.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> --- [resending with cc: samv] That's the end of the series. Thanks for reading. Hopefully this round did not introduce too many bugs but if it did, I'd be glad to hear about them. Good night, Jonathan t/t9011-svn-da.sh | 35 +++++++++++++++++++++++++++++++++++ vcs-svn/svndiff.c | 27 +++++++++++++++++++++++---- 2 files changed, 58 insertions(+), 4 deletions(-) diff --git a/t/t9011-svn-da.sh b/t/t9011-svn-da.sh index c4bd1f3..c8959e2 100755 --- a/t/t9011-svn-da.sh +++ b/t/t9011-svn-da.sh @@ -198,4 +198,39 @@ test_expect_success 'catch copy that overflows' ' test_must_fail test-svn-fe -d preimage copytarget.overflow $len ' +test_expect_success 'copyfrom source' ' + printf foo >expect && + printf "SVNQ%b%b" "Q\003\003\002Q" "\003Q" | q_to_nul >copysource.all && + test-svn-fe -d preimage copysource.all 11 >actual && + test_cmp expect actual +' + +test_expect_success 'copy backwards' ' + printf oof >expect && + printf "SVNQ%b%b" "Q\003\003\006Q" "\001\002\001\001\001Q" | + q_to_nul >copysource.rev && + test-svn-fe -d preimage copysource.rev 15 >actual && + test_cmp expect actual +' + +test_expect_success 'offsets are relative to window' ' + printf fo >expect && + printf "SVNQ%b%b%b%b" "Q\003\001\002Q" "\001Q" \ + "\002\001\001\002Q" "\001Q" | + q_to_nul >copysource.two && + test-svn-fe -d preimage copysource.two 18 >actual && + test_cmp expect actual +' + +test_expect_success 'example from notes/svndiff' ' + printf aaaaccccdddddddd >expect && + printf aaaabbbbcccc >source && + printf "SVNQ%b%b%s" "Q\014\020\007\001" \ + "\004Q\004\010\0201\0107\010" d | + q_to_nul >delta.example && + len=$(wc -c ...
Hi Jonathan, Thanks for this! The code looks really good and the test suite is quite comprehensive :) Eager to see this go through to `master`. -- Ram --
I think the strbuf use breaks the contrib/svn-fe/svn-fe build somewhere. Probably something like the following is needed. (moving the -I flags off of the compilation command line is justa "while at it" thing, for consistency with the toplevel Makefile) Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> --- diff --git a/contrib/svn-fe/Makefile b/contrib/svn-fe/Makefile index 360d8da..9e8f174 100644 --- a/contrib/svn-fe/Makefile +++ b/contrib/svn-fe/Makefile @@ -6,7 +6,7 @@ MV = mv CFLAGS = -g -O2 -Wall LDFLAGS = -ALL_CFLAGS = $(CFLAGS) +ALL_CFLAGS = $(CFLAGS) -I../../vcs-svn -I../.. ALL_LDFLAGS = $(LDFLAGS) EXTLIBS = @@ -38,7 +38,7 @@ svn-fe$X: svn-fe.o $(VCSSVN_LIB) $(GIT_LIB) $(ALL_LDFLAGS) $(LIBS) svn-fe.o: svn-fe.c ../../vcs-svn/svndump.h - $(QUIET_CC)$(CC) -I../../vcs-svn -o $*.o -c $(ALL_CFLAGS) $< + $(QUIET_CC)$(CC) -o $*.o -c $(ALL_CFLAGS) $< svn-fe.html: svn-fe.txt $(QUIET_SUBDIR0)../../Documentation $(QUIET_SUBDIR1) \ --
| Greg KH | Og dreams of kernels |
| Jens Axboe | [PATCH 31/33] Fusion: sg chaining support |
| Arnd Bergmann | Re: finding your own dead "CONFIG_" variables |
| Mark Brown | [PATCH 2/2] Subject: natsemi: Allow users to disable workaround for DspCfg reset |
| Tony Breeds | [LGUEST] Look in object dir for .config |
git: | |
| Brian Downing | Re: Git in a Nutshell guide |
| John Benes | Re: master has some toys |
| Matthias Lederhofer | [PATCH 4/7] introduce GIT_WORK_TREE to specify the work tree |
