Hello, Can anyone point me at code to mirror a git repository to cvs? I'd like to develop using git, and have a commit hook mirror the day-to-day changes (tags/commits) made in the git repo to a cvs repository. The idea is that the only way changes get into the cvs repo is via the git commit hook. I've experimented with git-cvsexportcommit, and found a few bugs (it couldn't handle simple things, like adding a file in a new directory -- fixed that, along with a few other minor problems), adding an empty file in git still gets a patch application error on the cvs side, but I can live with that for now. More seriously, making a change on a git branch mistakenly tries to apply the delta on the cvs trunk. None of this is particularly hard to fix -- or even critical, as long as you don't care about branches. I'm just hoping someone has already produced something more robust. From the looks of darcs/tailor, it doesn't handle the use of git as a source. Why am I interested? I want to switch the development of GNU coreutils from cvs to git. I would also like to continue making the repository available via cvs, for the sake of continuity. At worst, I can always cut the CVS cord, but that's a last resort. Jim -
Hi, If you only want to make a cvs repository available for tracking the project, git-cvsserver is what you want. It is even faster than the original cvs... Ciao, Dscho -
That might work if I had sufficient access to the system hosting the
public CVS repository. But there are restrictions (like no ssh access).
Currently I rsync the master repo to an intermediate site, from which
it is periodically pulled by savannah. Paranoia on both sides.
If I end up leaving savannah, can someone propose a good site,
i.e., secure, yet with git and rsync access?
I haven't made the leap to git yet, but git-cvsimport (from git-1.3.2)
seems to do a very good job of converting the cvs module (89MB).
FYI, here are some stats on the resulting git repository:
Size (nothing repacked):
1051MB (du -sh, actual, on reiserfs 3)
708MB (du --apparent-size)
Size repacked, (via git-repack -a -d && git-prune-packed)
65MB (du --si -s)
20k+ patchsets (counted by cvsps)
40k+ revisions (counted by cvs ... rlog cu|grep -c '^revision')
While repacking, git said something about more than 100K objects.
There were 120K files under .git/ before repacking.
-
Hello, Jim! I believe you have a very good reason to talk to decision makers in FSF. Savannah is very poorly maintained, and I actually took one of my projects (Orinoco driver) to SourceForge Subversion. If losing a Linux driver is next to nothing, losing GNU coreutils is a big deal for the GNU development site. You are likely to be heard if Subversion is as easy as CVS for potential users, but it has a useful "log" command if nothing else. It also have real changesets, which Sorry, I don't know any free git hosters, but here's what you can do: 1) Pressure Savannah to support git 2) Use arch on Savannah 3) Move to Subversion on SourceForge, GNA.org or Berlios and use git-svn -- Regards, Pavel Roskin -
Ive thought a couple of times about writing an exporter that would
replay things into a true CVS repo, but it's truly not worth it. We've
already got git-cvsserver that does all that -- better for me to focus
cvsexportcommit is clearly for manual usage, not for automagic usage.
It is a bit rough, (and I'd like to see your patches to it!) but it
wants to be driven by a smarter script to, for instance, know what
git-cvsserver is the word. It currently tracks the git repo itself
pretty well (perfectly, AFAICS) and it also tracks a git tree that is
actually imported daily from CVS -- doing
CVSrepo ->cvsimport -> GIT -> cvsserver -> CVS checkout
git-cvsserver works great for anon cvs access (does pserver) and
TortoiseCVS and cli cvs work great with it. Eclipse works well, but it
has been quite hard to get 'right'. Optionally, it can support users
with commit rights via ssh. It does track git 'heads' but they don't
show up as branches, they show up as different modules. So you to get
a checkout of the master branch, you do:
cvs -d pserver:anonymouys@foo.com:/var/foo.git co master
hope that helps!
martin
-
Thanks, but I'd rather do primary development directly using git, rather than with CVS. -
I do not use the automated tools myself, but I sync the day-job
work in my git repository to CVS at work. I do not develop with
CVS but use it merely as a publishing medium. Although other
people can make commits into CVS in which case I have to slurp
the change back into my git repository.
(0) Bootstrap. I did use git-cvsimport myself (this repository
started before the tool was written). Instead:
. cvs checkout the tip of the CVS development history
. "git init-db", edit .gitignore to ignore CVS, and "git
add ."
. "git commit -m epoch"; the git side of development
history in this repository starts at that point for me.
. "git branch origin"; the tip of CVS repository is kept
track of with this branch. I work in "master".
I think I could have done the above with git-cvsimport,
(1) Beginning of the day. In case other people did work on
the CVS side, I do this:
. "git checkout origin", "cvs -q update". If there is no
change, go to step (3).
. add any new files with "git add", and update the "origin"
branch with "git commit -a -m 'from CVS'".
(2) Merge other's work into my git master branch.
. "git checkout master" and "git pull . cvs"; conflict
resolve as needed.
(3) Do my work.
. "git checkout master" if I haven't done so.
. hack away, grow "master" branch using full power of git
including the use of topic branches etc.
(4) Publish, when "master" changes are ready.
. To avoid conflicts with other people working on CVS,
perform (1) again to make sure "origin" matches the tip
of CVS.
. "git checkout origin", "git pull . master".
. generate the consolidated log I am about to push back to
CVS with "git log --no-merges ORIG_HEAD.. | git shortlog >L".
. add any new files with "cvs add", and "cvs commit -F L"
. go back to (3) and continue.
This can be extended to ......
Thank you for describing the process you use.
However, since I don't have to allow independent cvs commits,
I hope to bend git-cvsexportcommit to my needs.
I haven't yet tried to restrict the mirroring to commits on a specific
git branch. So far, in the toy example I'm using to test things,
I have this in .git/hooks/post-commit:
#!/bin/sh
sha1_id=$(git-rev-parse --verify HEAD)
cvsdir=/var/tmp/work-c
cd $cvsdir && GIT_DIR=/var/tmp/git-experiment/work-g/.git \
git-cvsexportcommit -v -c -p $sha1_id
I'll clean up and post the changes I've made to git-cvsexportcommit
I'll send one report separately.
-
In a very shallow audit, I spotted code where overflow was not detected.
But it's hardly critical.
Currently,
git-diff HEAD HEAD
is equivalent to this
git-diff HEAD HEAD~18446744073709551616 # aka 2^64
Exercising git-rev-parse directly, currently I get this:
$ git-rev-parse --no-flags --sq HEAD~18446744073709551616
'639ca5497279607665847f2e3a11064441a8f2a6'
It'd be better to produce a diagnostic and fail:
$ ./git-rev-parse --no-flags --sq -- HEAD~18446744073709551616 > /dev/null
fatal: ambiguous argument 'HEAD~18446744073709551616': unknown revision or filename
The code in question is in sha1_name.c (get_sha1_1):
int num = 0;
...
while (cp < name + len)
num = num * 10 + *cp++ - '0';
Looking at how to fix it, my first reflex was to replace that loop
with this one:
while (cp < name + len) {
int tmp = num * 10 + *cp++ - '0';
if (INT_MAX / 10 < num || tmp < num)
return -1;
num = tmp;
}
But INT_MAX is used nowhere else, so I wonder if git avoids using
it for some reason. At least `make check' gripes about __INT_MAX__.
Anyhow, here's the patch I used. With it, git still passes `make test'.
diff --git a/sha1_name.c b/sha1_name.c
index dc68355..c813ba0 100644
--- a/sha1_name.c
+++ b/sha1_name.c
@@ -429,8 +429,12 @@ static int get_sha1_1(const char *name,
int num = 0;
int len1 = cp - name;
cp++;
- while (cp < name + len)
- num = num * 10 + *cp++ - '0';
+ while (cp < name + len) {
+ int tmp = num * 10 + *cp++ - '0';
+ if (INT_MAX / 10 < num || tmp < num)
+ return -1;
+ num = tmp;
+ }
if (has_suffix == '^') {
if (!num && len1 == len - 1)
num = 1;
-
This is another one of those `would be nice' sort of changes.
Probably not worth much at this early stage in development, but
eventually worth changing.
There are about 20 uses of atoi, and most calls can return
a usable result in spite of an invalid input -- just because
atoi returns the same thing for "99" as "99-and-any-suffix".
It would be better not to ignore invalid inputs.
-------------------
Also, integer overflow in object.c can cause trouble.
When the xrealloc byte count exceeds 2^32 (for a 32-bit int),
xrealloc will happily return a buffer of the requested (small) size,
but the following memset will scribble zeroes far beyond the end
of that new buffer.
static int nr_objs;
int obj_allocs;
...
void created_object(const unsigned char *sha1, struct object *obj)
{
...
if (obj_allocs - 1 <= nr_objs * 2) {
int i, count = obj_allocs;
obj_allocs = (obj_allocs < 32 ? 32 : 2 * obj_allocs);
objs = xrealloc(objs, obj_allocs * sizeof(struct object *));
memset(objs + count, 0, (obj_allocs - count)
* sizeof(struct object *));
But this may be only theoretical, because the problem doesn't strike
until there are over 250M objects (assuming 32-bit int and 8-byte pointers).
-
atoi has undefined behaviour for "99-and-any-suffix". You might get lucky and get back 99, but you might also get a random value or a core dump. Morten -
I've never heard of that.
POSIX says that atoi(str) is equivalent to:
(int) strtol(str, (char **)NULL, 10)
except that the handling of errors may differ.
If the value cannot be represented, the behavior is undefined.
Since strtol works fine with such a suffix, and since 99 can be
represented, I don't see why there would be any undefined behavior.
Do you know of an implementation for which `atoi ("99-and-any-suffix")'
does anything other than return 99?
-
Where do you get that from? The standard claims that it converts "the initial portion of the string pointed to" (7.20.1.2). Furthermore, atoi is equivalent to strtol with a base of 10 (with the exception of range errors). From 7.20.1.4, paragraph 2: The strtol [...] functions [...] decompose the input string into three parts: an initial, possibly empty, sequence of white-space characters [...], a subject sequence resembling an integer represented in some radix determined by the value of base, and a final string of one or more unrecognized characters... If no conversion can be performed (i.e., you feed it garbage with no number), zero is returned. atoi does NOT handle range errors, however; the behavior is undefined in that case. In practice, I expect most implementations do some sort of wrapping. -Peff -
My copy (which is admittedly a draft because I am cheap) does not restrict undefined behaviour to _range_ errors, but simply says "Except for the behavior on error, they are equivalent to [the strtol call]" M. -
git doesn't always detect write failures. A write I/O error,
(e.g., hardware I/O error or simply disk full)
doesn't provoke nonzero exit status:
$ ./git-cat-file -t HEAD > /dev/full && echo did not detect write failure
did not detect write failure
This is perhaps more important than the other things
I've reported, since it can lead to porcelain being unable
to detect a real failure in the plumbing.
Here are two more:
$ ./git-ls-tree HEAD > /dev/full && echo fail
fail
$ ./git-show > /dev/full && echo fail
fail
If you were using gnulib, I'd suggest simply adding this line
atexit (close_stdout);
near the beginning of each `main'. Then you wouldn't have to
manually track down each and every place where a write to stdout
can occur -- not to mention the maintenance burden of keeping
things correct as the code evolves.
-
