Re: [PATCH] git-cvsimport: strip question-mark characters in tags

Previous thread: Checksum mismatch problem by Manuel Corpas on Wednesday, April 14, 2010 - 6:31 am. (1 message)

Next thread: [PATCH] Document ls-files -t as obsolete. by Matthieu Moy on Wednesday, April 14, 2010 - 6:45 am. (6 messages)
From: Ed Santiago
Date: Wednesday, April 14, 2010 - 6:38 am

Question mark character appears to be valid in a CVS tag,
but not a git one.  Remove it.  Leave open the possibility that there may
be more such characters; and comment (FIXME) that we may want to replace
those instead of removing them.

Also: if git tag command fails, do not include $! in our
error message: it is not useful after system(), and will
only serve as a red herring.
---
 git-cvsimport.perl |    6 ++++--
 1 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/git-cvsimport.perl b/git-cvsimport.perl
index 9e03eee..48de2b4 100755
--- a/git-cvsimport.perl
+++ b/git-cvsimport.perl
@@ -840,10 +840,12 @@ sub commit {
 		$xtag =~ s/\s+\*\*.*$//; # Remove stuff like ** INVALID ** and ** FUNKY **
 		$xtag =~ tr/_/\./ if ( $opt_u );
 		$xtag =~ s/[\/]/$opt_s/g;
-		$xtag =~ s/\[//g;
+		# The following characters are valid in CVS tags but not git.
+		# Remove them. (FIXME: optionally replace?)
+		$xtag =~ tr/\[\?//d;
 
 		system('git' , 'tag', '-f', $xtag, $cid) == 0
-			or die "Cannot create tag $xtag: $!\n";
+			or die "Cannot create tag $xtag\n";
 
 		print "Created tag '$xtag' on '$branch'\n" if $opt_v;
 	}
-- 
1.6.6.1


-- 
Ed Santiago             Software Engineer             santiago@redhat.com
--

From: Junio C Hamano
Date: Wednesday, April 14, 2010 - 7:29 am

I agree that people may want to optionally replace them to avoid mapping
two originally different tags into the same one.

Does your regexp chain check other invalid refname sequences, such as ".name",

Sign-off?
--

From: Ed Santiago
Date: Wednesday, April 14, 2010 - 8:44 am

I considered that but decided that it was beyond the scope of
what I wanted to tackle.  My intent was to suggest an incremental
fix, not a revolutionary one.  I also wished to remove a misleading
diagnostic[1] and provide a way for others to do the same as needed.

 [1] The error I saw was "Cannot create tag mumblefoo{?dist}: Illegal seek",
     where "Illegal seek" is due to an uninitialized/invalid $!

A proper solution would, I believe, be uncomfortably complex.
Some of the approaches I considered are:

  * command-line option '--tag-replace <from-char>=<to>', eg '?=-'
  * replacing a (hardcoded) list of chars with %xx hex codes
  * option a la -A, with a file listing <oldtag>=<newtag>
  * like above, but the file lists perl expressions such as s/\?/./g

I used git-cvsimport because I've used tailor[2] before and
dreaded the thought of running it again.  I love how easy
git-cvsimport was to use!  I see it as a great tool for
quick jobs, for proof-of-concept situations: it has a low
cost of entry, works well with a remote CVS repo, and was
easy to debug.  But if it were to become as configurable -- and
as complex -- as tailor, I think it might lose much of its value.


Eek, no.  Again, this was a purely incremental fix, an addition
to an existing (simple) check.  Duplicating the functionality
of git-check-ref-format would be a terrible idea.  But if there's
already a perlified library for this, or a portable hook into
git itself (I haven't looked), it might be wise to have
git-cvsimport use that to check/convert tags.  At the very

My apologies.  I'm new to this.

How would you like me to proceed?  I'm reluctant to code a
full general solution to the tag-validity problem, but am
happy to rework my changes into something better.

Thanks for your feedback,
Ed
-- 
Ed Santiago             Software Engineer             santiago@redhat.com
--

From: Junio C Hamano
Date: Wednesday, April 14, 2010 - 12:27 pm

Oh, one step at a time is perfectly fine.
--

From: Andreas Schwab
Date: Wednesday, April 14, 2010 - 1:44 pm

According to the CVS docs only letters, digits, '-' and '_' are valid
for tag names.

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."
--

From: Ed Santiago
Date: Wednesday, April 14, 2010 - 2:42 pm

Poor choice of words on my part.  What I *should* have said is
something like:

   Although question marks and curly braces are not among the
   set of characters which CVS considers to be valid for a tag,
   real-world situations have been encontered in which a CVS
   comma-v file has a tag including all those characters.  This
   patch makes git-cvsimport accept and forgive that reality.

How that tag got created, I really don't know.  I can imagine
three ways it could've happened (rcs commands; broken/old version
of CVS; custom tool for mucking with comma-v files).  My goal
was to recognize that this sort of thing happens, and to make
it easier for the next person to find & fix this in the script.
With that goal in mind, removing $! and adding the comment is
the only important part of my patch.  The question mark itself
is not likely to be useful except in very rare and weird cases.

Ed
-- 
Ed Santiago             Software Engineer             santiago@redhat.com
--

From: Johan Herland
Date: Wednesday, April 14, 2010 - 6:39 pm

Indeed. I have even seen CVS tag names containing carriage returns (aka. CR, 
\r) in the wild...




-- 
Johan Herland, <johan@herland.net>
www.herland.net
--

Previous thread: Checksum mismatch problem by Manuel Corpas on Wednesday, April 14, 2010 - 6:31 am. (1 message)

Next thread: [PATCH] Document ls-files -t as obsolete. by Matthieu Moy on Wednesday, April 14, 2010 - 6:45 am. (6 messages)