We always use 'utf-8' as the encoding, since we currently
have no way of getting the information from the user.
This also refactors the quoting of recipient names, since
both processes can share the rfc2047 quoting code.
Signed-off-by: Jeff King <peff@peff.net>
---
git-send-email.perl | 18 +++++++++++++++---
t/t9001-send-email.sh | 15 +++++++++++++++
2 files changed, 30 insertions(+), 3 deletions(-)
diff --git a/git-send-email.perl b/git-send-email.perl
index 7c4f06c..075cd0b 100755
--- a/git-send-email.perl
+++ b/git-send-email.perl
@@ -501,7 +501,12 @@ if ($compose) {
open(C,">",$compose_filename)
or die "Failed to open for writing $compose_filename: $!";
print C "From $sender # This line is ignored.\n";
- printf C "Subject: %s\n\n", $initial_subject;
+ print C "Subject: ",
+ ($initial_subject =~ /[^[:ascii:]]/ ?
+ quote_rfc2047($initial_subject) :
+ $initial_subject),
+ "\n";
+ print C "\n";
printf C <<EOT;
GIT: Please enter your email below.
GIT: Lines beginning in "GIT: " will be removed.
@@ -626,6 +631,14 @@ sub unquote_rfc2047 {
return wantarray ? ($_, $encoding) : $_;
}
+sub quote_rfc2047 {
+ local $_ = shift;
+ my $encoding = shift || 'utf-8';
+ s/([^-a-zA-Z0-9!*+\/])/sprintf("=%02X", ord($1))/eg;
+ s/(.*)/=\?$encoding\?q\?$1\?=/;
+ return $_;
+}
+
# use the simplest quoting being able to handle the recipient
sub sanitize_address
{
@@ -643,8 +656,7 @@ sub sanitize_address
# rfc2047 is needed if a non-ascii char is included
if ($recipient_name =~ /[^[:ascii:]]/) {
- $recipient_name =~ s/([^-a-zA-Z0-9!*+\/])/sprintf("=%02X", ord($1))/eg;
- $recipient_name =~ s/(.*)/=\?utf-8\?q\?$1\?=/;
+ $recipient_name = quote_rfc2047($recipient_name);
}
# double quotes are needed if specials or CTLs are included
diff --git a/t/t9001-send-email.sh b/t/t9001-send-email.sh
index e222c49..a4bcd28 100755
--- a/t/t9001-send-email.sh
+++ b/t/t9001-send-email.sh
@@ -210,4 +210,19 @@ test_expect_success '--compose ...These patches seem to work except that the quoting of Subject field works only if user types a non-Ascii text to the "What subject should the initial email start with?" prompt. If she changes the subject in editor it won't be rfc2047-quoted. Thank you anyway, I think we're going to right direction. I think 'git send-mail --compose' is nice way to produce introductory message to patch series. If --compose doesn't support MIME encoding reasonable way, user may have to write and send intro message with real MUA and find out the Message-Id for correct In-Reply-To field for the actual patch series. E-mail agents KMail and Mutt have setting for preferred encodings for outgoing mail. It's a list of encodings, like "us-ascii,iso-8859-1,utf-8". The first one that fits (including From, To, Cc, Subject, the body, ...?) is used, so there is some kind of detection of content after the message has been composed. If portable content encoding detection is difficult or considered unnecessary, then I think a documented configurable option is fine (UTF-8 by default). --
Ah, yes, I hadn't considered that. We should definitely do the quoting
after all of the user's input. Replace 2/2 from my series with the patch
below, which handles this case correctly (and as a bonus, the user sees
git-format-patch recently got a --cover-letter option which does the
same thing. I actually use a real MUA (mutt) instead of send-email, and
this way you can avoid the message-id cutting and pasting that is
required. It automatically does the right thing with encodings because I
Yes, the git-send-email code is a real mess for this sort of thing. I
think it started very small and specific, and has gotten hack upon hack
piled on it. It would be much nicer rewritten from scratch around one of
the many abstracted perl mail objects (though that does introduce a new
I think that is sensible. Want to try adding it on top of my patches?
Below is the revised subject-munging patch.
-- >8 --
send-email: rfc2047-quote subject lines with non-ascii characters
We always use 'utf-8' as the encoding, since we currently
have no way of getting the information from the user.
This also refactors the quoting of recipient names, since
both processes can share the rfc2047 quoting code.
Signed-off-by: Jeff King <peff@peff.net>
---
git-send-email.perl | 20 ++++++++++++++++++--
t/t9001-send-email.sh | 15 +++++++++++++++
2 files changed, 33 insertions(+), 2 deletions(-)
diff --git a/git-send-email.perl b/git-send-email.perl
index 7c4f06c..3694f81 100755
--- a/git-send-email.perl
+++ b/git-send-email.perl
@@ -536,6 +536,15 @@ EOT
if (!$in_body && /^MIME-Version:/i) {
$need_8bit_cte = 0;
}
+ if (!$in_body && /^Subject: ?(.*)/i) {
+ my $subject = $1;
+ $_ = "Subject: " .
+ ($subject =~ /[^[:ascii:]]/ ?
+ quote_rfc2047($subject) :
+ $subject) .
+ "\n";
+ }
+ }
print C2 $_;
}
close(C);
@@ -626,6 +635,14 @@ sub unquote_rfc2047 {
return wantarray ? ($_, $encoding) : $_;
}
+sub quote_rfc2047 {
+ local $_ = ...I had missed the --cover-letter option completely. It may be useful too. I'm still trying to find the best way to send pathces. If I send intro message with real MUA I either need to wait for the message to show up on a mailing list or check my sent-mail folder to find the Message-Id. Once I know the Message-Id I can send the actual patch series with 'git I'd like to, but I can only do sh/bash stuff and possibly some copy-and-paste programming with other scripting languages. You'd end up fixing my code anyway, sorry. As you noticed, I accidentally sent you a couple of test emails because send-email CCed mails to patches' author (I think). Now I have set "suppresscc = all" and "suppressfrom = true" which should prevent such accidents. Shouldn't these be defaults? In my opinion it's generally the best practice to always explicitly define what parties emails are sent to. --
That is how I used to do it; now I use --cover-letter (which you
OK, I will add it to the end of my long todo. Out of curiosity, do you
actually want something besides utf-8, or is this just to make us feel
I think this is probably a good change. But it is a behavior change,
which means it is definitely out during the -rc freeze. And it may or
Argh, yes. I _thought_ I ran it successfully through the test script,
but obviously I failed to 'make' and just tested the previous version.
It works fine with the bracket removed.
For reference, the fixed-up patch is below.
-- >8 --
send-email: rfc2047-quote subject lines with non-ascii characters
We always use 'utf-8' as the encoding, since we currently
have no way of getting the information from the user.
This also refactors the quoting of recipient names, since
both processes can share the rfc2047 quoting code.
Signed-off-by: Jeff King <peff@peff.net>
---
git-send-email.perl | 19 +++++++++++++++++--
t/t9001-send-email.sh | 15 +++++++++++++++
2 files changed, 32 insertions(+), 2 deletions(-)
diff --git a/git-send-email.perl b/git-send-email.perl
index 7c4f06c..d0f9d4a 100755
--- a/git-send-email.perl
+++ b/git-send-email.perl
@@ -536,6 +536,14 @@ EOT
if (!$in_body && /^MIME-Version:/i) {
$need_8bit_cte = 0;
}
+ if (!$in_body && /^Subject: ?(.*)/i) {
+ my $subject = $1;
+ $_ = "Subject: " .
+ ($subject =~ /[^[:ascii:]]/ ?
+ quote_rfc2047($subject) :
+ $subject) .
+ "\n";
+ }
print C2 $_;
}
close(C);
@@ -626,6 +634,14 @@ sub unquote_rfc2047 {
return wantarray ? ($_, $encoding) : $_;
}
+sub quote_rfc2047 {
+ local $_ = shift;
+ my $encoding = shift || 'utf-8';
+ s/([^-a-zA-Z0-9!*+\/])/sprintf("=%02X", ord($1))/eg;
+ s/(.*)/=\?$encoding\?q\?$1\?=/;
+ return $_;
+}
+
# use the simplest quoting being able to handle the recipient
sub sanitize_address
{
@@ -643,8 +659,7 @@ sub sanitize_address
# rfc2047 is needed if a non-ascii char is ...I'm using the current 'master' branch so --cover-letter is there. Managed to miss it anyway. :) Hmm, do you send the 0000-cover-letter.patch with 'git send-email'? It seems that this cover letter don't get MIME headers when sent that way. Sending through 'mutt -H' it works fine but then the Message-Id needs to be copy-pasted manually to send-mail for the rest of the series (to have I mostly use (and promote) UTF-8 and now that I begin to understand how send-email works I can live with the current behaviour just fine. Don't take my feedback as complaining. :) In general my interests are in human languages and I have done quite a lot of work in different areas to make computers interact nicely with human languages. This is my interest in general level and I tend to report/fix problems when I notice them. From Git's point of view at the present moment we can probably say just like you did: "make us feel feature complete." Thanks for your work on this. Really. --
My English is somewhat broken. I meant to thank you for your work. --
Maybe it is the late hour, but I am a native English speaker, and it parsed just fine to me. -Peff --
No, I have format-patch do the threading. So something like:
git format-patch --cover-letter --thread --stdout upstream >mbox
mutt -f mbox
and then in mutt I bind a key to <resend-message>. For each message, I
do the 'resend', set the recipient headers, look it over one last time,
and then send. The most annoying part is entering the recipients;
usually it isn't too bad because I have short aliases for Junio and the
list, but I had to, e.g., cut and paste your address twice for the other
series.
Probably munging the 'to:' and 'cc:' before running mutt would make the
OK, I am inclined to leave the patches as-is, then, and wait for
somebody to complain about their pet encoding. My reasoning is that:
- in most cases throughout git, we assume things are happening in
utf-8, so I don't think it will come as a great surprise
- I think doing it right might be more complex than just send-mail; I
am thinking there might need to be a "stuff the user inputs is in
No problem at all. Thank you for helping make git better with bug
reports!
-Peff
--
Since it looks like you are using mutt also, I will warn you that there is a problem with this workflow: when mutt does the resend, it generates a new message-id. Thus the patches are all connected in a thread because they all in-reply-to the cover letter, but the cover letter is not connected, since it has a new message-id. I'm not sure if there is a way to fix this short of patching mutt. :( -Peff --
instead of opening the mbox using -f, and then recall the messages to send. That *might* prevent mutt from rewriting the message-id, but I haven't tested it at all. --=20 Todd OpenPGP -> KeyID: 0xBEAF0CE3 | URL: www.pobox.com/~tmz/pgp ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Between two evils, I always pick the one I never tried before. -- Mae West
