Another round of cleanups as noticed by Junio. Only the the first two patches were touched. -coding style cleanups -better boundary checking Cheers, Don -
They handle cases where there is no attached patch. Signed-off-by: Don Zickus <dzickus@redhat.com> --- t/t5100-mailinfo.sh | 2 +- t/t5100/info0007 | 5 +++++ t/t5100/info0008 | 5 +++++ t/t5100/msg0007 | 2 ++ t/t5100/msg0008 | 4 ++++ t/t5100/sample.mbox | 18 ++++++++++++++++++ 6 files changed, 35 insertions(+), 1 deletions(-) create mode 100644 t/t5100/info0007 create mode 100644 t/t5100/info0008 create mode 100644 t/t5100/msg0007 create mode 100644 t/t5100/msg0008 create mode 100644 t/t5100/patch0007 create mode 100644 t/t5100/patch0008 diff --git a/t/t5100-mailinfo.sh b/t/t5100-mailinfo.sh index 4d2b781..ca96918 100755 --- a/t/t5100-mailinfo.sh +++ b/t/t5100-mailinfo.sh @@ -11,7 +11,7 @@ test_expect_success 'split sample box' \ 'git-mailsplit -o. ../t5100/sample.mbox >last && last=`cat last` && echo total is $last && - test `cat last` = 6' + test `cat last` = 8' for mail in `echo 00*` do diff --git a/t/t5100/info0007 b/t/t5100/info0007 new file mode 100644 index 0000000..49bb0fe --- /dev/null +++ b/t/t5100/info0007 @@ -0,0 +1,5 @@ +Author: A U Thor +Email: a.u.thor@example.com +Subject: another patch +Date: Fri, 9 Jun 2006 00:44:16 -0700 + diff --git a/t/t5100/info0008 b/t/t5100/info0008 new file mode 100644 index 0000000..e8a2951 --- /dev/null +++ b/t/t5100/info0008 @@ -0,0 +1,5 @@ +Author: Junio C Hamano +Email: junio@kernel.org +Subject: another patch +Date: Fri, 9 Jun 2006 00:44:16 -0700 + diff --git a/t/t5100/msg0007 b/t/t5100/msg0007 new file mode 100644 index 0000000..71b23c0 --- /dev/null +++ b/t/t5100/msg0007 @@ -0,0 +1,2 @@ +Here is an empty patch from A U Thor. + diff --git a/t/t5100/msg0008 b/t/t5100/msg0008 new file mode 100644 index 0000000..a80ecb9 --- /dev/null +++ b/t/t5100/msg0008 @@ -0,0 +1,4 @@ +>Here is an empty patch from A U Thor. + +Hey you forgot the patch! + diff --git a/t/t5100/patch0007 b/t/t5100/patch0007 new file mode 100644 index 0000000..e69de29 diff ...
I have come across many emails that use long strings of '-'s as separators
for ideas. This patch below limits the separator to only 3 '-', with the
intent that long string of '-'s will stay in the commit msg and not in the
patch file.
Signed-off-by: Don Zickus <dzickus@redhat.com>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
---
I purposedly separated this patch out because I wasn't sure if anyone would
have objections to it. I tested it on numerous emails with and with patches
and didn't see any issues.
---
builtin-mailinfo.c | 37 ++++++++++++++++++++++++++++++++++---
1 files changed, 34 insertions(+), 3 deletions(-)
diff --git a/builtin-mailinfo.c b/builtin-mailinfo.c
index dd0f563..a8d5b60 100644
--- a/builtin-mailinfo.c
+++ b/builtin-mailinfo.c
@@ -652,6 +652,39 @@ again:
return (fgets(line, sizeof(line), fin) != NULL);
}
+static inline int patchbreak(const char *line)
+{
+ /* Beginning of a "diff -" header? */
+ if (!memcmp("diff -", line, 6))
+ return 1;
+
+ /* CVS "Index: " line? */
+ if (!memcmp("Index: ", line, 7))
+ return 1;
+
+ /*
+ * "--- <filename>" starts patches without headers
+ * "---<sp>*" is a manual separator
+ */
+ if (!memcmp("---", line, 3)) {
+ line += 3;
+ /* space followed by a filename? */
+ if (line[0] == ' ' && !isspace(line[1]))
+ return 1;
+ /* Just whitespace? */
+ for (;;) {
+ unsigned char c = *line++;
+ if (c == '\n')
+ return 1;
+ if (!isspace(c))
+ break;
+ }
+ return 0;
+ }
+ return 0;
+}
+
+
static int handle_commit_msg(char *line)
{
static int still_looking=1;
@@ -673,9 +706,7 @@ static int handle_commit_msg(char *line)
return 0;
}
- if (!memcmp("diff -", line, 6) ||
- !memcmp("---", line, 3) ||
- !memcmp("Index: ", line, 7)) {
+ if (patchbreak(line)) {
fclose(cmitmsg);
cmitmsg = NULL;
return 1;
--
1.5.0.2.211.g2ca9-dirty
-
I am working on a project that required parsing through regular mboxes that didn't necessarily have patches embedded in them. I started by creating my own modified copy of git-am and working from there. Very quickly, I noticed git-mailinfo wasn't able to handle a big chunk of my email. After hacking up numerous solutions and running into more limitations, I decided it was just easier to rewrite a big chunk of it. The following patch has a bunch of fixes and features that I needed in order for me do what I wanted. Note: I'm didn't follow any email rfc papers but I don't think any of the changes I did required much knowledge (besides the boundary stuff). List of major changes/fixes: - can't create empty patch files fix - empty patch files don't fail, this failure will come inside git-am - multipart boundaries are now handled - only output inbody headers if a patch exists otherwise assume those headers are part of the reply and instead output the original headers - decode and filter base64 patches correctly - various other accidental fixes I believe I didn't break any existing functionality or compatibility (other than what I describe above, which is really only the empty patch file). I tested this through various mailing list archives and everything seemed to parse correctly (a couple thousand emails). Signed-off-by: Don Zickus <dzickus@redhat.com> --- builtin-mailinfo.c | 520 +++++++++++++++++++++++++++------------------------ git-am.sh | 4 + git-applymbox.sh | 4 + git-quiltimport.sh | 4 + 4 files changed, 287 insertions(+), 245 deletions(-) diff --git a/builtin-mailinfo.c b/builtin-mailinfo.c index 766a37e..dacdf77 100644 --- a/builtin-mailinfo.c +++ b/builtin-mailinfo.c @@ -11,19 +11,22 @@ static FILE *cmitmsg, *patchfile, *fin, *fout; static int keep_subject; static const char *metainfo_charset; static char line[1000]; -static char date[1000]; static char name[1000]; static char email[1000]; -static char subject[1000]; ...
I am working on a project that required parsing through regular mboxes that didn't necessarily have patches embedded in them. I started by creating my own modified copy of git-am and working from there. Very quickly, I noticed git-mailinfo wasn't able to handle a big chunk of my email. After hacking up numerous solutions and running into more limitations, I decided it was just easier to rewrite a big chunk of it. The following patch has a bunch of fixes and features that I needed in order for me do what I wanted. Note: I'm didn't follow any email rfc papers but I don't think any of the changes I did required much knowledge (besides the boundary stuff). List of major changes/fixes: - can't create empty patch files fix - empty patch files don't fail, this failure will come inside git-am - multipart boundaries are now handled - only output inbody headers if a patch exists otherwise assume those headers are part of the reply and instead output the original headers - decode and filter base64 patches correctly - various other accidental fixes I believe I didn't break any existing functionality or compatibility (other than what I describe above, which is really only the empty patch file). I tested this through various mailing list archives and everything seemed to parse correctly (a couple thousand emails). Signed-off-by: Don Zickus <dzickus@redhat.com> --- Accidentally sent out the wrong patch yesterday.. --- builtin-mailinfo.c | 520 +++++++++++++++++++++++++++------------------------ git-am.sh | 4 + git-applymbox.sh | 4 + git-quiltimport.sh | 4 + 4 files changed, 287 insertions(+), 245 deletions(-) diff --git a/builtin-mailinfo.c b/builtin-mailinfo.c index 766a37e..a5eea82 100644 --- a/builtin-mailinfo.c +++ b/builtin-mailinfo.c @@ -11,19 +11,22 @@ static FILE *cmitmsg, *patchfile, *fin, *fout; static int keep_subject; static const char *metainfo_charset; static char line[1000]; -static char date[1000]; static char ...
This is useful when scripts need more than just the basic email headers to
parse. By specifying the "-x=" option, one can search and output any header
field they want.
Signed-off-by: Don Zickus <dzickus@redhat.com>
---
builtin-mailinfo.c | 10 ++++++++--
1 files changed, 8 insertions(+), 2 deletions(-)
diff --git a/builtin-mailinfo.c b/builtin-mailinfo.c
index dacdf77..dd0f563 100644
--- a/builtin-mailinfo.c
+++ b/builtin-mailinfo.c
@@ -856,11 +856,12 @@ int mailinfo(FILE *in, FILE *out, int ks, const char *encoding,
}
static const char mailinfo_usage[] =
- "git-mailinfo [-k] [-u | --encoding=<encoding>] msg patch <mail >info";
+ "git-mailinfo [-k] [-u | --encoding=<encoding>] [-x=<field>] msg patch <mail >info";
int cmd_mailinfo(int argc, const char **argv, const char *prefix)
{
const char *def_charset;
+ int top;
/* NEEDSWORK: might want to do the optional .git/ directory
* discovery
@@ -870,6 +871,8 @@ int cmd_mailinfo(int argc, const char **argv, const char *prefix)
def_charset = (git_commit_encoding ? git_commit_encoding : "utf-8");
metainfo_charset = def_charset;
+ for (top=0; header[top]; top++){ ; }
+
while (1 < argc && argv[1][0] == '-') {
if (!strcmp(argv[1], "-k"))
keep_subject = 1;
@@ -879,7 +882,10 @@ int cmd_mailinfo(int argc, const char **argv, const char *prefix)
metainfo_charset = NULL;
else if (!prefixcmp(argv[1], "--encoding="))
metainfo_charset = argv[1] + 11;
- else
+ else if (!prefixcmp(argv[1], "-x=")) {
+ header[top] = xmalloc(256*sizeof(char));
+ strncpy(header[top++], argv[1]+3, 256);
+ } else
usage(mailinfo_usage);
argc--; argv++;
}
--
1.5.0.2.211.g2ca9-dirty
-
This is useful when scripts need more than just the basic email headers to
parse. By specifying the "-x=" option, one can search and output any header
field they want.
Signed-off-by: Don Zickus <dzickus@redhat.com>
---
Accidentally sent out the wrong patch yesterday.
---
builtin-mailinfo.c | 22 ++++++++++++++++------
1 files changed, 16 insertions(+), 6 deletions(-)
diff --git a/builtin-mailinfo.c b/builtin-mailinfo.c
index a5eea82..8ac6ef4 100644
--- a/builtin-mailinfo.c
+++ b/builtin-mailinfo.c
@@ -302,7 +302,7 @@ static int check_header(char *line, char **hdr_data)
int i;
/* search for the interesting parts */
- for (i = 0; header[i]; i++) {
+ for (i = 0; header[i] && i < MAX_HDR_PARSED; i++) {
int len = strlen(header[i]);
if (!hdr_data[i] &&
!strncasecmp(line, header[i], len) &&
@@ -338,8 +338,8 @@ static int check_header(char *line, char **hdr_data)
if (!memcmp(">From", line, 5) && isspace(line[5]))
return 1;
if (!memcmp("[PATCH]", line, 7) && isspace(line[7])) {
- for (i = 0; header[i]; i++) {
- if (!memcmp("Subject: ", header[i], 9)) {
+ for (i = 0; header[i] && i < MAX_HDR_PARSED; i++) {
+ if (!memcmp("Subject", header[i], 7)) {
if (! handle_header(line, hdr_data[i], 0)) {
return 1;
}
@@ -796,7 +796,7 @@ static void handle_info(void)
char *hdr;
int i;
- for (i = 0; header[i]; i++) {
+ for (i = 0; header[i] && i < MAX_HDR_PARSED; i++) {
/* only print inbody headers if we output a patch file */
if (patch_lines && s_hdr_data[i])
@@ -856,11 +856,12 @@ int mailinfo(FILE *in, FILE *out, int ks, const char *encoding,
}
static const char mailinfo_usage[] =
- "git-mailinfo [-k] [-u | --encoding=<encoding>] msg patch <mail >info";
+ "git-mailinfo [-k] [-u | --encoding=<encoding>] [-x=<field>] msg patch <mail >info";
int cmd_mailinfo(int argc, const char **argv, const char *prefix)
{
const char *def_charset;
+ int top;
/* NEEDSWORK: might want to do the optional .git/ ...This issue popped up when testing my changes. I believe the patch is the intended output that git-mailinfo should provide. Signed-off-by: Don Zickus <dzickus@redhat.com> --- t/t5100/patch0005 | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/t/t5100/patch0005 b/t/t5100/patch0005 index 7d24b24..e7d6f66 100644 --- a/t/t5100/patch0005 +++ b/t/t5100/patch0005 @@ -61,7 +61,7 @@ diff --git a/git-cvsimport-script b/git-cvsimport-script push(@old,$fn); -- -David K
