[PATCH 3/3] CHECKFILES: new small shell script to check multiple source files

Previous thread: [PATCH 2/2] Version 5 (2.6.23-rc8-mm2) Smack: Simplified Mandatory Access Control Kernel by Casey Schaufler on Friday, October 5, 2007 - 9:58 am. (1 message)

Next thread: 2.6.23-rc9-git4: pata_pcmcia, disabling IRQ #9 by Manuel Lauss on Friday, October 5, 2007 - 11:08 am. (18 messages)
From: Erez Zadok
Date: Friday, October 5, 2007 - 9:56 am

This series of patches adds a -t option to checkpatch, so it can print terse
messages one per line, in a format compatible with g/cc
(filename:linenumber:message).  This format can be parsed by various tools
and editors that can help show the errors in one window and the offending
file+line in another window.

This patch series also introduces a new small shell script wrapper, called
scripts/checkfiles, that checks the compliance of source files (not in diff
format); the script can operate on any mix of files and directories
(recursively checking).  For example, to check the entire Linux kernel, run:

$ ./scripts/checkfiles .

So, I ran the above script and it found nearly 1.5 million reported
warnings/errors, with drivers being the largest abuser, not surprisingly.
Here's the sorted breakdown per top-level subsystem:

     32 usr
    162 init
    266 block
    293 Documentation
    471 ipc
    819 mm
   1115 security
   1915 scripts
   1948 kernel
   4569 lib
   4793 crypto
  13851 net
  80025 sound
  86484 include
 115789 arch
 116623 fs
1035158 drivers
-------+-------
1464313 TOTAL

Any volunteers? :-)


Erez Zadok (3):
      CHECKPATCH: update usage string for checkpatch.pl
      CHECKPATCH: add terse output option to checkpatch.pl
      CHECKFILES: new small shell script to check multiple source files

 checkfiles    |   34 ++++++++++++++++++++++++++++++++++
 checkpatch.pl |   21 +++++++++++++++++----
 2 files changed, 51 insertions(+), 4 deletions(-)

---
Erez Zadok
ezk@cs.sunysb.edu
-

From: Erez Zadok
Date: Friday, October 5, 2007 - 9:56 am

Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
---
 scripts/checkpatch.pl |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index dae7d30..ecbb030 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -33,6 +33,7 @@ if ($#ARGV < 0) {
 	print "version: $V\n";
 	print "options: -q           => quiet\n";
 	print "         --no-tree    => run without a kernel tree\n";
+	print "         --no-signoff => don't check for signed-off-by\n";
 	exit(1);
 }
 
-- 
1.5.2.2

-

From: Erez Zadok
Date: Friday, October 5, 2007 - 9:56 am

Such terse output complies with g/cc and looks like

	file_name:line_number:error_message

This output can be easily parsed within text editors (e.g., emacs/vim) that
can produce a split text screen showing in one screen the error message, and
in another screen the corresponding source file, with the cursor placed on
the offending line.

Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
---
 scripts/checkpatch.pl |   20 ++++++++++++++++----
 1 files changed, 16 insertions(+), 4 deletions(-)

diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index ecbb030..f8bd630 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -18,12 +18,14 @@ my $tree = 1;
 my $chk_signoff = 1;
 my $chk_patch = 1;
 my $tst_type = 0;
+my $terse = 0;		# terse ouput: report using g/cc error style
 GetOptions(
 	'q|quiet'	=> \$quiet,
 	'tree!'		=> \$tree,
 	'signoff!'	=> \$chk_signoff,
 	'patch!'	=> \$chk_patch,
 	'test-type!'	=> \$tst_type,
+	't|terse'	=> \$terse,
 ) or exit;
 
 my $exit = 0;
@@ -34,6 +36,7 @@ if ($#ARGV < 0) {
 	print "options: -q           => quiet\n";
 	print "         --no-tree    => run without a kernel tree\n";
 	print "         --no-signoff => don't check for signed-off-by\n";
+	print "         -t           => report errors using terse cc-style output (implies -q)\n";
 	exit(1);
 }
 
@@ -267,7 +270,16 @@ sub cat_vet {
 
 my @report = ();
 sub report {
-	push(@report, $_[0]);
+	if (!$terse) {
+		push(@report, $_[0]);
+		return;
+	}
+	# if terse output, extract filename, linenumber, and short message;
+	# format them as a new one-line message and push onto report
+	my($msg, $location, @rest) = split(/\n/, $_[0]);
+	@rest = split(/: /, $location);
+	my($newreport) = sprintf("%s%s\n", $rest[2], $msg);
+	push(@report, $newreport);
 }
 sub report_dump {
 	@report;
@@ -1097,17 +1109,17 @@ sub process {
 	if ($chk_patch && !$is_patch) {
 		ERROR("Does not appear to be a unified-diff format patch\n");
 	}
-	if ($is_patch && $chk_signoff && ...
From: Erez Zadok
Date: Friday, October 5, 2007 - 9:56 am

Examples:
	./scripts/checkfiles fs/foo/bar.c
	./scripts/checkfiles fs/foo/*.c
	./scripts/checkfiles fs/foo	# check all sources under fs/foo
	./scripts/checkfiles .		# check entire kernel

Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
---
 scripts/checkfiles |   34 ++++++++++++++++++++++++++++++++++
 1 files changed, 34 insertions(+), 0 deletions(-)
 create mode 100644 scripts/checkfiles

diff --git a/scripts/checkfiles b/scripts/checkfiles
new file mode 100644
index 0000000..bd5d3f0
--- /dev/null
+++ b/scripts/checkfiles
@@ -0,0 +1,34 @@
+#!/bin/sh
+# (c) 2007, Erez Zadok <ezk@cs.sunysb.edu> (initial version)
+# Licensed under the terms of the GNU GPL License version 2
+#
+# Check source files for compliance with coding standards, using terse
+# output in the style that g/cc produces.  This output can be easily parsed
+# within text editors (e.g., emacs/vim) which can produce a split text
+# screen showing in one screen the error message, and in another screen the
+# corresponding source file, with the cursor placed on the offending line.
+# See for example the documentation for Emacs's "next-error" command, often
+# bound to M-x ` (ESC x back-tick).
+
+# Usage: checkfiles file [files...]
+#        if "file" is a directory, will check all *.[hc] files recursively
+
+# check usage
+usage() {
+	echo "Usage: checkfiles file [files...]"
+	echo "(if \"file\" is a directory, check recursively for all C sources/headers)"
+	exit 1
+}
+if test -z "$1" ; then
+	usage
+fi
+if ! test -f scripts/checkpatch.pl ; then
+	echo "checkfiles: must run from top level source tree"
+	exit 1
+fi
+
+# check coding-style compliance of each source file found, using terse output
+find "$@" -type f -name '*.[hc]' | \
+while read f ; do
+	diff -u /dev/null $f | perl scripts/checkpatch.pl -t -
+done
-- 
1.5.2.2

-

From: Ingo Molnar
Date: Saturday, October 6, 2007 - 4:13 am

have you tried that with the latest version too:

  http://www.kernel.org/pub/linux/kernel/people/apw/checkpatch/checkpatch.pl-next

it outputs far fewer false positives.

	Ingo
-

From: Erez Zadok
Date: Sunday, October 7, 2007 - 12:05 pm

Andy, Ingo,

I tried the new checkpatch.pl on 2.6.23-rc9:

$ ./scripts/checkpatch.pl -q --file --emacs fs/namei.c

and got many perl warnings such as:

Use of uninitialized value in concatenation (.) or string at ./scripts/checkpatch.pl line 455.

followed by the usual verbose error message instead of one-per-line as I
#2823: FILE: namei.c:2820:
+EXPORT_SYMBOL(vfs_mkdir);

BTW, calling the option --emacs is a bit too restrictive.  Emacs didn't
invent the format of "filename:linenumeber:message".  C compilers had it
before.  Even "grep -n *" had it before.  That's why I think calling it a
"terse output" option may be more accurate.

The following small patch to checkpath.pl-next seems to fix the perl
warnings, but it still outputs the long error messages along with the
shorter one-liners.

diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index bdc493e..bbc4825 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -85,7 +85,7 @@ for my $filename (@ARGV) {
 		push(@rawlines, $_);
 	}
 	close(FILE);
-	if (!process($ARGV, @rawlines)) {
+	if (!process($filename, @rawlines)) {
 		$exit = 1;
 	}
 	@rawlines = ();
@@ -452,7 +452,7 @@ sub process {
 
 		my $rawline = $line;
 
-		$prefix = "$ARGV:$linenr: " if ($emacs);
+		$prefix = "$filename:$linenr: " if ($emacs);
 
 #extract the filename as it passes
 		if ($line=~/^\+\+\+\s+(\S+)/) {

Cheers,
Erez.
-

From: Andy Whitcroft
Date: Thursday, October 11, 2007 - 6:47 am

Yes, this support seems to be wholy broken, as a non emacs user I had
failed to test it correctly as I added the --file option.  Bad Andy.

In testing it I note we are emitting wholy the wrong line number, and

As I understand things this is called emacs mode because the emacs
buffer mode expects the filename:line:message format with the long error
to follow.  It seems pretty emacs specific.  So for now I'll leave it
named that.  If people convince me its --compiler-format or something we
can add that as an alias later.

I have hopefully sorted the main problems with it and will push out an
update for testing.

-apw
-

From: Erez Zadok
Date: Saturday, October 13, 2007 - 11:55 am

OK, I just checked the latest version of checkpatch on 2.6.23.1.  Here are
the stats broken down by subsystem and category of message (error, warning,
or subjective check):

      SUBSYSTEM   Error    Warn   Check   Total
        drivers  866113  168093    7712 1041918
             fs  102133   14016    1236  117385
           arch   78531   32898    5400  116829
        include   41237   42974    1838   86049
          sound   59175   20319    1184   80678
            net    6898    6614     659   14171
         crypto    4333     467      32    4832
            lib    4193     383      61    4637
         kernel     950     895     162    2007
        scripts    1205     742      26    1973
       security     501     591      39    1131
             mm     473     281      79     833
            ipc     344     119      30     493
  Documentation     191      95       6     292
          block     100     146      28     274
           init      64      69      28     161
            usr      14      19       0      33
          TOTAL 1166455  288721   18520 1473696

Erez.
-

Previous thread: [PATCH 2/2] Version 5 (2.6.23-rc8-mm2) Smack: Simplified Mandatory Access Control Kernel by Casey Schaufler on Friday, October 5, 2007 - 9:58 am. (1 message)

Next thread: 2.6.23-rc9-git4: pata_pcmcia, disabling IRQ #9 by Manuel Lauss on Friday, October 5, 2007 - 11:08 am. (18 messages)