This version brings a number of new checks, and a number of bug
fixes. Of note:
- better categorisation and space checks for dual use unary/binary
operators
- warn on deprecated use of {SPIN,RW}_LOCK_UNLOCKED
- check if/for/while with trailing ';' for hanging statements
- detect DOS line endings
- detect redundant casts for kalloc()
Andy Whitcroft (18):
Version: 0.10
asmlinkage is also a storage type
pull out inline specifiers
allow only some operators before a unary operator
parenthesised values may span line ends
add additional attribute matching
handle sparse annotations within pointer type space checks
support alternative function definition syntax for typedefs
check if/for/while with trailing ';' for hanging statements
fix output format for case checks
deprecate SPIN_LOCK_UNLOCKED and RW_LOCK_UNLOCKED
allow complex macros with bracketing braces
detect and report DOS line endings
fastcall is a valid function attribute
bracket spacing is ok for 'for'
categorise operators into unary/binary/definitions
add heuristic to pick up on unannotated types
remove spurious warnings from cat_vet
Dave Jones (1):
Make checkpatch warn about pointless casting of kalloc returns.
Signed-off-by: Andy Whitcroft <apw@shadowen.org>
---
scripts/checkpatch.pl | 254 +++++++++++++++++++++++++++++++++++++-----------
1 files changed, 196 insertions(+), 58 deletions(-)
diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index dae7d30..59ad83c 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -9,7 +9,7 @@ use strict;
my $P = $0;
$P =~ s@.*/@@g;
-my $V = '0.09';
+my $V = '0.10';
use Getopt::Long qw(:config no_auto_abbrev);
@@ -212,6 +212,11 @@ sub ctx_block_level {
return ctx_block_get($linenr, $remain, 0, '{', '}', 0);
}
+sub ctx_statement_level {
+ my ($linenr, $remain, $off) = @_;
+
+ return ...your checkpatch patch itself produces 22 warnings ...
i ran it over kernel/sched.c and there are many bogus warnings that i
reported to you earlier:
WARNING: multiple assignments should be avoided
#2319:
+ max_load = this_load = total_load = total_pwr = 0;
and new bogus ones:
ERROR: need consistent spacing around '*' (ctx:WxV)
#5287:
+ mode_t mode, proc_handler *proc_handler)
ERROR: need consistent spacing around '*' (ctx:WxV)
#5328:
+static ctl_table *sd_alloc_ctl_cpu_table(int cpu)
ERROR: need space before that '*' (ctx:VxV)
#209:
+# define INIT_TASK_GRP_LOAD 2*NICE_0_LOAD
why did you ignore my feedback? Ever since v8 the quality of
checkpatch.pl has been getting worse and worse as there are way too many
false positives. I'm still stuck on v8 for my own use, v9 and v10 is
unusable.
Ingo
--------------->
WARNING: line over 80 characters
#174: FILE: scripts/checkpatch.pl:572:
+ $line =~ /^.\s*(?:$Storage\s+)?($Ident)\b\s*\**\s*$Ident\s*(?:;|=)/)) {
WARNING: line over 80 characters
#186: FILE: scripts/checkpatch.pl:584:
+ (?:\s*\*+\s*const|\s*\*+|(?:\s*\[\s*\])+)?
WARNING: line over 80 characters
#200: FILE: scripts/checkpatch.pl:612:
+ ERROR("switch and case should be at the same indent\n$hereline$err");
WARNING: line over 80 characters
#208: FILE: scripts/checkpatch.pl:619:
+ my ($level, @ctx) = ctx_statement_level($linenr, $realcnt, 0);
WARNING: line over 80 characters
#213: FILE: scripts/checkpatch.pl:624:
+ # Skip over any removed lines in the context following statement.
WARNING: line over 80 characters
#221: FILE: scripts/checkpatch.pl:635:
+ if ($level == 0 && $ctx =~ /\)\s*\;\s*$/ && defined $lines[$ctx_ln - 1]) {
WARNING: line over 80 characters
#222: FILE: scripts/checkpatch.pl:636:
+ my ($nlength, $nindent) = line_stats($lines[$ctx_ln - 1]);
WARNING: line over 80 characters
#224: FILE: scripts/checkpatch.pl:638:
+ WARN("Trailing semicolon ...That warning is non-bogus, although this is one of the bogosities which I personally don't bother fixing or making a fuss about. But I do think it detracts from the readability of the code, and from patches which later alter that code. A bit. -
I tend to agree, watching the automatic replies from checkpatch, the majority of these are "justifiable" in their context. I think I'll lower this one to a CHECK in the next release. -apw -
what matters is that only items should be displayed that i _can_ fix. With v8 i was able to make kernel/sched*.c almost noise-free, but with v9 and v10 that's not possible anymore. And the moment the default output of the tool cannot be made 'empty', we've lost the biggest battle. Seeing the same bogus (or borderline) warnings again and again destroys the biggest dynamic that could get people to use this tool more often: the gratification of having a 'perfectly clean' file/patch. And this is not about any particular false positive. I dont mind an "advanced mode" non-default opt-in option for the script, if someone is interested in borderline or hard to judge warnings too, but these default false positives are _lethal_ for a tool like this. (and i made this point before.) This is a _fundamental_ thing, and i'm still not sure whether you accept and understand that point. This is very basic and very important, and this isnt the first (or second) time i raised this. Ingo -
You are striving for a level of perfection that is simply not achieveable. v8 is silent about sched.c because it is not checking very much of it, the logical extension of your position is to run 0.1 as that didn't check anything. v9 and 10 carry checks for most of the binary operators which were not there before. Due as I have mentioned before to the complexity of handling the unary/binary scism that C has for those operators and our different spacing requirement for each mode. Whilst I can see that it is gratifying that a patch or file is violations free where there are stylistic aberations in it they should be reported. Where those are questionble showing those as "CHECK" is not unreasonable. I will add a "--no-check" for you so that those are suppressed based on the assumption you know what you are doing. I think it is clear that we differ on what should and should not be output by default. Clever people are able to opt out of the warnings, of things they think they dissagree with. It is the people with little experience who need the most guidance and those people who the tool should target by default. You cannot expect someone with no experience to know they need to add '--i-need-more-help' whereas _you_ I can expect to say '--leave-me-alone' or indeed to make the call that the output is plain wrong and _you_ know you should ignore it. Fundamentally I am not trying to help the people who are careful but those who do not know better. As for the false positives, those I am always interested in and always striving to remove, as they annoy me as much as the next man. -apw -
I dont think Ingo is looking for perfection. Its about a different optimization goals. Let me put it this way: checkpatch in advanced mode: - I want to be able to see as many possible problems (this is the optimization goal) - I accept that I get false positives - not useful for git and mail traffic checkpatch in safe mode: - I never want a false positive (different optimization goal!) - I accept that I will miss several real bugs because several tricky tests are disabled - useful for git and mail traffic Christian -
Maybe checkpatch.pl needs an option '-W' to turn on/off those vexed "noise". (It seems that 'q|quiet' doesn't do as much as what it hints.) ;-) -- "Bill, look, we understand that you're interested in selling us this operating system, but compare it to ours. We can't possibly take such a retrograde step." -
Make checkpatch.pl a C language parser, then it can handle all the whitespace violations without false positives. :-) -
Well, yes, that's the root problem. checkpatch is a regex tool, not a C parser. And like Andy commented in another mail in this thread, (esp. since it's not a C parser), it's just a guide. --- ~Randy -
you _still_ fail to understand (let alone address), the fundamental
the most usable checkpatch.pl version was v6/v7. It went downhill after
that. Look at the script size versus the number of complaints about
kernel/sched.c:
size # warnings
----------------------------------------
25383 checkpatch.pl.v6 5
26038 checkpatch.pl.v7 6
29603 checkpatch.pl.v8 65
31160 checkpatch.pl.v9 24
34950 checkpatch.pl.v10 28
i.e. size did not increase all that significantly, still the number of
warnings has increased significantly - with little benefit.
and here's the breakdown for v10: out of 28 warnings, only 6 are
legitimate! So the signal to noise ratio has worsened significantly
since v6. One warning per 300 lines of sched.c is _not manageable_. It
translates to _over 25 thousand false positives_ for the whole kernel.
every false warning has additive costs: i will have to ignore it from
that point on, forever - and i frequently have to re-check the same
thing again, and again - just to discover that the warning is still
bogus.
the mails from me in your mbox during the past few months should be
proof that i'm fully constructive regarding this matter and that i'm not
striving for 100% level of perfection. I'm simply stating the obvious:
your tool is less and less useful with every new version and more
troubling than that is your apparent inability to realize what this
you still dont get the point of this. This isnt about writing a tool
that 'can' find something to complain about. This whole kernel source
code quality task is about:
_making kernel source code quality better_
not more, not less. If your tool outputs way too many false positives
and way too many unimportant borderline cases, people will ignore the
tool, and consequently YOU MAKE THE KERNEL'S SOURCE CODE WORSE than it
could be. How hard is that to understand?? Having zero (or near zero)
output from a checker ...That is unfair. Every time we discuss it I state that I disagree that hiding mostly useful tests is a good thing. I would love the tests to be 100% accurate, but if I removed all the tests that can false positive I would literally have none. There is a balance to be struck and we That is just stupid, I am no fundamentalist. I personally care not what tests there are or indeed if they are enabled by default. I personally feel you are more capable of turning off things you think are wrong, and as such the default should be to offer all of the tests. You disagree, Yes, but dispite that we don't just turn warnings off. Even though they produce false positives, as having them has some benefit. My contention is that checkpatch is following that same approach, risking some false positives to try and catch problems. Anyhow. I have already added a --check/--no-check option which controls the more subjective tests which will be in the next release; though its likely the option name will be something more useful by then. The only question is whether this should default to on. You are voting off. I personally think on. Andrew? Randy? Joel? -apw -
Hi Andy, Are you disagreeing with the numbers Ingo posted? 25,000 false positives for the kernel is beyond silly... Existing conventions should matter a lot and the default configuration for a static code checker should really be 100%. So why not hide the potentially useful warnings under -Wtoo-strict or similar command line option? -
I have not run across the whole kernel to find out, his estimation is likely high as his sample (mm/sched.c) includes a particular construct (multiple assignment) which is reported and overly common in that piece of code. If I take mm/signal.c (also big) I get 1/1000 files, and those two are easily fixed. I should note it shows some 62 actual real violations in that file. I do receive automated checks of every patch posted to lkml and I work to remove the false positives from them. The false positive ratio is very low in those reports and it those which drive my development effort. checkpatch is a work in progress and likely will be for many years to come. I have propose we 'gate' those subjective tests, and have asked for input on that thread on the default for those tests. -apw -
The main audience of this is new contributors, who should have more verbose output, including nitpicky things like multiple assignments per line. The default should target them. More advanced users can certainly use a flag that says "give me only the real errors". It might be a good idea to have three levels. --really-errors --really-picky --really-experimental Only with better names. --really-picky would be default, but would only include tests that have a very very high ratio of hits to false positives, but would still call out things like multiple assignments per line. --really-errors we could call the Igno level. --really-experimental would call out all issues, even on checks that generate a fair number of false positives. -Joel -
I think that if someone wants --verbose mode (--noisy, --beginner), they should have to ask for it. --- ~Randy -
off, I'd say. That way people are more likely to use it. Or, more accurately, will have less excuses to not use it. -
Ok, then I think thats 2 for on and 3 for off. So off it is. I was tending towards --subjective for the tests which are err more subjective. --strict is good too. Perhaps I'll put both of those in as aliases. I will also review the tests which are warnings and checks (subjective) and see if any are now miss-categorised. -apw -
here is an update wrt. the latest checkpatch.pl-next version
35948 checkpatch.pl.v11pre 11
so things are heading in the right direction :)
of those 11 warnings, 6 are correct warnings (4 will be solved via
KERN_CONT, 1 will be solved via a proper include file, and 1 is an
overlength line), 4 are borderline warnings (easily fixed) and only one
is a false positive! So v11-to-be gets the "best checkpatch.pl ever"
badge from me :)
The false positive is:
ERROR: need consistent spacing around '*' (ctx:WxV)
#5322:
+static ctl_table *sd_alloc_ctl_cpu_table(int cpu)
^
i think checkpatch.pl mistook this function definition as an arithmetic
expression?
But, there's a cleanliness bug underlying this false positive:
'ctl_table' is a typedef, and it would be cleaner to use 'struct
ctl_table' thoughout the kernel. When running checkpatch.pl over
include/linux/sysctl.h, it warns about the typedef:
WARNING: do not add new typedefs
#944:
+typedef struct ctl_table ctl_table;
(but mistaking that function for an arithmetic expression is still a bug
i think.)
nice work Andy!
Ingo
-
The original purpose was to catch the most tivial coding style errors. But then people started to be far too innovative and we now ended up with a checkpatch that try to check for every lillte glory detail. It would be much preferable to have a checkpatch - and that may be an incarnation of the current one or a new one. What it should do would be to catch the trivialities that we see happens in many patch submissions like: a) wrong patch format (-p1, unified diff) b) Whitespace versus tabs c) PascalCasing in new functions d) excessive line length e) C99 comments '//' (not that I see why they are banned) and maybe a few more trivial chacks. This would benefit from a very low false-positive rate and it could then be used as a patch-bot. As it is now where it tries to check for everything and a bit more it generates too much noise so a patch-bot usage is not possible. And users get annoyed by the excessive output. If we then have the same script that in a -pedantic mode generate more noise - fine. But leave the default simple. Sam -
well, the two variants is: max_load = this_load = total_load = total_pwr = 0; max_load = 0; this_load = 0; total_load = 0; total_pwr = 0; and the first one is more readable and more compact. (as long as the conceptual 'type' of the variables is the same - which it is in this case.) anyway, this is something where reasonable people might disagree, and a tool should not force it one way or another. And this is the second time i raised this very example and Andy ignored my feedback and failed to notice the structural problem behind it (that a tool should only warn by default if it is _sure_ that there is a problem - otherwise the tool cannot be used for effective [i.e. automated] quality control), so i'm raising this point again, in a slightly more irritated tone ;-) Ingo -
I think if you read your incoming email you will see nothing of the sort. I have discussed this with you and in public. The multiple assignment check you dissagree with and we have softened it in direct response to that dislike. However, the main proponent of this existing wanted that check. Therefore it has stayed. The other false positives you report are real. Some are fixed in my development version, others are not. They come from the fact that I was asked for better checks on '*' and the like in its binary mode. To get that I had to actually start telling unary and binary uses of the same operator appart. That is hard in the face of typedef'd types. I am working to make it better. However, the key here is that it will never be 100%, not without becoming a try C parser. The output is a _guide_ if you don't like its output ignore the reports you dislike. I for one send out patches with style violations where I deem that the code is better that way. -apw -
