Re: [rfc] the kernel workflow & trivial "global -> static" patches (was: Re: [2.6 patch] make sched_feat_{names,open} static)

Previous thread: [2.6 patch] unexport uts_sem by Adrian Bunk on Monday, May 5, 2008 - 11:29 am. (12 messages)

Next thread: [2.6 patch] the scheduled i2c driver removal by Adrian Bunk on Monday, May 5, 2008 - 11:29 am. (2 messages)
From: Adrian Bunk
Date: Monday, May 5, 2008 - 11:29 am

This patch makes the following needlessly global code in kernel/sched.c  
static:
- sched_feat_names[]
- sched_feat_open()

Signed-off-by: Adrian Bunk <bunk@kernel.org>

---

This patch has been sent on:
- 23 Apr 2008

 kernel/sched.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

2a1e54c660bb9c574133bda95468b1ea28239e95 diff --git a/kernel/sched.c b/kernel/sched.c
index 57ba7ea..19df986 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -757,14 +757,14 @@ const_debug unsigned int sysctl_sched_features =
 #define SCHED_FEAT(name, enabled)	\
 	#name ,
 
-__read_mostly char *sched_feat_names[] = {
+static __read_mostly char *sched_feat_names[] = {
 #include "sched_features.h"
 	NULL
 };
 
 #undef SCHED_FEAT
 
-int sched_feat_open(struct inode *inode, struct file *filp)
+static int sched_feat_open(struct inode *inode, struct file *filp)
 {
 	filp->private_data = inode->i_private;
 	return 0;

--


Adrian, you've been doing these "make needlessly global code static" 
patches for years meanwhile. I'm asking whether you have made any 
thoughts about going to the next level and automating (or at least 
half-automating) the generation of these trivial patches, so that they 
become less disruptive to our patch flow?

My opinion is that the technical benefit of these patches is positive to 
Linux: they make the kernel image a tiny bit smaller - for example this 
sched.c patch of yours i'm replying to makes the 64-bit kernel 8 bytes 
smaller. So i always applied your patches, and will do so in the future 
too.

But the per patch benefit is arguably extremely low: for example this 
particular patch saves 0.00016% from my particular vmlinux's size. Even 
on a Linux-Tiny kernel config, it's just 0.0008% of the vmlinux's size. 
We'd need more than 600 such patches (!) to make just 0.1% of a code 
size difference to my vmlinux, and even 0.1% is still very, very small. 
[ and many of these patches do not even trigger any savings on tiny 
  configs. ]

The point that many kernel developers make is that if we compare this 
small benefit to the disruption to the workflow this many small patches 
can cause, they could in fact become a net negative contribution (!). So 
it is extremely important to reduce the disruption, as much as we can.

And unlike other kernel developers, my opinion is not that we should 
eliminate this disruption by not doing these trivial patches _at all_. 
My opinion is that we should make it easier for maintainers to _avoid 
introducing_ these problems.

I.e. we need to fight the root of this problem (the steady introduction 
of needlessly global symbols), not its symptoms (the needlessly global 
symbols themselves).

Let me raise some thoughts about what we could do to achieve these 
goals.

Firstly, the methodology: you are using "make namespacecheck" on an 
allyesconfig kernel to find these 'needlessly global' sites, correct? Do 
you perhaps also ...

in some automated ways...

We have Documentation/SubmitChecklist that we hope that patch
submitters will use, but we have little evidence that they (we) do
use it, and we have some evidence that they (we) do NOT use it.
(but this isn't automated)

I see being related to DaveM's "Slow down, please" thread.
We have developers hurriedly writing new code but not paying enough
attention to code that they have already written.
(not directed at anyone in particular)

E.g., you do maybe 200 randconfig builds per day.  I only do

Good discussion material.


I have a similar problem with trying to keep kernel documentation
(kernel-doc & docbook) in sync with source code changes.
I try to review all patches that contain Documentation/ changes
(OK, I'm a few emails behind on that).  Anyway, it seems to be
a never-ending battle.

-- 
~Randy
--


On Mon, 5 May 2008 13:40:53 -0700 (PDT)


can we do a "make patchcheck" kernel build target that would
* run checkpatch on teh patch
* build the kernel without the patch (in various .configs, probably
allyesconfig / allmodconfig is enough, but we can figure this out
later)
* apply the patch
* build the kernel in the same configs
* build a kernel for install that has the 'standard debug options' on
  (lockdep, slabpoison etc)
then we can
* compare if new gcc warnings got introduced
* compare if major stack usage got introduced
* compare if namespace_check and some of the others introduce new issues
* compare if new sparse warnings got introduced
and maybe even run a bloat-o-meter to show code growth/shrinkage
[insert other useful checks here]

if all of that is just one command away, I bet quite a few people would
use it
(and the more useful it gets the more people will use it)

yes you can do the same thing by hand.
but yes it's many steps that are cumbersome if not automated... so few
people will do it.

If it's all in one step... 


--


keep listing the checks you want to have done. I'll bet that if you can 
automate it enough you will have people setting up farms to do the 
compiles (or, ideally, package it with a seti-at-home type of thing and 
many people will donate spare cycles to grinding away at it)

how much value is lost if this is run on larger chunks? during the merge 
window the number of patches will exceed anyone's capability to check each 
one individually (not to mention it being a little late), but I think 
there would still be value in checking larger groups of patches.

David Lang
--


On Mon, 5 May 2008 16:19:13 -0700 (PDT)

only some; if the output is a filename/linenumber (and some message) we
can always do "git blame" on when it changed last.
Heck we could then automate to run at that point again to see if that
patch really caused it (similar to bisect but a lot more targeted).

This of course breaks if you do it once a year... but if you do it
"small enough" it shouldn't be that bad... even during a merge window
it'll be fine.
--


I'm not sure we could do it for every single patch (because of the
time it would take), but how about automating it so that for every
single tree which is getting pushed to linux-next, we have a build
tree which automatically builds "origin..next" (i.e., the set of
commits that are being proposed for pushing into mainline), comparing
whether the current set of changes being proposed for pushing to
mainline, for each tree.   

Ideally the maintainer would do this himself before nominating the set
of patches to linux-next, but if not, if someone were interested in
doing this work automatically, and then sending the results to the
developer and cc'ed to LKML as a service, it would probably serve a
useful "gentle nudge" towards doing the right thing.  :-)

       	       	      	      	    	      - Ted
--


On Mon, 5 May 2008 19:45:15 -0400

I don't think build power is an actual problem for things like this,
since it tends to distribute really well.
--


On Mon, 5 May 2008 14:51:32 -0700

since people seem to be at least somewhat interested; I've started a
shell script that will do (for now, part of) the above.

It's very very premature, for one it is missing any and all error
checking, but it's a start. It's getting late here.. .maybe someone
wants to improve it before I wake up ? ;)

From: Arjan van de Ven <arjan@linux.intel.com>
Date: Mon, 5 May 2008 19:14:40 -0700
Subject: [PATCH] first stab at a "build with/without patch and report new non-statics and new unused exports" script

---
 dopatchtest.sh           |   63 ++++++++++++++++++++++++++++++++++++++++++++++
 scripts/export_report.pl |   20 ++++++++++++--
 scripts/namespace.pl     |   26 +++++++++++++------
 3 files changed, 98 insertions(+), 11 deletions(-)
 create mode 100644 dopatchtest.sh

diff --git a/dopatchtest.sh b/dopatchtest.sh
new file mode 100644
index 0000000..9362493
--- /dev/null
+++ b/dopatchtest.sh
@@ -0,0 +1,63 @@
+#!/bin/sh
+
+
+echo "cleaning trees"
+make mrproper
+
+rm -rf before after
+mkdir before before/allyesconfig  before/allmodconfig before/allnoconfig &> /dev/null
+mkdir after after/allyesconfig after/allmodconfig  after/allnoconfig &> /dev/null
+
+function build {
+	echo "   building $2"
+	make O=$1/$2 $2 &> $1/$2/config.log
+	make O=$1/$2 -j16 -s &> $1/$2/build.log
+	export srctree=$PWD
+	pushd $1/$2 > /dev/null
+	perl ../../scripts/namespace.pl --compact &> namespace.log
+	perl ../../scripts/export_report.pl -u &> export_report.log
+	popd > /dev/null
+	make O=$1/$2 checkstack &> $1/$2/checkstack.log
+}
+
+function check_static {
+	verbose=0
+	diff before/$1/namespace.log after/$1/namespace.log | grep -q "^>" && verbose=1
+	if [ $verbose -eq 1 ]; then
+		echo "New global symbols without users:"
+		diff before/$1/namespace.log after/$1/namespace.log | grep "^>" | cut -c2-
+	fi
+}
+
+function check_exports {
+	verbose=0
+	diff before/$1/export_report.log after/$1/export_report.log | grep -q "^>" && verbose=1
+	if [ ...

On Mon, 5 May 2008 22:19:06 +0200

We don't actually care about what comes out of `make namespacecheck'.  We
care about the _difference_ in its output when a patch is applied.

So a script which reports on what changes a particular patch has upon
namespacecheck output might be the way to go.  If it is fast enough then it
can be run on a per-patch basis alongside checkpatch.

--


"make namespacecheck" works on the binary objects.

- touching header files can result in a complete rebuild
- if a patch alters which objects get built you should start with
  a clean object dir

The question is therefore basically whether a complete rebuild of an
all*config kernel is fast enough for you...

cu
Adrian

-- 

       "Is there not promise of rain?" Ling Tan asked suddenly out
        of the darkness. There had been need of rain for many days.
       "Only a promise," Lao Er said.
                                       Pearl S. Buck - Dragon Seed

--


On Tue, 6 May 2008 00:07:12 +0300

That would be quite a bother.

I do think that we should aim to get these things fixed _before_ the
offending patches get into mainline.  It's dopey to append a sprinkle of
fixups against any particular patch after it has hit mainline when we have
the tools to fix those things up beforehand.

And it'd help to educate submitters to check their own stuff.  So when
these post-facto fixups are prepared then it is good to rub people's
noses^W^W^Wgently remind submitters about the problems in their work.
Probably you are already doing this.



Actually, we could perhaps do a lot of this at the checkpatch level?  If
checkpatch sees a global symbol being added and the same patch does not add
references to that symbol from a different file then whine.  Obviously this
will generate false positives but that's OK.
--


Not sure what is possible at the checkpatch level.

Adding -Wmissing-prototypes to the CFLAGS (which was my original 
motivation for doing these patches) would help much, but it's still a 
long road until I can propose it without being lynched for the warnings 
it still generates...

Or teach people to use sparse?

After all, this thread erupted on a patch against kernel/sched.c
that fixes something sparse also warns about.

I just tried running sparse against this file - looks scary...

cu
Adrian

-- 

       "Is there not promise of rain?" Ling Tan asked suddenly out
        of the darkness. There had been need of rain for many days.
       "Only a promise," Lao Er said.
                                       Pearl S. Buck - Dragon Seed

--


On Mon, 5 May 2008 14:26:04 -0700

or.. doesn't add it to a header file. That might be even more generic;
(and enforces a "all global functions need a prototype in a header
somewhere)
--


That does sound possible.  I am sure it will false positive quite a lot,
but its probabally worth a stab.

-apw
--


I could have announced this patch as "fixes two sparse warnings"
(which is true and seems to even make Linus bypass maintainers for 
applying patches for similar patches I've seen from other people).

Fact is that I only very rarely use sparse, do the things from different 
perspectives, and only the result often happens to overlap with what 
sparse does.

If the labeling is what makes you say "benefit is arguably extremely 
low" I can put more emphasize on which of my patches also fix sparse 



- "make namespacecheck"
- "make export_report"
- the -Wmissing-prototypes gcc flag

The latter is what I actually started with many years ago (that might 
even predate sparse), and we are slowly getting nearer to being able to 
enable it in the kernel (also thanks to sparse now emitting warnings 
for the same things).


Not exactly allyesconfig since my compile tests predate the 
introduction of the *config targets in 2.5, but my standard

I grep the sources and check in git what happened and what to do
(this sometimes finds some real bugs).

I edit the file(s).

Then I do:
  cg-commit < /dev/null; git-show --pretty=oneline > /tmp/patch-static-sched

Write the email with mutt+nano.

Store the email in my postponed-msgs folder.

Give a batch of patches some compile testing and check the output 
(e.g. removing some code might make gcc complain about now unused 

You can implement whatever you want to implement.


cu
Adrian

-- 

       "Is there not promise of rain?" Ling Tan asked suddenly out
        of the darkness. There had been need of rain for many days.
       "Only a promise," Lao Er said.
                                       Pearl S. Buck - Dragon Seed

--

From: Andi Kleen
Date: Monday, May 5, 2008 - 5:21 pm

I don't think the code changes actually with current gcc for integer
code if you change something from global to static (unless it causes
gcc to inline the function, but then it might be well larger if you're
unlucky)

The only file size change you'll see will be from a smaller symbol
table in the vmlinux ELF file, but that is not even loaded at run time
or included into the bzImage (and the kallsyms table has statics too)

So if we follow that "smaller symbol table" is better model
should we make a rule that all globals be 8 characters only? Or perhaps 6?  @)
I'm sure that could be easily enforced by some tool... 

I could see some advantage from static in future compiler versions 
though from better optimization, but it's quite remote.

I agree with your point that it's not effective to spend human time
to generate such patches.

-Andi
--

From: Adrian Bunk
Date: Monday, May 5, 2008 - 11:18 pm

It's a common case that a function has only one caller. It should always 

I'm not attaching size change information to these patches since 
whatever change one sees anyway also depends on other factors like
the exact kernel configuration, so it's non-trivial to get numbers
that could be taken seriously.

There are many small aspects, e.g. both gcc with -Wmissing-prototypes 
and sparse give warnings, and the problem might either be needlessly 
global code or the fact that a function prototype is either not in a 
header or the header not #include'd by the file. Although I've only
2 or 3 times catched such bugs in the kernel that is a nasty to debug 

The best case I've actually seen in practice was a variable I made 
static, and with CONFIG_DEBUG_FOOBAR=n gcc was now able to prove that 
the value never changed resulting in the variable plus quite a chunk

cu
Adrian

-- 

       "Is there not promise of rain?" Ling Tan asked suddenly out
        of the darkness. There had been need of rain for many days.
       "Only a promise," Lao Er said.
                                       Pearl S. Buck - Dragon Seed

--

From: Andi Kleen
Date: Tuesday, May 6, 2008 - 4:13 am

At least for the scheduler patch that started this thread this was not
the case. One was a global variable and the other was a callback from

Yes it's good to catch those. However I suspect there are better tools
for that that do it less work intensive. Traditionally in the Unix
world "lint" has been used to track this kind of bugs (dating back from
before prototypes were added to C). Now running any lint over the kernel
source would result in a incredibly number of warnings I'm sure, but
perhaps one can be configured to only output warnings related to
inconsistent prototypes over files. There are a couple of free lints

Sounds like the variable should just have been removed then in the source?

-Andi
--

From: Adrian Bunk
Date: Tuesday, May 6, 2008 - 4:25 am

No, an #ifdef CONFIG_DEBUG_FOOBAR guarded some sysctl that allows users 
to change the value.


cu
Adrian

-- 

       "Is there not promise of rain?" Ling Tan asked suddenly out
        of the darkness. There had been need of rain for many days.
       "Only a promise," Lao Er said.
                                       Pearl S. Buck - Dragon Seed

--

Previous thread: [2.6 patch] unexport uts_sem by Adrian Bunk on Monday, May 5, 2008 - 11:29 am. (12 messages)

Next thread: [2.6 patch] the scheduled i2c driver removal by Adrian Bunk on Monday, May 5, 2008 - 11:29 am. (2 messages)