Re: Are Section mismatches out of control?

Previous thread: [PATCH] make loglevel related commandline to early_param by Yinghai Lu on Friday, February 1, 2008 - 6:40 am. (6 messages)

Next thread: Re: how to tell i386 from x86-64 kernel by Rik Bobbaers on Friday, February 1, 2008 - 6:47 am. (2 messages)
To: LKML <linux-kernel@...>, linux arch <linux-arch@...>, James Bottomley <James.Bottomley@...>
Cc: Andrew Morton <akpm@...>
Date: Friday, February 1, 2008 - 6:47 am

James said in a related posting that the Section mismatch
warnings were getting out of control.

So I decided to take a closer look at current
status. Latest mainline with Adrian + mine fixes applied.

Target was x86 - an allyesconfig build.
I looked at the reported Section mismatch warnings per
directory so see where it was looking bad.

The list is here:

Directory Warnings
================= ========
block/ 0
fs/ 0
init/ 0
lib/ 0
net/ 2
sound/ 3
crypto/ 0
ipc/ 0
kernel/ 0
mm/ 0
usr/ 0
security/ 0
drivers/net/ 11
drivers/ata/ 2
drivers/base/ 0
drivers/block/ 0
drivers/cdrom/ 0
drivers/crypto/ 0
drivers/hid/ 0
drivers/input/ 0
drivers/lguest/ 0
drivers/leds/ 0
drivers/media/ 0
drivers/pci/ 0
drivers/pcmcia/ 9
drivers/power/ 0
drivers/ps3/ 0
drivers/rtc/ 1
drivers/scsi/ 3
drivers/isdn/ 34
drivers/serial/ 10
drivers/spi/ 0
drivers/usb/ 0
drivers/video/ 14
drivers/telephony/ 0
drivers/watchdog/ 0
drivers/w1/ 0
drivers/dca 0
drivers/edac/ 0
drivers/acpi/ 2
drivers/char/ 3
drivers/cpufreq 3
drivers/hwmon/ 1
drivers/infiniband/ 0
drivers/md 0
drivers/message/ 0
drivers/misc/ 0
drivers/mmc/ 0
drivers/mtd/ 0
drivers/parport/ 0
drivers/pnp/ 0

As expected the majority is in drivers/
And as is obvious from the above the warnings are
concentrated on a few places:
drivers/isdn/ has the top score.
Then we have video/, net/ serial/ and pcmia.

With thos four directories clean we are down with
78 less warnings.

The total figure for this build is 106 warnings.
In my book things are not out of control.
So stop complaining and lets see some fixes.

I will look at drivers/isdn as next step.

Sam
--

To: Sam Ravnborg <sam@...>
Cc: LKML <linux-kernel@...>, linux arch <linux-arch@...>, Andrew Morton <akpm@...>
Date: Friday, February 1, 2008 - 10:53 am

What I actually said was that the churn in the source base caused by
these sectional mismatches was getting out of hand.

What I questioned was the value of inspecting all the drivers and trying
to fix them all (and the effort involved) vs my proposed solution of

But on your own estimate, that's around 50 patches to fix all of this,
isn't it? Which all have to be inspected, tested and integrated. Is
that really a worthwhile exercise for saving 5k of memory (also from
your own estimate). Particularly as the people who care about extreme
memory configurations aren't going "goody sections saved us 5k", they're
going "&^*&^ me the kernel increased in size by 150k on my system from
2.6.15 to 2.6.24".

James

--

To: Sam Ravnborg <sam@...>
Cc: LKML <linux-kernel@...>, linux arch <linux-arch@...>, James Bottomley <James.Bottomley@...>, Andrew Morton <akpm@...>
Date: Friday, February 1, 2008 - 7:10 am

My question is: where are crashes? If the sections were
really in such bad shape and since we poison (and sometimes
even unmap) init after boot we should in theory see a lot
of oops reports from this if there were really accesses to
them after boot.

The interesting question is how many true bugs are in these warnings.

-Andi
--

To: Andi Kleen <ak@...>
Cc: Sam Ravnborg <sam@...>, LKML <linux-kernel@...>, linux arch <linux-arch@...>, James Bottomley <James.Bottomley@...>, Andrew Morton <akpm@...>
Date: Friday, February 1, 2008 - 5:51 pm

Perhaps still in RAM? Explicitly zero the area after unmapping,
maybe something happens.
--

To: Jan Engelhardt <jengelh@...>
Cc: Andi Kleen <ak@...>, Sam Ravnborg <sam@...>, LKML <linux-kernel@...>, linux arch <linux-arch@...>, James Bottomley <James.Bottomley@...>, Andrew Morton <akpm@...>
Date: Saturday, February 2, 2008 - 12:12 am

This is already doing (but not with zero). With DEBUG_RODATA it is
even unmapped.

-Andi

--

To: Sam Ravnborg <sam@...>
Cc: LKML <linux-kernel@...>, linux arch <linux-arch@...>, James Bottomley <James.Bottomley@...>
Date: Friday, February 1, 2008 - 7:03 am

Question is: why do people keep adding new ones when they are so easy to
detect and fix?

Asnwer: because neither they nor their patch integrators are doing adequate

Thanks.
--

To: Andrew Morton <akpm@...>
Cc: Sam Ravnborg <sam@...>, LKML <linux-kernel@...>, linux arch <linux-arch@...>, James Bottomley <James.Bottomley@...>
Date: Friday, February 1, 2008 - 12:43 pm

Another way to look at it... All of a sudden, different from 2.6.24,
kernel 2.6.25-git build spews so many warnings that I need to disable
section mismatch checking completely, because there is so much noise
that __normal build messages scroll off the screen__.

Jeff

--

To: Jeff Garzik <jeff@...>
Cc: Andrew Morton <akpm@...>, LKML <linux-kernel@...>, linux arch <linux-arch@...>, James Bottomley <James.Bottomley@...>
Date: Friday, February 1, 2008 - 4:17 pm

One can ignore or one can fix...
I decided to spend some of my friday on fixing section mismatch
warnings as I've got a bit irritated over people spending time
complaining but failing to provide patches.

My work approach was to seelct a directory and fix all warnings
there before moving on. So I did not as such pick the easy ones
or anything like that.

Despite this I managed with a one day effort to bring down
the number of warnings in my build from 106 to 50 warnings.
And yesterday the figure was 136.

So when complaining about the number of warnings I can only say
that we would all benefit if they are fixed.
Just ignoring them will not sort it out.

I did not pull your latest patches so my numbers are outdated
for net/drivers/ which alone count for 11 out of 50 warnings
in my tree.

I am also eagerly awaiting Andrew to push as he has a few
patches pending all over but the section mismatch warnings
are the least of his worries these days.

I hope to get down to less than 20 warnings in my allyesconfig
build before the merge window closes.
And with just a bit of assistance from others this
is doable.

Sam - who expected more people to actually fix this stuff :-(
--

To: Sam Ravnborg <sam@...>
Cc: Andrew Morton <akpm@...>, LKML <linux-kernel@...>, linux arch <linux-arch@...>, James Bottomley <James.Bottomley@...>, Linus Torvalds <torvalds@...>
Date: Friday, February 1, 2008 - 4:24 pm

Well, with due respect, it's a bit presumptuous to add a bunch of
warnings to the kernel build (due to more strict checking), and then get
annoyed when people aren't jumping up and fixing this stuff immediately.

There were no build complaints in 2.6.24 for my stuff (libata and
drivers/net) during my test builds, nor were there any for my 2.6.25-git
merge window pushes, nor were there any complaints when I last checked
Andrew's -mm tree.

So from our perspective, you dumped a lot of work in our laps from out
of the blue, getting irritated at us along the way.

Maybe we can resolve this in a more kinder, gentler, coordinated
fashion? :)

What could be done to prevent this sort of situation in the future?
Maybe add these checks to -mm, and then not push your strict checking
upstream until the build noise is reduced?

Jeff

--

To: Jeff Garzik <jeff@...>
Cc: Andrew Morton <akpm@...>, LKML <linux-kernel@...>, linux arch <linux-arch@...>, James Bottomley <James.Bottomley@...>, Linus Torvalds <torvalds@...>
Date: Friday, February 1, 2008 - 6:38 pm

I did some testing and the new code does not emit warnings which were
not emitted before. But previously you had to use less typical
It is the misinformation being spread that irritates me.
Thousand of hours, no real bugs found etc.

Anyway - that is all forgot tomorrow when we get the warning level down

The good thing about getting it upstream is the additional attention.
If we do not get it down to acceptable levels I have no problems
turning off the section mismatch in minline but keep it enabled
in -mm.

Sam
--

To: Andrew Morton <akpm@...>
Cc: Sam Ravnborg <sam@...>, LKML <linux-kernel@...>, linux arch <linux-arch@...>
Date: Friday, February 1, 2008 - 11:00 am

That's because certain configurations, where this shows up, like !
CONFIG_HOTPLUG or !CONFIG_HOTPLUG_CPU simply aren't anything other than
extremely weird configs for most modern HW and distributions.

I'm not saying we can't fix the problem with tools or that we can't
invest huge amounts of time patching and fixing this. I am saying we
should be questioning the value of the investment in doing so.

I also claim that getting rid of the __dev.* sectional attributes (for
which I've already submitted a small patch) fixes 90% of the problems
(since they're mainly caused by driver writers not understanding what
all of these things mean). The remaining ones are clearer and
additionally, show up on every build (whether HOTPLUG or not).

Before we start to pillory driver writers for not caring about configs
they rightly don't give a toss about, could we at least get someone
(anyone!) in the community that cares about extreme low memory
configurations to give us reasons why there's value to all this effort
(rather than spending it on say reducing cache footprints or eliminating
space in structures)? All the people I've asked so far (mainly in arm,
admittedly) have said they have bigger things to worry about.

James

--

To: Andrew Morton <akpm@...>
Cc: Sam Ravnborg <sam@...>, LKML <linux-kernel@...>, linux arch <linux-arch@...>, James Bottomley <James.Bottomley@...>
Date: Friday, February 1, 2008 - 10:48 am

Hi,

How about adding an Untested-by tag for psychological reasons?

Hannes
--

To: Andrew Morton <akpm@...>
Cc: Sam Ravnborg <sam@...>, LKML <linux-kernel@...>, linux arch <linux-arch@...>, James Bottomley <James.Bottomley@...>
Date: Friday, February 1, 2008 - 7:21 am

Because if there already exists more than a handful peoples' eyes glaze
over and ignore "just one more warning"

Unless they break the build, or if there currently are 0 and they make
it non-zero, people seem not to care....sad. Probably the same for
sparse/checkpatch, "there's plenty already, I can't be bothered to look"

Subject to someone _making_ it an issue, can't see it changing.

Harvey

--

To: Harvey Harrison <harvey.harrison@...>
Cc: Andrew Morton <akpm@...>, Sam Ravnborg <sam@...>, LKML <linux-kernel@...>, linux arch <linux-arch@...>, James Bottomley <James.Bottomley@...>
Date: Friday, February 1, 2008 - 5:47 pm

checkpatch does not parse C, it uses heuristical regexes.

That makes it very different from sparse or the section mismatch
finder which do not output false positives.
--

To: Jan Engelhardt <jengelh@...>
Cc: Harvey Harrison <harvey.harrison@...>, Andrew Morton <akpm@...>, LKML <linux-kernel@...>, linux arch <linux-arch@...>, James Bottomley <James.Bottomley@...>
Date: Friday, February 1, 2008 - 6:40 pm

Unfortunately I most correct you. Section mismatch checks seldoms finds
what I would call 'real' bugs that causes oops - but it happen.
It is mostly fasle positives that needs workaround, but also a great
deal of missing annotation resulting in additional memory saved.
And then occasionally a bad reference in some error handling that
seldom trigger but when it does it would oops.

Sam
--

To: Sam Ravnborg <sam@...>
Cc: Harvey Harrison <harvey.harrison@...>, Andrew Morton <akpm@...>, LKML <linux-kernel@...>, linux arch <linux-arch@...>, James Bottomley <James.Bottomley@...>
Date: Friday, February 1, 2008 - 8:01 pm

What I meant with false positives:=
modpost warning about something that is not true.

I have not yet seen such happening where code is obviously correct
by the eyeball, but modpost gets it wrong.
--

To: Jan Engelhardt <jengelh@...>
Cc: Harvey Harrison <harvey.harrison@...>, Andrew Morton <akpm@...>, Sam Ravnborg <sam@...>, LKML <linux-kernel@...>, linux arch <linux-arch@...>
Date: Friday, February 1, 2008 - 6:10 pm

Even by the exalted standards of LKML which sometimes seems to make a
virtue of misinformation, four wrong statements in twenty seven words is
pretty impressive ... I salute you!

James

--

To: Harvey Harrison <harvey.harrison@...>
Cc: Andrew Morton <akpm@...>, Sam Ravnborg <sam@...>, LKML <linux-kernel@...>, linux arch <linux-arch@...>, James Bottomley <James.Bottomley@...>
Date: Friday, February 1, 2008 - 1:02 pm

> Subject to someone _making_ it an issue, can't see it changing.

Actually I think that Sam's recent improvements to the section
mismatch detection should make it easy to get at least all of the
driver issues fixed -- the problem in the past was that many warnings
would only show with certain configs or certain compilers (inlining
functions only called once would hide many warnings).

Now I just need to enable CONFIG_DEBUG_SECTION_MISMATCH and I'm pretty
certain to see everything (and indeed this found a few warnings in
drivers/infiniband hidden by inlining, which I've queued up a fix for).

So the __cpuinit stuff may be trickier and half false positives etc,
but I'm confident drivers/ will be cleaned up quickly now thanks to Sam.
--

To: Harvey Harrison <harvey.harrison@...>
Cc: Andrew Morton <akpm@...>, Sam Ravnborg <sam@...>, LKML <linux-kernel@...>, linux arch <linux-arch@...>, James Bottomley <James.Bottomley@...>
Date: Friday, February 1, 2008 - 9:30 am

And before, the actual warnings depended a lot on the kernel
configuration, so making the build break was less of an option.
If Sam's improved section mismatch detection turns out to work fine, we
can fix the issues and start to enable breaking of the build in case of
warnings.

BTW, on m68k I get ca. 160 of them. Most seem to originate in
drivers/isdn/. Doesn't look unsurmountable compared to the number of
other compile warnings fixed during the last few years ;-)

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
--

To: Geert Uytterhoeven <geert@...>
Cc: Harvey Harrison <harvey.harrison@...>, Andrew Morton <akpm@...>, LKML <linux-kernel@...>, linux arch <linux-arch@...>, James Bottomley <James.Bottomley@...>
Date: Friday, February 1, 2008 - 9:40 am

Can you try the patchset I will post in a minute.
Would be nice to know if you see additional warnings as I got
isdn clean here with x86 - 64 bit - allyesconfig.

Sam
--

To: Sam Ravnborg <sam@...>
Cc: Harvey Harrison <harvey.harrison@...>, Andrew Morton <akpm@...>, LKML <linux-kernel@...>, linux arch <linux-arch@...>, James Bottomley <James.Bottomley@...>
Date: Friday, February 1, 2008 - 5:22 pm

Sure. Where can I find that patchset? I didn't see it on lkml so far,
except for the [0/5] email.

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
--

To: Geert Uytterhoeven <geert@...>
Cc: Harvey Harrison <harvey.harrison@...>, Andrew Morton <akpm@...>, LKML <linux-kernel@...>, linux arch <linux-arch@...>, James Bottomley <James.Bottomley@...>
Date: Friday, February 1, 2008 - 6:32 pm

I posted a cumulative patch-set which you can find here:
http://lkml.org/lkml/2008/2/1/242

Sam
--

To: Sam Ravnborg <sam@...>
Cc: Harvey Harrison <harvey.harrison@...>, Andrew Morton <akpm@...>, LKML <linux-kernel@...>, linux arch <linux-arch@...>, James Bottomley <James.Bottomley@...>
Date: Saturday, February 2, 2008 - 12:42 pm

Today's linux-2.6.git + this set killed ca. 100 warnings. Thx!

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
--

To: Geert Uytterhoeven <geert@...>
Cc: Harvey Harrison <harvey.harrison@...>, Andrew Morton <akpm@...>, LKML <linux-kernel@...>, linux arch <linux-arch@...>, James Bottomley <James.Bottomley@...>
Date: Saturday, February 2, 2008 - 1:40 pm

Thanks for testing Geert.

Sam
--

Previous thread: [PATCH] make loglevel related commandline to early_param by Yinghai Lu on Friday, February 1, 2008 - 6:40 am. (6 messages)

Next thread: Re: how to tell i386 from x86-64 kernel by Rik Bobbaers on Friday, February 1, 2008 - 6:47 am. (2 messages)