Cc: Linus Torvalds <torvalds@...>, Andrew Morton <akpm@...>, Rafael J. Wysocki <rjw@...>, <davem@...>, <linux-kernel@...>, <jirislaby@...>, Steven Rostedt <rostedt@...>
On Thu, 1 May 2008 03:31:25 +0300
Adrian Bunk <bunk@kernel.org> wrote:
I would argue instead that we don't know which bugs to fix first.
We're never going to fix all bugs, and to be honest, that's ok.
As long as we fix the important bugs, we're doing really well.
And at least for the kerneloops.org reported issues, we're doing quite ok.
For me, 'important' is a combination of effect of the bug and the number of people
it'll hit. A compiler warning on parisc is less important than easy to trigger filesystem corruption
in ext3 that way; more people will hit it and the effect is more grave.
For oopses and WARN_ON()'s were getting to the hang of this now with kerneloops.org,
at least for the oopses that aren't really hard fatal. One thing I learned at least is that
lkml is a poor representation of what people actually hit; it's a very very selective
audience.
oopses/warnons are only a subset of the bugs of course... but still.
So there's a few things we (and you / janitors) can do over time to get better data on what issues
people hit:
1) Get automated collection of issues more wide spread. The wider our net the better we know which
issues get hit a lot, and plain the more data we have on when things start, when they stop, etc etc.
Especially if you get a lot of testers in your project, I'd like them to install the client for easy reporting
of issues.
2) We should add more WARN_ON()s on "known bad" conditions. If it WARN_ON()'s, we can learn about it via
the automated collection. And we can then do the statistics to figure out which ones happen a lot.
3) We need to get persistent-across-reboot oops saving going; there's some venues for this
--