Top 10 kernel oopses for the week ending January 5th, 2008

Previous thread: 2.6.24-rc6: ricoh_mmc woes by Harald Dunkel on Saturday, January 5, 2008 - 4:28 pm. (2 messages)

Next thread: [2.6.24 patch] fix netx-eth.c compilation by Adrian Bunk on Saturday, January 5, 2008 - 5:46 pm. (2 messages)
To: Linux Kernel Mailing List <linux-kernel@...>
Cc: Linus Torvalds <torvalds@...>, Andrew Morton <akpm@...>, NetDev <netdev@...>
Date: Saturday, January 5, 2008 - 5:06 pm

The http://www.kerneloops.org website collects kernel oops and
warning reports from various mailing lists and bugzillas as well as
with a client users can install to auto-submit oopses.
Below is a top 10 list of the oopses collected in the last 7 days.
(Reports prior to 2.6.23 have been omitted in collecting the top 10)

This week, a total of 49 oopses and warnings have been reported,
compared to 53 reports in the previous week.

Rank 1: __ieee80211_rx
Warning at net/mac80211/rx.c:1672
Reported 6 times (11 total reports)
Same issue that was ranked 2nd last week
Johannes has diagnosed this as a driver bug in the iwlwifi drivers
More info: http://www.kerneloops.org/search.php?search=__ieee80211_rx

Rank 2: elv_next_request
kernel page fault
Reported 6 times (7 total reports)
Seems to be related to fast modprobe/rmmod cycles
More info: http://www.kerneloops.org/search.php?search=elv_next_request

Rank 3: d_splice_alias
NULL pointer deref
Reported 3 times
Happens in the isofs code
Only seen in 2.6.24-rc5-mm1
More info: http://www.kerneloops.org/search.php?search=d_splice_alias

Rank 4: remove_proc_entry
Was also ranked 4th last week
Only in tainted oopses
Reported 3 times (12 total reports)
More info: http://www.kerneloops.org/search.php?search=remove_proc_entry

Rank 5: __d_path
In the sys_getcwd system call
Only reported for 2.6.23.x, by one user
Reported 2 times
More info: http://www.kerneloops.org/search.php?search=__d_path

Rank 6: device_release
Was ranked 8th last week
Same reports as last week, but now entered into bugzilla.kernel.org
Reported 2 times (8 total reports)
More info: http://www.kerneloops.org/search.php?search=device_release

Rank 7: pgd_alloc
Has only been seen on machines tainted with the nvidia module
Reported 2 times
More info: http://www.kerneloops.org/search.php?search=pgd_alloc

Rank 8: evdev_disconnect
kernel page fault
Reported 2 times (10 total reports)
Previously seen in older kernels including 2.6...

To: Arjan van de Ven <arjan@...>
Cc: Linux Kernel Mailing List <linux-kernel@...>, Linus Torvalds <torvalds@...>, Andrew Morton <akpm@...>, NetDev <netdev@...>
Date: Wednesday, January 9, 2008 - 10:12 am

Note that because we don't get the module list for WARN_ON, we don't
actually know whether all of these instances are from the iwlwifi
drivers. A few other drivers suffer from the same problem. In one of
these cases, iwlwifi was contained in the stack trace, but in the common
case that isn't happening because packet processing is delayed to a
tasklet.

johannes

To: Johannes Berg <johannes@...>
Cc: Linux Kernel Mailing List <linux-kernel@...>, Linus Torvalds <torvalds@...>, Andrew Morton <akpm@...>, NetDev <netdev@...>
Date: Wednesday, January 9, 2008 - 11:28 am

and fwiw a patch to get this added to WARN_ON was posted by my last week to fix this;
once this goes into 2.6.25-rc this annoyance/hinderance in debugging will be fixed.
--

To: Arjan van de Ven <arjan@...>
Cc: Linux Kernel Mailing List <linux-kernel@...>, Linus Torvalds <torvalds@...>, Andrew Morton <akpm@...>, NetDev <netdev@...>
Date: Saturday, January 5, 2008 - 11:30 pm

Likely a broken module_exit() function that corrupts the list. To track
these down it might be useful to keep a list of recently unloaded modules
and dump these too in the oops module list with a special flag.

-Andi
--

To: Andi Kleen <andi@...>
Cc: Linux Kernel Mailing List <linux-kernel@...>, Linus Torvalds <torvalds@...>, Andrew Morton <akpm@...>, NetDev <netdev@...>
Date: Saturday, January 5, 2008 - 11:31 pm

I suspect that simply printing the currently unloading module will catch 90%+ already;
I'll look into adding this, it's a very good idea.
--

To: Arjan van de Ven <arjan@...>
Cc: Andi Kleen <andi@...>, Linux Kernel Mailing List <linux-kernel@...>, Linus Torvalds <torvalds@...>, Andrew Morton <akpm@...>, NetDev <netdev@...>
Date: Saturday, January 5, 2008 - 11:50 pm

There are a few cases where the corruption is only visible next time
someone accesses the list. So a small ring buffer might make sense.

-Andi

--

To: Arjan van de Ven <arjan@...>
Cc: Linux Kernel Mailing List <linux-kernel@...>, Linus Torvalds <torvalds@...>, Andrew Morton <akpm@...>, NetDev <netdev@...>
Date: Saturday, January 5, 2008 - 5:39 pm

FWIW, people moaning about the lack of entry-level kernel work would
do well by decoding those to the level of "this place in this function,
called from <here>, with so-and-so variable being <this>" and posting
the results. As skills go, it's far more useful than "how to trim
the trailing whitespace" and the rest of checkpatch.pl-inspired crap
that got so popular lately...
--

To: Al Viro <viro@...>
Cc: Arjan van de Ven <arjan@...>, Linux Kernel Mailing List <linux-kernel@...>, Linus Torvalds <torvalds@...>, Andrew Morton <akpm@...>, NetDev <netdev@...>
Date: Monday, January 7, 2008 - 1:44 pm

Is there any good basic documentation on this to point people at?

--b.
--

To: J. Bruce Fields <bfields@...>
Cc: Al Viro <viro@...>, Arjan van de Ven <arjan@...>, Linux Kernel Mailing List <linux-kernel@...>, Linus Torvalds <torvalds@...>, Andrew Morton <akpm@...>, NetDev <netdev@...>
Date: Monday, January 7, 2008 - 9:19 pm

I would second this question. I see people "decode" oops on lkml often enough, but I've never been entirely sure how its done. Is it somewhere in Documentation?

--
Kevin Winchester
--

To: Kevin Winchester <kjwinchester@...>
Cc: J. Bruce Fields <bfields@...>, Al Viro <viro@...>, Arjan van de Ven <arjan@...>, Linux Kernel Mailing List <linux-kernel@...>, Andrew Morton <akpm@...>, NetDev <netdev@...>
Date: Monday, January 7, 2008 - 11:26 pm

It's actually not necessarily at all that trivial, unless you have a deep
understanding of the code generated for the architecture in question (and
even then, some oopses take more time to figure out than others, thanks
to inlining and tailcalls etc).

If the oops happened with a kernel you generated yourself, it's usually
rather easy. Especially if you said "y" to the "generate debugging info"
question at configuration time. Because, in that case, you really just do
a simple

gdb vmlinux

and then you can do (for example) something like setting a breakpoint at
the EIP that was reported for the oops, and it will tell you what line it
came from.

However, if you don't have the exact binary - which is the common case for
random oopses reported on lkml - you will generally have to disassemble
the hex sequence given in the oops (the "Code:" line), and try to match it
up against the source code to try to figure out what is going on.

Even just the disassembly is not entirely trivial, since the oops will
give you the eip that it happened at, but you often want to also
disassemble *backwards* in order to get more of a context (the "Code:"
line will mark the particular EIP that starts the oopsing instruction by
enclosing it in <xx>, but with non-constant instruction lengths, you need
to use a bit of trial-and-error to figure it out.

I usually just compile a small program like

const char array[]="\xnn\xnn\xnn...";

int main(int argc, char **argv)
{
printf("%p\n", array);
*(int *)0=0;
}

and run it under gdb, and then when it gets the SIGSEGV (due to the
obvious NULL pointer dereference), I can just ask gdb to disassemble
around the array that contains the code[] stuff. Try a few offsets, to see
when the disassembly makes sense (and gives the reported EIP as the
beginning of one of the disassembled instructions).

(You can do it other and smarter ways too, I'm not claiming that's a
particularly good way to do it, and the old "ksymoops"...

To: Linus Torvalds <torvalds@...>
Cc: Kevin Winchester <kjwinchester@...>, J. Bruce Fields <bfields@...>, Al Viro <viro@...>, Arjan van de Ven <arjan@...>, Linux Kernel Mailing List <linux-kernel@...>, Andrew Morton <akpm@...>, NetDev <netdev@...>
Date: Tuesday, January 8, 2008 - 1:08 pm

IMHO better is

make the/file/xyz.lst

which gives you a listing with binary data in there which can be
grepped for.

But you should install a very recent binutils because older objdump -S
couldn't deal with unit-at-a-time compilers.

-Andi
--

To: Linus Torvalds <torvalds@...>
Cc: Kevin Winchester <kjwinchester@...>, J. Bruce Fields <bfields@...>, Al Viro <viro@...>, Arjan van de Ven <arjan@...>, Linux Kernel Mailing List <linux-kernel@...>, Andrew Morton <akpm@...>, NetDev <netdev@...>
Date: Tuesday, January 8, 2008 - 12:14 pm

[Empty message]
To: Randy Dunlap <randy.dunlap@...>
Cc: Linus Torvalds <torvalds@...>, Kevin Winchester <kjwinchester@...>, J. Bruce Fields <bfields@...>, Al Viro <viro@...>, Linux Kernel Mailing List <linux-kernel@...>, Andrew Morton <akpm@...>, NetDev <netdev@...>
Date: Tuesday, January 8, 2008 - 1:42 pm

I've made life easier for those using the www.kerneloops.org website;
at least for x86 oopses the website now does this for you and shows
the decoded Code: line in the raw oops data:

http://www.kerneloops.org/raw.php?rawid=2716
--

To: Arjan van de Ven <arjan@...>
Cc: Randy Dunlap <randy.dunlap@...>, Kevin Winchester <kjwinchester@...>, J. Bruce Fields <bfields@...>, Al Viro <viro@...>, Linux Kernel Mailing List <linux-kernel@...>, Andrew Morton <akpm@...>, NetDev <netdev@...>
Date: Tuesday, January 8, 2008 - 2:08 pm

Cool.

One thing I wonder about - could you separate out the bug-ons and warnings
from the oopses? They really are different issues, and an oops with
register information etc is very different from a BUG() with line numbers,
which in turn is very different from a WARN_ON().

Right now, it says

Oopses reported for kernel 2.6.24-rc7

7 oopses reported

hfsplus_releasepage 3
enqueue_task 1
lock_acquire 1
__hfs_brec_find 1
__ieee80211_rx 1

and in fact three of those five entries are really WARN_ON's. It would be
nicer if it would look more along the lines of

Backtraces reported for kernel 2.6.24-rc7

4 oopses reported

hfsplus_releasepage 3
__hfs_brec_find 1

3 warnings repored

enqueue_task 1
lock_acquire 1
__ieee80211_rx 1

because those things really don't have the same kind of impact at all, and
tend to be very different to debug (a "BUG_ON()" is perhaps somewhat
closer to an oops, but a WARN_ON() is definitely in a class of its own).

On that "Code:" side, it seems there is still some problem with oops
parsing. See for example:

http://www.kerneloops.org/raw.php?rawid=1521&msgid=http://mid.gmane.org/...

and notice how the Code: never made it into the raw message (and thus
there is also no instruction disassembly).

Linus
--

To: Linus Torvalds <torvalds@...>
Cc: Randy Dunlap <randy.dunlap@...>, Kevin Winchester <kjwinchester@...>, J. Bruce Fields <bfields@...>, Al Viro <viro@...>, Linux Kernel Mailing List <linux-kernel@...>, Andrew Morton <akpm@...>, NetDev <netdev@...>
Date: Tuesday, January 8, 2008 - 2:16 pm

the database has the information so it's just a matter of slightly different php code ;)
Before I do that... do you want the BUG's separate, part of the warnings or part of the oopses?

ok I'll fix this; I can fix this for all new entries at least, fixing retroactive is going to be
near impossible I suspect.
--

To: Arjan van de Ven <arjan@...>
Cc: Randy Dunlap <randy.dunlap@...>, Kevin Winchester <kjwinchester@...>, J. Bruce Fields <bfields@...>, Al Viro <viro@...>, Linux Kernel Mailing List <linux-kernel@...>, Andrew Morton <akpm@...>, NetDev <netdev@...>
Date: Tuesday, January 8, 2008 - 2:27 pm

I'd like them all separate, they tend to be very different and contain
different information.

Put the warnings last, as the least important. Oopses at the top, since

Oh well..

Linus
--

To: Linus Torvalds <torvalds@...>
Cc: Randy Dunlap <randy.dunlap@...>, Kevin Winchester <kjwinchester@...>, J. Bruce Fields <bfields@...>, Al Viro <viro@...>, Linux Kernel Mailing List <linux-kernel@...>, Andrew Morton <akpm@...>, NetDev <netdev@...>
Date: Tuesday, January 8, 2008 - 3:05 pm

ok done; I had to fizzle a bit because some things aren't *exactly* a BUG() statement
but I track them anyway (things like the "sleeping in invalid context" check), so I
had to somewhat arbitrarily assign categories for those. I might fine tune these over time
some; if you or someone else sees problems with categorization please let me know

--

To: Arjan van de Ven <arjan@...>
Cc: Randy Dunlap <randy.dunlap@...>, Kevin Winchester <kjwinchester@...>, J. Bruce Fields <bfields@...>, Al Viro <viro@...>, Linux Kernel Mailing List <linux-kernel@...>, Andrew Morton <akpm@...>, NetDev <netdev@...>, Rafael J. Wysocki <rjw@...>
Date: Tuesday, January 8, 2008 - 3:31 pm

Looking good. I wonder if we could also have some way to cross-ref these
things with the regression list (notably try to get pointers to them in
the regression list).

Right now, the regression list keeps track of the things that are closed,
but kerneloops.org obviously doesn't see that, so it's not obvious which
oopses are "uninteresting" since they've been fixed.

Linus
--

To: Linus Torvalds <torvalds@...>
Cc: Arjan van de Ven <arjan@...>, Randy Dunlap <randy.dunlap@...>, Kevin Winchester <kjwinchester@...>, J. Bruce Fields <bfields@...>, Al Viro <viro@...>, Linux Kernel Mailing List <linux-kernel@...>, Andrew Morton <akpm@...>, NetDev <netdev@...>
Date: Tuesday, January 8, 2008 - 6:56 pm

[Empty message]
To: Linus Torvalds <torvalds@...>
Cc: Kevin Winchester <kjwinchester@...>, J. Bruce Fields <bfields@...>, Arjan van de Ven <arjan@...>, Linux Kernel Mailing List <linux-kernel@...>, Andrew Morton <akpm@...>, NetDev <netdev@...>, <gregkh@...>, <neilb@...>
Date: Tuesday, January 8, 2008 - 1:59 am

Heh. I prefer
char main[] = {.....};
for the same thing, with gdb a.out and no running at all.

FWIW, I'm going to go through Arjan's collection and post blow-by-blow
analysis of some of those suckers. Tonight, probably...

Let's take e.g. http://www.kerneloops.org/raw.php?rawid=2618

RIP: 0010:[<ffffffff803b49a1>] [<ffffffff803b49a1>] kref_put+0x31/0x80
RSP: 0000:ffff81007ffe5df0 EFLAGS: 00010202
RAX: 0000000000000000 RBX: 0000000034333545 RCX: ffffffff80606270
RDX: 0000000000000040 RSI: ffffffff803b38b0 RDI: 0000000034333545
RBP: ffff81007ffe5e00 R08: 0000000000000001 R09: 0000000000000000
R10: ffffffff8094c430 R11: 0000000000000000 R12: ffffffff803b38b0
R13: ffff81011ede44d8 R14: ffffffff804d7d50 R15: ffff81011ff210f0
FS: 0000000002024870(0000) GS:ffff81011ff0dd00(0000)
...
Call Trace:
[<ffffffff803b37e9>] kobject_put+0x19/0x20
[<ffffffff803b389b>] kobject_del+0x2b/0x40
[<ffffffff804d7d50>] delayed_delete+0x0/0xb0
[<ffffffff804d7db9>] delayed_delete+0x69/0xb0
[<ffffffff80249775>] run_workqueue+0x175/0x210
[<ffffffff8024a411>] worker_thread+0x71/0xb0
[<ffffffff8024d9e0>] autoremove_wake_function+0x0/0x40
[<ffffffff8024a3a0>] worker_thread+0x0/0xb0
[<ffffffff8024d5fd>] kthread+0x4d/0x80
[<ffffffff8020c4b8>] child_rip+0xa/0x12
[<ffffffff8020bbcf>] restore_args+0x0/0x30
[<ffffffff8024d5b0>] kthread+0x0/0x80
[<ffffffff8020c4ae>] child_rip+0x0/0x12

Code: f0 ff 0b 0f 94 c0 31 d2 84 c0 74 0b 48 89 df 41 ff d4 ba 01

What do we have here? It barfs in kref_put() called from kobject_put().
It's -rc6-mm1 and I don't have -mm at hand. Let's see if we can make
any sense out of it from the mainline - it might be a good idea for the
first pass, unless there are some clear indications to the contrary.

kref_put() is fairly low-level (and deep in call chain, here). And it's
pretty small:
int kref_put(struct kref *kref, void (*release)(struct kref *kref))
{
WARN_ON(re...

To: Arjan van de Ven <arjan@...>
Cc: Al Viro <viro@...>, Linus Torvalds <torvalds@...>, Kevin Winchester <kjwinchester@...>, J. Bruce Fields <bfields@...>, Linux Kernel Mailing List <linux-kernel@...>, Andrew Morton <akpm@...>, NetDev <netdev@...>, <gregkh@...>, <neilb@...>
Date: Tuesday, January 8, 2008 - 3:33 am

...

IMHO, it would be really wasteful if Arian havn't published these
and maybe a few more such advices and links on this site, not
necessarily waiting for html-ized or howto-ized versions!

Thanks,
Jarek P.
--

To: Arjan van de Ven <arjan@...>
Cc: Linux Kernel Mailing List <linux-kernel@...>, Linus Torvalds <torvalds@...>, Andrew Morton <akpm@...>, NetDev <netdev@...>
Date: Saturday, January 5, 2008 - 5:26 pm

Practically certain to be fixed by 6addb1d6de1968b84852f54561cc9a999909b5a9;
it's list_for_each_entry() walking from the entry that just got list_del()
and kernels without that sucker have no protection of the list in question

pissfs locking problems; AFAICS, its sysfs_get_dentry() walking into
already unlinked sysfs directory node on the way to target; no exclusion
against that is there and it's not trivial to add.
--

Previous thread: 2.6.24-rc6: ricoh_mmc woes by Harald Dunkel on Saturday, January 5, 2008 - 4:28 pm. (2 messages)

Next thread: [2.6.24 patch] fix netx-eth.c compilation by Adrian Bunk on Saturday, January 5, 2008 - 5:46 pm. (2 messages)