# find /sys -name "*" | xargs tail [ 3866.840809] BUG: unable to handle kernel paging request at ffff8800be8b0019 [ 3866.840819] IP: [<ffffffff80262f07>] kallsyms_lookup+0x20/0x120 [ 3866.840828] PGD 202063 PUD 117f067 PMD 1374067 PTE 80000000be8b0160 [ 3866.840835] Oops: 0002 [#1] PREEMPT SMP DEBUG_PAGEALLOC [ 3866.840842] last sysfs file: /sys/kernel/slab/kmalloc-64/alloc_calls [ 3866.840846] CPU 1 [ 3866.840849] Modules linked in: hci_uart serport ipv6 af_packet binfmt_misc sco bridge stp llc rfcomm bnep l2cap bluetooth acpi_cpufreq cpufreq_performance cpufreq_powersave iptable_filter ip_tables sbp2 parport_pc lp parport snd_hda_intel pcmcia snd_pcm_oss snd_mixer_oss snd_pcm snd_page_alloc snd_hwdep snd_seq_dummy snd_seq_oss snd_seq_midi tifm_7xx1 iwl3945 evdev snd_rawmidi tifm_core psmouse rfkill rng_core yenta_socket rsrc_nonstatic iTCO_wdt iTCO_vendor_support mac80211 snd_seq_midi_event snd_seq snd_timer snd_seq_device snd battery ac shpchp button soundcore loop sd_mod sr_mod cdrom sg ohci1394 ieee1394 ehci_hcd uhci_hcd sky2 usbcore thermal processor fan unix [ 3866.840943] Pid: 7717, comm: tail Not tainted 2.6.28-rc6-git1 #6 [ 3866.840946] RIP: 0010:[<ffffffff80262f07>] [<ffffffff80262f07>] kallsyms_lookup+0x20/0x120 [ 3866.840953] RSP: 0018:ffff880073d63da8 EFLAGS: 00010292 [ 3866.840956] RAX: 0000000000000f9a RBX: ffffffff80695c45 RCX: ffff880073d63e00 [ 3866.840959] RDX: ffff880073d63df8 RSI: ffff880073d63df0 RDI: ffffffff80695c45 [ 3866.840962] RBP: ffff880073d63dd8 R08: ffff8800be8aff9a R09: 0000000000000000 [ 3866.840965] R10: ffff88013e8aff91 R11: 000000000000000a R12: ffff8800be8aff9a [ 3866.840968] R13: ffff880073d63df8 R14: ffff880073d63df0 R15: ffff880073d63e00 [ 3866.840972] FS: 00007f707c7996e0(0000) GS:ffff8800bf002898(0000) knlGS:0000000000000000 [ 3866.840975] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 3866.840978] CR2: ffff8800be8b0019 CR3: 0000000090844000 CR4: 00000000000006e0 [ 3866.840982] DR0: 0000000000000000 DR1: 0000000000000000 DR2: ...
Did that happen after a resume from suspend to RAM, by chance? Rafael --
Hi,
(I'm jumping in as Andrew forwarded the bug to us thinking it's SLUB related.)
Could be. I looked at the oops and I'm pretty sure SLUB is not at
fault here. Decoding the oopsing code:
[ 3866.841062] Code: df e8 39 d2 ff ff 5b 41 5c c9 c3 55 48 89 e5 41 57 49 89
cf 41 56 49 89 f6 41 55 49 89 d5 41 54 4d 89 c4 53 48 89 fb 48 83 ec
08 <41> c6
40 7f 00 41 c6 00 00 48 81 ff 00 90 20 80 72 09 48 81 ff
results in:
0000000000000000 <.text>:
0: 41 c6 40 7f 00 movb $0x0,0x7f(%r8) <<<<----
5: 41 c6 00 00 movb $0x0,(%r8)
9: 48 81 ff 00 90 20 80 cmp $0xffffffff80209000,%rdi
10: 72 09 jb 0x1b
12: 48 rex.W
13: 81 .byte 0x81
14: ff .byte 0xff
which looks like this:
000000000000023e <kallsyms_lookup>:
*/
const char *kallsyms_lookup(unsigned long addr,
unsigned long *symbolsize,
unsigned long *offset,
char **modname, char *namebuf)
{
23e: 41 56 push %r14
240: 49 89 f6 mov %rsi,%r14
243: 41 55 push %r13
245: 49 89 d5 mov %rdx,%r13
248: 41 54 push %r12
24a: 49 89 cc mov %rcx,%r12
24d: 55 push %rbp
24e: 48 89 fd mov %rdi,%rbp
251: 53 push %rbx
namebuf[KSYM_NAME_LEN - 1] = 0;
252: 41 c6 40 7f 00 movb $0x0,0x7f(%r8) <<<<----
*/
That is, we're oopsing because someone is passing a bogus 'namebuf' to
kallsyms_lookup(). This is further confirmed by looking at the value of R8:
[ 3866.840962] RBP: ffff880073d63dd8 R08: ffff8800be8aff9a R09:
0000000000000000
and adding 0x7f to it:
0xffff8800be8aff9a + 0x7f = 0xffff8800be8b0019
which equals to the ...Looks good, but I think you're looking at the wrong version of
sprint_symbol(). Try:
commit 966c8c12dc9e77f931e2281ba25d2f0244b06949
Author: Hugh Dickins <hugh@veritas.com>
Date: Wed Nov 19 15:36:36 2008 -0800
sprint_symbol(): use less stack
...
@@ -304,17 +304,24 @@ int sprint_symbol(char *buffer, unsigned long address)
char *modname;
const char *name;
unsigned long offset, size;
- char namebuf[KSYM_NAME_LEN];
+ int len;
- name = kallsyms_lookup(address, &size, &offset, &modname, namebuf);
+ name = kallsyms_lookup(address, &size, &offset, &modname, buffer);
...so it might just be the caller's fault (depending on whether this
patch was correct or not).
Vegard
--
"The animistic metaphor of the bug that maliciously sneaked in while
the programmer was not looking is intellectually dishonest as it
disguises that the error is the programmer's own creation."
-- E. W. Dijkstra, EWD1036
--
I think you're looking at a 2.6.28-rc5 sprint_symbol() there: the world has moved on since those days. I changed it to use the supplied "buffer" instead of local "namebuf" in 2.6.28-rc6, so we have to wonder if my patch is to blame - though I don't see it. Sorry, I'm eating lunch then about to go out for a couple of hours: can't look into it now, but maybe this info will help you to make better sense of what's going on. Hugh --
Hi Hugh,
(I'm adding Vegard to the cc as he pointed out the same thing.)
Oh, right. I think I see where this is going. The buffer is coming
from sysfs and is PAGE_SIZE long. In SLUB, we do check for overflows
but list_locations() allows us to get as close as 100 bytes bytes from
the edge of the page. Unfortunately, kallsyms_lookup() does:
const char *kallsyms_lookup(unsigned long addr,
unsigned long *symbolsize,
unsigned long *offset,
char **modname, char *namebuf)
{
namebuf[KSYM_NAME_LEN - 1] = 0;
namebuf[0] = 0;
where KSYM_NAME_LEN is 128 which will tip us over the page edge
potentially triggering an oops. The proper fix is probably to pass a
length of namebuf to kallsyms_lookup() but it's probably best to
revert the patch this late in the release cycle?
Pekka
--
Thanks for working that out. Several possibilities, I don't know which is the right one for now, hope Andrew can judge. I don't see passing the length of namebuf to kallsyms_lookup() as called for: it could have been designed that way from the start, but it wasn't, and I don't see this bug as any reason to change that. Reverting my patch for now: that's certainly a reasonable possibility, but leaves us with several other such bugs. Suggested patch below, but the ftrace part of it worries me a little, since it's within a structure and maybe it's a bad idea to enlarge that at this point; I've also not _really_ done the arithmetic needed for the slub one. An alternative quick just-for-now fix might be to remove that namebuf[KSYM_NAME_LEN - 1] = 0; from kallsyms_lookup(): as I understand it (please check), that could only make sense in cases where the symbol is KSYM_NAME_LEN long or longer - in which case, all of the places fixed in the patch below would be causing corruption already, even without my patch. I think. Maybe that "= 0" even serves no purpose at all? Hugh [PATCH] KSYM_SYMBOL_LEN fixes Miles Lane tailing /sys files hit a BUG which Pekka Enberg has tracked to my 966c8c12dc9e77f931e2281ba25d2f0244b06949 sprint_symbol(): use less stack exposing a bug in slub's list_locations() - kallsyms_lookup() writes a 0 to namebuf[KSYM_NAME_LEN-1], but that was beyond the end of page provided. The 100 slop which list_locations() allows at end of page looks roughly enough for all the other stuff it might print after the symbol before it checks again: break out KSYM_SYMBOL_LEN earlier than before. Latencytop and ftrace and are using KSYM_NAME_LEN buffers where they need KSYM_SYMBOL_LEN buffers, and vmallocinfo a 2*KSYM_NAME_LEN buffer where it wants a KSYM_SYMBOL_LEN buffer: fix those before anyone copies them. Signed-off-by: Hugh Dickins <hugh@veritas.com> --- fs/proc/base.c | 2 +- include/linux/ftrace.h | 2 +- kernel/latencytop.c | 2 ...
Hi Hugh, For SLUB, I think you can just replace the 100 with KSYM_NAME_LEN and it will work fine. The 100 constant is already supposed to be big enough to fit the symbol and the rest. I like this approach the best but I worry it's a can of worms especially this late in the cycle. As a stop-gap measure, I'm fine with your proposed patch either as-is or with the suggestions I had for the SLUB case. Pekka --
That would fix the bug that my sprint_symbol patch has exposed (or introduced, according to your perspective!); but that 100 can only be big enough to fit symbol and all the rest if you assume that maximum symbol length is ... I'm not sure exactly ... somewhere around 20 bytes? If we assume that, then why does KSYM_NAME_LEN allow 128? I'll agree that 128 seems rather on the high side (I'd abhor symbols of anywhere near that length in any code I came near), but that's what's allowed, I'd be _much_ happier with my proposed patch if we get an Ack from Steven on the ftrace.h part of it. Hugh --
Your fix looks fine. The maximum symbol length needs to be supported by list_locations() althought a line with a symbol like that would look a bit weird. --
Yup, Acked-by: Pekka Enberg <penberg@cs.helsinki.fi> --
Acked-by: Steven Rostedt <srostedt@redhat.com> Even on Turkey day ;-) -- Steve --
[ Added Frederic ] Acked-by: Steven Rostedt <srostedt@redhat.com> The boot tracer was written by Frederic. -- Steve --
No problem for me. The ring buffer seems large enough to contain this entry size for each initcall :-) Acked-by: Frederic Weisbecker <fweisbec@gmail.com> --
This makes rather a mess when applied ahead of the (massive) ftrace
changes in linux-next.
afacit these two func[] arrays:
--- /dev/null
+++ b/include/trace/boot.h
@@ -0,0 +1,56 @@
+#ifndef _LINUX_TRACE_BOOT_H
+#define _LINUX_TRACE_BOOT_H
+
+/*
+ * Structure which defines the trace of an initcall
+ * while it is called.
+ * You don't have to fill the func field since it is
+ * only used internally by the tracer.
+ */
+struct boot_trace_call {
+ pid_t caller;
+ char func[KSYM_NAME_LEN];
+};
+
+/*
+ * Structure which defines the trace of an initcall
+ * while it returns.
+ */
+struct boot_trace_ret {
+ char func[KSYM_NAME_LEN];
+ int result;
+ unsigned long long duration; /* nsecs */
+};
will need to be changed to KSYM_SYMBOL_LEN. I think. Please fix it up
carefully when this all lands on your doorstep.
--
I looked into this and unfortunately that won't help us. For example, module_address_lookup() will copy KSYM_NAME_LEN - 1 bytes to namebuf anyway so we'll overflow the buffer. It would probably be a good idea to add a comment to kallsyms_lookup() stating that namebuf needs to have at least KSYM_NAME_LEN bytes available. Pekka --
On Thu, 27 Nov 2008 17:34:27 +0000 (GMT)
The ftrace.h change causes this:
In file included from arch/x86/kernel/machine_kexec_64.c:14:
include/linux/ftrace.h:234: error: 'MODULE_NAME_LEN' undeclared here (not in a function)
The fault really lies with include/linux/kallsyms.h, I think:
#define KSYM_SYMBOL_LEN (sizeof("%s+%#lx/%#lx [%s]") + (KSYM_NAME_LEN - 1) + \
2*(BITS_PER_LONG*3/10) + (MODULE_NAME_LEN - 1) + 1)
but it doesn't include module.h, so it requires that users of this
header perform the include.
But I'm a bit reluctant to include module.h from kallsyms.h because
kallsyms.h is a simple low-level thing:
#include <linux/errno.h>
#include <linux/kernel.h>
#include <linux/stddef.h>
and it wouldn't surprise me if module.h was including kallsyms.h by
some means.
So for now I'll try including module.h from ftrace.c. It would be nice
to fix the kallsyms.h dependency..
--
You ended up choosing ftrace.h, which seems more to the point. Thank you for doing the work I should never have given you. Hugh --
Hmm... and I recently define MODULE_NAME_LEN in terms of MAX_PARAM_PREFIX_LEN for similar reasons. Perhaps we should move MODULE_NAME_LEN to kernel.h and define everything in terms of that? It seems to be a common need. Rusty. --
I don't recall doing a suspend/resume. I have tried to reproduce this
BUG, but haven't managed it thus far. I will keep trying. I can't spend
too much time on it, as I have a work project to complete.
Thanks,
Miles
--
