logo
Published on KernelTrap (http://kerneltrap.org)

Compiler Optimization Bugs and World Domination

By Jeremy
Created Oct 5 2007 - 05:09

A bug report [1] filed by Ingo Molnar regarding a procfs crash in the recently released 2.6.23-rc9 kernel was quickly tracked down by Linus Torvalds as a compiler bug. The bug was ultimately determined to be from a compiler optimization generated with an older version of GCC. Ingo was skeptical at first, "it's 4.0.2. Not the latest & greatest but I've been using it for 2 years and this would be the first time it miscompiles a 32-bit kernel out of tens of thousands of successful kernel bootups." Linus replied, "I am 100% sure. I can look at the disassembly, and point to the fact that your Oops happens on code that is simply totally bogus." He continued on to offer an interesting review of the crash [2], explaining line by line what should have been generated versus what actually was, causing the crash. In the end, Ingo switched to a distribution compiled GCC 4.1.2 and confirmed that the crash went away, "so you are completely right, it's a compiler bug in 4.0.2."

During the thread, Linus suggested that the optimization made by the compiler wasn't "legal", to which Alan Cox retorted, "pedant: valid. Almost all optimizations are legal, nobody has yet written laws about compilers. Sorry but I'm forever fixing misuse of the word 'illegal' in printks, docs and the like and it gets annoying after a bit." Linus playfully responded, "heh. When I'm ruler of the universe, it *will* be illegal. I'm just getting a bit ahead of myself." When asked how long until he expected to be ruler, Linus added, "I'm working on it, I'm working on it. I'm just as frustrated as you are. It turns out to be a non-trivial problem."

From: Ingo Molnar <mingo@...>
Subject: [bug] crash when reading /proc/mounts (was: Re: Linux 2.6.23-rc9 and a heads-up for the 2.6.24 series..)
 [2]Date: Oct 3, 4:46 am 2007

hm, i just triggered the procfs crash below with -rc9 on a testbox. 
Config attached. It's easy to reproduce it via 'service sshd restart'. 
The crash site is:

 (gdb) list *0xc017599d
 0xc017599d is in seq_path (fs/seq_file.c:354).
 349             if (m->count < m->size) {
 350                     char *s = m->buf + m->count;
 351                     char *p = d_path(dentry, mnt, s, m->size - m->count);
 352                     if (!IS_ERR(p)) {
 353                             while (s <= p) {
 354                                     char c = *p++;
 355                                     if (!c) {
 356                                             p = m->buf + m->count;
 357                                             m->count = s - m->buf;
 358                                             return s - p;
 (gdb)

any ideas? Fortunately i was able to do an strace of the incident:

 3247  munmap(0xb7f3e000, 4096)          = 0
 3247  open("/proc/mounts", O_RDONLY|O_LARGEFILE) = 3
 3247  fstat64(3, {st_mode=S_IFREG|0444, st_size=0, ...}) = 0
 3247  mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7f3e000
 3247  read(3,  <unfinished ...>
 3247  +++ killed by SIGSEGV +++

and doing "cat /proc/mounts" triggers the crash reliably.

	Ingo

---------------->
BUG: unable to handle kernel paging request at virtual address f2a40000
 printing eip:
c017599d
*pdpt = 0000000000001001
*pde = 0000000000aee067
*pte = 0000000032a40000
Oops: 0000 [#1]
PREEMPT DEBUG_PAGEALLOC
Modules linked in:
CPU:    0
EIP:    0060:[<c017599d>]    Not tainted VLI
EFLAGS: 00010297   (2.6.23-rc9 #89)
EIP is at seq_path+0x60/0xca
eax: f2a3fffe   ebx: c290c8d4   ecx: f6e341f0   edx: f2a3fffe
esi: f2a3f007   edi: c29097f0   ebp: ec5ddf1c   esp: ec5ddf04
ds: 007b   es: 007b   fs: 0000  gs: 0033  ss: 0068
Process sshd (pid: 2743, ti=ec5dc000 task=f6e341f0 task.ti=ec5dc000)
Stack: 00000ff9 c2bf6b40 f2a3fffe c29097c0 c2bf6b40 c29097f0 ec5ddf34 c0173c41 
       c05ffe64 00000400 c2bf6b40 c29097f0 ec5ddf74 c0175d2b 00000400 b7fa2000 
       f5277600 c2bf6b60 00000000 c0109e99 ec5ddf80 00000246 c01555e6 00000000 
Call Trace:
 [<c0106f80>] show_trace_log_lvl+0x19/0x2e
 [<c0107030>] show_stack_log_lvl+0x9b/0xa3
 [<c0107428>] show_registers+0x1c4/0x2e3
 [<c010772d>] die+0x115/0x1e0
 [<c0115e3b>] do_page_fault+0x808/0x8e1
 [<c0508faa>] error_code+0x6a/0x70
 [<c0173c41>] show_vfsmnt+0x44/0x11e
 [<c0175d2b>] seq_read+0xeb/0x25f
 [<c0160e63>] vfs_read+0x87/0xe5
 [<c0161613>] sys_read+0x3d/0x61
 [<c010606e>] sysenter_past_esp+0x6b/0xb5
 =======================
Code: 89 45 f0 76 77 eb 7a 8b 55 ec 8b 4d ec 89 f7 8b 02 89 c2 03 51 0c 29 c7 89 f0 89 79 0c 29 d0 eb 6c 89 f8 88 06 46 eb 54 8b 55 f0 <8b> 3a 42 89 55 f0 89 f9 84 c9 74 d0 8b 45 08 0f be d9 89 da e8 
EIP: [<c017599d>] seq_path+0x60/0xca SS:ESP 0068:ec5ddf04
BUG: unable to handle kernel paging request at virtual address f2a40000
 printing eip:
c017599d
*pdpt = 0000000000001001
*pde = 0000000000aee067
*pte = 0000000032a40000
Oops: 0000 [#2]
PREEMPT DEBUG_PAGEALLOC
Modules linked in:
CPU:    0
EIP:    0060:[<c017599d>]    Tainted: G      D VLI
EFLAGS: 00010297   (2.6.23-rc9 #89)
EIP is at seq_path+0x60/0xca
eax: f2a3fffe   ebx: c290c8d4   ecx: c02be275   edx: f2a3fffe
esi: f2a3f007   edi: c29097f0   ebp: ef2b7f1c   esp: ef2b7f04
ds: 007b   es: 007b   fs: 0000  gs: 0033  ss: 0068
Process sshd (pid: 2744, ti=ef2b6000 task=f6e5cce0 task.ti=ef2b6000)
Stack: 00000ff9 c2bf6b40 f2a3fffe c29097c0 c2bf6b40 c29097f0 ef2b7f34 c0173c41 
       c05ffe64 00000400 c2bf6b40 c29097f0 ef2b7f74 c0175d2b 00000400 b7f09000 
       f7375240 c2bf6b60 00000000 00000073 ef2b7f80 00000246 c01555e6 00000000 
Call Trace:
 [<c0106f80>] show_trace_log_lvl+0x19/0x2e
 [<c0107030>] show_stack_log_lvl+0x9b/0xa3
 [<c0107428>] show_registers+0x1c4/0x2e3
 [<c010772d>] die+0x115/0x1e0
 [<c0115e3b>] do_page_fault+0x808/0x8e1
 [<c0508faa>] error_code+0x6a/0x70
 [<c0173c41>] show_vfsmnt+0x44/0x11e
 [<c0175d2b>] seq_read+0xeb/0x25f
 [<c0160e63>] vfs_read+0x87/0xe5
 [<c0161613>] sys_read+0x3d/0x61
 [<c010606e>] sysenter_past_esp+0x6b/0xb5
 =======================
Code: 89 45 f0 76 77 eb 7a 8b 55 ec 8b 4d ec 89 f7 8b 02 89 c2 03 51 0c 29 c7 89 f0 89 79 0c 29 d0 eb 6c 89 f8 88 06 46 eb 54 8b 55 f0 <8b> 3a 42 89 55 f0 89 f9 84 c9 74 d0 8b 45 08 0f be d9 89 da e8 
EIP: [<c017599d>] seq_path+0x60/0xca SS:ESP 0068:ef2b7f04

From: Linus Torvalds <torvalds@...> Subject: Re: [bug] crash when reading /proc/mounts (was: Re: Linux 2.6.23-rc9 and a heads-up for the 2.6.24 series..) [2]Date: Oct 3, 11:11 am 2007 On Wed, 3 Oct 2007, Ingo Molnar wrote: > > hm, i just triggered the procfs crash below with -rc9 on a testbox. You have a terminally buggy piece of shit compiler. Lookie here: - the bug happens on this: char c = *p++; - which has been compiled into 8b 3a mov (%edx),%edi which is a *word* access. - the pointer is at the end of a page (very much on purpose): edx: f2a3fffe - and as a result you get an exception on the *next* page: BUG: unable to handle kernel paging request at virtual address f2a40000 and btw, there is no question what-so-ever about whether your compiler might be doing a legal optimization - the compiler really is wrong, and is total shit. You need to make a gcc bug-report. Because this is not a question of "the standard is ambiguous", this is a question of "the compiler turned good code into code that could SIGSEGV in user space too, if 'malloc()' happened to return a pointer at the end of an allocation". Linus -
From: Ingo Molnar <mingo@...> Subject: Re: [bug] crash when reading /proc/mounts (was: Re: Linux 2.6.23-rc9 and a heads-up for the 2.6.24 series..) [2]Date: Oct 3, 11:40 am 2007 * Linus Torvalds <torvalds@linux-foundation.org> wrote: > On Wed, 3 Oct 2007, Ingo Molnar wrote: > > > > hm, i just triggered the procfs crash below with -rc9 on a testbox. > > You have a terminally buggy piece of shit compiler. hm, it's 4.0.2. Not the latest & greatest but i've been using it for 2 years and this would be the first time it miscompiles a 32-bit kernel out of tens of thousands of successful kernel bootups. > - and as a result you get an exception on the *next* page: > > BUG: unable to handle kernel paging request at virtual address f2a40000 Hm, are you sure? This is a CONFIG_DEBUG_PAGEALLOC=y kernel, so even a slight overrun of a non-NIL terminated string (as suspected by Al) could run into a non-mapped kernel page. (which would indicate not a compiler bug but use-after free) i just found another config under which i get similar crashes, config attached. One common theme is CONFIG_DEBUG_FS and DEBUG_PAGEALLOC - and CONFIG_MAC80211_DEBUGFS is not enabled in this one so it's off the hook i think. (the crashes are attached below) (my serial log on this box goes back about 6 months, and that alone shows more than 3500 successful kernel bootups on that particular testsystem, each kernel built by this compiler - and there's another testsystem that i use even more frequently. Despite that, a compiler bug is still possible of course.) Ingo ---------------> kobject_uevent_env fill_kobj_path: path = '/class/vc/vcsa8' kobject vcsa8: cleaning up BUG: unable to handle kernel paging request at virtual address f6207000 printing eip: c016ecf1 *pdpt = 0000000000003001 *pde = 0000000000ac1067 *pte = 0000000036207000 Oops: 0000 [#1] SMP DEBUG_PAGEALLOC Modules linked in: CPU: 1 EIP: 0060:[<c016ecf1>] Not tainted VLI EFLAGS: 00010297 (2.6.23-rc9 #20) EIP is at seq_path+0x60/0xca eax: f6206ffe ebx: c2de0f50 ecx: 0000002b edx: f6206ffe esi: f6206007 edi: c2dddfb0 ebp: f6503f18 esp: f6503f00 ds: 007b es: 007b fs: 00d8 gs: 0033 ss: 0068 Process awk (pid: 1160, ti=f6503000 task=f73a8390 task.ti=f6503000) Stack: 00000ff9 f6e5cf70 f6206ffe c2dddf80 f6e5cf70 c2dddfb0 f6503f30 c016ce40 c05d71b5 f6730f38 f6e5cf70 c2dddfb0 f6503f70 c016f05d 00000400 08098f18 f6730f38 f6e5cf90 00000000 0806bc2e 00000003 08094320 f6503fb0 00000000 Call Trace: [<c0103c8d>] show_trace_log_lvl+0x19/0x2e [<c0103d3f>] show_stack_log_lvl+0x9d/0xa5 [<c0104083>] show_registers+0x1af/0x281 [<c0104338>] die+0x11a/0x1e8 [<c01138d1>] do_page_fault+0x632/0x715 [<c04e7372>] error_code+0x72/0x80 [<c016ce40>] show_vfsmnt+0x43/0x120 [<c016f05d>] seq_read+0xf1/0x269 [<c0159783>] vfs_read+0x90/0x10e [<c0159f9e>] sys_read+0x3f/0x63 [<c0102876>] sysenter_past_esp+0x5f/0x89 ======================= Code: f0 ff ff 76 77 eb 7a 8b 55 ec 8b 02 89 c2 8b 4d ec 03 51 0c 89 f7 29 c7 89 79 0c 89 f0 29 d0 eb 6c 89 f8 88 06 46 eb 54 8b 55 f0 <8b> 3a 42 89 55 f0 89 f9 84 c9 74 d0 0f be d9 89 da 8b 45 08 e8 EIP: [<c016ecf1>] seq_path+0x60/0xca SS:ESP 0068:f6503f00 BUG: unable to handle kernel paging request at virtual address f63d1000 printing eip: c016ecf1 *pdpt = 0000000000003001 *pde = 0000000000ac1067 *pte = 00000000363d1000 Oops: 0000 [#2] SMP DEBUG_PAGEALLOC Modules linked in: CPU: 0 EIP: 0060:[<c016ecf1>] Tainted: G D VLI EFLAGS: 00010297 (2.6.23-rc9 #20) EIP is at seq_path+0x60/0xca eax: f63d0ffe ebx: c2de0f50 ecx: 0000002c edx: f63d0ffe esi: f63d0007 edi: c2dddfb0 ebp: f6367f18 esp: f6367f00 ds: 007b es: 007b fs: 00d8 gs: 0033 ss: 0068 Process MAKEDEV (pid: 1170, ti=f6367000 task=f776e390 task.ti=f6367000) Stack: 00000ff9 f650ef70 f63d0ffe c2dddf80 f650ef70 c2dddfb0 f6367f30 c016ce40 c05d71b5 f6473f38 f650ef70 c2dddfb0 f6367f70 c016f05d 00000400 b7f13000 f6473f38 f650ef90 00000000 f65eb580 000b7f13 00000073 00000000 00000000 Call Trace: [<c0103c8d>] show_trace_log_lvl+0x19/0x2e [<c0103d3f>] show_stack_log_lvl+0x9d/0xa5 [<c0104083>] show_registers+0x1af/0x281 [<c0104338>] die+0x11a/0x1e8 [<c01138d1>] do_page_fault+0x632/0x715 [<c04e7372>] error_code+0x72/0x80 [<c016ce40>] show_vfsmnt+0x43/0x120 [<c016f05d>] seq_read+0xf1/0x269 [<c0159783>] vfs_read+0x90/0x10e [<c0159f9e>] sys_read+0x3f/0x63 [<c0102876>] sysenter_past_esp+0x5f/0x89 ======================= Code: f0 ff ff 76 77 eb 7a 8b 55 ec 8b 02 89 c2 8b 4d ec 03 51 0c 89 f7 29 c7 89 79 0c 89 f0 29 d0 eb 6c 89 f8 88 06 46 eb 54 8b 55 f0 <8b> 3a 42 89 55 f0 89 f9 84 c9 74 d0 0f be d9 89 da 8b 45 08 e8 EIP: [<c016ecf1>] seq_path+0x60/0xca SS:ESP 0068:f6367f00 BUG: unable to handle kernel paging request at virtual address f63d1000 printing eip: c016ecf1 *pdpt = 0000000000003001 *pde = 0000000000ac1067 *pte = 00000000363d1000 Oops: 0000 [#3] SMP DEBUG_PAGEALLOC Modules linked in: CPU: 0 EIP: 0060:[<c016ecf1>] Tainted: G D VLI EFLAGS: 00010297 (2.6.23-rc9 #20) EIP is at seq_path+0x60/0xca eax: f63d0ffe ebx: c2de0f50 ecx: 0000002c edx: f63d0ffe esi: f63d0007 edi: c2dddfb0 ebp: f6358f18 esp: f6358f00 ds: 007b es: 007b fs: 00d8 gs: 0033 ss: 0068 Process MAKEDEV (pid: 1174, ti=f6358000 task=f73ae390 task.ti=f6358000) Stack: 00000ff9 f650ef70 f63d0ffe c2dddf80 f650ef70 c2dddfb0 f6358f30 c016ce40 c05d71b5 f6274f38 f650ef70 c2dddfb0 f6358f70 c016f05d 00000400 b7f02000 f6274f38 f650ef90 00000000 f65eb7b0 000b7f02 00000073 00000000 00000000 Call Trace: [<c0103c8d>] show_trace_log_lvl+0x19/0x2e [<c0103d3f>] show_stack_log_lvl+0x9d/0xa5 [<c0104083>] show_registers+0x1af/0x281 [<c0104338>] die+0x11a/0x1e8 [<c01138d1>] do_page_fault+0x632/0x715 [<c04e7372>] error_code+0x72/0x80 [<c016ce40>] show_vfsmnt+0x43/0x120 [<c016f05d>] seq_read+0xf1/0x269 [<c0159783>] vfs_read+0x90/0x10e [<c0159f9e>] sys_read+0x3f/0x63 [<c0102876>] sysenter_past_esp+0x5f/0x89 ======================= Code: f0 ff ff 76 77 eb 7a 8b 55 ec 8b 02 89 c2 8b 4d ec 03 51 0c 89 f7 29 c7 89 79 0c 89 f0 29 d0 eb 6c 89 f8 88 06 46 eb 54 8b 55 f0 <8b> 3a 42 89 55 f0 89 f9 84 c9 74 d0 0f be d9 89 da 8b 45 08 e8 EIP: [<c016ecf1>] seq_path+0x60/0xca SS:ESP 0068:f6358f00 BUG: unable to handle kernel paging request at virtual address f63d1000 printing eip: c016ecf1 *pdpt = 0000000000003001 *pde = 0000000000ac1067 *pte = 00000000363d1000 Oops: 0000 [#4] SMP DEBUG_PAGEALLOC Modules linked in: CPU: 0 EIP: 0060:[<c016ecf1>] Tainted: G D VLI EFLAGS: 00010297 (2.6.23-rc9 #20) EIP is at seq_path+0x60/0xca eax: f63d0ffe ebx: c2de0f50 ecx: 0000002a edx: f63d0ffe esi: f63d0007 edi: c2dddfb0 ebp: f6080f18 esp: f6080f00 ds: 007b es: 007b fs: 00d8 gs: 0033 ss: 0068 Process udevd (pid: 1176, ti=f6080000 task=f7346390 task.ti=f6080000) Stack: 00000ff9 f650ef70 f63d0ffe c2dddf80 f650ef70 c2dddfb0 f6080f30 c016ce40 c05d71b5 f67b3f38 f650ef70 c2dddfb0 f6080f70 c016f05d 00000400 b7f89000 f67b3f38 f650ef90 00000000 f6bb5120 000b7f89 00000073 00000000 00000000 Call Trace: [<c0103c8d>] show_trace_log_lvl+0x19/0x2e [<c0103d3f>] show_stack_log_lvl+0x9d/0xa5 [<c0104083>] show_registers+0x1af/0x281 [<c0104338>] die+0x11a/0x1e8 [<c01138d1>] do_page_fault+0x632/0x715 [<c04e7372>] error_code+0x72/0x80 [<c016ce40>] show_vfsmnt+0x43/0x120 [<c016f05d>] seq_read+0xf1/0x269 [<c0159783>] vfs_read+0x90/0x10e [<c0159f9e>] sys_read+0x3f/0x63 [<c0102876>] sysenter_past_esp+0x5f/0x89 ======================= Code: f0 ff ff 76 77 eb 7a 8b 55 ec 8b 02 89 c2 8b 4d ec 03 51 0c 89 f7 29 c7 89 79 0c 89 f0 29 d0 eb 6c 89 f8 88 06 46 eb 54 8b 55 f0 <8b> 3a 42 89 55 f0 89 f9 84 c9 74 d0 0f be d9 89 da 8b 45 08 e8 EIP: [<c016ecf1>] seq_path+0x60/0xca SS:ESP 0068:f6080f00 BUG: unable to handle kernel paging request at virtual address f63d1000 printing eip: c016ecf1 *pdpt = 0000000000003001 *pde = 0000000000ac1067 *pte = 00000000363d1000 Oops: 0000 [#5] SMP DEBUG_PAGEALLOC Modules linked in: CPU: 0 EIP: 0060:[<c016ecf1>] Tainted: G D VLI EFLAGS: 00010297 (2.6.23-rc9 #20) EIP is at seq_path+0x60/0xca eax: f63d0ffe ebx: c2de0f50 ecx: 0000002b edx: f63d0ffe esi: f63d0007 edi: c2dddfb0 ebp: f597ef18 esp: f597ef00 ds: 007b es: 007b fs: 00d8 gs: 0033 ss: 0068 Process udevsend (pid: 1264, ti=f597e000 task=f778e390 task.ti=f597e000) Stack: 00000ff9 f650ef70 f63d0ffe c2dddf80 f650ef70 c2dddfb0 f597ef30 c016ce40 c05d71b5 f67c5f38 f650ef70 c2dddfb0 f597ef70 c016f05d 00000400 b7f12000 f67c5f38 f650ef90 00000000 f6073510 000b7f12 00000073 00000000 00000000 Call Trace: [<c0103c8d>] show_trace_log_lvl+0x19/0x2e [<c0103d3f>] show_stack_log_lvl+0x9d/0xa5 [<c0104083>] show_registers+0x1af/0x281 [<c0104338>] die+0x11a/0x1e8 [<c01138d1>] do_page_fault+0x632/0x715 [<c04e7372>] error_code+0x72/0x80 [<c016ce40>] show_vfsmnt+0x43/0x120 [<c016f05d>] seq_read+0xf1/0x269 [<c0159783>] vfs_read+0x90/0x10e [<c0159f9e>] sys_read+0x3f/0x63 [<c0102876>] sysenter_past_esp+0x5f/0x89 ======================= Code: f0 ff ff 76 77 eb 7a 8b 55 ec 8b 02 89 c2 8b 4d ec 03 51 0c 89 f7 29 c7 89 79 0c 89 f0 29 d0 eb 6c 89 f8 88 06 46 eb 54 8b 55 f0 <8b> 3a 42 89 55 f0 89 f9 84 c9 74 d0 0f be d9 89 da 8b 45 08 e8 EIP: [<c016ecf1>] seq_path+0x60/0xca SS:ESP 0068:f597ef00 BUG: unable to handle kernel paging request at virtual address f6207000 printing eip: c016ecf1 *pdpt = 0000000000003001 *pde = 0000000000ac1067 *pte = 0000000036207000 Oops: 0000 [#6] SMP DEBUG_PAGEALLOC Modules linked in: CPU: 1 EIP: 0060:[<c016ecf1>] Tainted: G D VLI EFLAGS: 00010297 (2.6.23-rc9 #20) EIP is at seq_path+0x60/0xca eax: f6206ffe ebx: c2de0f50 ecx: 0000002b edx: f6206ffe esi: f6206007 edi: c2dddfb0 ebp: f5929f18 esp: f5929f00 ds: 007b es: 007b fs: 00d8 gs: 0033 ss: 0068 Process udevsend (pid: 1265, ti=f5929000 task=f77af390 task.ti=f5929000) Stack: 00000ff9 f6e5cf70 f6206ffe c2dddf80 f6e5cf70 c2dddfb0 f5929f30 c016ce40 c05d71b5 f6352f38 f6e5cf70 c2dddfb0 f5929f70 c016f05d 00000400 b7f31000 f6352f38 f6e5cf90 00000000 f66c9740 000b7f31 00000073 00000000 00000000 Call Trace: [<c0103c8d>] show_trace_log_lvl+0x19/0x2e [<c0103d3f>] show_stack_log_lvl+0x9d/0xa5 [<c0104083>] show_registers+0x1af/0x281 [<c0104338>] die+0x11a/0x1e8 [<c01138d1>] do_page_fault+0x632/0x715 [<c04e7372>] error_code+0x72/0x80 [<c016ce40>] show_vfsmnt+0x43/0x120 [<c016f05d>] seq_read+0xf1/0x269 [<c0159783>] vfs_read+0x90/0x10e [<c0159f9e>] sys_read+0x3f/0x63 [<c0102876>] sysenter_past_esp+0x5f/0x89 ======================= Code: f0 ff ff 76 77 eb 7a 8b 55 ec 8b 02 89 c2 8b 4d ec 03 51 0c 89 f7 29 c7 89 79 0c 89 f0 29 d0 eb 6c 89 f8 88 06 46 eb 54 8b 55 f0 <8b> 3a 42 89 55 f0 89 f9 84 c9 74 d0 0f be d9 89 da 8b 45 08 e8 EIP: [<c016ecf1>] seq_path+0x60/0xca SS:ESP 0068:f5929f00 BUG: unable to handle kernel paging request at virtual address f63d1000 printing eip: c016ecf1 *pdpt = 0000000000003001 *pde = 0000000000ac1067 *pte = 00000000363d1000 Oops: 0000 [#7] SMP DEBUG_PAGEALLOC Modules linked in: CPU: 0 EIP: 0060:[<c016ecf1>] Tainted: G D VLI EFLAGS: 00010297 (2.6.23-rc9 #20) EIP is at seq_path+0x60/0xca eax: f63d0ffe ebx: c2de0f50 ecx: 0000002b edx: f63d0ffe esi: f63d0007 edi: c2dddfb0 ebp: f6a98f18 esp: f6a98f00 ds: 007b es: 007b fs: 00d8 gs: 0033 ss: 0068 Process udevsend (pid: 1266, ti=f6a98000 task=f7397390 task.ti=f6a98000) Stack: 00000ff9 f650ef70 f63d0ffe c2dddf80 f650ef70 c2dddfb0 f6a98f30 c016ce40 c05d71b5 f7359f38 f650ef70 c2dddfb0 f6a98f70 c016f05d 00000400 b7f8a000 f7359f38 f650ef90 00000000 f6376dd0 000b7f8a 00000073 00000000 00000000 Call Trace: [<c0103c8d>] show_trace_log_lvl+0x19/0x2e [<c0103d3f>] show_stack_log_lvl+0x9d/0xa5 [<c0104083>] show_registers+0x1af/0x281 [<c0104338>] die+0x11a/0x1e8 [<c01138d1>] do_page_fault+0x632/0x715 [<c04e7372>] error_code+0x72/0x80 [<c016ce40>] show_vfsmnt+0x43/0x120 [<c016f05d>] seq_read+0xf1/0x269 [<c0159783>] vfs_read+0x90/0x10e [<c0159f9e>] sys_read+0x3f/0x63 [<c0102876>] sysenter_past_esp+0x5f/0x89 ======================= Code: f0 ff ff 76 77 eb 7a 8b 55 ec 8b 02 89 c2 8b 4d ec 03 51 0c 89 f7 29 c7 89 79 0c 89 f0 29 d0 eb 6c 89 f8 88 06 46 eb 54 8b 55 f0 <8b> 3a 42 89 55 f0 89 f9 84 c9 74 d0 0f be d9 89 da 8b 45 08 e8 EIP: [<c016ecf1>] seq_path+0x60/0xca SS:ESP 0068:f6a98f00 BUG: unable to handle kernel paging request at virtual address f6207000 printing eip: c016ecf1 *pdpt = 0000000000003001 *pde = 0000000000ac1067 *pte = 0000000036207000 Oops: 0000 [#8] SMP DEBUG_PAGEALLOC Modules linked in: CPU: 1 EIP: 0060:[<c016ecf1>] Tainted: G D VLI EFLAGS: 00010297 (2.6.23-rc9 #20) EIP is at seq_path+0x60/0xca eax: f6206ffe ebx: c2de0f50 ecx: 0000002b edx: f6206ffe esi: f6206007 edi: c2dddfb0 ebp: f60c0f18 esp: f60c0f00 ds: 007b es: 007b fs: 00d8 gs: 0033 ss: 0068 Process udevsend (pid: 1267, ti=f60c0000 task=f6372390 task.ti=f60c0000) Stack: 00000ff9 f6e5cf70 f6206ffe c2dddf80 f6e5cf70 c2dddfb0 f60c0f30 c016ce40 c05d71b5 f6145f38 f6e5cf70 c2dddfb0 f60c0f70 c016f05d 00000400 b7fd2000 f6145f38 f6e5cf90 00000000 f6aa4890 000b7fd2 00000073 00000000 00000000 Call Trace: [<c0103c8d>] show_trace_log_lvl+0x19/0x2e [<c0103d3f>] show_stack_log_lvl+0x9d/0xa5 [<c0104083>] show_registers+0x1af/0x281 [<c0104338>] die+0x11a/0x1e8 [<c01138d1>] do_page_fault+0x632/0x715 [<c04e7372>] error_code+0x72/0x80 [<c016ce40>] show_vfsmnt+0x43/0x120 [<c016f05d>] seq_read+0xf1/0x269 [<c0159783>] vfs_read+0x90/0x10e [<c0159f9e>] sys_read+0x3f/0x63 [<c0102876>] sysenter_past_esp+0x5f/0x89 ======================= Code: f0 ff ff 76 77 eb 7a 8b 55 ec 8b 02 89 c2 8b 4d ec 03 51 0c 89 f7 29 c7 89 79 0c 89 f0 29 d0 eb 6c 89 f8 88 06 46 eb 54 8b 55 f0 <8b> 3a 42 89 55 f0 89 f9 84 c9 74 d0 0f be d9 89 da 8b 45 08 e8 EIP: [<c016ecf1>] seq_path+0x60/0xca SS:ESP 0068:f60c0f00 BUG: unable to handle kernel paging request at virtual address f63d1000 printing eip: c016ecf1 *pdpt = 0000000000003001 *pde = 0000000000ac1067 *pte = 00000000363d1000 Oops: 0000 [#9] SMP DEBUG_PAGEALLOC Modules linked in: CPU: 0 EIP: 0060:[<c016ecf1>] Tainted: G D VLI EFLAGS: 00010297 (2.6.23-rc9 #20) EIP is at seq_path+0x60/0xca eax: f63d0ffe ebx: c2de0f50 ecx: 0000002a edx: f63d0ffe esi: f63d0007 edi: c2dddfb0 ebp: f64c1f18 esp: f64c1f00 ds: 007b es: 007b fs: 00d8 gs: 0033 ss: 0068 Process udevstart (pid: 1269, ti=f64c1000 task=f6742390 task.ti=f64c1000) Stack: 00000ff9 f650ef70 f63d0ffe c2dddf80 f650ef70 c2dddfb0 f64c1f30 c016ce40 c05d71b5 f6b22f38 f650ef70 c2dddfb0 f64c1f70 c016f05d 00000400 b7f47000 f6b22f38 f650ef90 00000000 f667ecf0 000b7f47 00000073 00000000 00000000 Call Trace: [<c0103c8d>] show_trace_log_lvl+0x19/0x2e [<c0103d3f>] show_stack_log_lvl+0x9d/0xa5 [<c0104083>] show_registers+0x1af/0x281 [<c0104338>] die+0x11a/0x1e8 [<c01138d1>] do_page_fault+0x632/0x715 [<c04e7372>] error_code+0x72/0x80 [<c016ce40>] show_vfsmnt+0x43/0x120 [<c016f05d>] seq_read+0xf1/0x269 [<c0159783>] vfs_read+0x90/0x10e [<c0159f9e>] sys_read+0x3f/0x63 [<c0102876>] sysenter_past_esp+0x5f/0x89 ======================= Code: f0 ff ff 76 77 eb 7a 8b 55 ec 8b 02 89 c2 8b 4d ec 03 51 0c 89 f7 29 c7 89 79 0c 89 f0 29 d0 eb 6c 89 f8 88 06 46 eb 54 8b 55 f0 <8b> 3a 42 89 55 f0 89 f9 84 c9 74 d0 0f be d9 89 da 8b 45 08 e8 EIP: [<c016ecf1>] seq_path+0x60/0xca SS:ESP 0068:f64c1f00 fill_kobj_path: path = '/class/input/input0' BUG: unable to handle kernel paging request at virtual address f63d1000 printing eip: c016ecf1 *pdpt = 0000000000003001 *pde = 0000000000ac1067 *pte = 00000000363d1000 Oops: 0000 [#10] SMP DEBUG_PAGEALLOC Modules linked in: CPU: 0 EIP: 0060:[<c016ecf1>] Tainted: G D VLI EFLAGS: 00010297 (2.6.23-rc9 #20) EIP is at seq_path+0x60/0xca eax: f63d0ffe ebx: c2de0f50 ecx: 00000057 edx: f63d0ffe esi: f63d0007 edi: c2dddfb0 ebp: f588cf18 esp: f588cf00 ds: 007b es: 007b fs: 00d8 gs: 0033 ss: 0068 Process ls (pid: 1289, ti=f588c000 task=f7346390 task.ti=f588c000) Stack: 00000ff9 f59bef70 f63d0ffe c2dddf80 f59bef70 c2dddfb0 f588cf30 c016ce40 c05d71b5 f6f26f38 f59bef70 c2dddfb0 f588cf70 c016f05d 00000400 b7f48000 f6f26f38 f59bef90 00000000 f6376580 000b7f48 00000073 00000000 00000000 Call Trace: [<c0103c8d>] show_trace_log_lvl+0x19/0x2e [<c0103d3f>] show_stack_log_lvl+0x9d/0xa5 [<c0104083>] show_registers+0x1af/0x281 [<c0104338>] die+0x11a/0x1e8 [<c01138d1>] do_page_fault+0x632/0x715 [<c04e7372>] error_code+0x72/0x80 [<c016ce40>] show_vfsmnt+0x43/0x120 [<c016f05d>] seq_read+0xf1/0x269 [<c0159783>] vfs_read+0x90/0x10e [<c0159f9e>] sys_read+0x3f/0x63 [<c0102876>] sysenter_past_esp+0x5f/0x89 ======================= Code: f0 ff ff 76 77 eb 7a 8b 55 ec 8b 02 89 c2 8b 4d ec 03 51 0c 89 f7 29 c7 89 79 0c 89 f0 29 d0 eb 6c 89 f8 88 06 46 eb 54 8b 55 f0 <8b> 3a 42 89 55 f0 89 f9 84 c9 74 d0 0f be d9 89 da 8b 45 08 e8 EIP: [<c016ecf1>] seq_path+0x60/0xca SS:ESP 0068:f588cf00 BUG: unable to handle kernel paging request at virtual address f63d1000 printing eip: c016ecf1 *pdpt = 0000000000003001 *pde = 0000000000ac1067 *pte = 00000000363d1000 Oops: 0000 [#11] SMP DEBUG_PAGEALLOC Modules linked in: CPU: 0 EIP: 0060:[<c016ecf1>] Tainted: G D VLI EFLAGS: 00010297 (2.6.23-rc9 #20) EIP is at seq_path+0x60/0xca eax: f63d0ffe ebx: c2de0f50 ecx: 00000023 edx: f63d0ffe esi: f63d0007 edi: c2dddfb0 ebp: f55e6f18 esp: f55e6f00 ds: 007b es: 007b fs: 00d8 gs: 0033 ss: 0068 Process lvm.static (pid: 1329, ti=f55e6000 task=f7769390 task.ti=f55e6000) Stack: 00000ff9 f59bef70 f63d0ffe c2dddf80 f59bef70 c2dddfb0 f55e6f30 c016ce40 c05d71b5 f6592f38 f59bef70 c2dddfb0 f55e6f70 c016f05d 00000400 b7f77000 f6592f38 f59bef90 00000000 f66c9900 000b7f77 00000073 00000000 00000000 Call Trace: [<c0103c8d>] show_trace_log_lvl+0x19/0x2e [<c0103d3f>] show_stack_log_lvl+0x9d/0xa5 [<c0104083>] show_registers+0x1af/0x281 [<c0104338>] die+0x11a/0x1e8 [<c01138d1>] do_page_fault+0x632/0x715 [<c04e7372>] error_code+0x72/0x80 [<c016ce40>] show_vfsmnt+0x43/0x120 [<c016f05d>] seq_read+0xf1/0x269 [<c0159783>] vfs_read+0x90/0x10e [<c0159f9e>] sys_read+0x3f/0x63 [<c01028e2>] syscall_call+0x7/0xb ======================= Code: f0 ff ff 76 77 eb 7a 8b 55 ec 8b 02 89 c2 8b 4d ec 03 51 0c 89 f7 29 c7 89 79 0c 89 f0 29 d0 eb 6c 89 f8 88 06 46 eb 54 8b 55 f0 <8b> 3a 42 89 55 f0 89 f9 84 c9 74 d0 0f be d9 89 da 8b 45 08 e8 EIP: [<c016ecf1>] seq_path+0x60/0xca SS:ESP 0068:f55e6f00 BUG: unable to handle kernel paging request at virtual address f63d1000 printing eip: c016ecf1 *pdpt = 0000000000003001 *pde = 0000000000ac1067 *pte = 00000000363d1000 Oops: 0000 [#12] SMP DEBUG_PAGEALLOC Modules linked in: CPU: 0 EIP: 0060:[<c016ecf1>] Tainted: G D VLI EFLAGS: 00010297 (2.6.23-rc9 #20) EIP is at seq_path+0x60/0xca eax: f63d0ffe ebx: c2de0f50 ecx: 0000002b edx: f63d0ffe esi: f63d0007 edi: c2dddfb0 ebp: f56e0f18 esp: f56e0f00 ds: 007b es: 007b fs: 00d8 gs: 0033 ss: 0068 Process fsck.ext3 (pid: 1331, ti=f56e0000 task=f7769390 task.ti=f56e0000) Stack: 00000ff9 f59bef70 f63d0ffe c2dddf80 f59bef70 c2dddfb0 f56e0f30 c016ce40 c05d71b5 f67f3f38 f59bef70 c2dddfb0 f56e0f70 c016f05d 00000400 b7dba000 f67f3f38 f59bef90 00000000 f60ce510 000b7dba 00000073 00000000 00000000 Call Trace: [<c0103c8d>] show_trace_log_lvl+0x19/0x2e [<c0103d3f>] show_stack_log_lvl+0x9d/0xa5 [<c0104083>] show_registers+0x1af/0x281 [<c0104338>] die+0x11a/0x1e8 [<c01138d1>] do_page_fault+0x632/0x715 [<c04e7372>] error_code+0x72/0x80 [<c016ce40>] show_vfsmnt+0x43/0x120 [<c016f05d>] seq_read+0xf1/0x269 [<c0159783>] vfs_read+0x90/0x10e [<c0159f9e>] sys_read+0x3f/0x63 [<c01028e2>] syscall_call+0x7/0xb ======================= Code: f0 ff ff 76 77 eb 7a 8b 55 ec 8b 02 89 c2 8b 4d ec 03 51 0c 89 f7 29 c7 89 79 0c 89 f0 29 d0 eb 6c 89 f8 88 06 46 eb 54 8b 55 f0 <8b> 3a 42 89 55 f0 89 f9 84 c9 74 d0 0f be d9 89 da 8b 45 08 e8 EIP: [<c016ecf1>] seq_path+0x60/0xca SS:ESP 0068:f56e0f00 BUG: unable to handle kernel paging request at virtual address f63d1000 printing eip: c016ecf1 *pdpt = 0000000000003001 *pde = 0000000000ac1067 *pte = 00000000363d1000 Oops: 0000 [#13] SMP DEBUG_PAGEALLOC Modules linked in: CPU: 0 EIP: 0060:[<c016ecf1>] Tainted: G D VLI EFLAGS: 00010297 (2.6.23-rc9 #20) EIP is at seq_path+0x60/0xca eax: f63d0ffe ebx: c2de0f50 ecx: 00000026 edx: f63d0ffe esi: f63d0007 edi: c2dddfb0 ebp: f5703f18 esp: f5703f00 ds: 007b es: 007b fs: 00d8 gs: 0033 ss: 0068 Process sulogin (pid: 1335, ti=f5703000 task=f7397390 task.ti=f5703000) Stack: 00000ff9 f59bef70 f63d0ffe c2dddf80 f59bef70 c2dddfb0 f5703f30 c016ce40 c05d71b5 f6058f38 f59bef70 c2dddfb0 f5703f70 c016f05d 00000400 b7fcf000 f6058f38 f59bef90 00000000 f60ce660 000b7fcf 00000073 00000000 00000000 Call Trace: [<c0103c8d>] show_trace_log_lvl+0x19/0x2e [<c0103d3f>] show_stack_log_lvl+0x9d/0xa5 [<c0104083>] show_registers+0x1af/0x281 [<c0104338>] die+0x11a/0x1e8 [<c01138d1>] do_page_fault+0x632/0x715 [<c04e7372>] error_code+0x72/0x80 [<c016ce40>] show_vfsmnt+0x43/0x120 [<c016f05d>] seq_read+0xf1/0x269 [<c0159783>] vfs_read+0x90/0x10e [<c0159f9e>] sys_read+0x3f/0x63 [<c0102876>] sysenter_past_esp+0x5f/0x89 ======================= Code: f0 ff ff 76 77 eb 7a 8b 55 ec 8b 02 89 c2 8b 4d ec 03 51 0c 89 f7 29 c7 89 79 0c 89 f0 29 d0 eb 6c 89 f8 88 06 46 eb 54 8b 55 f0 <8b> 3a 42 89 55 f0 89 f9 84 c9 74 d0 0f be d9 89 da 8b 45 08 e8 EIP: [<c016ecf1>] seq_path+0x60/0xca SS:ESP 0068:f5703f00
From: Linus Torvalds <torvalds@...> Subject: Re: [bug] crash when reading /proc/mounts (was: Re: Linux 2.6.23-rc9 and a heads-up for the 2.6.24 series..) [2]Date: Oct 3, 12:07 pm 2007 On Wed, 3 Oct 2007, Ingo Molnar wrote: > > > - and as a result you get an exception on the *next* page: > > > > BUG: unable to handle kernel paging request at virtual address f2a40000 > > Hm, are you sure? This is a CONFIG_DEBUG_PAGEALLOC=y kernel, so even a > slight overrun of a non-NIL terminated string (as suspected by Al) could > run into a non-mapped kernel page. (which would indicate not a compiler > bug but use-after free) I am 100% sure. I can look at the disassembly, and point to the fact that your Oops happens on code that is simply totally bogus. That string is NUL-terminated, which is why the access is to f2a3fffe in the first place: we explicitly asked d_path() to create us a string at the end of the page (it creates them backwards), so the path string has a NUL a the end at address f2a3ffff, which is exactly what we'd expect. Your compiler really does seem to be total crap. Do a "make fs/seq_file.s" (and make sure you *disable* CONFIG_DEBUG_INFO first, otherwise the result will be unreadable crud), and look at seq_path(). It's going to be more readable than the disassembly that I got through gdb, but I bet it's going to show it even more clearly. > i just found another config under which i get similar crashes, config > attached. One common theme is CONFIG_DEBUG_FS and DEBUG_PAGEALLOC - and > CONFIG_MAC80211_DEBUGFS is not enabled in this one so it's off the hook > i think. (the crashes are attached below) .. of *course* DEBUG_PAGEALLOC is going to be implied in the problem. If you don't have DEBUG_PAGEALLOC, you'll never see this, because you'll have all pages mapped, and the only page that it could happen to is the very last page in memory, and you'll never hit that one in practice. > (my serial log on this box goes back about 6 months, and that alone > shows more than 3500 successful kernel bootups on that particular > testsystem, each kernel built by this compiler - and there's another > testsystem that i use even more frequently. Despite that, a compiler bug > is still possible of course.) It's not about "possible". It's a fact. Send me your "seq_file.s" output for that function to be sure - it *could* be memory corruption that changes a "movb" into a "movl", and maybe the compiler did a byte move to start with, but quite frankly, that is such a remote possibility that I don't consider it realistic. > BUG: unable to handle kernel paging request at virtual address f6207000 > printing eip: > c016ecf1 > *pdpt = 0000000000003001 > *pde = 0000000000ac1067 > *pte = 0000000036207000 > Oops: 0000 [#1] > SMP DEBUG_PAGEALLOC > Modules linked in: > CPU: 1 > EIP: 0060:[<c016ecf1>] Not tainted VLI > EFLAGS: 00010297 (2.6.23-rc9 #20) > EIP is at seq_path+0x60/0xca > eax: f6206ffe ebx: c2de0f50 ecx: 0000002b edx: f6206ffe > esi: f6206007 edi: c2dddfb0 ebp: f6503f18 esp: f6503f00 > ds: 007b es: 007b fs: 00d8 gs: 0033 ss: 0068 > Process awk (pid: 1160, ti=f6503000 task=f73a8390 task.ti=f6503000) > Stack: 00000ff9 f6e5cf70 f6206ffe c2dddf80 f6e5cf70 c2dddfb0 f6503f30 c016ce40 > c05d71b5 f6730f38 f6e5cf70 c2dddfb0 f6503f70 c016f05d 00000400 08098f18 > f6730f38 f6e5cf90 00000000 0806bc2e 00000003 08094320 f6503fb0 00000000 > Call Trace: > [<c0103c8d>] show_trace_log_lvl+0x19/0x2e > [<c0103d3f>] show_stack_log_lvl+0x9d/0xa5 > [<c0104083>] show_registers+0x1af/0x281 > [<c0104338>] die+0x11a/0x1e8 > [<c01138d1>] do_page_fault+0x632/0x715 > [<c04e7372>] error_code+0x72/0x80 > [<c016ce40>] show_vfsmnt+0x43/0x120 > [<c016f05d>] seq_read+0xf1/0x269 > [<c0159783>] vfs_read+0x90/0x10e > [<c0159f9e>] sys_read+0x3f/0x63 > [<c0102876>] sysenter_past_esp+0x5f/0x89 > ======================= > Code: f0 ff ff 76 77 eb 7a 8b 55 ec 8b 02 89 c2 8b 4d ec 03 51 0c 89 f7 29 c7 89 79 0c 89 f0 29 d0 eb 6c 89 f8 88 06 46 eb 54 8b 55 f0 <8b> 3a 42 89 55 f0 89 f9 84 c9 74 d0 0f be d9 89 da 8b 45 08 e8 This looks like *exactly* the same thing, except you're in "show_vfsmnt()" this time. Again: the oopsing instruction (8b 3a) is "movl". And again, the address is f6206ffe, and it oopses because the (incorrect) 32-bit access will touch the next page, so you get a paging request fault on f6207000 - which is some *totally* different allocation, and one that isn't mapped because it doesn't exist, so DEBUG_PAGE_ALLOC has removed it. .. and again: exact same thing. > EIP: [<c016ecf1>] seq_path+0x60/0xca SS:ESP 0068:f6503f00 > BUG: unable to handle kernel paging request at virtual address f63d1000 > eax: f63d0ffe ebx: c2de0f50 ecx: 0000002c edx: f63d0ffe > Code: .. <8b> 3a .. .. and again: > EIP: [<c016ecf1>] seq_path+0x60/0xca SS:ESP 0068:f6367f00 > BUG: unable to handle kernel paging request at virtual address f63d1000 > eax: f63d0ffe ebx: c2de0f50 ecx: 0000002c edx: f63d0ffe > Code: .. <8b> 3a .. And I can even tell you exactly what path it is: - it's going to be the first path that shows up in the path list, since the seq_file interface will re-use that page, so if you hit it, you'll hit it on the first entry (unless seq_file has *lots* of data and needs more than a single-page allocation) - it must be a single-byte path, because otheriwse you'd have oopsed one byte earlier (you'd have oopsed already on access .....ffd, which would *also* overflow to the next page - ergo, it's "/". but that doesn't really even matter. Disassembling the code stream from your oops shows clearly that it's a 32-bit access. No ifs, buts or maybes about it. If you don't trust the gdb disassembly (I didn't, entirely, so I looked it up) byte 0x8b is "mov Gv,Ev" in the Intel opcode map. A 8-bit move would have been 0x8a. Linus -

From: Linus Torvalds <torvalds@...>
Subject: Re: [bug] crash when reading /proc/mounts (was: Re: Linux 2.6.23-rc9 and a heads-up for the 2.6.24 series..)
 [2]Date: Oct 3, 11:47 am 2007

On Wed, 3 Oct 2007, Linus Torvalds wrote:
> 
>  - the bug happens on this:
> 
> 	char c = *p++;
> 
>  - which has been compiled into
> 
> 	8b 3a		mov    (%edx),%edi

Btw, this definitely doesn't happen for me, either on x86-64 or plain x86. 
The x86 thing I tested was Fedora 8 testing (ie not even some stable 
setup), so I wonder what experimental compiler you have.

Your compiler generates

	movl    -16(%ebp),%edx
	movl    (%edx),%edi		/* this is _totally_ bogus! */
	incl    %edx
	movl    %edx,-16(%ebp)
	movl    %edi,%ecx
	testb   %cl,%cl
	je      ...

while I get (gcc version 4.1.2 20070925 (Red Hat 4.1.2-28)):

        movl    -16(%ebp), %eax # p,
        movzbl  (%eax), %edi    #, c	/* not bogus! */
        movl    %edi, %edx      # c,
        testb   %dl, %dl        #
        je      .L64    #,
        incl    %eax    #
        movsbl  %dl,%ebx        #, D.12414
        movl    %eax, -16(%ebp) #, p

where the difference (apart from doing the increment differently and 
different register allocation) is that I have a "movzbl" (correct), while 
you have a "movl" (pure and utter crap).

I *suspect* that the compiler bug is along the lines of:
 (a) start off with movzbl
 (b) notice that the higher bits don't matter, because nobody subsequently 
     uses them
 (c) turn the thing into just a byte move. 
 (d) make the totally incorrect optimization of using a full 32-bit move 
     in order to avoid a partial register access stall

and the thing is, that final optimization can actually speed things up 
(although it can also slow things down for any access that crosses a cache 
sector boundary - 8/16 bytes), but it's seriously bogus, exactly because 
it can cause an invalid access to the three next bytes that may not even 
exist.

			Linus
-

From: Ingo Molnar <mingo@...> Subject: Re: [bug] crash when reading /proc/mounts (was: Re: Linux 2.6.23-rc9 and a heads-up for the 2.6.24 series..) [2]Date: Oct 3, 11:49 am 2007 * Linus Torvalds <torvalds@linux-foundation.org> wrote: > Your compiler generates > > movl -16(%ebp),%edx > movl (%edx),%edi /* this is _totally_ bogus! */ > incl %edx > movl %edx,-16(%ebp) > movl %edi,%ecx > testb %cl,%cl > je ... ah, ok. > while I get (gcc version 4.1.2 20070925 (Red Hat 4.1.2-28)): > > movl -16(%ebp), %eax # p, > movzbl (%eax), %edi #, c /* not bogus! */ > movl %edi, %edx # c, > testb %dl, %dl # > je .L64 #, > incl %eax # > movsbl %dl,%ebx #, D.12414 > movl %eax, -16(%ebp) #, p > > where the difference (apart from doing the increment differently and > different register allocation) is that I have a "movzbl" (correct), > while you have a "movl" (pure and utter crap). i'll try with another compiler in a minute. Ingo -
From: Ingo Molnar <mingo@...> Subject: Re: [bug] crash when reading /proc/mounts (was: Re: Linux 2.6.23-rc9 and a heads-up for the 2.6.24 series..) [2]Date: Oct 3, 12:07 pm 2007 * Ingo Molnar <mingo@elte.hu> wrote: > > * Linus Torvalds <torvalds@linux-foundation.org> wrote: > > > Your compiler generates > > > > movl -16(%ebp),%edx > > movl (%edx),%edi /* this is _totally_ bogus! */ > > incl %edx > > movl %edx,-16(%ebp) > > movl %edi,%ecx > > testb %cl,%cl > > je ... > > ah, ok. > > > while I get (gcc version 4.1.2 20070925 (Red Hat 4.1.2-28)): > > > > movl -16(%ebp), %eax # p, > > movzbl (%eax), %edi #, c /* not bogus! */ > > movl %edi, %edx # c, > > testb %dl, %dl # > > je .L64 #, > > incl %eax # > > movsbl %dl,%ebx #, D.12414 > > movl %eax, -16(%ebp) #, p > > > > where the difference (apart from doing the increment differently and > > different register allocation) is that I have a "movzbl" (correct), > > while you have a "movl" (pure and utter crap). > > i'll try with another compiler in a minute. i just tried: gcc version 4.1.2 20070626 (Red Hat 4.1.2-13) and indeed the crash is gone. So you are completely right, it's a compiler bug in 4.0.2 (it's vanilla gcc 4.0.2 built by me, not a distro compiler). It should not affect normal kernels too much this bug needs CONFIG_DEBUG_PAGEALLOC. (or it needs a _really_ unlucky allocation being at the far upper end of RAM - but those are usually taken up by boot-time allocations anyway). i also just re-tried the other config as well - and crash is gone there too. (not surprisingly) Ingo -

From: Alan Cox <alan@...>
Subject: Re: [bug] crash when reading /proc/mounts (was: Re: Linux 2.6.23-rc9 and a heads-up for the 2.6.24 series..)
 [2]Date: Oct 3, 11:51 am 2007

> and btw, there is no question what-so-ever about whether your compiler 
> might be doing a legal optimization - the compiler really is wrong, and is 

Pedant: valid. Almost all optimizations are legal, nobody has yet written
laws about compilers. Sorry but I'm forever fixing misuse of the word
"illegal" in printks, docs and the like and it gets annoying after a bit.

> total shit. You need to make a gcc bug-report. Because this is not a 
> question of "the standard is ambiguous", 

Agreed - the standard is not ambiguous here. (For reference the standard
says that a valid pointer must point at an object _OR_ one past the end
of the object (in the latter case it is not dereferencable)). So its a
compiler bug.

Alan


-

From: Linus Torvalds <torvalds@...> Subject: Re: [bug] crash when reading /proc/mounts (was: Re: Linux 2.6.23-rc9 and a heads-up for the 2.6.24 series..) [2]Date: Oct 3, 12:09 pm 2007 On Wed, 3 Oct 2007, Alan Cox wrote: > > > and btw, there is no question what-so-ever about whether your compiler > > might be doing a legal optimization - the compiler really is wrong, and is > > Pedant: valid. Almost all optimizations are legal, nobody has yet written > laws about compilers. Sorry but I'm forever fixing misuse of the word > "illegal" in printks, docs and the like and it gets annoying after a bit. Heh. When I'm ruler of the universe, it *will* be illegal. I'm just getting a bit ahead of myself. Linus -
From: Jan Engelhardt <jengelh@...> Subject: Re: [bug] crash when reading /proc/mounts (was: Re: Linux 2.6.23-rc9 and a heads-up for the 2.6.24 series..) [2]Date: Oct 3, 12:25 pm 2007 On Oct 3 2007 09:09, Linus Torvalds wrote: >On Wed, 3 Oct 2007, Alan Cox wrote: >> >> > and btw, there is no question what-so-ever about whether your compiler >> > might be doing a legal optimization - the compiler really is wrong, and is >> >> Pedant: valid. Almost all optimizations are legal, nobody has yet written >> laws about compilers. Sorry but I'm forever fixing misuse of the word >> "illegal" in printks, docs and the like and it gets annoying after a bit. > >Heh. > >When I'm ruler of the universe, it *will* be illegal. I'm just getting a >bit ahead of myself. Any time frame when that will happen? -
From: Linus Torvalds <torvalds@...> Subject: Re: [bug] crash when reading /proc/mounts (was: Re: Linux 2.6.23-rc9 and a heads-up for the 2.6.24 series..) [2]Date: Oct 3, 1:07 pm 2007 On Wed, 3 Oct 2007, Jan Engelhardt wrote: > > >When I'm ruler of the universe, it *will* be illegal. I'm just getting a > >bit ahead of myself. > > Any time frame when that will happen? I'm working on it, I'm working on it. I'm just as frustrated as you are. It turns out to be a non-trivial problem. Linus -


Related links:


Source URL:
http://kerneltrap.org/Linux/Compiler_Optimization_Bugs_and_World_Domination