Re: [Bug #11342] Linux 2.6.27-rc3: kernel BUG at mm/vmalloc.c - bisected

Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]
From: Linus Torvalds
Date: Tuesday, August 26, 2008 - 12:40 pm

On Tue, 26 Aug 2008, Mike Travis wrote:

Well, even just scripts/checkstack.pl is quite relevant.

The fact is, anything with a stack footprint of more than a hundred bytes 
is suspect. We _do_ have a lot of cases of several hundred bytes, and some 
of them are even very intentional.

For an example of _intentional_ and valid large stacks, look at 
do_sys_poll and do_select. They both have a big stack footprint in a 
normal kernel, and that's on purpose - it's not pretty, but they are very 
common and performance-sensitive functions, and using a big stack allows 
some basic allocations to be much cheaper by default.

Same goes for early_printk(), although I don't think the reasons are 
really very strong in that case.

Sadly, while those functions are _fairly_ high up, they aren't at the top, 
and we do have a lot of other functions that have huge stack footprints 
for totally bogus reasons. But the intentional ones are at least in the 
top ten.

But the kernel that Alan had problems with was different. The 
_intentional_ ones were way down in the noise.  do_sys_poll wasn't in the 
top ten, it was barely even in the top 50! (It was in fact #49, to be 
exact).

So look at the top ten in my kernel:

     1  ide_generic_init [vmlinux]:             1384
     2  idefloppy_ioctl [vmlinux]:              1208
     3  e1000_check_options [vmlinux]:  	1152
     4  do_sys_poll [vmlinux]:          	904
     5  ide_floppy_get_capacity [vmlinux]:      872
     6  do_select [vmlinux]:                    744
     7  early_printk [vmlinux]:         	720
     8  do_task_stat [vmlinux]:         	680
     9  mmc_ioctl [vmlinux]:                    648
    10  elf_kcore_store_hdr [vmlinux]:  	576

.. and in Alan's kernel:

     1  smp_call_function_mask [vmlinux]:       2736
     2  __build_sched_domains [vmlinux]:        2232
     3  setup_IO_APIC_irq [vmlinux]:            1616
     4  arch_setup_ht_irq [vmlinux]:            1600
     5  arch_setup_msi_irq [vmlinux]:   	1600
     6  __assign_irq_vector [vmlinux]:  	1592
     7  move_task_off_dead_cpu [vmlinux]:       1592
     8  tick_handle_oneshot_broadcast [vmlinux]:1544
     9  store_scaling_governor [vmlinux]:       1376
    10  cpuset_write_resmask [vmlinux]:		1360

That's a big difference. The top #1 in my kernel would just _barely_ be in 
the top 10 in Alan's kernel (he doesn't have it at all, because he didn't 
compile the drives I did into the kernel).

And the top three in my kernel are just because of crap code. That 
"e1000_check_options" thing is there just because it creates multiple 
"struct e1000_option" structures. I wrote an ugly but totally trivial 
patch to get it down to ~600 bytes, and it would be less if I had bothered 
to waste any more time on it.

The others are similar issues of "people just didn't think".

But look at the top ones in Alan's kernel. Not only are they _much_ bigger 
than the top ones in a sane kernel, they are _all_ due to cpumask_t, I 
think.

			Linus
--
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

Messages in current thread:
2.6.27-rc4-git1: Reported regressions from 2.6.26, Rafael J. Wysocki, (Sat Aug 23, 11:07 am)
[Bug #11141] no battery or DC status - Dell i1501, Rafael J. Wysocki, (Sat Aug 23, 11:07 am)
[Bug #11191] 2.6.26-git8: spinlock lockup in c1e_idle(), Rafael J. Wysocki, (Sat Aug 23, 11:10 am)
[Bug #11207] VolanoMark regression with 2.6.27-rc1, Rafael J. Wysocki, (Sat Aug 23, 11:10 am)
[Bug #11220] Screen stays black after resume, Rafael J. Wysocki, (Sat Aug 23, 11:10 am)
[Bug #11219] KVM modules break emergency reboot, Rafael J. Wysocki, (Sat Aug 23, 11:10 am)
[Bug #11215] INFO: possible recursive locking detected ps2 ..., Rafael J. Wysocki, (Sat Aug 23, 11:10 am)
[Bug #11210] libata badness, Rafael J. Wysocki, (Sat Aug 23, 11:10 am)
[Bug #11209] 2.6.27-rc1 process time accounting, Rafael J. Wysocki, (Sat Aug 23, 11:10 am)
[Bug #11237] corrupt PMD after resume, Rafael J. Wysocki, (Sat Aug 23, 11:10 am)
[Bug #11224] Only three cores found on quad-core machine., Rafael J. Wysocki, (Sat Aug 23, 11:10 am)
[Bug #11230] Kconfig no longer outputs a .config with fres ..., Rafael J. Wysocki, (Sat Aug 23, 11:10 am)
[Bug #11271] BUG: fealnx in 2.6.27-rc1, Rafael J. Wysocki, (Sat Aug 23, 11:10 am)
[Bug #11264] Invalid op opcode in kernel/workqueue, Rafael J. Wysocki, (Sat Aug 23, 11:10 am)
[Bug #11254] KVM: fix userspace ABI breakage, Rafael J. Wysocki, (Sat Aug 23, 11:10 am)
[Bug #11282] Please fix x86 defconfig regression, Rafael J. Wysocki, (Sat Aug 23, 11:10 am)
[Bug #11279] 2.6.27-rc0 Power Bugs with HP/Compaq Laptops, Rafael J. Wysocki, (Sat Aug 23, 11:10 am)
[Bug #11276] build error: CONFIG_OPTIMIZE_INLINING=y cause ..., Rafael J. Wysocki, (Sat Aug 23, 11:10 am)
[Bug #11272] BUG: parport_serial in 2.6.27-rc1 for NetMos ..., Rafael J. Wysocki, (Sat Aug 23, 11:10 am)
[Bug #11334] myri10ge: use ioremap_wc: compilation failure ..., Rafael J. Wysocki, (Sat Aug 23, 11:10 am)
[Bug #11336] 2.6.27-rc2:stall while mounting root fs, Rafael J. Wysocki, (Sat Aug 23, 11:10 am)
[Bug #11335] 2.6.27-rc2-git5 BUG: unable to handle kernel ..., Rafael J. Wysocki, (Sat Aug 23, 11:10 am)
[Bug #11308] tbench regression on each kernel release from ..., Rafael J. Wysocki, (Sat Aug 23, 11:10 am)
[Bug #11342] Linux 2.6.27-rc3: kernel BUG at mm/vmalloc.c ..., Rafael J. Wysocki, (Sat Aug 23, 11:10 am)
[Bug #11340] LTP overnight run resulted in unusable box, Rafael J. Wysocki, (Sat Aug 23, 11:10 am)
[Bug #11343] SATA Cold Boot Problems with 2.6.27-rc[23] on ..., Rafael J. Wysocki, (Sat Aug 23, 11:10 am)
[Bug #11358] net: forcedeth call restore mac addr in nv_sh ..., Rafael J. Wysocki, (Sat Aug 23, 11:10 am)
[Bug #11357] Can not boot up with zd1211rw USB-Wlan Stick, Rafael J. Wysocki, (Sat Aug 23, 11:10 am)
[Bug #11356] Linux 2.6.27-rc3 - build failure: undefined r ..., Rafael J. Wysocki, (Sat Aug 23, 11:10 am)
[Bug #11355] Regression in 2.6.27-rc2 when cross-building ..., Rafael J. Wysocki, (Sat Aug 23, 11:10 am)
[Bug #11354] AMD Elan regression with 2.6.27-rc3, Rafael J. Wysocki, (Sat Aug 23, 11:10 am)
[Bug #11380] lockdep warning: cpu_add_remove_lock at:cpu_m ..., Rafael J. Wysocki, (Sat Aug 23, 11:10 am)
[Bug #11379] char/tpm: tpm_infineon no longer loaded for H ..., Rafael J. Wysocki, (Sat Aug 23, 11:10 am)
[Bug #11361] my servers with nvidia mcp55 nic don't work w ..., Rafael J. Wysocki, (Sat Aug 23, 11:10 am)
[Bug #11360] mpc8xxx_wdt.c doesn't build modular, Rafael J. Wysocki, (Sat Aug 23, 11:10 am)
[Bug #11401] pktcdvd: BUG, NULL pointer dereference in pkt ..., Rafael J. Wysocki, (Sat Aug 23, 11:10 am)
[Bug #11398] hda_intel: IRQ timing workaround is activated ..., Rafael J. Wysocki, (Sat Aug 23, 11:10 am)
[Bug #11388] 2.6.27-rc3 warns about MTRR range; only 3 of ..., Rafael J. Wysocki, (Sat Aug 23, 11:10 am)
[Bug #11382] e1000e: 2.6.27-rc1 corrupts EEPROM/NVM, Rafael J. Wysocki, (Sat Aug 23, 11:10 am)
[Bug #11405] 2.6.27-rc3 segfault on cold boot; not on warm ..., Rafael J. Wysocki, (Sat Aug 23, 11:10 am)
[Bug #11403] 2.6.27-rc2 USB suspend regression, Rafael J. Wysocki, (Sat Aug 23, 11:10 am)
[Bug #11402] skbuff bug?, Rafael J. Wysocki, (Sat Aug 23, 11:10 am)
[Bug #11404] BUG: in 2.6.23-rc3-git7 in do_cciss_intr, Rafael J. Wysocki, (Sat Aug 23, 11:10 am)
[Bug #11413] get_rtc_time() triggers NMI watchdog in hpet_ ..., Rafael J. Wysocki, (Sat Aug 23, 11:10 am)
[Bug #11409] build issue #564 for v2.6.27-rc4 : undefined ..., Rafael J. Wysocki, (Sat Aug 23, 11:10 am)
[Bug #11407] suspend: unable to handle kernel paging request, Rafael J. Wysocki, (Sat Aug 23, 11:10 am)
[Bug #11410] SLUB list_lock vs obj_hash.lock..., Rafael J. Wysocki, (Sat Aug 23, 11:10 am)
[Bug #11414] Random crashes with 2.6.27-rc3 on PPC, Rafael J. Wysocki, (Sat Aug 23, 11:10 am)
Re: [Bug #11210] libata badness, Jeff Garzik, (Sat Aug 23, 3:23 pm)
Re: [Bug #11271] BUG: fealnx in 2.6.27-rc1, Jeff Garzik, (Sat Aug 23, 3:26 pm)
Re: 2.6.27-rc4-git1: Reported regressions from 2.6.26, Linus Torvalds, (Sun Aug 24, 10:48 am)
Re: 2.6.27-rc4-git1: Reported regressions from 2.6.26, Linus Torvalds, (Sun Aug 24, 11:03 am)
Re: 2.6.27-rc4-git1: Reported regressions from 2.6.26, Linus Torvalds, (Sun Aug 24, 11:34 am)
Re: 2.6.27-rc4-git1: Reported regressions from 2.6.26, Vegard Nossum, (Sun Aug 24, 11:43 am)
Re: 2.6.27-rc4-git1: Reported regressions from 2.6.26, Linus Torvalds, (Sun Aug 24, 11:52 am)
Re: 2.6.27-rc4-git1: Reported regressions from 2.6.26, Linus Torvalds, (Sun Aug 24, 11:58 am)
Re: 2.6.27-rc4-git1: Reported regressions from 2.6.26, Linus Torvalds, (Sun Aug 24, 12:03 pm)
Re: 2.6.27-rc4-git1: Reported regressions from 2.6.26, Adrian Bunk, (Sun Aug 24, 12:23 pm)
Re: 2.6.27-rc4-git1: Reported regressions from 2.6.26, David Greaves, (Sun Aug 24, 12:23 pm)
Re: [Bug #11254] KVM: fix userspace ABI breakage, Adrian Bunk, (Sun Aug 24, 12:27 pm)
Re: [Bug #11210] libata badness, Rafael J. Wysocki, (Sun Aug 24, 2:04 pm)
Re: [Bug #11334] myri10ge: use ioremap_wc: compilation fai ..., Rafael J. Wysocki, (Sun Aug 24, 2:05 pm)
Re: [Bug #11356] Linux 2.6.27-rc3 - build failure: undefin ..., Rafael J. Wysocki, (Sun Aug 24, 2:10 pm)
Re: [Bug #11379] char/tpm: tpm_infineon no longer loaded f ..., Rafael J. Wysocki, (Sun Aug 24, 2:12 pm)
Re: [Bug #11355] Regression in 2.6.27-rc2 when cross-build ..., Rafael J. Wysocki, (Sun Aug 24, 2:34 pm)
Re: 2.6.27-rc4-git1: Reported regressions from 2.6.26, Rafael J. Wysocki, (Sun Aug 24, 2:40 pm)
Re: 2.6.27-rc4-git1: Reported regressions from 2.6.26, H. Peter Anvin, (Sun Aug 24, 5:16 pm)
Re: 2.6.27-rc4-git1: Reported regressions from 2.6.26, Benjamin Herrenschmidt, (Sun Aug 24, 5:48 pm)
Re: 2.6.27-rc4-git1: Reported regressions from 2.6.26, Linus Torvalds, (Sun Aug 24, 5:51 pm)
Re: [Bug #11254] KVM: fix userspace ABI breakage, Avi Kivity, (Mon Aug 25, 3:23 am)
Re: 2.6.27-rc4-git1: Reported regressions from 2.6.26, Rafael J. Wysocki, (Mon Aug 25, 4:40 am)
Re: 2.6.27-rc4-git1: Reported regressions from 2.6.26, Daniel J Blueman, (Mon Aug 25, 6:03 am)
Re: [Bug #11342] Linux 2.6.27-rc3: kernel BUG at mm/vmallo ..., Christoph Lameter, (Mon Aug 25, 3:07 pm)
Re: [Bug #11342] Linux 2.6.27-rc3: kernel BUG at mm/vmallo ..., Linus Torvalds, (Tue Aug 26, 12:40 pm)
Re: [Bug #11342] Linux 2.6.27-rc3: kernel BUG at mm/vmallo ..., Bernd Petrovitsch, (Wed Aug 27, 1:34 am)
Re: [Bug #11342] Linux 2.6.27-rc3: kernel BUG at mm/vmallo ..., Bernd Petrovitsch, (Wed Aug 27, 1:44 am)
Re: [Bug #11342] Linux 2.6.27-rc3: kernel BUG at mm/vmallo ..., Bernd Petrovitsch, (Wed Aug 27, 2:00 am)
Re: [Bug #11342] Linux 2.6.27-rc3: kernel BUG at mm/vmallo ..., Bernd Petrovitsch, (Wed Aug 27, 6:17 am)
Re: [Bug #11342] Linux 2.6.27-rc3: kernel BUG at mm/vmallo ..., Bernd Petrovitsch, (Wed Aug 27, 9:38 am)
Re: [Bug #11342] Linux 2.6.27-rc3: kernel BUG at mm/vmallo ..., Bernd Petrovitsch, (Wed Aug 27, 12:30 pm)
Re: 2.6.27-rc4-git1: Reported regressions from 2.6.26, Peter Osterlund, (Wed Aug 27, 1:17 pm)
Re: 2.6.27-rc4-git1: Reported regressions from 2.6.26, Linus Torvalds, (Wed Aug 27, 1:40 pm)
Re: 2.6.27-rc4-git1: Reported regressions from 2.6.26, Linus Torvalds, (Wed Aug 27, 1:45 pm)
Re: 2.6.27-rc4-git1: Reported regressions from 2.6.26, Linus Torvalds, (Wed Aug 27, 3:38 pm)
Re: 2.6.27-rc4-git1: Reported regressions from 2.6.26, David Miller, (Wed Aug 27, 3:43 pm)
Re: 2.6.27-rc4-git1: Reported regressions from 2.6.26, Alexey Dobriyan, (Wed Aug 27, 3:45 pm)
Re: 2.6.27-rc4-git1: Reported regressions from 2.6.26, Linus Torvalds, (Wed Aug 27, 4:00 pm)
Re: 2.6.27-rc4-git1: Reported regressions from 2.6.26, Linus Torvalds, (Wed Aug 27, 4:12 pm)
Re: 2.6.27-rc4-git1: Reported regressions from 2.6.26, Linus Torvalds, (Wed Aug 27, 5:35 pm)
Re: 2.6.27-rc4-git1: Reported regressions from 2.6.26, Christoph Hellwig, (Thu Aug 28, 6:52 am)
Subject: [RFC 1/1] cpumask: Provide new cpumask API, Mike Travis, (Thu Sep 25, 1:59 pm)