Add caller information so that /proc/vmallocinfo shows where the allocation request for a slice of vmalloc memory originated. Result in output like this: 0xffffc20000000000-0xffffc20000801000 8392704 alloc_large_system_hash+0x127/0x246 pages=2048 vmalloc vpages 0xffffc20000801000-0xffffc20000806000 20480 alloc_large_system_hash+0x127/0x246 pages=4 vmalloc 0xffffc20000806000-0xffffc20000c07000 4198400 alloc_large_system_hash+0x127/0x246 pages=1024 vmalloc vpages 0xffffc20000c07000-0xffffc20000c0a000 12288 alloc_large_system_hash+0x127/0x246 pages=2 vmalloc 0xffffc20000c0a000-0xffffc20000c0c000 8192 acpi_os_map_memory+0x13/0x1c phys=cff68000 ioremap 0xffffc20000c0c000-0xffffc20000c0f000 12288 acpi_os_map_memory+0x13/0x1c phys=cff64000 ioremap 0xffffc20000c10000-0xffffc20000c15000 20480 acpi_os_map_memory+0x13/0x1c phys=cff65000 ioremap 0xffffc20000c16000-0xffffc20000c18000 8192 acpi_os_map_memory+0x13/0x1c phys=cff69000 ioremap 0xffffc20000c18000-0xffffc20000c1a000 8192 acpi_os_map_memory+0x13/0x1c phys=fed1f000 ioremap 0xffffc20000c1a000-0xffffc20000c1c000 8192 acpi_os_map_memory+0x13/0x1c phys=cff68000 ioremap 0xffffc20000c1c000-0xffffc20000c1e000 8192 acpi_os_map_memory+0x13/0x1c phys=cff68000 ioremap 0xffffc20000c1e000-0xffffc20000c20000 8192 acpi_os_map_memory+0x13/0x1c phys=cff68000 ioremap 0xffffc20000c20000-0xffffc20000c22000 8192 acpi_os_map_memory+0x13/0x1c phys=cff68000 ioremap 0xffffc20000c22000-0xffffc20000c24000 8192 acpi_os_map_memory+0x13/0x1c phys=cff68000 ioremap 0xffffc20000c24000-0xffffc20000c26000 8192 acpi_os_map_memory+0x13/0x1c phys=e0081000 ioremap 0xffffc20000c26000-0xffffc20000c28000 8192 acpi_os_map_memory+0x13/0x1c phys=e0080000 ioremap 0xffffc20000c28000-0xffffc20000c2d000 20480 alloc_large_system_hash+0x127/0x246 pages=4 vmalloc 0xffffc20000c2d000-0xffffc20000c31000 16384 tcp_init+0xd5/0x31c pages=3 vmalloc 0xffffc20000c31000-0xffffc20000c34000 12288 alloc_large_system_hash+0x127/0x246 pages=2 ...
please use one simple save_stack_trace() instead of polluting a dozen Ingo --
save_stack_trace() depends on CONFIG_STACKTRACE which is only available when debugging is compiled it. I was more thinking about this as a generally available feature. --
then make STACKTRACE available generally via the patch below. Ingo -------------------------------------------> Subject: debugging: always enable stacktrace From: Ingo Molnar <mingo@elte.hu> Date: Fri Mar 21 11:48:32 CET 2008 Signed-off-by: Ingo Molnar <mingo@elte.hu> --- lib/Kconfig.debug | 1 - 1 file changed, 1 deletion(-) Index: linux-x86.q/lib/Kconfig.debug =================================================================== --- linux-x86.q.orig/lib/Kconfig.debug +++ linux-x86.q/lib/Kconfig.debug @@ -387,7 +387,6 @@ config DEBUG_LOCKING_API_SELFTESTS config STACKTRACE bool - depends on DEBUG_KERNEL depends on STACKTRACE_SUPPORT config DEBUG_KOBJECT --
How do I figure out which nesting level to display if we'd do this? --
the best i found for lockdep was to include a fair number of them, and to skip the top 3. struct vm_area that vmalloc uses isnt space-critical, so 4-8 entries with a 3 skip would be quite ok. (but can be more than that as well) Ingo --
STACKTRACE depends on STACKTRACE_SUPPORT which is not available on all arches? alpha blackfin ia64 etc are missing it? I thought there were also issues on x86 with optimizations leading to weird stacktraces? --
at most there can be extra stack trace entries. This is for debugging, so if someone wants exact stacktraces, FRAME_POINTERS will certainly improve them. Ingo --
as long as the new code in question is properly ifdef-ed, making it rely on STACKTRACE sounds fine. i'll open an item in our Blackfin tracker to add support for it. -mike --
i _specifically_ objected to the uglification that this patch brings
with itself to the modified arch/x86 files (see the diff excerpt below),
in:
http://lkml.org/lkml/2008/3/19/450
i pointed out how it should be done _much cleaner_ (and much smaller -
only a single patch needed) via stack-trace, without changing a dozen
architectures, and even gave a patch to make it all easier for you:
http://lkml.org/lkml/2008/3/19/568
http://lkml.org/lkml/2008/3/21/88
in fact, a stacktrace printout is much more informative as well to
users, than a punny __builtin_return_address(0)!
but you did not reply to my objections in substance, hence i considered
the issue closed - but you apparently went ahead without addressing my
concerns (which are rather obvious to anyone doing debug code) and now
this ugly code is upstream.
If lockdep can get stacktrace samples from all around the kernel without
adding "caller" info parameters to widely used APIs, then the MM is
evidently able to do it too. _Saving_ a stacktrace is relatively fast
[printing it to the console is what is slow], and vmalloc() is an utter
slowpath anyway [and 1 million file descriptors does not count as a
fastpath].
If performance is of any concern then make it dependent on
CONFIG_DEBUG_VM or whatever debug switch in the MM - that will be
_faster_ in the default case than the current
pass-parameter-deep-down-the-arch crap you've pushed here. I dont
remember the last time i genuinely needed the allocation site of a
vmalloc().
I any case, do _NOT_ pollute any architectures with stack debugging
hacks (and that holds for future similar patches too), that's why we
wrote stacktrace. This needs to be reverted or fixed properly.
--
Sorry lost track of this issue. Adding stracktrace support is not a trivial thing and will change the basic handling of vmallocinfo. Not sure if stacktrace support can be enabled without a penalty on various platforms. Doesnt this require stackframes to be formatted in a certain way? --
On Tue, 29 Apr 2008 10:08:29 -0700 (PDT) it doesn't. --
Hmmm... Why do we have CONFIG_FRAMEPOINTER then? The current implementation of vmalloc_caller() follows what we have done with kmalloc_track_caller. Its low overhead and always on. It would be great if we could have stacktrace support both for kmalloc and vmalloc in the same way also with low overhead but I think following a backtrace requires much more than simply storing the caller address. A mechanism like that would require an explicit kernel CONFIG option. A year or so ago we had patches to implement stacktraces in the slab allocators but they were not merged due to various arch specific issues with backtraces. We could dump the offending x86_64 pieces. Some detail of what /proc/vmallocinfo would be lost then. --
On Tue, 29 Apr 2008 11:49:54 -0700 (PDT) stacktraces aren't entirely free, the cost is O(nr of modules) unfortunately ;( --
Well so we display out of whack backtraces? There are also issues on platforms that do not have a stack in the classic sense (rotating register file on IA64 and Sparc64 f.e.). Determining a backtrace can be very The current implementation /proc/vmallocinfo avoids these issues and with just one caller address it can print one line per vmalloc request. --
I think that's the key question here whether we need to enable this on production systems? If yes, why? If it's just a debugging aid, then I see Ingo's point of save_stack_trace(); otherwise the low-overhead __builtin_return_address() makes more sense. And btw, why is this new file not in /sys/kernel....? --
Actually, this is vmalloc() so why do we even care? If there are callers in the tree that use vmalloc() for performance sensitive stuff, they ought to be converted to kmalloc() anyway, no? --
they have to solve that for kernel oopses and for lockdep somehow anyway. Other users of stacktrace are: fault injection, kmemcheck, latencytop, ftrace. All new debugging and instrumentation code uses it, and for a good reason. Ingo --
