Hi, I skipped the public announcements for versions 5 and 6, but here is 7 :) General description: kmemcheck is a patch to the linux kernel that detects use of uninitialized memory. It does this by trapping every read and write to memory that was allocated dynamically (e.g. using kmalloc()). If a memory address is read that has not previously been written to, a message is printed to the kernel log. Changes since v4 (rough list): - SLUB parts were broken-out into its own file to avoid cluttering the main SLUB code. - A rather lot of cleanups, including removing #ifdefs from arch code. - Some preparation in anticipation of an x86_64 port. - Make reporting safer by using a periodic timer to inspect the error queue. - Fix hang due to page flags changing too early on free(). - Fix hang due to kprobes incompatibility. - Allow CONFIG_SMP, but limit number of CPUs to 1 at run-time. - Add kmemcheck=0|1 boot option. - Add /proc/sys/kernel/kmemcheck for run-time enabling/disabling. These patches apply to Linus's v2.6.25-rc8. The latest patchset can also be found here: http://folk.uio.no/vegardno/linux/kmemcheck/ (I will try to submit this for inclusion in 2.6.26, and testing and feedback is of course very welcome!) I would like to thank the following people, who provided patches or helped in various ways: Ingo Molnar Paul McKenney Pekka Enberg Pekka Paalanen Peter Zijlstra Randy Dunlap Kind regards, Vegard Nossum --
(reply to an e-mail of one month ago) Hello Vegard, It's a bit late but I finally found out about your announcement of kmemcheck version 7. Are you familiar with the patch that adds support to Valgrind for User Mode Linux ? I'm not sure what the best approach is -- letting the kernel do its own checking like kmemcheck or extend Valgrind such that it supports UML. Anyway, the techniques applied in Valgrind may be useful for kmemcheck too, such as the algorithms used in Valgrind to compress the memory state information. See also: http://www.mail-archive.com/user-mode-linux-devel@lists.sourceforge.net/msg05602.html Bart. --
It's better to do it with the native kernel so you can "valgrind" all the interesting driver code. --
That's right. This is the paper I was referring to that details how to minimize the memory consumption when tracking state information: http://www.valgrind.org/docs/shadow-memory2007.pdf Bart. --
Hi! On Sat, May 10, 2008 at 1:04 PM, Bart Van Assche Yes, I have learned of it not so long ago, around January or so. I wanted to stop kmemcheck development back then, but Ingo and Pekka convinced me that it could still be useful :-) (The link is http://bitwagon.com/valgrind+uml/index.html) I guess the main disadvantages of using kmemcheck over valgrind-memcheck are: - kmemcheck can only warn eagerly, whereas memcheck will wait until the uninitialized bits are actually used. This means that kmemcheck will report many false positives. (We have some workarounds but this is obviously not perfect.) - kmemcheck can only warn for dynamic memory, whereas kmemcheck I believe will also work for local variables, static variables, etc. It would be interesting to compare the output of kmemcheck vs. the Thanks. I have actually seen the paper before, but not read all of it. From a quick glace, it seems that the optimizations described there apply to the tracking of individual bits within a byte, but since we are tracking by byte granularity (as opposed to bit granularity), it also seems irrelevant to kmemcheck. (I am not saying that it isn't interesting, however.) Currently, we are using a full byte for each shadowed byte. Since we actually only use two bits out of eight, we could save three fourths compared to what we use today. However, memory usage doesn't seem to be much of a problem. I actually think it might be worth saving the CPU cycles that are needed for the lookups/bit operations (memory is cheap, cycles aren't). How is the speed of Valgrind+UML, does anybody know? Isn't there a problem that Valgrind will have to emulate all the userspace programs as well? That, I believe, would make the Valgrinded system painfully slow to work with. I have no benchmarks or profiler results to refer to, but kmemcheck at least boots to full userspace+X and is still quite usable. Vegard -- "The animistic metaphor of the bug that maliciously sneaked in while the programmer was...
The speed of Valgrind+UML is the same as the speed of valgrind on any application. On a 2GHz box it took about 2.5 minutes to reach "login:" from a cold boot of UML (includes udev, etc.) So if normal boot takes 15 seconds, then that's a factor of 10 slowdown: slow for interactivity, yet bearable for checking. The memory-intensive portions (linear search, pointer chasing, etc.) can be slower still, but loops that concentrate on register arithmetic or conditional branching go faster. There is almost no system wait time: normal device delays (disk, network) get totally overlapped by CPU usage for grinding :-) I'd like to have both kmemcheck and valgrind+UML, and use them differently. Run kmemcheck all the time on a box or two as "background trolling" for infrequent cases. Use valgrind+UML for interactivity and programmable flexibility when hunting specific bugs, or when hardware cannot be dedicated. -- John Reiser, jreiser@BitWagon.com --
No, I think valgrind+uml deliberately lets usermode code run directly on
the cpu, not under valgrind. Having the option to run everything under
Valgrind would be interesting, since it would allow you to trace
uninitialized values crossing the user-kernel boundary (both ways)
indicating either usermode or kernel bugs (also user to user via the
kernel, such as via a pipe).
I've thought about, but not actually implemented, running valgrind as a
Xen guest, and then running a sub-guest under it, allowing you to run an
entire virtual machine under Valgrind. I think people have done vaguely
similar stuff with qemu.
J
--It can be done either way. Grinding userspace code as well is more uniform, as there's no need to say "this clone should not be followed, as it will become a UML process". On the other hand, not grinding processes means you don't need to figure out how to get the valgrind engine into your processes. Jeff -- Work email - jdike at linux dot intel dot com --
One easy way to force valgrind into a process is for load_elf_binary() in fs/binfmt_elf.c to force a PT_INTERP which loads memcheck via true user-mode calls, then chains to the original PT_INTERP. -- John Reiser, jreiser@BitWagon.com --
Keep in mind that a reduction in memory usage may reduce the number of cache misses, and that the improved caching behavior may outweigh the extra CPU cycles needed for the bit operations. Bart. --
I don't think that's true. valgrind can only detect uninitialized local variables in one special case (first use of the stack region). But as soon as you reuse stack which is pretty common it won't be able to detect the next uninitialized use in a stack frame. Luckily the compilers do a reasonable job at detecting them at build time. And static/global variables are never uninitialized in C. -Andi --
It tracks changes to the stack pointer, and any memory below it is
considered uninitialized. But, yes, if you mean that if you use the
variable (or slot) once in a function, then again later, it will still
be considered initialized. But that's no different from any other memory.
J
--But it does not invalidate anything below the stack pointer as soon What I meant is e.g. f1(); f2(); both f1 and f2 use the same stack memory, but f2 uses it uninitialized, then I think valgrind would still think it is initialized in f2 from the execution of f1. It would only detect such things in f1 (assuming there were no other users of the stack before that) In theory it could throw away all stack related uninitizedness on each SP change, but that would be likely prohibitively expensive and also it might be hard to know the exact boundaries of the stack. BTW on running a test program here it doesn't seem to detect any uninitialized stack frames here with 3.2.3. Test program is http://halobates.de/t10.c (should be compiled without optimization) -Andi --
Yeah, as soon as the stack pointer changes, everything below it is invalidated (except if the stack-pointer change was actually determined No, it won't. If the stack pointer goes up then down between f1 and f2, then f2 will get fresh values. The big thing Valgrind hasn't traditionally helped with is overruns of No, its not all that expensive compared the overall cost of valgrind and the amount of diagnostic power it provides. Determining stack boundaries has always been a bit fraught. Typically a stack switch has been determined heuristically by looking for a "large" change in stack pointer, but there's a callback to specifically mark a range of memory as a stack, so that movements into and out of a stack can be determined as a switch (added specifically to deal with small densely packed stacks Hm, I'd expect it to. Oh, your test program doesn't use the value. Valgrind doesn't complain about uninitialized values unless they actually affect execution (ie, a conditional depends on one, you use it as an address for a dereference, or pass it to a syscall). The attached version emits errors as I'd expect: $ valgrind t10 ==30474== Memcheck, a memory error detector. ==30474== Copyright (C) 2002-2007, and GNU GPL'd, by Julian Seward et al. ==30474== Using LibVEX rev 1804, a library for dynamic binary translation. ==30474== Copyright (C) 2004-2007, and GNU GPL'd, by OpenWorks LLP. ==30474== Using valgrind-3.3.0, a dynamic binary instrumentation framework. ==30474== Copyright (C) 2000-2007, and GNU GPL'd, by Julian Seward et al. ==30474== For more details, rerun with: -v ==30474== f1 set y to 1 ==30474== Conditional jump or move depends on uninitialised value(s) ==30474== at 0x8048420: test (t10.c:22) ==30474== by 0x8048451: main (t10.c:29) ==30474== ==30474== Use of uninitialised value of size 4 ==30474== at 0xB5C5B6: _itoa_word (in /lib/libc-2.8.so) ==30474== by 0xB5FF90: vfprintf (in /lib/libc-2.8.so) ==30474== by 0xB6769F: printf (in /l...
The valgrind+uml patches added a callback, "I am switching stacks >NOW<." If possible then it is better to tell an interpreter what is happening, rather than requiring that the interpreter [try to] figure it out. -- John Reiser, jreiser@BitWagon.com --
Hm, I never particularly liked that approach because unless you do the
whole thing in assembly it was never certain that there wasn't a
basic-block break between them (ie, atomic with respect to valgrind).
For the kernel that may be possible, but I was thinking of the general
Matter of taste really, but I tend to disagree. If you say something
like "addresses A-B, C-D, E-F are stacks", then the stack pointer
changing from the range A-B to C-D is a pretty clear indication of stack
switch, regardless of the mechanism you use to do it. Of course, an
explicit hint prevents an accidental push/pop of 32k onto an 8K stack
from being considered a stack switch, but unless you actually know where
the stacks are, you can't warn about it or prevent it from
validating/invalidating a pile of innocent memory.
J
--It might in theory, but at least it doesn't for my test program. -Andi --
If you'd read a tiny bit further down my mail, you'd have seen my
explanation of why your test program isn't testing what you think it is,
and a variant which does.
J
--As long as the compiler is not told to optimize the compiled code, Valgrind's memcheck tool is able to detect uninitialized local variables. Valgrind a.o. tracks all updates of the stack pointer. If the stack pointer is increased, the memory range between the old and the new stack pointer is marked as undefined. This works as long as gcc doesn't optimize away individual stack pointer updates. (I'm one of the Valgrind developers.) Bart. --
From d3844118edba5548cce8d27a78bb15b8d6aded66 Mon Sep 17 00:00:00 2001
From: Vegard Nossum <vegard.nossum@gmail.com>
Date: Fri, 4 Apr 2008 00:54:48 +0200
Subject: [PATCH] slub: add hooks for kmemcheck
With kmemcheck enabled, SLUB needs to do this:
1. Request twice as much memory as would normally be needed. The bottom half
of the memory is what the user actually sees and uses; the upper half
contains the so-called shadow memory, which stores the status of each byte
in the bottom half, e.g. initialized or uninitialized.
2. Tell kmemcheck which parts of memory that should be marked uninitialized.
There are actually a few more states, such as "not yet allocated" and
"recently freed".
If a slab cache is set up using the SLAB_NOTRACK flag, it will never return
memory that can take page faults because of kmemcheck.
If a slab cache is NOT set up using the SLAB_NOTRACK flag, callers can still
request memory with the __GFP_NOTRACK flag. This does not prevent the page
faults from occuring, however, but marks the object in question as being
initialized so that no warnings will ever be produced for this object.
Signed-off-by: Vegard Nossum <vegardno@ifi.uio.no>
Acked-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
include/linux/gfp.h | 3 +-
include/linux/slab.h | 7 +++
include/linux/slub_def.h | 17 ++++++++
kernel/fork.c | 15 ++++---
mm/Makefile | 3 +
mm/slub.c | 36 ++++++++++++-----
mm/slub_kmemcheck.c | 99 ++++++++++++++++++++++++++++++++++++++++++++++
7 files changed, 161 insertions(+), 19 deletions(-)
create mode 100644 mm/slub_kmemcheck.c
diff --git a/include/linux/gfp.h b/include/linux/gfp.h
index 164be9d..0faeedc 100644
--- a/include/linux/gfp.h
+++ b/include/linux/gfp.h
@@ -50,8 +50,9 @@ struct vm_area_struct;
#define __GFP_THISNODE ((__force gfp_t)0x40000u)/* No fallback, no policies */
#defin...From caf2b1b7198bd4e0d690555be389a3444523162d Mon Sep 17 00:00:00 2001 From: Vegard Nossum <vegard.nossum@gmail.com> Date: Fri, 4 Apr 2008 00:53:23 +0200 Subject: [PATCH] x86: add hooks for kmemcheck The hooks that we modify are: - Page fault handler (to handle kmemcheck faults) - Debug exception handler (to hide pages after single-stepping the instruction that caused the page fault) Also redefine memset() to use the optimized version if kmemcheck is enabled. Signed-off-by: Vegard Nossum <vegardno@ifi.uio.no> Acked-by: Pekka Enberg <penberg@cs.helsinki.fi> Signed-off-by: Ingo Molnar <mingo@elte.hu> --- arch/x86/kernel/cpu/common.c | 7 +++++++ arch/x86/kernel/entry_32.S | 8 ++++---- arch/x86/kernel/traps_32.c | 16 +++++++++++++++- arch/x86/mm/fault.c | 25 +++++++++++++++++++++---- include/asm-x86/string_32.h | 8 ++++++++ 5 files changed, 55 insertions(+), 9 deletions(-) diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c index a38aafa..040c650 100644 --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -634,6 +634,13 @@ void __init early_cpu_init(void) nexgen_init_cpu(); umc_init_cpu(); early_cpu_detect(); + +#ifdef CONFIG_KMEMCHECK + /* + * We need 4K granular PTEs for kmemcheck: + */ + setup_clear_cpu_cap(X86_FEATURE_PSE); +#endif } /* Make sure %fs is initialized properly in idle threads */ diff --git a/arch/x86/kernel/entry_32.S b/arch/x86/kernel/entry_32.S index 4b87c32..54f477c 100644 --- a/arch/x86/kernel/entry_32.S +++ b/arch/x86/kernel/entry_32.S @@ -289,7 +289,7 @@ ENTRY(ia32_sysenter_target) CFI_DEF_CFA esp, 0 CFI_REGISTER esp, ebp movl TSS_sysenter_sp0(%esp),%esp -sysenter_past_esp: +ENTRY(sysenter_past_esp) /* * No need to follow this irqs on/off section: the syscall * disabled irqs and here we enable it straight after entry: @@ -767,7 +767,7 @@ label: \ CFI_ADJUST_CFA_OFFSET 4; \ CFI_REL...
From bb63a1de75a67ecd88c962c2616f0ab4217f27fe Mon Sep 17 00:00:00 2001 From: Vegard Nossum <vegard.nossum@gmail.com> Date: Fri, 4 Apr 2008 00:51:41 +0200 Subject: [PATCH] kmemcheck: add the kmemcheck core General description: kmemcheck is a patch to the linux kernel that detects use of uninitialized memory. It does this by trapping every read and write to memory that was allocated dynamically (e.g. using kmalloc()). If a memory address is read that has not previously been written to, a message is printed to the kernel log. Signed-off-by: Vegard Nossum <vegardno@ifi.uio.no> Acked-by: Pekka Enberg <penberg@cs.helsinki.fi> Signed-off-by: Ingo Molnar <mingo@elte.hu> --- Documentation/kmemcheck.txt | 93 +++++ arch/x86/Kconfig.debug | 47 +++ arch/x86/kernel/Makefile | 2 + arch/x86/kernel/kmemcheck.c | 893 ++++++++++++++++++++++++++++++++++++++++++ include/asm-x86/kmemcheck.h | 30 ++ include/asm-x86/pgtable.h | 4 +- include/asm-x86/pgtable_32.h | 6 + include/linux/kmemcheck.h | 27 ++ include/linux/page-flags.h | 6 + init/main.c | 2 + kernel/sysctl.c | 12 + 11 files changed, 1120 insertions(+), 2 deletions(-) create mode 100644 Documentation/kmemcheck.txt create mode 100644 arch/x86/kernel/kmemcheck.c create mode 100644 include/asm-x86/kmemcheck.h create mode 100644 include/linux/kmemcheck.h diff --git a/Documentation/kmemcheck.txt b/Documentation/kmemcheck.txt new file mode 100644 index 0000000..9d359d2 --- /dev/null +++ b/Documentation/kmemcheck.txt @@ -0,0 +1,93 @@ +Technical description +===================== + +kmemcheck works by marking memory pages non-present. This means that whenever +somebody attempts to access the page, a page fault is generated. The page +fault handler notices that the page was in fact only hidden, and so it calls +on the kmemcheck code to make further investigations. + +When the investigations are completed, kme...
| Karl Meyer | PROBLEM: 2.6.23-rc "NETDEV WATCHDOG: eth0: transmit timed out" |
| Justin Piszcz | Linux Software RAID 5 Performance Optimizations: 2.6.19.1: (211MB/s read & 195... |
| Bart Van Assche | Integration of SCST in the mainstream Linux kernel |
| David P. Quigley | [RFC v3] Security Label Support for NFSv4 |
| YOSHIFUJI Hideaki / | [GIT PULL] [IPV6] COMPAT: Fix SSM applications on 64bit kernels. |
| Krzysztof Halasa | Re: [PATCH v2] Re: WAN: new PPP code for generic HDLC |
| Pavel Emelyanov | [PATCH][CAN]: Fix copy_from_user() results interpretation. |
| Roel Kluin | [PATCH 1] net: fix and typo's |
git: | |
| Peter Stahlir | Git as a filesystem |
| Miklos Vajna | [rfc] git submodules howto |
| Dan Zwell | $GIT_DIR usage |
| Wink Saville | Resolving conflicts |
| GVG GVG | ssh_exchange_identification: Connection closed by remote host |
| Xavier Mertens | newfs: cg 0: bad magic number |
| Laurent CARON | IPSEC VPN between OpenBSD and Linux (OpenSwan) |
| Didier Wiroth | win32-codecs, avi and amd64 question |
| Netfilter kernel module | 8 hours ago | Linux kernel |
| serial driver xmit problem | 11 hours ago | Linux kernel |
| Why Windows is better than Linux | 11 hours ago | Linux general |
| How can I see my kernel messages in vt12? | 18 hours ago | Linux kernel |
| Grub | 1 day ago | Linux general |
| vmalloc_fault handling in x86_64 | 1 day ago | Linux kernel |
| epoll_wait()ing on epoll FD | 1 day ago | Linux kernel |
| Framebuffer in x86_64 causes problems to multiseat | 1 day ago | Linux kernel |
| Difference between 2.4 and 2.6 regarding thread creation | 1 day ago | Linux general |
| Compiling gfs2 on kernel 2.6.27 | 2 days ago | Linux kernel |
