hi, i enabled CONFIG_SECURITY on current git and get tons of Redzone overwritten errors during early boot, even with CONFIG_SECURITY_CAPABILITIES and CONFIG_SECURITY_NETWORK disabled. After a while it ends with a kernel panic saying: not syncing: Out of memory and no killable process... Root partition is ext3 format. At the moment i dont have a camera at hand, so i'll try to write down everything which looks interesting, please tell me if i missed something. The first 24 Bytes of the overwritten section contain zeros. Then we have a constant 0x18, and three changing values. the next three bites contain exactly the same values, first the 0x18, then the two changing ones. The only value i found so far matching the 0x18 and which might be related to CONFIG_SECURITY is CAP_SYS_RESOURCE defined in /include/linux/capability.h BUG hugetlbfs_inode_cache: Redzone overwritten INFO: 0xccd8e250-0xccd8e253. First byte 0x0 instead of 0xbb Info: Slab 0xc119d1c0 objects=12 used=0 fs=0xccd8e000 flags=0x400020c3 Info: Object 0xccd8e00 offset=0 fp=0xccd8e280 Object 0xccd8e00: 00 00 00 ... Object 0xccd8e10: 00 00 00 00 00 00 00 00 00 18 e0 d8 cc 18 e0 d8 cc Object 0xccd8e20 00 00 00 ... ... Pid: 1, comm:swapper Not tainted 2.6.26-rc3-00436-gb373303 #42 print_trailer check_bytes_and_report check_object __slab_alloc kmem_cache_alloc ? hugetlbfs_alloc_inode ? hugetlbfs_alloc_inode hugetlbfs_alloc_inode alloc_inote new_inode hugetlbs_get_inote hugetlbfs_fill_super ? sget ? set_anon_super get_sb_node hugetlbfs_get_sb ? hugetlbfs_fill_super vfs_kern_mount kern_mount_data init_hugetlbfs_fs ? init_once ? kernel_init kernel_init Config follows: # # Automatically generated make config: don't edit # Linux kernel version: 2.6.26-rc3 # Mon May 26 15:10:47 2008 # # CONFIG_64BIT is not set CONFIG_X86_32=y # CONFIG_X86_64 is not set CONFIG_X86=y CONFIG_ARCH_DEFCONFIG="arch/x86/configs/i386_defconfig" # CONFIG_GENERIC_LOCKBREAK is not ...
hi,
i tested a kmemcheck kernel as an attempt to debug
this further... seems CONFIG_SECURITY is unrelated to
this, but slub debugging only catches the
overwrite it if i enable CONFIG_SECURITY.
with slub_debug=FZPU i get the warning at
init_object+0x63:
(gdb) l *(init_object+0x63)
0xc0187243 is in init_object (mm/slub.c:544).
539 {
540 u8 *p = object;
541
542 if (s->flags & __OBJECT_POISON) {
543 memset(p, POISON_FREE, s->objsize - 1);
544 p[s->objsize - 1] = POISON_END;
545 }
546
547 if (s->flags & SLAB_RED_ZONE)
548 memset(p + s->objsize,
if i set slub_debug=- i get the kmemcheck warning at
(gdb) l *(__slab_alloc+0x238)
0xc0187bc8 is in __slab_alloc (mm/slub.c:303).
298 return *(void **)(object + s->offset);
299 }
300
301 static inline void set_freepointer(struct kmem_cache *s, void
*object, void *fp)
302 {
303 *(void **)(object + s->offset) = fp;
304 }
305
306 /* Loop over all objects in a slab */
307 #define for_each_object(__p, __s, __addr, __objects) \
I used the kmemcheck git tree from
git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-kmemcheck-4.git
In case you need some of the other kmemcheck output please
let me know.
Greetings, Eric
--
Hello! Oy, whow! :-) I actually tried to reproduce your problem yesterday to see if This is sort of expected. kmemcheck is not directly incompatible with slub debugging, but it may produce some false positives (that we haven't worked out yet). So I recommend that you turn slub debugging Hm, yes. It would be nice to see the actual kmemcheck error message as well in order to determine the cause of this. I don't really see how that write (= fp) can cause an error, so it has to be the s->offset dereference that is doing it. That seems extremely unlikely and would indicate a bug in SLUB itself... Out of curiosity, will your crash go away entirely if you compile the It would be nice to see the whole dmesg if you can get it. You should also make sure you have either CONFIG_KMEMCHECK_ENABLED_BY_DEFAULT=y set in your config or that you are booting with the kmemcheck=1 command-line option; otherwise, you'll only get the first warning before kmemcheck auto-disables itself. Forcing it to stay on will potentially give us more useful output. There is actually a newer kmemcheck tree which supports kmemcheck+SLAB, but the version you are running should be usable for debugging your problem, so I'm not going to ask you to try that. Thanks for trying it out, it would feel good if kmemcheck would finally be useful for something :-) Good luck. Vegard -- "The animistic metaphor of the bug that maliciously sneaked in while the programmer was not looking is intellectually dishonest as it disguises that the error is the programmer's own creation." -- E. W. Dijkstra, EWD1036 --
hi, ah, wouldnt a config option or warning message make sense while Ok, here we go, i tried to write it down as good as possible BUG: unable to handle kernel paging request at cb801000 IP: [(c0187bc8)] __slab_alloc+0x238/0x5e0 *pde = 01019067 *pte = 0x801962 Thread overran stack, or stack corrupted Oops: 0002 [#1] PREEMPT Modules linked in: Pid: 0, comm: swapper Not tainted (2.6.25-x86-latest,git....) EIP: 0060:[(c0187bc8)] EFLAGS: 00010286 CPU:0 EIP is at __slab_alloc+0x238/0x5e0 EAX: c08a3d40 EBC: cf801000 ECX: 00000000 EDX: c125b024 ESI: cf801000 EDI: cf801000 EBP: c08a9f54 ESP: c08a9f24 DS: 007b ES: 007b FS: 0000 CS: 0000 SS: 0068 Process swapper (pid: 0, ti=c08a9000 task=c08583c0, task.ti=c08a9000) Stack: c125b024 c0441b07 00000008 00000000 ffffffff 000000d0 c08a3d40 00000010 c125b024 00000000 c08a3d40 00000286 c08a9f78 c0189187 c0443d85 c08a3ddc c04433d85 000000d0 00000000 c08a9fb4 0000000a c08a9f98 c0443d85 c08a9fb8 Call Trace: ? vsnprintf+0x2d7 ? __ kmalloc+0xf7 ? kvasprintf+0x35 ? kvasprintf+0x35 ? kvasprintf+0x35 ? kasprintf+0x17 ? kmem_cache_init+0xcd ? start_kernel+0x1c9 ? unknown_bootoption+0x0 ? i386_start_kernel+0x8 ===== Copde: 0a 8b 59 04 0f af c3 8d 04 07 39 f8 76 36 89 fb eb 04 89 f3 89 ce 8b 55 f0 89 d9 8b 45 e8 e8 d0 ea ff ff 8b 45 e8 8b 48 0c 01 cb(89) 33 8b 5d f0 8b 50 04 0f b7 43 0a 8d 0c 32 0f af c2 8d 04 07 EIP: [(c0187bc8)] __slab_alloc+0x238/0x5e0 SS:ESP 0068:c08a9f24 --- end trace --- Kernel panic - not syncing: attempted to kill the idle task! At the moment i dont have a camera at hand and netconsole doesnt yeah, its a nice project, is there a reason why it isnt in mainline yet? Thanks for your help, Eric --
Hi Eric,
Unfortunately kmemcheck does not catch writes to red-zone so it won't
help you debug the original issue.
Pekka
--
(added some cc's) So what kernel version is this and what's the last known version that worked? As it's early boot crash, maybe you can try to do git bisect --
this is 2.6.26-rc4, i didnt test any earlier versions so far (ok, i did test some pre -rc4 git versions i think 4 days ago, but they also showed the problem) this is the first time that i enabled CONFIG_SECURITY on my testbox. I am currently trying to reproduce this with SLAB as Vegard suggested for the kmemcheck report. After this I'll retest this on a fresh tree to make sure it isnt something buggy on my part and try some older kernels. Greetings, Eric --
ok, with CONFIG_SECURITY, SLAB and CONFIG_DEBUG_SLAB on the -rc3 based kmemcheck tree enabled i dont get any error. I am currently building a fresh -rc4 (with SLUB) to make sure this is for real, then i'll try that again with SLAB and then start testing older kernels. Greetings, Eric --
Then it's likely that we corrupt hugetlbfs_inode_cachep because of SLUB merging and the real problem is somewhere else. Can you also try passing 'slub_nomerge" as a kernel parameter with SLUB? --
Enabling Redzone disables merging. So its unrelated to merging. --
ok, do_init_calls time frame with that config...CONFIG_SECURITY isn't
really doing any allocations, nor much in the way of memory writes.
It would get called into via the:
hugetlbs_get_inode
new_inode
alloc_inode
I couldn't recreate with that config.
--
I did a fresh git-clone and tried again without being able to reproduce this. I diffed all .h and .c files and except for the autogenerated ones they are exactly the same... Is it possible that this was caused because a file didnt get rebuild correctly? I can still reproduce it with the old checkout. Sorry if this causes unessecary noise :( Greetings, Eric --
I had wondered that, since data structures will grow w/ CONFIG_SECURITY set (like inode, for example). I haven't encountered a Kbuild dependency bug in quite a while though. thanks, -chris --
Yeah, this thing is miscompiled (thanks for the vmlinux).
$ cd /tmp/CONFIG_SECURITY-mem-corruption
$ gdb -q vmlinux
(gdb) p sizeof(struct inode)
$1 = 596
(gdb) p sizeof(struct hugetlbfs_inode_info)
$2 = 592
struct hugetlbfs_inode_info {
struct shared_policy policy;
struct inode vfs_inode;
};
The hugetlbfs_inode_info structure isn't updated with the 4 extra bytes
added from CONFIG_SECURITY to struct inode.
If you're interested in more gory details, you can look at:
$ eu-readelf -winfo vmlinux > readelf-info.out
thanks,
-chris
--
Hi Eric, Unfortunately this transcribe is not useful for serious debugging. The object ranges in the printout ("0xccd8e250" vs "0xccd8e00") and you didn't write down contents of the "Redzone" range that has the corrupting data. So a serial console output or a picture of the oops would be much appreciated. --
| Greg KH | Og dreams of kernels |
| Jens Axboe | [PATCH 31/33] Fusion: sg chaining support |
| Arnd Bergmann | Re: finding your own dead "CONFIG_" variables |
| Mark Brown | [PATCH 2/2] Subject: natsemi: Allow users to disable workaround for DspCfg reset |
| Tony Breeds | [LGUEST] Look in object dir for .config |
git: | |
| Brian Downing | Re: Git in a Nutshell guide |
| John Benes | Re: master has some toys |
| Matthias Lederhofer | [PATCH 4/7] introduce GIT_WORK_TREE to specify the work tree |
| Alexander Sulfrian | [RFC/PATCH] RE: git calls SSH_ASKPASS even if DISPLAY is not set |
| Junio C Hamano | Re: Rss produced by git is not valid xml? |
