On Sun, 20 Apr 2008, Paul E. McKenney wrote:Ok, I applied it, with hopefully an understandable commit message. That said, now we just need to figure out what actually caused the bug in question. Rafael: if it's a too-early free of the dentry (which could be because somebody didn't do a proper rcu read-lock, or maybe the rcu grace period logic itself got broken?), then enabling SLUB/SLAB debugging should catch it much more quickly (and hopefully we'd see the signature of a use-after-free - the poisoning byte pattern rather than the -1). The other alternative is simply memory corruption. Ie the -1 may well be somebody *else* overwritin the ->next pointer because they did a use-after-free and maybe the dentry_cache is shared with some other allocation of the same size (SLUB does that, no?) Rafael: your last oops does seem to imply that there is some strange memory corruption going on, because in that case the invalid pointer is different: instead of being all-ones, it is "fff0810023444c98", which is not a possible pointer. It very much looks like a single nybble got cleared (because ffff810023444c98 _would_ be a valid pointer, notice the "fff0" vs "ffff" prefix). So I do suspect it's *some* kind of use-after-free thing. But nothing in fs/ has changed, so it's not a dentry bug, I think. Which is why my "preferred" suspect is that "somebody else also does allocations of the same size as the dentry code, and shares the same SLUB alloc space, and does something bad". So Rafael - are you using SLUB, and if you are, can you enable SLUB_DEBUG, and then use the "slub_debug" kernel command line to enable it? Linus --
| Greg Kroah-Hartman | [PATCH 004/196] Chinese: add translation of SubmittingPatches |
| David Chinner | Re: [RFD] BIO_RW_BARRIER - what it means for devices, filesystems, and dm/md. |
| Andrew Morton | -mm merge plans for 2.6.23 |
| Trent Piepho | Re: [PATCH] [POWERPC] Improve (in|out)_beXX() asm code |
git: | |
| David Miller | Re: iptables very slow after commit784544739a25c30637397ace5489eeb6e15d7d49 |
| Jarek Poplawski | [PATCH] pkt_sched: Destroy gen estimators under rtnl_lock(). |
| Gerrit Renker | [PATCH 27/37] dccp: Integration of dynamic feature activation - part 2 (server side) |
| David Miller | [GIT]: Networking |
