login
Login
/
Register
Search
Search this site:
Forums
News
Blogs
Features
Site
Home
»
Mailing list archives
»
linux-kernel
»
2010
»
August
»
15
Re: [PATCH] oprofile: fix crash when accessing freed task structs
view
thread
Previous message: [
thread
] [
date
] [
author
]
Next message: [
thread
] [
date
] [
author
]
[view in full thread]
From: Benjamin Herrenschmidt
Subject:
Re: [PATCH] oprofile: fix crash when accessing freed task structs
Date: Sunday, August 15, 2010 - 3:22 pm
On Fri, 2010-08-13 at 17:39 +0200, Robert Richter wrote:
quoted text
> On 02.08.10 21:39:33, Benjamin Herrenschmidt wrote: > > > I can't tell that much about the workload, I don't have access to it > > either, let's say that from my point of view it's a "customer" binary > > blob. > > > > I can re-trigger it though. > > Benjamin, > > can you try the patch below?
Thanks. I'll see if the folks who have a repro-case can give it a spin for me. Cheers, Ben.
quoted text
> Thanks, > > -Robert > > >From 4435322debc38097e9e863e14597ab3f78814d14 Mon Sep 17 00:00:00 2001 > From: Robert Richter <robert.richter@amd.com> > Date: Fri, 13 Aug 2010 16:29:04 +0200 > Subject: [PATCH] oprofile: fix crash when accessing freed task structs > > This patch fixes a crash during shutdown reported below. The crash is > caused be accessing already freed task structs. The fix changes the > order for registering and unregistering notifier callbacks. > > All notifiers must be initialized before buffers start working. To > stop buffer synchronization we cancel all workqueues, unregister the > notifier callback and then flush all buffers. After all of this we > finally can free all tasks listed. > > This should avoid accessing freed tasks. > > On 22.07.10 01:14:40, Benjamin Herrenschmidt wrote: > > > So the initial observation is a spinlock bad magic followed by a crash > > in the spinlock debug code: > > > > [ 1541.586531] BUG: spinlock bad magic on CPU#5, events/5/136 > > [ 1541.597564] Unable to handle kernel paging request for data at address 0x6b6b6b6b6b6b6d03 > > > > Backtrace looks like: > > > > spin_bug+0x74/0xd4 > > ._raw_spin_lock+0x48/0x184 > > ._spin_lock+0x10/0x24 > > .get_task_mm+0x28/0x8c > > .sync_buffer+0x1b4/0x598 > > .wq_sync_buffer+0xa0/0xdc > > .worker_thread+0x1d8/0x2a8 > > .kthread+0xa8/0xb4 > > .kernel_thread+0x54/0x70 > > > > So we are accessing a freed task struct in the work queue when > > processing the samples. > > Reported-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> > Signed-off-by: Robert Richter <robert.richter@amd.com> > --- > drivers/oprofile/buffer_sync.c | 27 ++++++++++++++------------- > drivers/oprofile/cpu_buffer.c | 2 -- > 2 files changed, 14 insertions(+), 15 deletions(-) > > diff --git a/drivers/oprofile/buffer_sync.c b/drivers/oprofile/buffer_sync.c > index a9352b2..b7e755f 100644 > --- a/drivers/oprofile/buffer_sync.c > +++ b/drivers/oprofile/buffer_sync.c > @@ -141,16 +141,6 @@ static struct notifier_block module_load_nb = { > .notifier_call = module_load_notify, > }; > > - > -static void end_sync(void) > -{ > - end_cpu_work(); > - /* make sure we don't leak task structs */ > - process_task_mortuary(); > - process_task_mortuary(); > -} > - > - > int sync_start(void) > { > int err; > @@ -158,7 +148,7 @@ int sync_start(void) > if (!zalloc_cpumask_var(&marked_cpus, GFP_KERNEL)) > return -ENOMEM; > > - start_cpu_work(); > + mutex_lock(&buffer_mutex); > > err = task_handoff_register(&task_free_nb); > if (err) > @@ -173,7 +163,10 @@ int sync_start(void) > if (err) > goto out4; > > + start_cpu_work(); > + > out: > + mutex_unlock(&buffer_mutex); > return err; > out4: > profile_event_unregister(PROFILE_MUNMAP, &munmap_nb); > @@ -182,7 +175,6 @@ out3: > out2: > task_handoff_unregister(&task_free_nb); > out1: > - end_sync(); > free_cpumask_var(marked_cpus); > goto out; > } > @@ -190,11 +182,20 @@ out1: > > void sync_stop(void) > { > + /* flush buffers */ > + mutex_lock(&buffer_mutex); > + end_cpu_work(); > unregister_module_notifier(&module_load_nb); > profile_event_unregister(PROFILE_MUNMAP, &munmap_nb); > profile_event_unregister(PROFILE_TASK_EXIT, &task_exit_nb); > task_handoff_unregister(&task_free_nb); > - end_sync(); > + mutex_unlock(&buffer_mutex); > + flush_scheduled_work(); > + > + /* make sure we don't leak task structs */ > + process_task_mortuary(); > + process_task_mortuary(); > + > free_cpumask_var(marked_cpus); > } > > diff --git a/drivers/oprofile/cpu_buffer.c b/drivers/oprofile/cpu_buffer.c > index 219f79e..f179ac2 100644 > --- a/drivers/oprofile/cpu_buffer.c > +++ b/drivers/oprofile/cpu_buffer.c > @@ -120,8 +120,6 @@ void end_cpu_work(void) > > cancel_delayed_work(&b->work); > } > - > - flush_scheduled_work(); > } > > /* > -- > 1.7.1.1 > > >
--
unsubscribe notice
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to
majordomo@vger.kernel.org
More majordomo info at
http://vger.kernel.org/majordomo-info.html
Please read the FAQ at
http://www.tux.org/lkml/
Previous message: [
thread
] [
date
] [
author
]
Next message: [
thread
] [
date
] [
author
]
Messages in current thread:
Possible Oprofile crash/race when stopping
, Benjamin Herrenschmidt
, (Wed Jul 21, 10:14 pm)
Re: Possible Oprofile crash/race when stopping
, Robert Richter
, (Wed Jul 28, 5:21 am)
Re: Possible Oprofile crash/race when stopping
, Benjamin Herrenschmidt
, (Mon Aug 2, 6:39 pm)
[PATCH] oprofile: fix crash when accessing freed task structs
, Robert Richter
, (Fri Aug 13, 8:39 am)
Re: [PATCH] oprofile: fix crash when accessing freed task ...
, Benjamin Herrenschmidt
, (Sun Aug 15, 3:22 pm)
Re: [PATCH] oprofile: fix crash when accessing freed task ...
, Robert Richter
, (Tue Aug 31, 3:28 am)
Navigation
Mailing list archives
Recent posts
Popular discussions
linux-kernel
:
Ken Chen
[patch] sched: fix inconsistency when redistribute per-cpu tg->cfs_rq shares.
Ingo Molnar
Re: [PATCH v3] x86: merge the simple bitops and move them to bitops.h
Paul Turner
[tg_shares_up rewrite v4 11/11] sched: update tg->shares after cpu.shares write
Andi Kleen
Re: - romsignature-checksum-cleanup-2.patch removed from -mm tree
Stefano Stabellini
Re: [PATCH 09/22] xen: Find an unbound irq number in reverse order (high to low).
git
:
Junio C Hamano
Re: Teach "git checkout" to use git-show-ref
Christian Jaeger
Re: Problem with Git.pm bidi_pipe methods
Linus Torvalds
[PATCH 1/7] Make unpack_trees_options bit flags actual bitfields
Jon Smirl
stgit: managing signed-off-by lines
Junio C Hamano
GIT 1.4.3-rc2
git-commits-head
:
Linux Kernel Mailing List
MIPS: Bonito64: Make Loongson independent from Bonito64 code.
Linux Kernel Mailing List
iwlwifi: initialize spinlock before use
Linux Kernel Mailing List
i2c-i801: Add Intel Cougar Point device IDs
Linux Kernel Mailing List
drm/i915: Add information on pinning and fencing to the i915 list debug.
Linux Kernel Mailing List
cirrusfb: GD5434 (aka SD64) support fixed
linux-netdev
:
Gerrit Renker
v2 [PATCH 1/4] dccp: Limit feature negotiation to connection setup phase
Richard Cochran
Re: [PATCH v3 3/3] ptp: Added a clock that uses the eTSEC found on the MPC85xx.
Inaky Perez-Gonzalez
[PATCH 40/40] wimax/i2400m: add CREDITS and MAINTAINERS entries
Sathya Perla
[PATCH net-next-2.6] be2net: add multiple RX queue support
Changli Gao
Re: [PATCH 3/3] ifb: move tq from ifb_private
freebsd-current
:
Boris Samorodov
Re: twa + dump = sbwait
韓家標 Bill Hacker
Re: ZFS honesty
Bjoern A. Zeeb
Re: Can not boot 7.0-BETA3 with IPSEC
rmgls
man usb2_core(4)
Sam Leffler
Re: Lots of "ath0: bad series0 hwrate 0x1b" in 8.0-BETA2
Colocation donated by:
Syndicate