Re: [PATCH] perf_events: fix bug in AMD per-cpu initialization

Previous thread: by kees schoenmakers on Wednesday, March 17, 2010 - 2:19 am. (1 message)

Next thread: Can not get 'function_graph' for by Ryan Wang on Wednesday, March 17, 2010 - 2:53 am. (1 message)
From: Stephane Eranian
Date: Wednesday, March 17, 2010 - 1:40 am

On AMD processors, we need to allocate a data structure per Northbridge
	to handle certain events.

	On CPU initialization, we need to query the Northbridge id and check
	whether the structure is already allocated or not. We use the
	amd_get_nb_id() function to request the Northbridge identification.

	The recent cleanup of the CPU online/offline initialization introduced
	a bug. AMD cpu initialization is invoked on CPU_UP_PREPARE callback.
	This is before the CPU Northbridge id is calculated. Therefore no
	processor had a Northbridge structure allocated except the boot
	processor. That was causing bogus NB event scheduling.

	This patch uses the CPU_ONLINE callback to initialize the AMD
	Northbridge structure. This way amd_get_nb_id() returns valid
	information.

	The x86_cpu_up() callback was added. Could not call it cpu_online
	because of existing macro.

	Signed-off-by: Stephane Eranian <eranian@google.com>
	
diff --git a/arch/x86/kernel/cpu/perf_event.c b/arch/x86/kernel/cpu/perf_event.c
index 978d297..016c25a 100644
--- a/arch/x86/kernel/cpu/perf_event.c
+++ b/arch/x86/kernel/cpu/perf_event.c
@@ -213,6 +213,7 @@ struct x86_pmu {
 	void		(*quirks)(void);
 
 	void		(*cpu_prepare)(int cpu);
+	void		(*cpu_up)(int cpu);
 	void		(*cpu_starting)(int cpu);
 	void		(*cpu_dying)(int cpu);
 	void		(*cpu_dead)(int cpu);
@@ -1342,6 +1343,11 @@ x86_pmu_notifier(struct notifier_block *self, unsigned long action, void *hcpu)
 			x86_pmu.cpu_starting(cpu);
 		break;
 
+	case CPU_ONLINE:
+		if (x86_pmu.cpu_up)
+			x86_pmu.cpu_up(cpu);
+		break;
+
 	case CPU_DYING:
 		if (x86_pmu.cpu_dying)
 			x86_pmu.cpu_dying(cpu);
diff --git a/arch/x86/kernel/cpu/perf_event_amd.c b/arch/x86/kernel/cpu/perf_event_amd.c
index 358a8e3..584df49 100644
--- a/arch/x86/kernel/cpu/perf_event_amd.c
+++ b/arch/x86/kernel/cpu/perf_event_amd.c
@@ -346,6 +346,9 @@ static void amd_pmu_cpu_offline(int cpu)
 
 	cpuhw = &per_cpu(cpu_hw_events, cpu);
 
+	if (!cpuhw->amd_nb)
+		return;
+
 ...
From: Peter Zijlstra
Date: Wednesday, March 17, 2010 - 4:47 pm

No, ONLINE is not exposed for a good reason, its always wrong.

Use prepare to allocate data, and starting to initialize stuff on the
cpu proper once its up. Online is after the cpu is up and running and we
are already scheduling stuff on it so its too late to initialize things.

--

From: Stephane Eranian
Date: Wednesday, March 17, 2010 - 5:33 pm

I can't because amd_get_nb_id() returns garbage for the CPU
when you call from CPU_STARTING. So looks like initialization
of cpu_llc_id needs to be moved way earlier. I don't want to have
to duplicate that code.
--

From: Peter Zijlstra
Date: Tuesday, March 23, 2010 - 7:11 am

Crap, you're right, either notify_cpu_starting() is done too early or
smp_store_cpu_info() is done too late.

Since smp_store_cpu_info() relies on the result of calibrate_delay() we
can't easily change that order, but since there really isn't any other
CPU_STARTING user in tree (I appear to have created the first?!) we can
easily move that notifier thing later.

(What's up with that IRQ-enable over calibrate_delay(), can't we simply
enable the NMI watchdog later?)

So I guess something like the below should work:

---
arch/x86/kernel/smpboot.c |    4 ++--
1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 06d98ae..6808b93 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -242,8 +242,6 @@ static void __cpuinit smp_callin(void)
end_local_APIC_setup();
map_cpu_to_logical_apicid();

- notify_cpu_starting(cpuid);
-
/*
* Need to setup vector mappings before we enable interrupts.
*/
@@ -264,6 +262,8 @@ static void __cpuinit smp_callin(void)
*/
smp_store_cpu_info(cpuid);

+ notify_cpu_starting(cpuid);
+
/*
* Allow the master to continue.
*/


Which then allows us to write something like:

---

 arch/x86/kernel/cpu/perf_event.c     |    8 ++-
 arch/x86/kernel/cpu/perf_event_amd.c |   69 ++++++++++++++++++---------------
 2 files changed, 43 insertions(+), 34 deletions(-)

diff --git a/arch/x86/kernel/cpu/perf_event.c b/arch/x86/kernel/cpu/perf_event.c
index f571f51..4db6eaf 100644
--- a/arch/x86/kernel/cpu/perf_event.c
+++ b/arch/x86/kernel/cpu/perf_event.c
@@ -209,7 +209,7 @@ struct x86_pmu {
 	struct event_constraint *event_constraints;
 	void		(*quirks)(void);
 
-	void		(*cpu_prepare)(int cpu);
+	int		(*cpu_prepare)(int cpu);
 	void		(*cpu_starting)(int cpu);
 	void		(*cpu_dying)(int cpu);
 	void		(*cpu_dead)(int cpu);
@@ -1330,11 +1330,12 @@ static int __cpuinit
 x86_pmu_notifier(struct notifier_block *self, unsigned long action, void *hcpu)
 {
 ...
From: Stephane Eranian
Date: Tuesday, March 23, 2010 - 7:55 am

What's the point of CPU_ONLINE vs. CPU_STARTING if you're saying the
former is never right? Why not move CPU_ONLINE to the right place and drop



-- 
Stephane Eranian  | EMEA Software Engineering
Google France | 38 avenue de l'Opéra | 75002 Paris
Tel : +33 (0) 1 42 68 53 00
This email may be confidential or privileged. If you received this
communication by mistake, please
don't forward it to anyone else, please erase all copies and
attachments, and please let me know that
it went to the wrong person. Thanks
--

From: Peter Zijlstra
Date: Tuesday, March 23, 2010 - 8:07 am

Its right for a lot of things, just not for perf, we need to be ready
and done by the time the cpu starts scheduling.

--

From: Stephane Eranian
Date: Tuesday, March 23, 2010 - 8:12 am

You mean they need to wait until after the cpu starts scheduling?
As opposed to being called just before it starts scheduling.
--

From: Peter Zijlstra
Date: Tuesday, March 23, 2010 - 8:18 am

As in it doesn't really matter for them, and the CPU_ONLINE call is
convenient in that it allows the callback to schedule too.
--

From: Stephane Eranian
Date: Tuesday, March 23, 2010 - 3:41 pm

Fine, I will try your proposed patch tomorrow. A priori, it looks fine.
--

From: tip-bot for Peter Zijlstra
Date: Friday, April 2, 2010 - 12:06 pm

Commit-ID:  85257024096a96fc5c00ce59d685f62bbed3ad95
Gitweb:     http://git.kernel.org/tip/85257024096a96fc5c00ce59d685f62bbed3ad95
Author:     Peter Zijlstra <a.p.zijlstra@chello.nl>
AuthorDate: Tue, 23 Mar 2010 19:30:52 +0100
Committer:  Ingo Molnar <mingo@elte.hu>
CommitDate: Fri, 2 Apr 2010 19:30:01 +0200

x86: Move notify_cpu_starting() callback to a later stage

Because we need to have cpu identification things done by the time we run
CPU_STARTING notifiers.

( This init ordering will be relied on by the next fix. )

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1269353485.5109.48.camel@twins>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
 arch/x86/kernel/smpboot.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 06d98ae..6808b93 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -242,8 +242,6 @@ static void __cpuinit smp_callin(void)
 	end_local_APIC_setup();
 	map_cpu_to_logical_apicid();
 
-	notify_cpu_starting(cpuid);
-
 	/*
 	 * Need to setup vector mappings before we enable interrupts.
 	 */
@@ -264,6 +262,8 @@ static void __cpuinit smp_callin(void)
 	 */
 	smp_store_cpu_info(cpuid);
 
+	notify_cpu_starting(cpuid);
+
 	/*
 	 * Allow the master to continue.
 	 */
--

From: tip-bot for Peter Zijlstra
Date: Friday, April 2, 2010 - 12:07 pm

Commit-ID:  b38b24ead33417146e051453d04bf60b8d2d7e25
Gitweb:     http://git.kernel.org/tip/b38b24ead33417146e051453d04bf60b8d2d7e25
Author:     Peter Zijlstra <a.p.zijlstra@chello.nl>
AuthorDate: Tue, 23 Mar 2010 19:31:15 +0100
Committer:  Ingo Molnar <mingo@elte.hu>
CommitDate: Fri, 2 Apr 2010 19:30:02 +0200

perf, x86: Fix AMD hotplug & constraint initialization

Commit 3f6da39 ("perf: Rework and fix the arch CPU-hotplug hooks") moved
the amd northbridge allocation from CPUS_ONLINE to CPUS_PREPARE_UP
however amd_nb_id() doesn't work yet on prepare so it would simply bail
basically reverting to a state where we do not properly track node wide
constraints - causing weird perf results.

Fix up the AMD NorthBridge initialization code by allocating from
CPU_UP_PREPARE and installing it from CPU_STARTING once we have the
proper nb_id. It also properly deals with the allocation failing.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
[ robustify using amd_has_nb() ]
Signed-off-by: Stephane Eranian <eranian@google.com>
LKML-Reference: <1269353485.5109.48.camel@twins>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
 arch/x86/kernel/cpu/perf_event.c     |    8 ++-
 arch/x86/kernel/cpu/perf_event_amd.c |   80 ++++++++++++++++++++--------------
 2 files changed, 52 insertions(+), 36 deletions(-)

diff --git a/arch/x86/kernel/cpu/perf_event.c b/arch/x86/kernel/cpu/perf_event.c
index 5fb490c..bd28cf9 100644
--- a/arch/x86/kernel/cpu/perf_event.c
+++ b/arch/x86/kernel/cpu/perf_event.c
@@ -158,7 +158,7 @@ struct x86_pmu {
 						 struct perf_event *event);
 	struct event_constraint *event_constraints;
 
-	void		(*cpu_prepare)(int cpu);
+	int		(*cpu_prepare)(int cpu);
 	void		(*cpu_starting)(int cpu);
 	void		(*cpu_dying)(int cpu);
 	void		(*cpu_dead)(int cpu);
@@ -1333,11 +1333,12 @@ static int __cpuinit
 x86_pmu_notifier(struct notifier_block *self, unsigned long action, void *hcpu)
 {
 	unsigned int cpu = (long)hcpu;
+	int ret = NOTIFY_OK;
 
 	switch (action & ...
Previous thread: by kees schoenmakers on Wednesday, March 17, 2010 - 2:19 am. (1 message)

Next thread: Can not get 'function_graph' for by Ryan Wang on Wednesday, March 17, 2010 - 2:53 am. (1 message)