These have been discussed intensively on the trace/perf lists recently.
Outcome: cleanup power events in a (several kernel rounds) compatible way.
Ingo: As discussed, it would be great to see these in some tree/branch
soon as I'd like to base further cpu_idle cleanups/fixes on top.
Also Jean is waiting for these with further work/patches.
Thanks,
Thomas
--
Recent changes: - Enable EVENT_POWER_TRACING_DEPRECATED by default New power trace events: power:cpu_idle power:cpu_frequency power:machine_suspend C-state/idle accounting events: power:power_start power:power_end are replaced with: power:cpu_idle and power:power_frequency is replaced with: power:cpu_frequency power:machine_suspend is newly introduced. Jean Pihet has a patch integrated into the generic layer (kernel/power/suspend.c) which will make use of it. the type= field got removed from both, it was never used and the type is differed by the event type itself. perf timechart userspace tool gets adjusted in a separate patch. Signed-off-by: Thomas Renninger <trenn@suse.de> Acked-by: Arjan van de Ven <arjan@linux.intel.com> Acked-by: Jean Pihet <jean.pihet@newoldbits.com> CC: Arjan van de Ven <arjan@linux.intel.com> CC: Ingo Molnar <mingo@elte.hu> CC: rjw@sisk.pl CC: linux-kernel@vger.kernel.org --- arch/x86/kernel/process.c | 7 +++- arch/x86/kernel/process_32.c | 2 +- arch/x86/kernel/process_64.c | 2 + drivers/cpufreq/cpufreq.c | 1 + drivers/cpuidle/cpuidle.c | 1 + drivers/idle/intel_idle.c | 1 + include/trace/events/power.h | 87 +++++++++++++++++++++++++++++++++++++++++- kernel/trace/Kconfig | 15 +++++++ kernel/trace/power-traces.c | 3 + 9 files changed, 116 insertions(+), 3 deletions(-) diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c index 57d1868..155d975 100644 --- a/arch/x86/kernel/process.c +++ b/arch/x86/kernel/process.c @@ -374,6 +374,7 @@ void default_idle(void) { if (hlt_use_halt()) { trace_power_start(POWER_CSTATE, 1, smp_processor_id()); + trace_cpu_idle(1, smp_processor_id()); current_thread_info()->status &= ~TS_POLLING; /* * TS_POLLING-cleared state must be visible before we @@ -444,6 +445,7 @@ EXPORT_SYMBOL_GPL(cpu_idle_wait); void mwait_idle_with_hints(unsigned long ax, unsigned long cx) { ...
Thomas, Thanks for the patches re-spin! Here are my comments inlined. Using %lu for the state field causes PWR_EVENT_EXIT to appear as The clock and power_domain events have been recently introduced and so must be part of the new API. Can this #endif be moved right after the A string is needed here. Without it it is impossible to have the option unset. ... Thanks, Jean --
This is intended, what exactly is the problem?
Oops, I pulled again meanwhile and the patches still patched without fuzz,
but probably with some offset.
Not sure whether this also came from merge issues, but yes, several
#ifdef conditions need to get corrected.
Ok, thanks.
I am currently rebuilding on several archs/flavors and hope to be able
to re-send this one today or on Tue.
Thanks,
Thomas
--
Thanks for pointing this out. Because pre-processor conditionals only have
been moved around it looks like my test build after pulling still succeeded,
while the #ifdefs/#endifs were rather messed up.
I adjusted these parts and successfully test-built on quite a lot .config
Done.
@Ingo: If this does not go into x86/tip, but perf or whatever tree, it would
be great if you can ping me as soon as this stuff is in.
I want to cleanup the "double cpu_idle events" issues on top and make this
more architecture independent (throw cpu_idle events from cpuidle framework
instead of throwing very x86 specific mwait states, etc.).
Thanks,
Thomas
--
Mind sending the latest version which has been adjusted/fixed and all acks added? Thanks, Ingo --
On Thursday 18 November 2010 09:01:32 Ingo Molnar wrote: Done. This time with lkml excluded as there were only cleanups/fixes due to a messed merge. Thanks, Thomas --
Please do not exclude lkml from such iterations of patches in the future - every modification to patches is relevant - often pure resends get resent to lkml as well. Thanks, Ingo --
Ok for me! Acked-by: Jean Pihet <j-pihet@ti.com> Note the ti.com email address to be used for Sign-offs and Acks. Thanks, --
I am also getting build failures: drivers/cpufreq/cpufreq.c:357: error: 'POWER_PSTATE' undeclared (first use in this function) drivers/cpufreq/cpufreq.c:357: error: (Each undeclared identifier is reported only once drivers/cpufreq/cpufreq.c:357: error: for each function it appears in.) arch/x86/kernel/process.c:375: error: 'POWER_CSTATE' undeclared (first use in this function) arch/x86/kernel/process.c:375: error: (Each undeclared identifier is reported only once arch/x86/kernel/process.c:375: error: for each function it appears in.) arch/x86/kernel/process.c:446: error: 'POWER_CSTATE' undeclared (first use in this function) arch/x86/kernel/process.c:463: error: 'POWER_CSTATE' undeclared (first use in this function) arch/x86/kernel/process.c:485: error: 'POWER_CSTATE' undeclared (first use in this function) include/trace/events/power.h:142: error: redefinition of 'trace_power_start' Config attached. Note: please reuse the two commits from below for further work, i did some small cleanups to the commit text and to the patches. Thanks, Ingo ----------------> From 87a2cfbda3f53c3bf00c424ce18d97b03b0c3aa0 Mon Sep 17 00:00:00 2001 From: Thomas Renninger <trenn@suse.de> Date: Thu, 18 Nov 2010 10:25:13 +0100 Subject: [PATCH] perf: Clean up power events Add these new power trace events: power:cpu_idle power:cpu_frequency power:machine_suspend The old C-state/idle accounting events: power:power_start power:power_end Have now a replacement (but we are still keeping the old tracepoints for compatibility): power:cpu_idle and power:power_frequency is replaced with: power:cpu_frequency power:machine_suspend is newly introduced. Jean Pihet has a patch integrated into the generic layer (kernel/power/suspend.c) which will make use of it. the type= field got removed from both, it was never used and the type is differed by the event type itself. perf timechart userspace tool gets adjusted in a separate patch. Signed-off-by: Thomas ...
The problem is because power.h gets included mutliple times, and so
the POWER_ enum and the empty deprecated functions need to be
protected from that.
Here is a patch below that fixes it, compile tested with and without
CONFIG_EVENT_POWER_TRACING_DEPRECATED set.
Ingo, Thomas, please let me know if you want me tp refresh the patches
with that fix.
diff --git a/include/trace/events/power.h b/include/trace/events/power.h
index 00d9819..89db5a1 100644
--- a/include/trace/events/power.h
+++ b/include/trace/events/power.h
@@ -136,12 +136,24 @@ enum {
POWER_PSTATE = 2,
};
#endif /* _PWR_EVENT_AVOID_DOUBLE_DEFINING_DEPRECATED */
-#else
+
+#else /* CONFIG_EVENT_POWER_TRACING_DEPRECATED */
+
+#ifndef _PWR_EVENT_AVOID_DOUBLE_DEFINING_DEPRECATED
+#define _PWR_EVENT_AVOID_DOUBLE_DEFINING_DEPRECATED
+enum {
+ POWER_NONE = 0,
+ POWER_CSTATE = 1,
+ POWER_PSTATE = 2,
+};
+
/* These dummy declaration have to be ripped out when the deprecated
events get removed */
static inline void trace_power_start(u64 type, u64 state, u64 cpuid) {};
static inline void trace_power_end(u64 cpuid) {};
static inline void trace_power_frequency(u64 type, u64 state, u64 cpuid) {};
+#endif /* _PWR_EVENT_AVOID_DOUBLE_DEFINING_DEPRECATED */
+
#endif /* CONFIG_EVENT_POWER_TRACING_DEPRECATED */
/*
Thanks,
--
Yep, that should be the correct fix.
While I tested both options before, after the pre-processor mess ups, it
looks like I did test a lot of different archs/flavors through our
build service, but all .configs were set by default to
CONFIG_EVENT_POWER_TRACING_DEPRECATED=y
I'll add it to the end, based on the one Ingo sent.
Ingo: As you started fiddling with it, is that enough or do you prefer
another whole patch series resend?
This is an independent fix, you can just push it.
Thanks Jean/Ingo, hope that was the last remaining issue...
Thomas
--------
perf: Clean up power events
Add these new power trace events:
power:cpu_idle
power:cpu_frequency
power:machine_suspend
The old C-state/idle accounting events:
power:power_start
power:power_end
Have now a replacement (but we are still keeping the old
tracepoints for compatibility):
power:cpu_idle
and
power:power_frequency
is replaced with:
power:cpu_frequency
power:machine_suspend is newly introduced.
Jean Pihet has a patch integrated into the generic layer
(kernel/power/suspend.c) which will make use of it.
the type= field got removed from both, it was never
used and the type is differed by the event type itself.
perf timechart userspace tool gets adjusted in a separate patch.
Signed-off-by: Thomas Renninger <trenn@suse.de>
Acked-by: Arjan van de Ven <arjan@linux.intel.com>
Acked-by: Jean Pihet <jean.pihet@newoldbits.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: rjw@sisk.pl
LKML-Reference: <1290072314-31155-2-git-send-email-trenn@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index 57d1868..155d975 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -374,6 +374,7 @@ void default_idle(void)
{
if (hlt_use_halt()) {
trace_power_start(POWER_CSTATE, 1, smp_processor_id());
+ trace_cpu_idle(1, smp_processor_id());
current_thread_info()->status &= ...PERF(kernel): Cleanup power events Recent changes: - Fix pre-processor conditionals which got messed up silently by a recent merge/pull - Add a comment to EVENT_POWER_TRACING_DEPRECATED .config option New power trace events: power:cpu_idle power:cpu_frequency power:machine_suspend C-state/idle accounting events: power:power_start power:power_end are replaced with: power:cpu_idle and power:power_frequency is replaced with: power:cpu_frequency power:machine_suspend is newly introduced. Jean Pihet has a patch integrated into the generic layer (kernel/power/suspend.c) which will make use of it. the type= field got removed from both, it was never used and the type is differed by the event type itself. perf timechart userspace tool gets adjusted in a separate patch. Signed-off-by: Thomas Renninger <trenn@suse.de> Acked-by: Arjan van de Ven <arjan@linux.intel.com> Acked-by: Jean Pihet <jean.pihet@newoldbits.com> CC: Arjan van de Ven <arjan@linux.intel.com> CC: Ingo Molnar <mingo@elte.hu> CC: rjw@sisk.pl CC: linux-kernel@vger.kernel.org diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c index 57d1868..155d975 100644 --- a/arch/x86/kernel/process.c +++ b/arch/x86/kernel/process.c @@ -374,6 +374,7 @@ void default_idle(void) { if (hlt_use_halt()) { trace_power_start(POWER_CSTATE, 1, smp_processor_id()); + trace_cpu_idle(1, smp_processor_id()); current_thread_info()->status &= ~TS_POLLING; /* * TS_POLLING-cleared state must be visible before we @@ -444,6 +445,7 @@ EXPORT_SYMBOL_GPL(cpu_idle_wait); void mwait_idle_with_hints(unsigned long ax, unsigned long cx) { trace_power_start(POWER_CSTATE, (ax>>4)+1, smp_processor_id()); + trace_cpu_idle((ax>>4)+1, smp_processor_id()); if (!need_resched()) { if (cpu_has(&current_cpu_data, X86_FEATURE_CLFLUSH_MONITOR)) clflush((void *)&current_thread_info()->flags); @@ -460,6 +462,7 @@ static void mwait_idle(void) { if (!need_resched()) { ...
Acked-by: Jean Pihet <j-pihet@ti.com> --
power_frequency moved to drivers/cpufreq/cpufreq.c which has to be compiled in, no need to export it. intel_idle can a be module though... Signed-off-by: Thomas Renninger <trenn@suse.de> CC: Jean Pihet <jean.pihet@newoldbits.com> CC: Arjan van de Ven <arjan@linux.intel.com> CC: Ingo Molnar <mingo@elte.hu> CC: linux-kernel@vger.kernel.org --- drivers/idle/intel_idle.c | 2 -- kernel/trace/power-traces.c | 2 +- 2 files changed, 1 insertions(+), 3 deletions(-) diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c index 41665d2..3c95325 100644 --- a/drivers/idle/intel_idle.c +++ b/drivers/idle/intel_idle.c @@ -220,9 +220,7 @@ static int intel_idle(struct cpuidle_device *dev, struct cpuidle_state *state) kt_before = ktime_get_real(); stop_critical_timings(); -#ifndef MODULE trace_power_start(POWER_CSTATE, (eax >> 4) + 1, cpu); -#endif if (!need_resched()) { __monitor((void *)&current_thread_info()->flags, 0, 0); diff --git a/kernel/trace/power-traces.c b/kernel/trace/power-traces.c index a22582a..0e0497d 100644 --- a/kernel/trace/power-traces.c +++ b/kernel/trace/power-traces.c @@ -13,5 +13,5 @@ #define CREATE_TRACE_POINTS #include <trace/events/power.h> -EXPORT_TRACEPOINT_SYMBOL_GPL(power_frequency); +EXPORT_TRACEPOINT_SYMBOL_GPL(power_start); -- 1.6.3 --
Acked-by: Jean Pihet <j-pihet@ti.com> --
