From: Jean Pihet <j-pihet@ti.com> Provides: . calls to machine_suspend trace point, . OMAP support, . API Documentation Applies on top of Thomas's 8 latest power trace API patches, cf. http://marc.info/?l=linux-kernel&m=129130827309354&w=2 Jean Pihet (3): perf: add calls to suspend trace point perf: add OMAP support for the new power events tools, perf: Documentation for the power events API Documentation/trace/events-power.txt | 90 ++++++++++++++++++++++++++++++++++ arch/arm/mach-omap2/pm34xx.c | 7 +++ arch/arm/mach-omap2/powerdomain.c | 3 + arch/arm/plat-omap/clock.c | 13 ++++- kernel/power/suspend.c | 3 + 5 files changed, 113 insertions(+), 3 deletions(-) create mode 100644 Documentation/trace/events-power.txt -- 1.7.2.3 --
From: Jean Pihet <j-pihet@ti.com>
Uses the machine_suspend trace point, called from the
generic kernel suspend_enter function.
Signed-off-by: Jean Pihet <j-pihet@ti.com>
CC: Thomas Renninger <trenn@suse.de>
---
kernel/power/suspend.c | 3 +++
1 files changed, 3 insertions(+), 0 deletions(-)
diff --git a/kernel/power/suspend.c b/kernel/power/suspend.c
index ecf7705..0650596 100644
--- a/kernel/power/suspend.c
+++ b/kernel/power/suspend.c
@@ -22,6 +22,7 @@
#include <linux/mm.h>
#include <linux/slab.h>
#include <linux/suspend.h>
+#include <trace/events/power.h>
#include "power.h"
@@ -164,7 +165,9 @@ static int suspend_enter(suspend_state_t state)
error = sysdev_suspend(PMSG_SUSPEND);
if (!error) {
if (!suspend_test(TEST_CORE) && pm_check_wakeup_events()) {
+ trace_machine_suspend(state);
error = suspend_ops->enter(state);
+ trace_machine_suspend(PWR_EVENT_EXIT);
events_check_enabled = false;
}
sysdev_resume();
--
1.7.2.3
--
Please use the scripts/get_maintainer.pl to construct a proper Cc: list and to gather the necessary Acked-by: scripts/get_maintainer.pl -f kernel/power/suspend.c Thanks, Ingo --
From: Jean Pihet <j-pihet@ti.com> The patch adds the new power management trace points for the OMAP architecture. The trace points are for: - default idle handler. Since the cpuidle framework is instrumented in the generic way there is no need to add trace points in the OMAP specific cpuidle handler; - cpufreq (DVFS), - clocks changes (enable, disable, set_rate), - change of power domains next power states. Signed-off-by: Jean Pihet <j-pihet@ti.com> --- arch/arm/mach-omap2/pm34xx.c | 7 +++++++ arch/arm/mach-omap2/powerdomain.c | 3 +++ arch/arm/plat-omap/clock.c | 13 ++++++++++--- 3 files changed, 20 insertions(+), 3 deletions(-) diff --git a/arch/arm/mach-omap2/pm34xx.c b/arch/arm/mach-omap2/pm34xx.c index 0ec8a04..0ee0b0e 100644 --- a/arch/arm/mach-omap2/pm34xx.c +++ b/arch/arm/mach-omap2/pm34xx.c @@ -29,6 +29,7 @@ #include <linux/delay.h> #include <linux/slab.h> #include <linux/console.h> +#include <trace/events/power.h> #include <plat/sram.h> #include <plat/clockdomain.h> @@ -506,8 +507,14 @@ static void omap3_pm_idle(void) if (omap_irq_pending() || need_resched()) goto out; + trace_power_start(POWER_CSTATE, 1, smp_processor_id()); + trace_cpu_idle(1, smp_processor_id()); + omap_sram_idle(); + trace_power_end(smp_processor_id()); + trace_cpu_idle(PWR_EVENT_EXIT, smp_processor_id()); + out: local_fiq_enable(); local_irq_enable(); diff --git a/arch/arm/mach-omap2/powerdomain.c b/arch/arm/mach-omap2/powerdomain.c index 6527ec3..73cbe9a 100644 --- a/arch/arm/mach-omap2/powerdomain.c +++ b/arch/arm/mach-omap2/powerdomain.c @@ -23,6 +23,7 @@ #include <linux/errno.h> #include <linux/err.h> #include <linux/io.h> +#include <trace/events/power.h> #include <asm/atomic.h> @@ -440,6 +441,8 @@ int pwrdm_set_next_pwrst(struct powerdomain *pwrdm, u8 pwrst) pr_debug("powerdomain: setting next powerstate for %s to %0x\n", pwrdm->name, pwrst); + trace_power_domain_target(pwrdm->name, pwrst, ...
I suspect the gents and mailing lists listed by: scripts/get_maintainer.pl -f arch/arm/plat-omap/clock.c scripts/get_maintainer.pl -f arch/arm/mach-omap2/pm34xx.c Would want to be Cc:-ed as well. That will also get the right Acked-by's. (if you want these commits to go upstream via the perf tree) Thanks, Ingo --
Yes the idea is to get those upstream via the tip tree, since it now Thanks, Jean --
jean.pihet@newoldbits.com had written, on 01/04/2011 04:17 AM, the following: Dumb question: it just tells me which C state was attempted - not if actually succeeded in hitting it rt? Does'nt this give us a false data? (from an offline discussion on a related topic): Would it also be nice to hook on mach-omap2/clock.c points as well to hook on indirect changes? [..] -- Regards, Nishanth Menon --
From: Jean Pihet <j-pihet@ti.com> Provides documentation for the following: - the new power trace API, - the old (legacy) power trace API, - the DEPRECATED Kconfig option usage. Signed-off-by: Jean Pihet <j-pihet@ti.com> --- Documentation/trace/events-power.txt | 90 ++++++++++++++++++++++++++++++++++ 1 files changed, 90 insertions(+), 0 deletions(-) create mode 100644 Documentation/trace/events-power.txt diff --git a/Documentation/trace/events-power.txt b/Documentation/trace/events-power.txt new file mode 100644 index 0000000..8a50653 --- /dev/null +++ b/Documentation/trace/events-power.txt @@ -0,0 +1,90 @@ + + Subsystem Trace Points: power + +The power tracing system captures events related to power transitions +within the kernel. Broadly speaking there are three major subheadings: + + o Power state switch which reports events related to suspend (S-states), + cpuidle (C-states) and cpufreq (P-states) + o System clock related changes + o Power domains related changes and transitions + +This document describes what each of the tracepoints is and why they +might be useful. + +Cf. include/trace/events/power.h for the events definitions. + +1. Power state switch events +============================ + +1.1 New trace API +----------------- + +A 'cpu' event class gathers the CPU-related events: cpuidle and +cpufreq. + +cpu_idle "state=%lu cpu_id=%lu" +cpu_frequency "state=%lu cpu_id=%lu" + +A suspend event is used to indicate the system going in and out of the +suspend mode: + +machine_suspend "state=%lu" + + +Note: the value of '-1' or '4294967295' for state means an exit from the current state, +i.e. trace_cpu_idle(4, smp_processor_id()) means that the system +enters the idle state 4, while trace_cpu_idle(PWR_EVENT_EXIT, smp_processor_id()) +means that the system exits the previous idle state. + +The event which has 'state=4294967295' in the trace is very important to the user +space tools which are using it to detect the end of the ...
From: Jean Pihet <j-pihet@ti.com>
Uses the machine_suspend trace point, called from the
generic kernel suspend_enter function.
Signed-off-by: Jean Pihet <j-pihet@ti.com>
CC: Thomas Renninger <trenn@suse.de>
---
kernel/power/suspend.c | 3 +++
1 files changed, 3 insertions(+), 0 deletions(-)
diff --git a/kernel/power/suspend.c b/kernel/power/suspend.c
index ecf7705..0650596 100644
--- a/kernel/power/suspend.c
+++ b/kernel/power/suspend.c
@@ -22,6 +22,7 @@
#include <linux/mm.h>
#include <linux/slab.h>
#include <linux/suspend.h>
+#include <trace/events/power.h>
#include "power.h"
@@ -164,7 +165,9 @@ static int suspend_enter(suspend_state_t state)
error = sysdev_suspend(PMSG_SUSPEND);
if (!error) {
if (!suspend_test(TEST_CORE) && pm_check_wakeup_events()) {
+ trace_machine_suspend(state);
error = suspend_ops->enter(state);
+ trace_machine_suspend(PWR_EVENT_EXIT);
events_check_enabled = false;
}
sysdev_resume();
--
1.7.2.3
--
Ok... why this place? I mean, perhaps suspend time should include device suspend? Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html --
Hi, This trace has been placed here because it traces the machine low That makes sense. We have a few options here: 1) keep the traces as proposed to trace the low level machine code only, 2) move the traces to the entry and exit of suspend_enter so that it includes the prepare and late_prepare (+ the associated wake-up) callbacks as well, 3) move the traces to suspend_devices_and_enter so that it includes 2) and the handling of the console and the devices, 4) move the traces to enter_state do that it includes 3), the call to sys_sync and the user space freeze. Note that the the SNAPSHOT_2RAM ioctl code also calls suspend_devices_and_enter, so if only 4) is used no trace will be generated in that case. I am in favor of 3) of 4). Thanks, Jean --
Why don't we keep the tracepoints as proposed _and_ add two additional tracepoints around device suspend-resume? Rafael --
From: Jean Pihet <j-pihet@ti.com> The patch adds the new power management trace points for the OMAP architecture. The trace points are for: - default idle handler. Since the cpuidle framework is instrumented in the generic way there is no need to add trace points in the OMAP specific cpuidle handler; - cpufreq (DVFS), - clocks changes (enable, disable, set_rate), - change of power domains next power states. Signed-off-by: Jean Pihet <j-pihet@ti.com> --- arch/arm/mach-omap2/pm34xx.c | 7 +++++++ arch/arm/mach-omap2/powerdomain.c | 3 +++ arch/arm/plat-omap/clock.c | 13 ++++++++++--- 3 files changed, 20 insertions(+), 3 deletions(-) diff --git a/arch/arm/mach-omap2/pm34xx.c b/arch/arm/mach-omap2/pm34xx.c index 0ec8a04..0ee0b0e 100644 --- a/arch/arm/mach-omap2/pm34xx.c +++ b/arch/arm/mach-omap2/pm34xx.c @@ -29,6 +29,7 @@ #include <linux/delay.h> #include <linux/slab.h> #include <linux/console.h> +#include <trace/events/power.h> #include <plat/sram.h> #include <plat/clockdomain.h> @@ -506,8 +507,14 @@ static void omap3_pm_idle(void) if (omap_irq_pending() || need_resched()) goto out; + trace_power_start(POWER_CSTATE, 1, smp_processor_id()); + trace_cpu_idle(1, smp_processor_id()); + omap_sram_idle(); + trace_power_end(smp_processor_id()); + trace_cpu_idle(PWR_EVENT_EXIT, smp_processor_id()); + out: local_fiq_enable(); local_irq_enable(); diff --git a/arch/arm/mach-omap2/powerdomain.c b/arch/arm/mach-omap2/powerdomain.c index 6527ec3..73cbe9a 100644 --- a/arch/arm/mach-omap2/powerdomain.c +++ b/arch/arm/mach-omap2/powerdomain.c @@ -23,6 +23,7 @@ #include <linux/errno.h> #include <linux/err.h> #include <linux/io.h> +#include <trace/events/power.h> #include <asm/atomic.h> @@ -440,6 +441,8 @@ int pwrdm_set_next_pwrst(struct powerdomain *pwrdm, u8 pwrst) pr_debug("powerdomain: setting next powerstate for %s to %0x\n", pwrdm->name, pwrst); + trace_power_domain_target(pwrdm->name, pwrst, ...
Hello Jean,
A question about these. Are these only meant to track calls to these
functions from outside the clock code? Or meant to track actual hardware
clock changes? If the latter, then it might make sense to put these
trace points into the functions that actually change the hardware
registers, e.g., omap2_dflt_clk_{enable,disable}(), etc., since a
clk_enable() on a leaf clock may result in many internal system clocks
being enabled up the clock tree.
- Paul
--
