Re: ftraced and suspend to ram

Previous thread: [RFC, PATCH] state machine based rcu by Manfred Spraul on Thursday, August 21, 2008 - 8:27 am. (10 messages)

Next thread: Re: [PATCH 3/13 v2] viafb: accel.c, accel.h by Andrew Morton on Thursday, August 21, 2008 - 9:36 am. (1 message)
From: Steven Rostedt
Date: Thursday, August 21, 2008 - 8:49 am

In latest 2.6.27(git) enabling dynamic ftrace makes resume from a suspend 
to ram reboot instead of resuming. Queued for 2.6.28 is a new method of 
recording mcount callers at compile time that does not have this issue.

But the new method is still too "green" to be pulled into 27, so the old 
ftraced (daemon method) needs to be fixed for 27.

The way dynamic ftrace works with the daemon method is this. On boot up 
the mcount function simply returns. When ftrace is initialized, it calls 
kstop_machine to modify the mcount function to call another function 
called "ftrace_record_ip". This new function will record in a preallocated 
hash (allocated by the ftrace initializer) all the callers of mcount. A 
check is made to see if the caller has already been put into the hash, and 
if so, it is not recorded again.

Later on a kernel thread ftraced is created. This kernel thread wakes up 
once a second and checks to see if any new functions were added to the 
hash. If so, it then calls kstop_machine and modifies those callers to 
mcount into nops.

Again, this daemon method makes resume from suspend to ram reboot instead 
of resuming. Now, I'm asking the s2r gurus, what did I miss? Do I need to 
add a "NO_FREEZE" flag or something to the "ftraced" kernel thread?

Just asking for some advice.

Thanks,

-- Steve

--

From: Rafael J. Wysocki
Date: Thursday, August 21, 2008 - 11:15 am

If I'm not mistaken, it'll probably suffice to make it freezable, so that it
doesn't run while the system is suspending and resuming.  Would that be
acceptable?

Please tell me where exactly the ftraced source code is located.

Thanks,
Rafael
--

From: Steven Rostedt
Date: Thursday, August 21, 2008 - 11:26 am

It's in Linus's latest git tree.

The code in question is the ftraced() function in kernel/trace/ftrace.c

Thanks,

-- Steve

--

From: Rafael J. Wysocki
Date: Thursday, August 21, 2008 - 11:37 am

Thanks, I'll have a look in a while.

Rafael
--

From: Rafael J. Wysocki
Date: Thursday, August 21, 2008 - 12:59 pm

Can you try the appended patch, please?

Thanks,
Rafael

---
 kernel/trace/ftrace.c |    5 +++++
 1 file changed, 5 insertions(+)

Index: linux-2.6/kernel/trace/ftrace.c
===================================================================
--- linux-2.6.orig/kernel/trace/ftrace.c
+++ linux-2.6/kernel/trace/ftrace.c
@@ -796,8 +796,13 @@ static int ftraced(void *ignore)
 {
 	unsigned long usecs;
 
+	set_freezable();
+
 	while (!kthread_should_stop()) {
 
+		if (try_to_freeze())
+			continue;
+
 		set_current_state(TASK_INTERRUPTIBLE);
 
 		/* check once a second */
--

From: Ingo Molnar
Date: Thursday, August 21, 2008 - 9:46 pm

makes sense - i've applied it to tip/tracing/urgent, see the tidied up 
commit below.

It should be no big issue not being able to trace across suspend+resume 
- and that restriction will go away with Steve's build-time based mcount 
patching mechanism in v2.6.28.

	Ingo

------------->
From 0e556695ddc8eebf6f6dd86bb0c4911b2b90c12a Mon Sep 17 00:00:00 2001
From: Rafael J. Wysocki <rjw@sisk.pl>
Date: Thu, 21 Aug 2008 21:59:36 +0200
Subject: [PATCH] ftrace: fix ftraced and suspend to ram


It will suffice to make it freezable, so that it doesn't run while the
system is suspending and resuming.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
 kernel/trace/ftrace.c |    5 +++++
 1 files changed, 5 insertions(+), 0 deletions(-)

diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index 639e16c..49f4c3f 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -819,8 +819,13 @@ static int ftraced(void *ignore)
 {
 	unsigned long usecs;
 
+	set_freezable();
+
 	while (!kthread_should_stop()) {
 
+		if (try_to_freeze())
+			continue;
+
 		set_current_state(TASK_INTERRUPTIBLE);
 
 		/* check once a second */
--

From: Pavel Machek
Date: Friday, August 22, 2008 - 12:23 am

Patch looks okay to me, but I'm not sure if another issue is not
hiding under it. Did anyone actually test ftrace + suspend after
applying this?
								Pavel


-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
--

From: Marcin Slusarz
Date: Friday, August 22, 2008 - 3:35 am

I just tested this patch - it didn't help ;(

diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index 49f4c3f..02e41b2 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -27,6 +27,7 @@
 #include <linux/ctype.h>
 #include <linux/hash.h>
 #include <linux/list.h>
+#include <linux/freezer.h>
 
 #include <asm/ftrace.h>
 
--

From: Rafael J. Wysocki
Date: Friday, August 22, 2008 - 9:39 am

This is needed to fix compilation, sorry for the omission.

Still, did you test ftrace + suspend with the original patch and your fix
applied and if you did, then what happend?

Rafael
--

From: Marcin Slusarz
Date: Friday, August 22, 2008 - 1:54 pm

Suspend works fine. When I hit power button I see some informations from
video card (nvidia), then screen goes black and monitor says "no signal"
and then I see the same informations from video card, then BIOS and GRUB.
When everything works fine, X comes back after first infos from video card.

Marcin
--

From: Rafael J. Wysocki
Date: Friday, August 22, 2008 - 2:17 pm

I suspect that something under arch/x86/kernel/acpi/realmode is broken
somehow.

Thanks,
Rafael
--

From: Pavel Machek
Date: Friday, August 22, 2008 - 3:46 am

Does ftrace hook itself onto _all_ the functions? Or all .c functions?

I guess low-level suspend code needs to be exempt from
tracing. Certainly all the assembly functions.
									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
--

From: Steven Rostedt
Date: Friday, August 22, 2008 - 1:33 pm

It hooks into all .c functions that are not annotated with "notrace"
or the files have not been marked in the Makefile like:


I'm looking into that now too. Are the functions in arch/x86/power/cpu*.c 
the suspend to ram code?

Thanks,

-- Steve

--

From: Rafael J. Wysocki
Date: Friday, August 22, 2008 - 1:52 pm

They contain code executed during suspend to RAM, but such code is also:
- in all files under arch/x86/kernel/acpi/
- in main.c, console.c under kernel/power
- in all files under drivers/acpi/sleep
- in drivers/acpi/hardware/hwsleep.c

Generally, ACPI is heavily involved and I'm not the right person to ask which
of the ACPI functions should get the 'notrace' thing.  Also, I'm not sure about
the device drivers' ->suspend() and ->resume() callbacks, especially for
sysdevs and ->suspend_late(), ->resume_early() for platform devices and PCI.

Well, how exactly suspend to RAM is broken by ftrace?

Rafael
--

From: Steven Rostedt
Date: Friday, August 22, 2008 - 1:55 pm

I know that the smp_processor_id may be defined in the %fs register, but 
if ftrace is called before the %fs is set up, it may crash because it 
uses smp_processor_id.

-- Steve

--

From: Rafael J. Wysocki
Date: Friday, August 22, 2008 - 2:11 pm

The wake-up code is the most interesting for you, then.  It's located in
arch/x86/kernel/acpi/ , but on x86-64 it also uses trampoline code in
trampoline_64.S (this is assembly, though).

On x86-64, wakeup_long64 in wakeup_64.S is where we get from the real-mode
wake-up code, but under arch/x86/kernel/acpi/realmode there are some C files
containing functions executed in real mode.  Everything in there and everything
referred to from there should be 'notrace' IMO.

Thanks,
Rafael

--

From: Steven Rostedt
Date: Wednesday, August 27, 2008 - 6:14 am

I've been painstakingly debugging the issue with suspend to ram and 
ftraced. The 2.6.28 code does not have this issue, but since the mcount 
recording is not going to be in 27, this must be solved for the ftrace 
daemon version.

The resume from suspend to ram would reboot because it was triple 
faulting. Debugging further, I found that calling the mcount function 
itself was not an issue, but it would fault when it incremented 
preempt_count. preempt_count is on the tasks info structure that is on the 
low memory address of the task's stack.  For some reason, it could not 
write to it. Resuming out of suspend to ram does quite a lot of funny 
tricks to get to work, so it is not surprising at all that simply doing a 
preempt_disable() would cause a fault.

Thanks to Rafael for suggesting to add a "while (1);" to find the place in 
resuming that is causing the fault. I would place the loop somewhere in 
the code, compile and reboot and see if it would either reboot (hit the 
fault) or simply hang (hit the loop).  Doing this over and over again, I 
narrowed it down that it was happening in enable_nonboot_cpus.

At this point, I found that it is easier to simply disable tracing around 
the suspend code, instead of searching for the particular function that 
can not handle doing a preempt_disable.

This patch disables the tracer as it suspends and reenables it on resume.

I tested this patch on my Laptop, and it can resume fine with the patch.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
---
 kernel/power/main.c |    5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

Index: linux-compile.git/kernel/power/main.c
===================================================================
--- linux-compile.git.orig/kernel/power/main.c	2008-08-27 08:53:11.000000000 -0400
+++ linux-compile.git/kernel/power/main.c	2008-08-27 08:53:58.000000000 -0400
@@ -21,6 +21,7 @@
 #include <linux/freezer.h>
 #include <linux/vmstat.h>
 #include <linux/syscalls.h>
+#include ...
From: Rafael J. Wysocki
Date: Wednesday, August 27, 2008 - 6:26 am

Well, if that's enable_nonboot_cpus() that's faulting, I guess a similar change
in the hibernation code is needed.  I'll try to prepare a patch for that.

Thanks,
Rafael
--

From: Rafael J. Wysocki
Date: Thursday, August 28, 2008 - 5:39 am

From: Rafael J. Wysocki <rjw@sisk.pl>

ftrace: disable tracing for hibernation
    
In accordance with commit f42ac38c59e0a03d6da0c24a63fb211393f484b0
("ftrace: disable tracing for suspend to ram"), disable tracing
around the suspend code in hibernation code paths.
    
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
---
 kernel/power/disk.c |   13 ++++++++++---
 1 file changed, 10 insertions(+), 3 deletions(-)

Index: linux-2.6/kernel/power/disk.c
===================================================================
--- linux-2.6.orig/kernel/power/disk.c
+++ linux-2.6/kernel/power/disk.c
@@ -21,6 +21,7 @@
 #include <linux/console.h>
 #include <linux/cpu.h>
 #include <linux/freezer.h>
+#include <linux/ftrace.h>
 
 #include "power.h"
 
@@ -255,7 +256,7 @@ static int create_image(int platform_mod
 
 int hibernation_snapshot(int platform_mode)
 {
-	int error;
+	int error, ftrace_save;
 
 	/* Free memory before shutting down devices. */
 	error = swsusp_shrink_memory();
@@ -267,6 +268,7 @@ int hibernation_snapshot(int platform_mo
 		goto Close;
 
 	suspend_console();
+	ftrace_save = __ftrace_enabled_save();
 	error = device_suspend(PMSG_FREEZE);
 	if (error)
 		goto Recover_platform;
@@ -296,6 +298,7 @@ int hibernation_snapshot(int platform_mo
  Resume_devices:
 	device_resume(in_suspend ?
 		(error ? PMSG_RECOVER : PMSG_THAW) : PMSG_RESTORE);
+	__ftrace_enabled_restore(ftrace_save);
 	resume_console();
  Close:
 	platform_end(platform_mode);
@@ -366,10 +369,11 @@ static int resume_target_kernel(void)
 
 int hibernation_restore(int platform_mode)
 {
-	int error;
+	int error, ftrace_save;
 
 	pm_prepare_console();
 	suspend_console();
+	ftrace_save = __ftrace_enabled_save();
 	error = device_suspend(PMSG_QUIESCE);
 	if (error)
 		goto Finish;
@@ -384,6 +388,7 @@ int hibernation_restore(int platform_mod
 	platform_restore_cleanup(platform_mode);
 	device_resume(PMSG_RECOVER);
  Finish:
+	__ftrace_enabled_restore(ftrace_save);
 	resume_console();
 ...
From: Ingo Molnar
Date: Thursday, August 28, 2008 - 5:42 am

applied it to tip/tracing/urgent - thanks Rafael!

	Ingo
--

From: Steven Rostedt
Date: Thursday, August 28, 2008 - 5:44 am

Looks good to me, Thanks Rafael!

Acked-by: Steven Rostedt <srostedt@redhat.com>

-- Steve

--

From: Pavel Machek
Date: Friday, August 29, 2008 - 4:53 pm

From: Marcin Slusarz
Date: Wednesday, August 27, 2008 - 2:27 pm

You can add my:
Tested-by: Marcin Slusarz <marcin.slusarz@gmail.com>

Thanks!

Marcin
--

From: Pavel Machek
Date: Thursday, August 28, 2008 - 12:28 am

So this is going to be reverted after 2.6.27? Okay, then.


-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
--

From: Steven Rostedt
Date: Friday, August 29, 2008 - 6:43 am

Maybe not. The difference is that in 27 it would cause s2ram to fail every
time. In 28 (I have not proved this yet), it may fail s2ram if the tracer 
is on.

--

From: Rafael J. Wysocki
Date: Friday, August 22, 2008 - 3:22 am

Previous thread: [RFC, PATCH] state machine based rcu by Manfred Spraul on Thursday, August 21, 2008 - 8:27 am. (10 messages)

Next thread: Re: [PATCH 3/13 v2] viafb: accel.c, accel.h by Andrew Morton on Thursday, August 21, 2008 - 9:36 am. (1 message)