Re: 2.6.27-rc6: nohz + s2ram = need to press keys to get progress

Previous thread: Re: [PATCH 3/4] ide: Implement disk shock protection support by Elias Oltmanns on Friday, September 12, 2008 - 2:55 am. (4 messages)

Next thread: "Invalid module format" after recompiling kernel without changing .config - why? by kovlensky on Friday, September 12, 2008 - 3:24 am. (1 message)
From: Pavel Machek
Date: Friday, September 12, 2008 - 1:31 am

Hi!

The old "you have to press keys to get machine to progress" seems to
be back :-(. Thinkpad x60.
									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
--

From: Rafael J. Wysocki
Date: Friday, September 12, 2008 - 12:17 pm

I guess the last bunch of HPET/clockevents patches caused that to happen.

We seem to be doing a "one step forward, one step back" thing here ...

Thanks,
Rafael
--

From: Thomas Gleixner
Date: Friday, September 12, 2008 - 12:15 pm

Guessing is hardly a good method to get down to the root cause of

Well, the fixes for HPET/clockevents fix real bugs and there are no
evident side effects vs. s2ram inside. quite the contrary.

Thanks,

	tglx
--

From: Rafael J. Wysocki
Date: Friday, September 12, 2008 - 1:45 pm

So that must be something different.

Thanks,
Rafael
--

From: Thomas Gleixner
Date: Friday, September 12, 2008 - 2:05 pm

well, there is no way to exclude those patches for sure, but Pavel
should be able to identify the one which causes problems.

Thanks,

	tglx
 
--

From: Thomas Gleixner
Date: Friday, September 12, 2008 - 12:13 pm

/me cries

did this happen between rc5 and rc6 ?

Thanks,

	tglx
--

From: Thomas Gleixner
Date: Saturday, September 13, 2008 - 7:56 pm

Pavel,


Is there a chance that we get some more information than that ?

Thanks,

	tglx
--

From: Pavel Machek
Date: Sunday, September 14, 2008 - 3:09 am

Yep.

It does not happen after _every_ s2ram, but when it happens the system
limps around in half-dead state with non-blinking cursor etc. Next
s2ram will not fix it.

nohz=off helps.

Will try 2.6.27-rc6 w/o any custom patches next (not that I have
anything interesting in that area), and then probably 2.6.26.

(I had nohz turned off before 2.6.27-rc6...)

									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
--

From: Pavel Machek
Date: Sunday, September 14, 2008 - 3:14 am

It _does_ happen with mainline 2.6.27-rc6.
									Pavel

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
--

From: Pavel Machek
Date: Sunday, September 14, 2008 - 10:35 am

s2ram hs problem in mainline 2.6.26, too. (Different problem: no
ammount pressing shift helps there.) nohz=off cures it, too.

How to proceed?

									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
--

From: Thomas Gleixner
Date: Sunday, September 14, 2008 - 10:51 am

Ok, so it's not a regression.

Is the problem in the suspend path or in the resume ?

Thanks,

	tglx

--

From: Pavel Machek
Date: Sunday, September 14, 2008 - 11:12 am

During resume, on both 2.6.26 and 2.6.27-rc6. (27-rc6 actually resumes
if I keep hitting shift, and I get "sleepy" system after that --
cursor does not blink, but machine can be used -- as long as I keep
hitting keys).

									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
--

From: Thomas Gleixner
Date: Sunday, September 14, 2008 - 12:15 pm

Hmm. Can you please provide the output of /proc/timer_list when the
system is in that "sleepy" state.

Thanks,

	tglx
--

From: Pavel Machek
Date: Monday, September 15, 2008 - 2:19 am

-rc5 seems to work ok... is there some patch I should try to revert
first?

									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
--

From: Thomas Gleixner
Date: Monday, September 15, 2008 - 7:08 am

The relevant 9 changes are from:

7c1e76897492d92b6a1c2d6892494d39ded9680c

to

72d43d9bc9210d24d09202eaf219eac09e17b339

Thanks,

	tglx
--

From: Pavel Machek
Date: Monday, September 15, 2008 - 12:26 pm

I did 

cg-seek 7c1e76897492d92b6a1c2d6892494d39ded9680c


cg-seek here, and problem is back. Good. I can get the
/proc/timer_list ;-)... attached.
									Pavel


-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
From: Pavel Machek
Date: Monday, September 15, 2008 - 12:59 pm

(Actually, timer_list was from more recent kernel, sorry about that, I
was confused. But I verified, and 72d... is still broken).

Trying commit 7cfb0435330364f90f274a26ecdc5f47f738498c now... bad,
too.

Trying 1fb9b7d29d8e85ba3196eaa7ab871bf76fc98d36... bad, too.

									Pavel

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
--

From: Thomas Gleixner
Date: Tuesday, September 16, 2008 - 7:16 am

Is the timer_list output is from that "sleepy" state ? If not please
provide one.

Thanks,
	tglx
--

From: Thomas Gleixner
Date: Tuesday, September 16, 2008 - 10:02 am

Does the patch below fix it ?

Thanks,

	tglx
----
diff --git a/kernel/time/clockevents.c b/kernel/time/clockevents.c
index 1876b52..eb8736f 100644
--- a/kernel/time/clockevents.c
+++ b/kernel/time/clockevents.c
@@ -69,6 +69,9 @@ void clockevents_set_mode(struct clock_event_device *dev,
 		dev->set_mode(mode, dev);
 		dev->mode = mode;
 	}
+
+	if (mode == CLOCK_EVT_MODE_SHUTDOWN)
+		dev->next_event.tv64 = KTIME_MAX;
 }
 
 /**
diff --git a/kernel/time/tick-broadcast.c b/kernel/time/tick-broadcast.c
index 2f5a382..6b0b230 100644
--- a/kernel/time/tick-broadcast.c
+++ b/kernel/time/tick-broadcast.c
@@ -434,6 +434,19 @@ again:
 }
 
 /*
+ * We shutdown the device which will stop anyway, but we keep the
+ * next_event untouched as it carries the information when we
+ * broadcast.
+ */
+static void tick_broadcast_shutdown_device(struct clock_event_device *dev)
+{
+	if (dev->mode != CLOCK_EVT_MODE_SHUTDOWN) {
+		dev->set_mode(CLOCK_EVT_MODE_SHUTDOWN, dev);
+		dev->mode = CLOCK_EVT_MODE_SHUTDOWN;
+	}
+}
+
+/*
  * Powerstate information: The system enters/leaves a state, where
  * affected devices might stop
  */
@@ -464,7 +477,7 @@ void tick_broadcast_oneshot_control(unsigned long reason)
 	if (reason == CLOCK_EVT_NOTIFY_BROADCAST_ENTER) {
 		if (!cpu_isset(cpu, tick_broadcast_oneshot_mask)) {
 			cpu_set(cpu, tick_broadcast_oneshot_mask);
-			clockevents_set_mode(dev, CLOCK_EVT_MODE_SHUTDOWN);
+			tick_broadcast_shutdown_device(dev);
 			if (dev->next_event.tv64 < bc->next_event.tv64)
 				tick_broadcast_set_event(dev->next_event, 1);
 		}
--

Previous thread: Re: [PATCH 3/4] ide: Implement disk shock protection support by Elias Oltmanns on Friday, September 12, 2008 - 2:55 am. (4 messages)

Next thread: "Invalid module format" after recompiling kernel without changing .config - why? by kovlensky on Friday, September 12, 2008 - 3:24 am. (1 message)