login
Header Space

 
 

Re: broken suspend (sched related) [Was: 2.6.24-rc4-mm1]

Score:
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]
To: Ingo Molnar <mingo@...>
Cc: Jiri Slaby <jirislaby@...>, Andrew Morton <akpm@...>, <linux-kernel@...>, Rafael J. Wysocki <rjw@...>, Arjan van de Ven <arjan@...>, Thomas Gleixner <tglx@...>, Linux-pm mailing list <linux-pm@...>, Dipankar Sarma <dipankar@...>
Date: Monday, December 10, 2007 - 6:15 am

On Mon, Dec 10, 2007 at 10:10:52AM +0100, Ingo Molnar wrote:

In this particular case, we are trying to see if any task on a particular
cpu has not been scheduled for a really long time. If we do this check
on a cpu which has gone offline, then
a) If the tasks have not been migrated on to another cpu yet, we will
still perform that check and yell if something has been holding any task
for a sufficiently long time.
b) If the tasks have been migrated off, then we have nothing to check.

However, if we still want that particular cpu to not go offline during
the check, then could we use the following patch

commit a49736844717e00f7a37c96368cea8fec7eb31cf
Author: Gautham R Shenoy <ego@in.ibm.com>
Date:   Mon Dec 10 15:43:32 2007 +0530

CPU-Hotplug: Add try_get_online_cpus()

Add the fastpath code, try_get_online_cpus() which will
return 1 once it has managed to hold the reference to the cpu_online_map
if there are no threads trying to perform a cpu-hotplug.

Use the primitive in the softlockup code.

Signed-off-by: Gautham R Shenoy <ego@in.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linuxtronix.de>
Cc: Jiri Slaby <jirislaby@gmail.com>

diff --git a/include/linux/cpu.h b/include/linux/cpu.h
index e0132cb..d236e21 100644
--- a/include/linux/cpu.h
+++ b/include/linux/cpu.h
@@ -107,6 +107,7 @@ static inline void cpuhotplug_mutex_unlock(struct mutex *cpu_hp_mutex)
 }
 
 extern void get_online_cpus(void);
+extern int  try_get_online_cpus(void);
 extern void put_online_cpus(void);
 #define hotcpu_notifier(fn, pri) {				\
 	static struct notifier_block fn##_nb =			\
@@ -127,6 +128,9 @@ static inline void cpuhotplug_mutex_unlock(struct mutex *cpu_hp_mutex)
 
 #define get_online_cpus()	do { } while (0)
 #define put_online_cpus()	do { } while (0)
+static inline int try_get_online_cpus(void) 
+{ return 1;}
+
 #define hotcpu_notifier(fn, pri)	do { (void)(fn); } while (0)
 /* These aren't inline functions due to a GCC bug. */
 #define register_hotcpu_notifier(nb)	({ (void)(nb); 0; })
diff --git a/kernel/cpu.c b/kernel/cpu.c
index e0d3a4f..38537c9 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -48,11 +48,35 @@ void __init cpu_hotplug_init(void)
 
 #ifdef CONFIG_HOTPLUG_CPU
 
+/*
+ * try_get_online_cpus(): Tries to hold a reference 
+ * to the cpu_online_map if no one is trying to perform 
+ * a cpu-hotplug operation. This is the fastpath code for
+ * get_online_cpus.
+ *
+ * Returns 1 if there is no cpu-hotplug operation
+ * currently in progress.
+ */
+int try_get_online_cpus(void)
+{
+	if(!cpu_hotplug.active_writer) {
+		cpu_hotplug.refcount++;
+		return 1;
+	}
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(try_get_online_cpus);
+
 void get_online_cpus(void)
 {
 	might_sleep();
 	if (cpu_hotplug.active_writer == current)
 		return;
+	if (try_get_online_cpus())
+		return;
+
+	/* The writer exists, hence the slowpath */
 	mutex_lock(&cpu_hotplug.lock);
 	cpu_hotplug.refcount++;
 	mutex_unlock(&cpu_hotplug.lock);
@@ -120,6 +144,11 @@ static void cpu_hotplug_begin(void)
 	mutex_lock(&cpu_hotplug.lock);
 
 	cpu_hotplug.active_writer = current;
+	synchronize_sched();
+	/* New users of get_online_cpus() will see a non-NULL value
+	 * for cpu_hotplug.active_writer here and will take the slowpath
+	 */
+
 	add_wait_queue_exclusive(&cpu_hotplug.writer_queue, &wait);
 	while (cpu_hotplug.refcount) {
 		set_current_state(TASK_UNINTERRUPTIBLE);
diff --git a/kernel/softlockup.c b/kernel/softlockup.c
index 576eb9c..cb0616d 100644
--- a/kernel/softlockup.c
+++ b/kernel/softlockup.c
@@ -150,8 +150,8 @@ static void check_hung_task(struct task_struct *t, unsigned long now)
 	sysctl_hung_task_warnings--;
 
 	/*
-	 * Ok, the task did not get scheduled for more than 2 minutes,
-	 * complain:
+	 * Ok, the task did not get scheduled for more than
+	 * sysctl_hung_task_timeout_secs, complain:
 	 */
 	printk(KERN_ERR "INFO: task %s:%d blocked for more than "
 			"%ld seconds.\n", t->comm, t->pid,
@@ -216,16 +216,21 @@ static int watchdog(void *__bind_cpu)
 		touch_softlockup_watchdog();
 		msleep_interruptible(10000);
 
+		/* 
+		 * If a cpu-hotplug operation is in progress
+		 * we can come back later
+		 */
+		if (!try_get_online_cpus())
+			continue; 
 		/*
 		 * Only do the hung-tasks check on one CPU:
 		 */
 		check_cpu = any_online_cpu(cpu_online_map);
 
-		if (this_cpu != check_cpu)
-			continue;
-
-		if (sysctl_hung_task_timeout_secs)
+		if ((this_cpu == check_cpu) && sysctl_hung_task_timeout_secs)
 			check_hung_uninterruptible_tasks(this_cpu);
+
+		put_online_cpus();
 	}
 
 	return 0;







-- 
Gautham R Shenoy
Linux Technology Center
IBM India.
"Freedom comes with a price tag of responsibility, which is still a bargain,
because Freedom is priceless!"
--
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

Messages in current thread:
2.6.24-rc4-mm1, Andrew Morton, (Wed Dec 5, 1:17 am)
Re: 2.6.24-rc4-mm1 - BUG in tcp_fragment, Cedric Le Goater, (Thu Dec 13, 1:45 pm)
Re: 2.6.24-rc4-mm1 - BUG in tcp_fragment, Ilpo Järvinen, (Thu Dec 13, 7:00 pm)
Re: 2.6.24-rc4-mm1 - BUG in tcp_fragment, Cedric Le Goater, (Fri Dec 14, 2:52 am)
Re: 2.6.24-rc4-mm1, Rik van Riel, (Wed Dec 12, 12:16 am)
Re: 2.6.24-rc4-mm1, Martin Bligh, (Tue Dec 11, 12:20 pm)
Re: 2.6.24-rc4-mm1, Randy Dunlap, (Tue Dec 11, 12:59 pm)
Re: 2.6.24-rc4-mm1, Martin Bligh, (Tue Dec 11, 1:50 pm)
Re: 2.6.24-rc4-mm1, Reuben Farrelly, (Mon Dec 10, 10:48 am)
Re: 2.6.24-rc4-mm1, Andrew Morton, (Mon Dec 10, 5:11 pm)
Re: 2.6.24-rc4-mm1, Reuben Farrelly, (Tue Dec 11, 10:12 am)
broken suspend (sched related) [Was: 2.6.24-rc4-mm1], Jiri Slaby, (Fri Dec 7, 10:34 am)
Re: broken suspend (sched related) [Was: 2.6.24-rc4-mm1], Gautham R Shenoy, (Mon Dec 10, 4:19 am)
Re: broken suspend (sched related) [Was: 2.6.24-rc4-mm1], Gautham R Shenoy, (Mon Dec 10, 6:15 am)
Re: broken suspend (sched related) [Was: 2.6.24-rc4-mm1], Gautham R Shenoy, (Mon Dec 10, 7:08 am)
Re: broken suspend (sched related) [Was: 2.6.24-rc4-mm1], Gautham R Shenoy, (Mon Dec 10, 7:49 am)
Re: 2.6.24-rc4-mm1: VDSOSYM build error, Laurent Riffard, (Thu Dec 6, 6:28 pm)
Re: 2.6.24-rc4-mm1: VDSOSYM build error, Andrew Morton, (Thu Dec 6, 6:37 pm)
Re: 2.6.24-rc4-mm1: VDSOSYM build error, Miles Lane, (Thu Dec 6, 7:28 pm)
Re: 2.6.24-rc4-mm1: VDSOSYM build error, Andrew Morton, (Thu Dec 6, 7:34 pm)
Re: 2.6.24-rc4-mm1: VDSOSYM build error, Miles Lane, (Thu Dec 6, 7:47 pm)
Re: 2.6.24-rc4-mm1: VDSOSYM build error, Ingo Molnar, (Fri Dec 7, 6:36 am)
[PATCH x86/mm] x86 vDSO: canonicalize sysenter .eh_frame, Roland McGrath, (Thu Dec 6, 9:14 pm)
Re: 2.6.24-rc4-mm1, Reuben Farrelly, (Thu Dec 6, 2:59 am)
Re: 2.6.24-rc4-mm1, Andrew Morton, (Thu Dec 6, 3:35 am)
Re: 2.6.24-rc4-mm1, Ilpo Järvinen, (Mon Dec 10, 8:24 am)
Re: 2.6.24-rc4-mm1, Cedric Le Goater, (Wed Dec 12, 3:21 pm)
tcp_sacktag_one() WARNING (was Re: 2.6.24-rc4-mm1), Cedric Le Goater, (Thu Dec 13, 1:38 pm)
Re: 2.6.24-rc4-mm1, Ilpo Järvinen, (Mon Dec 10, 4:05 pm)
Re: 2.6.24-rc4-mm1, David Miller, (Thu Dec 6, 3:09 am)
Re: 2.6.24-rc4-mm1, Ilpo Järvinen, (Fri Dec 7, 9:16 am)
Re: 2.6.24-rc4-mm1, Cedric Le Goater, (Wed Dec 12, 1:57 pm)
Re: 2.6.24-rc4-mm1 Kernel build fails on S390x, Kamalesh Babulal, (Wed Dec 5, 11:15 pm)
Re: 2.6.24-rc4-mm1 Kernel build fails on S390x, Andrew Morton, (Thu Dec 6, 3:19 am)
Re: 2.6.24-rc4-mm1, , (Thu Dec 6, 7:49 am)
Re: 2.6.24-rc4-mm1, Andrew Morton, (Thu Dec 6, 8:04 am)
Re: 2.6.24-rc4-mm1, , (Thu Dec 6, 3:18 pm)
Re: 2.6.24-rc4-mm1, Greg KH, (Thu Dec 6, 3:38 pm)
Re: 2.6.24-rc4-mm1, , (Thu Dec 6, 4:04 pm)
Re: [dm-devel] Re: 2.6.24-rc4-mm1, Kay Sievers, (Thu Dec 6, 6:04 pm)
Re: [dm-devel] Re: 2.6.24-rc4-mm1, , (Thu Dec 6, 7:12 pm)
Re: [dm-devel] Re: 2.6.24-rc4-mm1, Kay Sievers, (Thu Dec 6, 7:24 pm)
Re: [dm-devel] Re: 2.6.24-rc4-mm1, , (Fri Dec 7, 2:20 pm)
Re: [dm-devel] Re: 2.6.24-rc4-mm1, Kay Sievers, (Fri Dec 7, 2:44 pm)
Re: [dm-devel] Re: 2.6.24-rc4-mm1, , (Fri Dec 7, 4:28 pm)
Re: [dm-devel] Re: 2.6.24-rc4-mm1, Kay Sievers, (Fri Dec 7, 4:49 pm)
Re: [dm-devel] Re: 2.6.24-rc4-mm1, Alasdair G Kergon, (Thu Dec 6, 6:12 pm)
Re: 2.6.24-rc4-mm1: some issues on sparc64, Mariusz Kozlowski, (Sat Dec 8, 2:20 pm)
Re: 2.6.24-rc4-mm1: some issues on sparc64, Andrew Morton, (Sat Dec 8, 2:22 pm)
Re: 2.6.24-rc4-mm1: some issues on sparc64, David Miller, (Sun Dec 9, 4:45 am)
Re: 2.6.24-rc4-mm1: some issues on sparc64, Andrew Morton, (Sun Dec 9, 5:03 am)
[PATCH] md: balance braces in raid5 debug code, Mariusz Kozlowski, (Fri Dec 7, 2:20 pm)
Re: 2.6.24-rc4-mm1, Dave Young, (Thu Dec 6, 10:12 pm)
Re: 2.6.24-rc4-mm1, Luis R. Rodriguez, (Fri Dec 7, 6:22 pm)
Re: 2.6.24-rc4-mm1, Dave Young, (Sun Dec 9, 9:07 pm)
Re: 2.6.24-rc4-mm1, Nick Kossifidis, (Sun Dec 9, 1:55 pm)
2.6.24-rc4-mm1: kobj changes fallout on powerpc, Olof Johansson, (Wed Dec 5, 5:15 am)
Re: 2.6.24-rc4-mm1: kobj changes fallout on powerpc, Kamalesh Babulal, (Wed Dec 5, 9:11 am)
speck-geostationary