Re: current linux-2.6.git: cpusets completely broken

Previous thread: [PATCH 3/3] x86_64: adjust exception frame on paranoid exceptions by Jeremy Fitzhardinge on Saturday, July 12, 2008 - 2:22 am. (1 message)

Next thread: [mmotm] smp.c: fix build error by KOSAKI Motohiro on Saturday, July 12, 2008 - 4:11 am. (2 messages)
From: Dmitry Adamushko
Date: Saturday, July 12, 2008 - 3:45 am

I had in mind something like this:

[ yes, probably the patch makes things somewhat uglier. I tried to bring a minimal amount of changes so far, just to emulate the 'old' behavior of update_sched_domains().
I guess, common_cpu_mem_hotplug_unplug() needs to be split up into cpu- and mem-hotplug parts to make it cleaner ]

(not tested yet)

---

diff --git a/kernel/cpuset.c b/kernel/cpuset.c
index 9fceb97..965d9eb 100644
--- a/kernel/cpuset.c
+++ b/kernel/cpuset.c
@@ -1882,7 +1882,7 @@ static void scan_for_empty_cpusets(const struct cpuset *root)
  * in order to minimize text size.
  */
 
-static void common_cpu_mem_hotplug_unplug(void)
+static void common_cpu_mem_hotplug_unplug(int rebuild_sd)
 {
 	cgroup_lock();
 
@@ -1894,7 +1894,8 @@ static void common_cpu_mem_hotplug_unplug(void)
 	 * Scheduler destroys domains on hotplug events.
 	 * Rebuild them based on the current settings.
 	 */
-	rebuild_sched_domains();
+	if (rebuild_sd)
+		rebuild_sched_domains();
 
 	cgroup_unlock();
 }
@@ -1912,11 +1913,22 @@ static void common_cpu_mem_hotplug_unplug(void)
 static int cpuset_handle_cpuhp(struct notifier_block *unused_nb,
 				unsigned long phase, void *unused_cpu)
 {
-	if (phase == CPU_DYING || phase == CPU_DYING_FROZEN)
+	swicth (phase) {
+	case CPU_UP_CANCELED:
+	case CPU_UP_CANCELED_FROZEN:
+	case CPU_DOWN_FAILED:
+	case CPU_DOWN_FAILED_FROZEN:
+	case CPU_ONLINE:
+	case CPU_ONLINE_FROZEN:
+	case CPU_DEAD:
+	case CPU_DEAD_FROZEN:
+		common_cpu_mem_hotplug_unplug(1);
+		break;
+	default:
 		return NOTIFY_DONE;
+	}
 
-	common_cpu_mem_hotplug_unplug();
-	return 0;
+	return NOTIFY_OK;
 }
 
 #ifdef CONFIG_MEMORY_HOTPLUG
@@ -1929,7 +1941,7 @@ static int cpuset_handle_cpuhp(struct notifier_block *unused_nb,
 
 void cpuset_track_online_nodes(void)
 {
-	common_cpu_mem_hotplug_unplug();
+	common_cpu_mem_hotplug_unplug(0);
 }
 #endif
 


--

From: Dmitry Adamushko
Date: Saturday, July 12, 2008 - 4:14 am

argh, this one compiles (will test shortly).


Signed-off-by: Dmitry Adamushko <dmitry.adamushko@gmail.com>


diff --git a/kernel/cpuset.c b/kernel/cpuset.c
index 9fceb97..798b3ab 100644
--- a/kernel/cpuset.c
+++ b/kernel/cpuset.c
@@ -1882,7 +1882,7 @@ static void scan_for_empty_cpusets(const struct
cpuset *root)
  * in order to minimize text size.
  */
 
-static void common_cpu_mem_hotplug_unplug(void)
+static void common_cpu_mem_hotplug_unplug(int rebuild_sd)
 {
 	cgroup_lock();
 
@@ -1894,7 +1894,8 @@ static void common_cpu_mem_hotplug_unplug(void)
 	 * Scheduler destroys domains on hotplug events.
 	 * Rebuild them based on the current settings.
 	 */
-	rebuild_sched_domains();
+	if (rebuild_sd)
+		rebuild_sched_domains();
 
 	cgroup_unlock();
 }
@@ -1912,11 +1913,22 @@ static void common_cpu_mem_hotplug_unplug(void)
 static int cpuset_handle_cpuhp(struct notifier_block *unused_nb,
 				unsigned long phase, void *unused_cpu)
 {
-	if (phase == CPU_DYING || phase == CPU_DYING_FROZEN)
+	switch (phase) {
+	case CPU_UP_CANCELED:
+	case CPU_UP_CANCELED_FROZEN:
+	case CPU_DOWN_FAILED:
+	case CPU_DOWN_FAILED_FROZEN:
+	case CPU_ONLINE:
+	case CPU_ONLINE_FROZEN:
+	case CPU_DEAD:
+	case CPU_DEAD_FROZEN:
+		common_cpu_mem_hotplug_unplug(1);
+		break;
+	default:
 		return NOTIFY_DONE;
+	}
 
-	common_cpu_mem_hotplug_unplug();
-	return 0;
+	return NOTIFY_OK;
 }
 
 #ifdef CONFIG_MEMORY_HOTPLUG
@@ -1929,7 +1941,7 @@ static int cpuset_handle_cpuhp(struct
notifier_block *unused_nb,
 
 void cpuset_track_online_nodes(void)
 {
-	common_cpu_mem_hotplug_unplug();
+	common_cpu_mem_hotplug_unplug(0);
 }
 #endif
 



--

From: Dmitry Adamushko
Date: Saturday, July 12, 2008 - 5:10 pm

Linus,


(just that we have it all together in one place, ready for testing and
further consideration).

below is the patch and explanation.

Basically the fix below just emulates the 'old' behavior
of update_sched_domains(). We call rebuild_sched_domains() for the same hotplug-events
as it was called (and is still called for !CPUSETS case) in update_sched_domains().
The aim is to keep sched-domain consistent wrt cpu-down/up.

This should be a minimal change. Effectively, the change is against
f18f982abf183e91f435990d337164c7a43d1e6d. So the logic of this patch should be easily visible comparing it to
what the aforementioned commit does.

Ingo, could also please comment on this issue? TIA.


Subject: fix cpuset_handle_cpuhp()

The following commit

---
commit f18f982abf183e91f435990d337164c7a43d1e6d
Author: Max Krasnyansky <maxk@qualcomm.com>
Date:   Thu May 29 11:17:01 2008 -0700

sched: CPU hotplug events must not destroy scheduler domains created by
the cpusets
---

[ Note, with this commit arch_update_cpu_topology is not called any more for CPUSETS. But it's just a nop.
The whole scheme should be probably reworked later. ]


introduced a hotplug-related problem as described below:

[ Basically the fix below just emulates the 'old' behavior of update_sched_domains().
We call rebuild_sched_domains() for the same hotplug-events as it was called (and is still called
for !CPUSETS case) in update_sched_domains(). ]


Upon CPU_DOWN_PREPARE, update_sched_domains() -> detach_destroy_domains(&cpu_online_map)
does the following:

/*
 * Force a reinitialization of the sched domains hierarchy. The domains
 * and groups cannot be updated in place without racing with the
balancing
 * code, so we temporarily attach all running cpus to the NULL domain
 * which will prevent rebalancing while the sched domains are
recalculated.
 */

The sched-domains should be rebuilt when a CPU_DOWN ops. has been
completed, effectively either upon CPU_DEAD{_FROZEN} (upon success) ...
From: Vegard Nossum
Date: Sunday, July 13, 2008 - 1:50 am

On Sun, Jul 13, 2008 at 2:10 AM, Dmitry Adamushko

Tested-by: Vegard Nossum <vegard.nossum@gmail.com>

Works :-)


Vegard

-- 
"The animistic metaphor of the bug that maliciously sneaked in while
the programmer was not looking is intellectually dishonest as it
disguises that the error is the programmer's own creation."
	-- E. W. Dijkstra, EWD1036
--

From: Ingo Molnar
Date: Sunday, July 13, 2008 - 2:41 am

thanks! I've tidied up the changelog and queued it up into 
tip/sched/urgent. I'd prefer this more conservative patch so late in the 
cycle, but i'll also queue up the more intrusive real fix from Linus and 
Dmitry in sched/devel.

Linus, if you've not applied it already, you can pull Dmitry's fix from:

   git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip.git sched-fixes-for-linus

shortlog, diffstat and diff below.

Thanks,

	Ingo

------------------>
Dmitry Adamushko (1):
      cpusets, hotplug, scheduler: fix scheduler domain breakage


 kernel/cpuset.c |   24 ++++++++++++++++++------
 1 files changed, 18 insertions(+), 6 deletions(-)

diff --git a/kernel/cpuset.c b/kernel/cpuset.c
index 9fceb97..798b3ab 100644
--- a/kernel/cpuset.c
+++ b/kernel/cpuset.c
@@ -1882,7 +1882,7 @@ static void scan_for_empty_cpusets(const struct cpuset *root)
  * in order to minimize text size.
  */
 
-static void common_cpu_mem_hotplug_unplug(void)
+static void common_cpu_mem_hotplug_unplug(int rebuild_sd)
 {
 	cgroup_lock();
 
@@ -1894,7 +1894,8 @@ static void common_cpu_mem_hotplug_unplug(void)
 	 * Scheduler destroys domains on hotplug events.
 	 * Rebuild them based on the current settings.
 	 */
-	rebuild_sched_domains();
+	if (rebuild_sd)
+		rebuild_sched_domains();
 
 	cgroup_unlock();
 }
@@ -1912,11 +1913,22 @@ static void common_cpu_mem_hotplug_unplug(void)
 static int cpuset_handle_cpuhp(struct notifier_block *unused_nb,
 				unsigned long phase, void *unused_cpu)
 {
-	if (phase == CPU_DYING || phase == CPU_DYING_FROZEN)
+	switch (phase) {
+	case CPU_UP_CANCELED:
+	case CPU_UP_CANCELED_FROZEN:
+	case CPU_DOWN_FAILED:
+	case CPU_DOWN_FAILED_FROZEN:
+	case CPU_ONLINE:
+	case CPU_ONLINE_FROZEN:
+	case CPU_DEAD:
+	case CPU_DEAD_FROZEN:
+		common_cpu_mem_hotplug_unplug(1);
+		break;
+	default:
 		return NOTIFY_DONE;
+	}
 
-	common_cpu_mem_hotplug_unplug();
-	return 0;
+	return NOTIFY_OK;
 }
 
 #ifdef CONFIG_MEMORY_HOTPLUG
@@ -1929,7 ...
Previous thread: [PATCH 3/3] x86_64: adjust exception frame on paranoid exceptions by Jeremy Fitzhardinge on Saturday, July 12, 2008 - 2:22 am. (1 message)

Next thread: [mmotm] smp.c: fix build error by KOSAKI Motohiro on Saturday, July 12, 2008 - 4:11 am. (2 messages)