Re: current linux-2.6.git: cpusets completely broken

Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]
From: Linus Torvalds
Date: Sunday, July 13, 2008 - 10:10 am

On Sun, 13 Jul 2008, Dmitry Adamushko wrote:
 

And let me explain why you are totally off base.


This has *NOTHING* to do with optimizing any hotplug machinery.


This has *NOTHING* to do even with cpusets and scheduler domains!

Until you can understand that, all your arguments are total and utter 
CRAP.

So Dmitry - please follow along, and think this through.

This is a *fundamental* scheduler issue. It has nothing what-so-ever to do 
with optimization, and it has nothing to do with cpusets. It's about the 
fact that we migrate threads from one CPU to another - and we do that 
whether cpusets are even enabled or not!

And anything that uses "cpu_active_map" to decide if the migration target 
is alive is simply _buggy_.

See? Not "un-optimized". Not "cpusets". Just pure scheduling and hotplug 
issues with taking a CPU down.

As long as you continue to only look at wake_idle() and scheduler domains, 
you are missing all the *other* cases of migration. Like the one we do at 
execve() time, or in balance_task.

The thing is, we should fix the top level code to never even _consider_ an 
invalid CPU as a target, and that in turn should mean that all the other 
code should be able to just totally ignore CPU hotplug events.

In other words, it vey fundamentally SHOULD NOT MATTER that somebody 
happened to call "try_to_wake_up()" during the cpu unplug sequence. We 
should fix the fundamental scheduler routines to simply make it impossible 
for that to ever balance something back to a CPU that is going down.

And we shouldn't _care_ about what crazy things the cpusets code does.

See?

THAT is the reason for my patch. I think the cpusets callbacks are totally 
insane, but I don't care. What I care about is that the scheduler got 
confused just because those insane callbacks happened to make timing be 
just subtle enough that (and I quote):

  "try_to_wake_up() is called for one of these tasks from another CPU ->
   the load-balancer (wake_idle()) picks up a "dead" CPU and places the 
   task on it. Then e.g. BUG_ON(rq->nr_running) detects this a bit later 
   -> oops."

IOW, we should never have had code that was that fragile in the first 
place! It's totally INSANE to depend on complex and fragile code, when 
we'd be much better off with simple code that always says: "I will not 
migrate a task to a CPU that is going down".

Depending on complex (and conditional) scheduler domains data structures 
is a *bug*. It's fragile, and it's a horrible design mistake.

			Linus

--
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

Messages in current thread:
current linux-2.6.git: cpusets completely broken, Vegard Nossum, (Fri Jul 11, 12:07 pm)
Re: current linux-2.6.git: cpusets completely broken, Paul Menage, (Fri Jul 11, 12:36 pm)
Re: current linux-2.6.git: cpusets completely broken, Vegard Nossum, (Fri Jul 11, 12:43 pm)
Re: current linux-2.6.git: cpusets completely broken, Max Krasnyansky, (Fri Jul 11, 1:07 pm)
Re: current linux-2.6.git: cpusets completely broken, Dmitry Adamushko, (Fri Jul 11, 4:03 pm)
Re: current linux-2.6.git: cpusets completely broken, Max Krasnyansky, (Fri Jul 11, 4:19 pm)
Re: current linux-2.6.git: cpusets completely broken, Dmitry Adamushko, (Fri Jul 11, 4:53 pm)
Re: current linux-2.6.git: cpusets completely broken, Vegard Nossum, (Fri Jul 11, 8:17 pm)
Re: current linux-2.6.git: cpusets completely broken, Linus Torvalds, (Fri Jul 11, 8:28 pm)
Re: current linux-2.6.git: cpusets completely broken, Dmitry Adamushko, (Sat Jul 12, 3:04 am)
Re: current linux-2.6.git: cpusets completely broken, Dmitry Adamushko, (Sat Jul 12, 4:05 am)
Re: current linux-2.6.git: cpusets completely broken, Linus Torvalds, (Sat Jul 12, 12:15 pm)
Re: current linux-2.6.git: cpusets completely broken, Max Krasnyansky, (Sat Jul 12, 12:19 pm)
Re: current linux-2.6.git: cpusets completely broken, Linus Torvalds, (Sat Jul 12, 1:10 pm)
Re: current linux-2.6.git: cpusets completely broken, Linus Torvalds, (Sat Jul 12, 2:30 pm)
Re: current linux-2.6.git: cpusets completely broken, Linus Torvalds, (Sat Jul 12, 3:07 pm)
Re: current linux-2.6.git: cpusets completely broken, Max Krasnyansky, (Sat Jul 12, 3:43 pm)
Re: current linux-2.6.git: cpusets completely broken, Vegard Nossum, (Sat Jul 12, 4:00 pm)
Re: current linux-2.6.git: cpusets completely broken, Linus Torvalds, (Sat Jul 12, 4:01 pm)
Re: current linux-2.6.git: cpusets completely broken, Linus Torvalds, (Sat Jul 12, 4:04 pm)
Re: current linux-2.6.git: cpusets completely broken, Dmitry Adamushko, (Sat Jul 12, 4:05 pm)
Re: current linux-2.6.git: cpusets completely broken, Linus Torvalds, (Sat Jul 12, 4:17 pm)
Re: current linux-2.6.git: cpusets completely broken, Dmitry Adamushko, (Sat Jul 12, 4:19 pm)
Re: current linux-2.6.git: cpusets completely broken, Vegard Nossum, (Sat Jul 12, 4:25 pm)
Re: current linux-2.6.git: cpusets completely broken, Dmitry Adamushko, (Sat Jul 12, 4:25 pm)
Re: current linux-2.6.git: cpusets completely broken, Dmitry Adamushko, (Sun Jul 13, 2:53 am)
Re: current linux-2.6.git: cpusets completely broken, Andi Kleen, (Sun Jul 13, 8:29 am)
Re: current linux-2.6.git: cpusets completely broken, Linus Torvalds, (Sun Jul 13, 10:10 am)
Re: current linux-2.6.git: cpusets completely broken, Ingo Molnar, (Sun Jul 13, 10:42 am)
Re: current linux-2.6.git: cpusets completely broken, Linus Torvalds, (Sun Jul 13, 10:46 am)
Re: current linux-2.6.git: cpusets completely broken, Dmitry Adamushko, (Sun Jul 13, 11:13 am)
Re: current linux-2.6.git: cpusets completely broken, Ingo Molnar, (Sun Jul 13, 11:19 am)
Re: current linux-2.6.git: cpusets completely broken, Linus Torvalds, (Sun Jul 13, 11:20 am)
Re: current linux-2.6.git: cpusets completely broken, Linus Torvalds, (Sun Jul 13, 11:38 am)
Re: current linux-2.6.git: cpusets completely broken, Mike Travis, (Mon Jul 14, 8:49 am)
Re: current linux-2.6.git: cpusets completely broken, Dmitry Adamushko, (Mon Jul 14, 3:38 pm)
Re: current linux-2.6.git: cpusets completely broken, Linus Torvalds, (Mon Jul 14, 4:05 pm)
Re: current linux-2.6.git: cpusets completely broken, Dmitry Adamushko, (Mon Jul 14, 5:00 pm)
Re: current linux-2.6.git: cpusets completely broken, Linus Torvalds, (Mon Jul 14, 5:23 pm)
Re: current linux-2.6.git: cpusets completely broken, Dmitry Adamushko, (Mon Jul 14, 7:21 pm)
Re: current linux-2.6.git: cpusets completely broken, Max Krasnyansky, (Mon Jul 14, 8:03 pm)
Re: current linux-2.6.git: cpusets completely broken, Steven Rostedt, (Mon Jul 14, 8:23 pm)
Re: current linux-2.6.git: cpusets completely broken, Linus Torvalds, (Mon Jul 14, 8:36 pm)
Re: current linux-2.6.git: cpusets completely broken, Steven Rostedt, (Mon Jul 14, 8:47 pm)
Re: current linux-2.6.git: cpusets completely broken, Linus Torvalds, (Mon Jul 14, 9:04 pm)
Re: current linux-2.6.git: cpusets completely broken, Linus Torvalds, (Mon Jul 14, 9:12 pm)
Re: current linux-2.6.git: cpusets completely broken, Steven Rostedt, (Mon Jul 14, 9:16 pm)
Re: current linux-2.6.git: cpusets completely broken, Ingo Molnar, (Tue Jul 15, 1:32 am)
Re: current linux-2.6.git: cpusets completely broken, Max Krasnyansky, (Tue Jul 15, 1:42 am)
Re: current linux-2.6.git: cpusets completely broken, Ingo Molnar, (Tue Jul 15, 1:57 am)
Re: current linux-2.6.git: cpusets completely broken, Max Krasnyansky, (Tue Jul 15, 2:12 am)
Re: current linux-2.6.git: cpusets completely broken, Max Krasnyansky, (Tue Jul 15, 11:35 pm)
Re: current linux-2.6.git: cpusets completely broken, Peter Zijlstra, (Wed Jul 16, 12:10 am)
Re: current linux-2.6.git: cpusets completely broken, Max Krasnyansky, (Wed Jul 16, 10:01 am)