Re: Ten percent test

Previous thread: [PATCH]: Resubmit Fix bogus softlockup warning with sysrq-t by Prarit Bhargava on Wednesday, March 28, 2007 - 9:03 am. (1 message)

Next thread: Re: Linux page cache issue? by Xin Zhao on Wednesday, March 28, 2007 - 9:15 am. (1 message)
From: Con Kolivas
Date: Wednesday, March 28, 2007 - 9:37 am

test.kernel.org found some idle time regressions in the latest update to the
staircase deadline scheduler and Andy Whitcroft helped me track down the 
offending problem which was present in all previous RSDL schedulers but
previously wouldn't be manifest without changes in nice. So here is a bugfix
for the set_load_weight being incorrectly set and a few other minor 
improvements. Thanks Andy!

I'm cautiously optimistic that we're at the thin edge of the bugfix wedge now.

---
set_load_weight() should be performed after p->quota is set. This fixes a
large SMP performance regression.

Make sure rr_interval is never set to less than one jiffy.

Some sanity checking in update_cpu_clock will prevent bogus sched_clock
values.

SCHED_BATCH tasks should not set the rq->best_static_prio field.

Correct sysctl rr_interval description to describe the value in milliseconds.

Style fixes.

Signed-off-by: Con Kolivas <kernel@kolivas.org>

---
 Documentation/sysctl/kernel.txt |    8 ++--
 kernel/sched.c                  |   73 +++++++++++++++++++++++++++++-----------
 2 files changed, 58 insertions(+), 23 deletions(-)

Index: linux-2.6.21-rc5-mm2/kernel/sched.c
===================================================================
--- linux-2.6.21-rc5-mm2.orig/kernel/sched.c	2007-03-28 09:01:03.000000000 +1000
+++ linux-2.6.21-rc5-mm2/kernel/sched.c	2007-03-29 00:02:33.000000000 +1000
@@ -88,10 +88,13 @@ unsigned long long __attribute__((weak))
 #define MAX_USER_PRIO		(USER_PRIO(MAX_PRIO))
 #define SCHED_PRIO(p)		((p)+MAX_RT_PRIO)
 
-/* Some helpers for converting to/from nanosecond timing */
+/* Some helpers for converting to/from various scales.*/
 #define NS_TO_JIFFIES(TIME)	((TIME) / (1000000000 / HZ))
-#define NS_TO_MS(TIME)		((TIME) / 1000000)
+#define JIFFIES_TO_NS(TIME)	((TIME) * (1000000000 / HZ))
 #define MS_TO_NS(TIME)		((TIME) * 1000000)
+/* Can return 0 */
+#define MS_TO_JIFFIES(TIME)	((TIME) * HZ / 1000)
+#define JIFFIES_TO_MS(TIME)	((TIME) * 1000 / HZ)
 
 #define ...
From: Ingo Molnar
Date: Wednesday, March 28, 2007 - 11:48 am

hm, how about the questions Mike raised (there were a couple of cases of 
friction between 'the design as documented and announced' and 'the code 
as implemented')? As far as i saw they were still largely unanswered - 
but let me know if they are all answered and addressed:

 http://marc.info/?l=linux-kernel&m=117465220309006&w=2
 http://marc.info/?l=linux-kernel&m=117489673929124&w=2
 http://marc.info/?l=linux-kernel&m=117489831930240&w=2

and the numbers he posted:

 http://marc.info/?l=linux-kernel&m=117448900626028&w=2

his test conclusion was that under CPU load, RSDL (SD) generally does 
not hold up to mainline's interactivity.

	Ingo
-

From: Con Kolivas
Date: Wednesday, March 28, 2007 - 4:44 pm

I spent less time emailing and more time coding. I have been working on 





There have been improvements since the earlier iterations but it's still a 
fairness based design. Mike's "sticking point" test case should be improved 
as well.

My call based on my own testing and feedback from users is: 

Under niced loads it is 99% in favour of SD.

Under light loads it is 95% in favour of SD.

Under Heavy loads it becomes proportionately in favour of mainline. The 
crossover is somewhere around a load of 4.

If the reluctance to renice X goes away I'd say it was 99% across the board 

-- 
-ck
-

From: Mike Galbraith
Date: Wednesday, March 28, 2007 - 10:50 pm

That one's not fine.

+static void recalc_task_prio(struct task_struct *p, struct rq *rq)
+{
+	struct prio_array *array = rq->active;
+	int queue_prio;
+
+	update_if_moved(p, rq);
+	if (p->rotation == rq->prio_rotation) {
+		if (p->array == array) {
+			if (p->time_slice > 0)
+				return;
+			p->time_slice = p->quota;
+		} else if (p->array == rq->expired) {

You implemented nanosecond accounting, but here you give a task which
has either missed the tick ofter enough, or accumulated enough cross cpu
clock drift to have an I.O.U. in it's wallet a shiny new $8 bill.

WRT  clock drift/timewarps, your latest code cedes that these do occur,
but where these timewarps can be anywhere between minuscule with Intel
same package processors, up to a tick elsewhere, charges a tick. 
 
-       /* cpu scheduler quota accounting is performed here */
+       if (tick) {
+               /*
+                * Called from scheduler_tick() there should be less
than two
+                * jiffies worth, and not negative/overflow.
+                */
+               if (time_diff > JIFFIES_TO_NS(2) || time_diff <
min_diff)

Hm.  How, where?

I'm getting inconsistent results with current, but sleeping tasks still
don't _appear_ to be able to compete with hogs on an equal footing, and
I don't see how they really can.

What happens if a sleeper sleeps after using say half of it's slice, and
the hog it's sharing the CPU with then sleeps briefly after using most
of it's slice.  That's the end of the rotation.  They are put back on an

The behavior is different, and is less ragged, but I wouldn't say it's
really been improved.  The below was added as a workaround.

+ * This contains a bitmap for each dynamic priority level with empty slots
+ * for the valid priorities each different nice level can have. It allows
+ * us to stagger the slots where differing priorities run in a way that
+ * keeps latency differences between different nice levels at a minimum.
+ * ie, where 0 means a slot ...
From: Mike Galbraith
Date: Wednesday, March 28, 2007 - 11:29 pm

Suggestion: try the testcase that Satoru Takeuch posted.  The numbers I
got with latest SD were no better than the numbers I got with the patch
I posted to try to solve it.  Seems to me the numbers with SD should
have been much better, but they in fact were not.

Running that thing, mainline's GUI was not usable, even with my patch,
but neither was it usable with SD.  What's the difference between
horrible with mainline and merely terrible with SD?  In both, the GUI
ends up doing round-robin with a slew of hogs.  In mainline, this
happens because the history logic can and does get it wrong sometimes,
which this exploit deliberately triggers.  With SD, it's by design.

	-Mike

-

From: Mike Galbraith
Date: Wednesday, March 28, 2007 - 11:54 pm

Oh my, I'm on a roll here... somebody stop me ;-)

Some emphasis:


The much maligned history mechanism in mainline didn't start it's life
as an interactivity estimator, that's a name it acquired later.  What it
was first put there for was to ensure fairness for sleeping tasks.

I found it most ironic that the numbers I posted showed that mechanism
working perfectly, with an exploit that was designed specifically to
expose it's weakness, despite the deliberate tweaks that have gone in
tweaking it very heavily in the unfair direction, and this went
uncommented.  If I had run more of them, it would have shown that
weakness very well.  We all know that weakness exists.

What the numbers clearly showed was that sleeping tasks did not get the
fairness RSDL advertised with the particular test I ran, yet it went
uncommented/uncontested.  Anyone could have tested with the trivial
proggy of their choice... but nobody did.

The history mechanism is not only about interactivity, and never was. 

	-Mike

I'm gonna go piddle around with code now, much more fun than yacking :)

-

From: Mike Galbraith
Date: Thursday, March 29, 2007 - 1:18 am

Rereading to make sure I wasn't unclear anywhere...


Egad.  Here I'm pondering the numbers and light load as I'm typing, and
my fingers (seemingly independent when mind wanders off) typed < 95% as
in not fully committed, instead of "light".

	-Mike

-

From: Con Kolivas
Date: Monday, April 2, 2007 - 7:35 pm

95% of cases where load is less than 4; not 95% load.

-- 
-ck
-

From: michael chang
Date: Thursday, March 29, 2007 - 5:55 am

While I don't know the _exact_ figure for this, my hunch is that a
good ballpark figure is anything that is not a heavy load (less than
4, perhaps even lower, maybe <0.75 or <2?) and that is not a "niced"
load.

-- 
-- Michael Chang
~Just the crazy copy cat~
-

From: Con Kolivas
Date: Monday, April 2, 2007 - 7:37 pm

Try two instances of chew.c at _differing_ nice levels on one cpu on mainline, 

-- 
-ck
From: Mike Galbraith
Date: Monday, April 2, 2007 - 10:31 pm

How about something more challenging instead :)

The numbers below are from my scheduler tree with massive_intr running
at nice 0, and chew at nice 5.  Below these numbers are 100 lines from
the exact center of chew's output.

(interactivity remains intact with this rather heavy load)

root@Homer: ./massive_intr 30 180
005671  00001506
005657  00001506
005651  00001491
005647  00001466
005661  00001484
005660  00001475
005645  00001514
005668  00001384
005673  00001516
005656  00001449
005664  00001512
005659  00001507
005667  00001513
005663  00001521
005670  00001440
005649  00001522
005652  00001487
005648  00001405
005665  00001472
005669  00001418
005662  00001489
005674  00001523
005650  00001480
005655  00001476
005672  00001530
005653  00001463
005654  00001427
005646  00001499
005658  00001510
005666  00001476

100 sequential lines from the middle of chew's logged output.

pid 5642, prio   5, out for    2 ms, ran for    1 ms, load  34%
pid 5642, prio   5, out for 1268 ms, ran for   63 ms, load   4%
pid 5642, prio   5, out for   52 ms, ran for    0 ms, load   0%
pid 5642, prio   5, out for    8 ms, ran for    1 ms, load  14%
pid 5642, prio   5, out for    9 ms, ran for    1 ms, load  12%
pid 5642, prio   5, out for    8 ms, ran for    1 ms, load  17%
pid 5642, prio   5, out for    8 ms, ran for    1 ms, load  15%
pid 5642, prio   5, out for    9 ms, ran for    1 ms, load  17%
pid 5642, prio   5, out for    8 ms, ran for    1 ms, load  15%
pid 5642, prio   5, out for    8 ms, ran for    1 ms, load  12%
pid 5642, prio   5, out for    7 ms, ran for    1 ms, load  18%
pid 5642, prio   5, out for    8 ms, ran for    1 ms, load  11%
pid 5642, prio   5, out for    8 ms, ran for    1 ms, load  18%
pid 5642, prio   5, out for    4 ms, ran for    1 ms, load  22%
pid 5642, prio   5, out for 1395 ms, ran for   50 ms, load   3%
pid 5642, prio   5, out for   26 ms, ran for    0 ms, load   3%
pid 5642, prio   5, out for    8 ms, ran for    1 ms, load  ...
From: Mike Galbraith
Date: Monday, April 2, 2007 - 11:00 pm

Here are the numbers for 2.6.21-rc5 with only the earlier mentioned
patch.  Chew's log is only 20% as long as that from my other tree, and
interactivity suffers badly while running this exploit, but as you can
see, chew isn't dying of boredom.

	-Mike

root@Homer: ./massive_intr 30 180
006701  00001509
006693  00001571
006707  00001072
006690  00001582
006691  00001547
006692  00001336
006695  00001759
006710  00001766
006699  00001531
006688  00001405
006709  00001907
006703  00001572
006705  00001501
006697  00001617
006686  00001344
006713  00001922
006714  00001885
006704  00001491
006694  00001482
006689  00001395
006711  00001176
006715  00001471
006708  00001527
006687  00001200
006706  00001451
006698  00001246
006702  00001495
006696  00001421
006712  00001414
006700  00001047


pid 6683, prio   5, out for   46 ms, ran for    0 ms, load   0%
pid 6683, prio   5, out for    7 ms, ran for    1 ms, load  17%
pid 6683, prio   5, out for    8 ms, ran for    1 ms, load  16%
pid 6683, prio   5, out for    6 ms, ran for    1 ms, load  18%
pid 6683, prio   5, out for 3527 ms, ran for   69 ms, load   1%
pid 6683, prio   5, out for   52 ms, ran for    1 ms, load   2%
pid 6683, prio   5, out for   15 ms, ran for    1 ms, load   6%
pid 6683, prio   5, out for    7 ms, ran for    1 ms, load  15%
pid 6683, prio   5, out for    7 ms, ran for    1 ms, load  13%
pid 6683, prio   5, out for    7 ms, ran for    1 ms, load  18%
pid 6683, prio   5, out for    8 ms, ran for    1 ms, load  18%
pid 6683, prio   5, out for    8 ms, ran for    1 ms, load  18%
pid 6683, prio   5, out for    8 ms, ran for    1 ms, load  17%
pid 6683, prio   5, out for    7 ms, ran for    1 ms, load  17%
pid 6683, prio   5, out for 3925 ms, ran for   56 ms, load   1%
pid 6683, prio   5, out for   30 ms, ran for    1 ms, load   3%
pid 6683, prio   5, out for   24 ms, ran for    1 ms, load   6%
pid 6683, prio   5, out for    7 ms, ran for    1 ms, load  18%
pid 6683, prio   5, out for    7 ...
From: Mike Galbraith
Date: Tuesday, April 3, 2007 - 3:57 am

Taking a little break from tinkering, I built/ran rsd-0.38 as well.
While chew usually says "out for N < 500ms", I see spikes like those
below the massive_intr numbers.

root@Homer: ./massive_intr 30 180 (nice 0)
006596  00001346
006613  00001475
006605  00001463
006606  00001423
006598  00001279
006609  00001458
006600  00001378
006591  00001491
006610  00001413
006588  00001361
006602  00001401
006601  00001412
006607  00001373
006604  00001449
006599  00001398
006608  00001269
006611  00001464
006593  00001349
006614  00001335
006612  00001512
006615  00001422
006589  00001363
006617  00001362
006597  00001435
006592  00001354
006595  00001425
006616  00001348
006603  00001308
006594  00001360
006590  00001397

(spikes from run above)
pid 6585, prio   0, out for  178 ms, ran for   12 ms, load   6%
pid 6585, prio   0, out for  175 ms, ran for   13 ms, load   7%
pid 6585, prio   0, out for 1901 ms, ran for   12 ms, load   0%
pid 6585, prio   0, out for   61 ms, ran for   12 ms, load  17%
...
pid 6585, prio   0, out for  148 ms, ran for   11 ms, load   7%
pid 6585, prio   0, out for  229 ms, ran for   13 ms, load   5%
pid 6585, prio   0, out for  182 ms, ran for   11 ms, load   6%
pid 6585, prio   0, out for 1306 ms, ran for   11 ms, load   0%
pid 6585, prio   0, out for   72 ms, ran for   12 ms, load  15%
pid 6585, prio   0, out for  252 ms, ran for   11 ms, load   4%
....
(spikes from massive_intr at nice 0 and chew at nice -20)
pid 6547, prio -20, out for  132 ms, ran for  119 ms, load  47%
pid 6547, prio -20, out for   52 ms, ran for  119 ms, load  69%
pid 6547, prio -20, out for    4 ms, ran for   96 ms, load  95%
pid 6547, prio -20, out for 1251 ms, ran for   24 ms, load   1%
pid 6547, prio -20, out for   78 ms, ran for 1561 ms, load  95%
pid 6547, prio -20, out for   89 ms, ran for  120 ms, load  57%
pid 6547, prio -20, out for   69 ms, ran for  119 ms, load  63%
pid 6547, prio -20, out for 4125 ms, ran for  119 ms, load   2%
pid 6547, prio ...
From: Ingo Molnar
Date: Monday, April 2, 2007 - 11:01 pm

looks interesting - could you send the patch?

	Ingo
-

From: Mike Galbraith
Date: Monday, April 2, 2007 - 11:11 pm

Sorry, that tree is not _even_ ready for viewing yet.
(and it's got an occasional oops bug i have to kill)

	-Mike

-

From: Mike Galbraith
Date: Thursday, April 5, 2007 - 4:02 am

Ok, this is looking/feeling pretty good in testing.  Comments on
fugliness etc much appreciated.

Below the numbers is a snapshot of my experimental tree.  It's a mixture
of my old throttling/anti-starvation tree and the task promotion patch,
with the addition of a scheduling class for interactive tasks to dish
out some of that targeted unfairness I mentioned.  SCHED_INTERACTIVE is
also targeted at the scenario where X or one of it's clients uses enough
CPU to end up in the expired array.

(note:  Xorg was not set SCHED_INTERACTIVE during the test runs below)

	-Mike

top - 12:31:34 up 16 min, 13 users,  load average: 7.37, 8.74, 6.58

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  P COMMAND
 6542 root      15   0  1568  108   24 S   43  0.0   0:58.98 1 fiftypercent
 6540 root      17   0  1568  440  356 R   30  0.0   1:00.04 0 fiftypercent
 6544 root      18   0  1568  108   24 R   28  0.0   0:58.36 0 fiftypercent
 6541 root      20   0  1568  108   24 R   26  0.0   0:57.70 1 fiftypercent
 6536 root      25   0  1436  356  296 R   24  0.0   0:45.76 1 chew
 6538 root      25   0  1436  356  296 R   20  0.0   0:49.73 0 chew
 6543 root      19   0  1568  108   24 R   19  0.0   0:58.04 1 fiftypercent
 6409 root      15   0  154m  63m  27m R    2  6.3   0:13.09 0 amarokapp
 6410 root      15   0  154m  63m  27m S    2  6.3   0:14.36 0 amarokapp
 6376 root      15   0  2380 1092  764 R    2  0.1   0:15.63 0 top
 5591 root      18   0  4736 1036  736 S    1  0.1   0:00.14 1 smpppd
 5678 root      15   0  167m  24m 4848 S    1  2.4   0:19.37 0 Xorg
 6202 root      15   0 32364  18m  12m S    1  1.8   0:04.25 1 konsole

50 lines from center of chew nailed to cpu0's log

pid 6538, prio   0, out for   27 ms, ran for    1 ms, load   6%
pid 6538, prio   0, out for   26 ms, ran for    4 ms, load  14%
pid 6538, prio   0, out for   27 ms, ran for    7 ms, load  20%
pid 6538, prio   0, out for   13 ms, ran for    5 ms, load  27%
pid 6538, prio   0, out for    8 ms, ran for ...
From: Ingo Molnar
Date: Thursday, April 5, 2007 - 4:09 am

find a whitespace fix below.

	Ingo

Index: linux/kernel/sched.c
===================================================================
--- linux.orig/kernel/sched.c
+++ linux/kernel/sched.c
@@ -1034,7 +1034,7 @@ static int recalc_task_prio(struct task_
 	/*
 	 * Migration timestamp adjustment may induce negative time.
 	 * Ignore unquantifiable values as well as SCHED_BATCH tasks.
-	 */ 
+	 */
 	if (now < p->timestamp || batch_task(p))
 		sleep_time = 0;
 
-

From: Mike Galbraith
Date: Thursday, April 5, 2007 - 4:12 am

Thanks.

(dang, i need to find that fifty "make it red" thingie for vi again)

	-Mike

-

From: Ingo Molnar
Date: Thursday, April 5, 2007 - 4:15 am

or just start using quilt, which warns about this :)

	Ingo
-

From: Johannes Stezenbach
Date: Thursday, April 5, 2007 - 6:18 am

put "let c_space_errors=1" in .vimrc

HTH,
Johannes
-

From: Mike Galbraith
Date: Thursday, April 5, 2007 - 8:28 am

Thanks.

I received this link via private mail, and think it's worth posting.
Who knows, it may save Maintainers an antacid tablet or two.

http://www.pixelbeat.org/settings/.vimrc

	-Mike

(may eventually get tired of the colors, but for now they're cooler than
the plain black and white i'm used to, _and_ has "make it glow" feature)

-

From: Ingo Molnar
Date: Thursday, April 5, 2007 - 4:54 am

here's some test results, comparing SD-latest to Mike's-latest:

re-testing the weak points of the vanilla scheduler + Mike's:

 - thud.c:    this workload has almost unnoticeable effect
 - fiftyp.c:  noticeable, but alot better than previously!

re-testing the weak points of SD:

 - hackbench: still unusable under such type of high load - no improvement.
 - make -j:   still less interactive than Mike's - no improvement.

	Ingo
-

From: Mike Galbraith
Date: Thursday, April 5, 2007 - 5:10 am

Hmm.  Here fiftyp.c is utterly harmless.  If you have a second, can you
send me a top snapshot?  If you're running many of them, it can take a
bit for the throttle to catch them all.

	-Mike

-

From: Ingo Molnar
Date: Thursday, April 5, 2007 - 5:12 am

ah, indeed - i ran 10 of them and letting them run for a bit smoothes 
things out.

	Ingo
-

From: Mike Galbraith
Date: Thursday, April 5, 2007 - 5:24 am

Ok, I didn't try 10 of them.  It can still get a bit ragged here, so I
may have to latch the throttle for a bit to make sure they have to
maintain improved behavior to get unleashed.  5 of them get instantly
nailed, and stay nailed.

	-Mike

-

From: Con Kolivas
Date: Thursday, April 5, 2007 - 9:08 am

Throttling to try to get to SD fairness? The mainline state machine becomes 
more complex than ever and fluctuates from interactive to fair by an as-yet 

Nice -10 on mainline ruins the latency of nice 0 tasks unlike SD. New 
scheduling class just for X? Sounds like a very complicated 



Depends on how big your job number vs cpu is. The better the throttling gets 
with mainline the better SD gets in this comparison. At equal fairness 
mainline does not have the low latency interactivity SD has.

Nice -10 X with SD is a far better solution than an ever increasing complexity 
state machine and a userspace-changing scheduling policy just for X. Half 

-- 
-ck
-

From: Mike Galbraith
Date: Thursday, April 5, 2007 - 1:29 pm

I believe I've already met and surpassed SD fairness.  Bold statement,
but I believe it's true.  I'm more worried about becoming _too_ fair.

Show me your numbers.  I showed you mine with both SD and my patches.

WRT magic and state machine complexity:  If you read the patch, there is
nothing "magical" about it.  It doesn't do anything but monitor CPU
usage and move a marker.  It does nothing the least bit complicated, and
what it does, it does in the slow path.  The only thing it does in the
fast path is to move the marker, and perhaps tag a targeted task.  State
machine?  There is nothing there that resembles a state machine to me,

This patch makes massive nice -10 vs nice 0 latency history I believe.
Testing welcome.  WRT "nice -10 obfuscated", that's a load of high grade
horse-hockey.  There were very good reason posted here as to why that is
a very bad idea, perhaps you haven't read them.  (you can find them if
you choose)

Your criticism SCHED_INTERACTIVE leaves me dumbfounded, since you were,
and still are, specifically telling me that I should tell the scheduler
that X is special.  I did precisely that, and am also trying to tell it
that it's clients are special too, _without_ having to start each and



SD does not retain interactivity under any appreciable load for one, and
secondly, I'm getting interactivity that SD cannot even get close to
without renicing, and without any patches - in mainline right now.

(Speaking of low latency, how long can tasks forking off sleepers who
overlap their wake times prevent an array switch with SD?  Forever?)

I posted numbers that demonstrate the improvement in fairness while
maintaining interactivity, and I'm not finished.  I've solved the
multiple fiftyp.c thing Ingo noticed, and in fact, I had 10 copies
running that I had forgotten to terminate while I was working, and I
didn't even notice until I finished, and saw my top window.  Patch to
follow as soon as I test some more (that's what takes much time, not
creating the ...
From: Ingo Molnar
Date: Thursday, April 5, 2007 - 12:05 pm

i think you are missing the point. We _do not know in advance_ whether X 
should be prioritized or not. It's the behavior of X that determines it. 
When X is reniced to -10 it fixes a few corner cases, but it breaks many 
other cases. We found that out time and time again.


this is relative to how mainline+Mike's handles it. Users wont really 

i often run make jobs with -j200 or larger, and SD gets worse than even 
mainline much sooner than that.

	Ingo
-

From: Con Kolivas
Date: Thursday, April 5, 2007 - 6:03 pm

fiftyp.c seems to have been stumbled across by accident as having an effect 
when Xenofon was trying to recreate Mike's 50% x 3 test case. I suggest a ten 
percent version like the following would be more useful as a test for the 
harmful effect discovered in fiftyp.c. (/me throws in obligatory code style 
change).

Starts 15 processes that sleep ten times longer than they run. Change forks to 
15 times the number of cpus you have and it should work on any size hardware.

-- 
-ck
From: Mike Galbraith
Date: Friday, April 6, 2007 - 2:07 am

I was more focused on the general case, but all I should have to do to
de-claw all of these sleep exploits is account rr time (only a couple of
lines, done and building now).  It's only a couple of lines.

	-Mike

-

From: Con Kolivas
Date: Friday, April 6, 2007 - 2:28 am

The more you try to "de-claw" these sleep exploits the less effective you make 
your precious interactive estimator. Feel free to keep adding endless tweaks 
to undo the other tweaks in order to try and achieve what SD has by design. 
You'll end up with an incresingly complex state machine design of 
interactivity tweaks and interactivity throttlers all fighting each other to 
the point where the intearactivity estimator doesn't do anything. What's the 
point in that? Eventually you'll have an estimator throttled to the point it 
does nothing and you end up with something far less interactive than SD which 
is as interactive as fairness allows, unlike mainline.

-- 
-ck
-

From: Mike Galbraith
Date: Friday, April 6, 2007 - 3:48 am

I haven't seen SD achieve what it's design docs claim yet, so yup, I'm
going to keep right on trying to fix the corner cases in what we have
that _does_ give me the interactivity I want.

	-Mike

-

From: Ingo Molnar
Date: Friday, April 6, 2007 - 3:03 am

firstly, testing on various workloads Mike's tweaks work pretty well, 
while SD still doesnt handle the high-load case all that well. Note that 
it was you who raised this whole issue to begin with: everything was 
pretty quiet in scheduling interactivity land. (There was one person who 
reported wide-scale interactivity regressions against mainline but he 
didnt answer my followup posts to trace/debug the scenario.)

SD has a built-in "interactivity estimator" as well, but hardcoded into 
its design. SD has its own set of ugly-looking tweaks as well - for 
example the prio_matrix. So it all comes down on 'what interactivity 
heuristics is enough', and which one is more tweakable. So far i've yet 
to see SD address the hackbench and make -j interactivity 
problems/regression for example, while Mike has been busy addressing the 

It comes down to defining interactivity by scheduling behavior, and 
making that definition flexible. SD's definition of interactivity is 
rigid (but it's still behavior-based, so not fundamentally different 
from an explicit 'interactivity estimator'), and currently it does not 
work well under high load. But ... i'm still entertaining the notion 
that it might be good enough, but you've got to demonstrate the design's 
flexibility.

furthermore, your description does not match my experience when using 
Mike's tweaks and comparing it to SD on the same hardware. According to 
your claim i should have seen regressions popping up in various, 
already-fixed corners, but it didnt happen in practice. But ... i'm 
awaiting further SD and Mike tweaks, the race certainly looks 
interesting ;)

	Ingo
-

From: Mike Galbraith
Date: Friday, April 6, 2007 - 3:40 am

<g> I think I lapped him, but since we're running in opposite
directions, it's hard to tell.

	-Mike

-

From: Con Kolivas
Date: Friday, April 6, 2007 - 11:50 pm

I'm terribly sorry but you have completely missed my intentions then. I was 
_not_ trying to improve mainline's interactivity at all. My desire was to fix 
the unfairness that mainline has, across the board without compromising 
fairness. You said yourself that an approach that fixed a lot and had a small 
number of regressions would be worth it. In a surprisingly ironic turnaround 
two bizarre things happened. People found SD fixed a lot of their 
interactivity corner cases which were showstoppers. That didn't surprise me 
because any unfair design will by its nature get it wrong sometimes. The even 
_more_ surprising thing is that you're now using interactivity as the 
argument against SD. I did not set out to create better interactivity, I set 
out to create widespread fairness without too much compromise to 
interactivity. As I said from the _very first email_, there would be cases of 

That was one user. As I mentioned in an earlier thread, the problem with email 
threads on drawn out issues on lkml is that all that people remember is the 
last one creating noise, and that has only been the noise from Mike for 2 
weeks now. Has everyone forgotten the many many users who reported the 
advantages first up which generated the interest in the first place? Why have 
they stopped reporting? Well the answer is obvious; all the signs suggest 
that SD is slated for mainline. It is on the path, Linus has suggested it and 
now akpm is asking if it's ready for 2.6.22. So they figure there is no point 
testing and replying any further. SD is ready for prime time, finalised and 
does everything I intended it to. This is where I have to reveal to them the 
horrible truth. This is no guarantee it will go in. In fact, this one point 
that you (Ingo) go on and on about is not only a quibble, but you will call 
it an absolute showstopper. As maintainer of the cpu scheduler, in its 
current form you will flatly refuse it goes to mainline citing the 5% of 
cases where interactivity has regressed. ...
From: Mike Galbraith
Date: Saturday, April 7, 2007 - 9:32 am

This doesn't even deserve a reply, so I'll just say "get well soon".

	-Mike

-

From: Gene Heskett
Date: Saturday, April 7, 2007 - 9:12 am

Con was scratching an itch, one we desktop users all have in a place we 
can't quite reach to scratch because we aren't quite the coding gods we 
should be.  Con at least has the coding knowledge to walk in and start 
shoveling, which is more than I can say of the efforts to derail the SD 

Sorry, this user got quiet to watch the cat fight.  Obviously I should 

Who gives a s*** about hackbench or a make -j 200?!  Those are NOT, and 
NEVER WILL BE, REAL WORLD LOADS for the vast majority of us.  For us SD 

To be expected, there are after all, only so many cpu cycles to go around.  
Here I sit, running 2.6.21-rc6 ATM, and since there is not an SD patch 
that applies cleanly to rc6, I am back to typing half or more of a 
sentence blind while I answer a posting such as this because of x 
starvation while kmail is sorting incoming stuff.

All this while gkrellm, sitting on the right edge of my screen, is showing 
a 0 to 2% cpu usage in its graphic display!  FWIW, also isn't suffering 
the same display update problems, nor is the system clock down on the 
kickstart bar.  If that isn't prima faci evidence of an unfair scheduler, 
I don't know what is. With the SD patch applied to a working kernel, I've 
pretty well got my machine back and I'm in command again, just as if I 
was running nitros9 on my trs-80 Color Computer while it was compiling a 
program in the background, or back when I was doing all this on an amiga.

Both of these had, by their simplistic designs, schedulers that were fair, 
with (nitr)os9 having the ability to schedule the order that IRQ's were 
serviced with a priority setting on a per IRQ basis.  If Amigados ever 
had the ability to fiddle with the scheduler other than niceing the 
process, it wasn't important enough for me to see if I could tweak it 
because generally it simply worked.

Con's earlier patches worked very well for this desktop user, but as Mike 
kept bitching about "production", (who the hell runs a 'make -j 200' or 
50 while(1)'s in the ...
From: Ingo Molnar
Date: Saturday, April 7, 2007 - 11:08 am

it would be really nice to analyze this. Does the latest -rt patch boot 
on your box so that we could trace this regression? (I can send you a 
standalone tracing patch if it doesnt.) IIRC you reported that one of 
the early patches from Mike made your system behave good (but still not 
as good as SD) - it would be nice to try a later patch too.

basically, the current unfairness in the scheduler should be solved, one 

not many - and i dont think Mike tested any of these - Mike tested 
pretty low make -j values (Mike, can you confirm?).

(I personally routinely run 'make -j 200' build jobs on my box [because
 it's the central server of a build cluster and high parallelism is
 needed to overcome network latencies], but i'm pretty special in that
 regard and i didnt use that workload as a test against any of these
 schedulers.)

	Ingo
-

From: Mike Galbraith
Date: Saturday, April 7, 2007 - 12:14 pm

Yes.  I don't test anything more than make -j5 when looking at
interactivity, and make -j nr_cpus+1 is my must have yardstick.

	-Mike

-

From: Gene Heskett
Date: Saturday, April 7, 2007 - 1:31 pm

Somebody made that remark, maybe not you, and maybe they were being funny, 
but I didn't at the time, see any smileys.

-- 
Cheers, Gene
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
Please remain calm, it's no use both of us being hysterical at the same 
time.
-

From: William Lee Irwin III
Date: Monday, April 9, 2007 - 10:51 am

I strongly suggest assembling a battery of cleanly and properly written,
configurable testcases, and scripting a series of regression tests as
opposed to just randomly running kernel compiles and relying on Braille.
For instance, a program that spawns a set of tasks with some spectrum
of interactive vs. noninteractive behaviors and maybe priorities too
according to command-line flags and then measures and reports the
distribution of CPU bandwidth between them, with some notion of success
or failure and performance within the realm of success reported would
be something to include in such a battery of testcases. Different sorts
of cooperating processes attempting to defeat whatever sorts of
guarantees the scheduler is intended to provide would also be good
testcases, particularly if they're arranged so as to automatically
report success or failure in their attempts to defeat the scheduler
(which even irman2.c, while quite good otherwise, fails to do).

IMHO the failure of these threads to converge to some clear conclusion
is in part due to the lack of an agreed-upon set of standards for what
the scheduler should achieve and overreliance on subjective criteria.
how any of this is to be demonstrated, and what the status of various
pathological cases are, these threads are a nightmare of subjective
squishiness and a tug-of-war between testcases only ever considered one
at a time needing Lindent to read that furthermore have all their
parameters hardcoded. Scripting edits and recompiles is awkward. Just
finding the testcases is also awkward; con has a collection of a few,
but they've got the aforementioned flaws and others also go around
that can only be dredged up from mailing list archive searches, plus
there's nothing like LTP where they can be run in a script with
pass/fail reports and/or performance metrics for each. One patch goes
through for one testcase and regressions against the others are open
questions.

Scheduling does have a strong subjective component, but this is ...
From: Ingo Molnar
Date: Monday, April 9, 2007 - 11:03 am

there's interbench, written by Con (with the purpose of improving 
RSDL/SD), which does exactly that, but vanilla and SD performs quite the 
same in those tests.

it's quite hard to test interactivity, because it's both subjective and 
because even for objective workloads, things depend so much on exact 
circumstances. So the best way is to wait for actual complaints, and/or 
actual testcases that trigger badness, and victims^H^H^H^H^H testers.

(also note that often it needs _that precise_ workload to trigger some 
badness. For example make -j depends on the kind of X shell terminal 
that is used - gterm behaves differently from xterm, etc.)

	Ingo
-

From: William Lee Irwin III
Date: Monday, April 9, 2007 - 11:44 am

Interactivity will probably have to stay squishy. The DoS affairs like
fiftyp.c, tenp.c, etc. are more of what I had in mind. There are also
a number of instances where CPU bandwidth distributions are gauged by
top(1) with noninteractive tests where the scriptable testcase affair
should be coming into play.

There are other, relatively obvious testcases for basic functionality
missing, too. For instance, where is the testcase to prove that nice
levels have the intended effect upon CPU bandwidth distribution between
sets of CPU-bound tasks? Or one that gauges the CPU bandwidth
distribution between a task that sleeps some (command-line configurable)
percentage of the time and some (command-line configurable) number of
competing CPU-bound tasks? Or one that gauges the CPU bandwidth
distribution between sets of cooperating processes competing with
ordinary CPU-bound processes? Can it be proven that any of this is
staying constant across interactivity or other changes? Is any of it
being changed as an unintended side-effect? Are the CPU bandwidth
distributions among such sets of competing tasks even consciously decided?

There should be readily-available answers to these questions, but they
are not so.


-- wli
-

From: Gene Heskett
Date: Saturday, April 7, 2007 - 11:23 am

Yes it would be Ingo, but so far, none of the recent -rt patches has 
booted on this machine, the last one I tried a few days ago failing to 
find /dev/root, whatever the heck that is.

FWIW, I gave up on the rt stuffs 6 months or more ago when the regressions 
I was reporting weren't ever acknowledged.  I don't enjoy sitting through 
all these e2fsk's during the reboot just to have things I normally run in 
the background die, like tvtime, sitting there with some news channel 
muttering along in the background.  I was even ignored when I suggested 
it might be a dma problem, which I still think it could be.

Nevertheless, the patch you sent is building as I type, intermittently 

And I'd wager a cool one that you don't gain more than a second or so in 
compile time between a make -j8 and a make -j200 unless your network is a 
pair of tomato juice cans & some string.  Again, to me, the network thing 
is not something that's present in an everyday users environment.  My 

-- 
Cheers, Gene
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
If you would keep a secret from an enemy, tell it not to a friend.
-

From: Ingo Molnar
Date: Saturday, April 7, 2007 - 11:52 am

did you have a chance to try the yum kernel by any chance? The -testing 
one you can try on Fedora with little hassle, by doing this as root:

cat > /etc/yum.repos.d/rt-testing.repo
[rt-testing]
name=Ingo's Real-Time (-rt) test-kernel for FC6
baseurl=http://people.redhat.com/mingo/realtime-preempt/yum-testing/yum/
enabled=1
gpgcheck=0
<Ctrl-D>


i did spend quite some time to debug your tv-tuner problem back then, 
and for that purpose alone i bought a tv tuner card to test this myself. 
(but it worked on my testbox)

	Ingo
-

From: Gene Heskett
Date: Saturday, April 7, 2007 - 1:30 pm

No, I couldn't seem to get that to show up in a yumex display, and I'm 
You didn't tell me this.

That said, I am booted to the patch you sent me now, and this also is a 
very obvious improvement, one I could easily live with on a long term 
basis.  I haven't tried a kernel build in the background yet, but I have 
sat here and played patience for about an hour, looking for the little 
stutters, but never saw them.  So I could just as easily recommend this 
one for desktop use, it seems to be working.  tvtime hasn't had any audio 
or video glitches that I've noted when I was on that screen to check on 
an interesting story, like the 102 year old lady who finally got her hole 
in one, on a very short hole, but after 90 years of golfing, she was 
beginning to wonder if she would ever get one.  Not sure who bought at 
the 19th hole, HNN didn't cover that traditional part.

So this patch also works.  And if it gets into mainline, at least Con's 
efforts at proding the fixes needed will not have been in vain.

My question then, is why did it take a very public cat-fight to get this 
looked at and the code adjusted?  Its been what, nearly 2 years since 
Linus himself made a comment that this thing needed fixed.  The fixes 
then done were of very little actual effectiveness and the situation then 
has gradually deteriorated since.

Its on the desktop that linux will win or lose the public's market share.  
After all, there are only so many 'servers' on the planet, a market that 
linux has pretty well demo'ed its superiority, if not in terms of speed, 
at least in security.

To qualify that, I currently have 2 of yahoo's machines in 
my .procmailrc's /dev/null list as they are a source of a large number of 
little 1 to 3 line spams.  I assume they are IIS machines, but the emails 
headers aren't that explicit to my relatively untrained eyeballs.

And I'd like to see korea put on a permanent rbl black hole.  I'm less 
than amused at watching the log coming out of my router as first ...
From: Ingo Molnar
Date: Sunday, April 8, 2007 - 3:41 am

thanks for testing it! (for the record, Gene tested sched-mike-4.patch, 

this is pretty hard to get right, and the most objective way to change 
it is to do it testcase-driven. FYI, interactivity tweaking has been 
gradual, the last bigger round of interactivity changes were done a year 
ago:

 commit 5ce74abe788a26698876e66b9c9ce7e7acc25413
 Author: Mike Galbraith <efault@gmx.de>
 Date:   Mon Apr 10 22:52:44 2006 -0700

     [PATCH] sched: fix interactive task starvation

(and a few smaller tweaks since then too.)

and that change from Mike responded to a testcase. Mike's latest changes 
(the ones you just tested) were mostly driven by actual testcases too, 
which measured long-term timeslice distribution fairness.

It's really hard to judge interactivity subjectively, so we rely on 
things like interbench (written by Con) - in which testsuite the 
upstream scheduler didnt fare all that badly, plus other testcases 
(thud.c, game_sim.c, now massive_inter.c, fiftyp.c and chew.c) and all 
the usual test-workloads. This is admittedly a slow process, but it 
seems to be working too and it also ensures that we dont regress in the 
future. (because testcases stick around and do get re-tested)

your system seems to also be a bit special because you 1) drive it to 
the absolute max on the desktop but you do not overload it in obvious 
ways (i.e. your workloads are pretty fairly structured) 2) it's a bit 
under-powered (single-CPU 800 MHz CPU, right?) but not _too_ 
underpowered - so i think you /just/ managed to hit 'the worst' of the 
current interactivity estimator: with important tasks both being just 
above and just below 50%. Believe me, on all ~10 systems i use 
regularly, Linux interactivity of the vanilla scheduler is stellar. (And 
that includes a really old 500 MHz one too with FC6 on it.)

	Ingo
-

From: Gene Heskett
Date: Sunday, April 8, 2007 - 4:33 am

Actually, its an XP2800 Athlon, 333 fsb, gig of memory.  And I was all 
enthusiastic about this until amanda's nightly run started, at which 
point I started losing control for quite long periods, 30+ seconds at a 
time.  Up till then I thought we had it made.  In this regard, Cons 
patches were enough better to notice it right away, lags were 1-2 seconds 
max.

That seems to be the killer loading here, building a kernel (make -j3) 
doesn't seem to lag it all that bad.  One session of gzip -best makes it 
fall plumb over though, which was a disappointment.

But, I could live with this.

Now if I could figure out a way to nail dm_mod down to a fixed LANANA 
approved address, I just got bit again, because enabling pktcdvd caused a 
MAJOR switch, only from 253 to 252 but tar thinks the whole 45GB is all 
new again.  So since it, dm_mod, no longer carries the experimental 
label, lets put that patch back in and be done with this particular 
hassle once and for all.  If I had known that using LVM2 was going to be 
such a pain in the ass just with this item alone, I wouldn't have touched 
it with a 50 foot fiberglass pole.  Or does this SOB effect normal 
partition mountings too?  I don't know, and the suggested fixes from 
David Dillow I put in /etc/modprobe.conf are ignored for dm_mod, and when 
extended to pktcdvd, cause pktcdvd to fail totally.

Mmm??, can I pass an 'option dm_mod major=238' as a kernel argument & make 

-- 
Cheers, Gene
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
Real Programmers don't write in PL/I.  PL/I is for programmers who can't
decide whether to write in COBOL or FORTRAN.
-

From: Mike Galbraith
Date: Sunday, April 8, 2007 - 4:40 am

Can you make a testcase that doesn't require amanda?

	-Mike

-

From: Mike Galbraith
Date: Sunday, April 8, 2007 - 5:02 am

Or at least send me a couple of 5 or 10 second top snapshots (which also
show CPU usage of sleeping tasks) while the system is misbehaving?

	-Mike

-

From: Gene Heskett
Date: Sunday, April 8, 2007 - 10:57 am

With what monitor utility?

-- 
Cheers, Gene
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
"Microsoft technology" -- isn't that an oxymoron? 

   -- Gareth Barnard
-

From: Mike Galbraith
Date: Sunday, April 8, 2007 - 9:19 pm

Top.

	-Mike

-

From: Gene Heskett
Date: Sunday, April 8, 2007 - 10:23 pm

This may not be so informative, its almost behaving ATM.

29252 amanda    22   0  1856  572  220 R 76.4  0.1   1:07.24 gzip
29235 amanda    15   0  2992 1224  888 S  5.6  0.1   0:02.80 chunker
29500 root      18   0  2996 1164  788 S  4.0  0.1   0:02.40 tar
10459 amanda    15   0  3340 1052  832 S  3.0  0.1   0:49.04 amandad
10536 amanda    15   0  3276 1308 1004 S  2.3  0.1   0:40.92 dumper
29496 amanda    18   0  2808  472  280 S  2.0  0.0   0:01.73 sendbackup
 4057 gkrellmd  15   0 11568 1172  896 S  1.3  0.1   7:45.82 gkrellmd
29498 amanda    18   0  2396  780  656 S  1.0  0.1   0:00.60 tar
19183 root      15   0     0    0    0 S  0.7  0.0   0:01.92 pdflush

I also note with some disdain that I'm half a megabyte into swap, but I've 
had FF-2.0.0.3 busy for the last hour while amanda was trying to find a 
few cycles at the same time.  Looking at a bunch of pdf's of circuit 
boards to see if I wanna build them for my milling machine.

-- 
Cheers, Gene
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
Fatal Error: Found MS-Windows System -> Repartitioning Disk for Linux...
-

From: Mike Galbraith
Date: Sunday, April 8, 2007 - 11:09 pm

Yeah, this is showing the scheduler behaving properly.

	-Mike

-

From: Gene Heskett
Date: Sunday, April 8, 2007 - 10:56 am

Sure.  Try 'tar czf nameofarchive.tar.gz /path/to-dir-to-be-backed-up'

Or, from the runtar log from this morning, and this is all one line:

runtar.20070408022016.debug:running: /bin/tar: 'gtar' '--create' '--file' '-' '--directory' '/usr/dlds-rpms' '--one-file-system' '--listed-incremental' '/usr/local/var/amanda/gnutar-lists/coyote_usr_dlds-rpms_1.new' '--sparse' '--ignore-failed-read' '--totals' '--exclude-from' '/tmp/amanda/sendbackup._usr_dlds-rpms.20070408022016.exclude' '.'

and amanda will if requested, pipe that output through a |gzip -best, and 
its this process that brings the machine to the table begging for scraps 
like a puppy.  Tar by itself can be felt but isn't bad.

Even without the -best switch in effect, I'm sure you'll see the machine 
slow considerably.

Please don't try to call amanda an unusual load as amanda itself is 
nothing but an intelligent manager, constructing the command lines passed 
to tar or dump, and gzip, which do the real work.  Amdump, the manager my 
scripts wrap around, and my scripts themselves, will not use more 
than .01% of the cpu when averaged over the whole backup session.

-- 
Cheers, Gene
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
We are Microsoft.  What you are experiencing is not a problem; it is an 
undocumented feature.
-

From: Mike Galbraith
Date: Sunday, April 8, 2007 - 9:17 pm

So tar -cvf - / | gzip --best | tar -tvzf - should reproduce the
problem?

	-Mike

-

From: Gene Heskett
Date: Sunday, April 8, 2007 - 10:16 pm

That looks as if it should demo it pretty well if I understand correctly 
everything you're doing there.

-- 
Cheers, Gene
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
In /users3 did Kubla Kahn
A stately pleasure dome decree,
Where /bin, the sacred river ran
Through Test Suites measureless to Man
Down to a sunless C.
-

From: Mike Galbraith
Date: Sunday, April 8, 2007 - 11:06 pm

Well, I let it process my ~250GB of data with my current tree, and it
looked utterly harmless (and since I'm running SMP, was of course).
I'll try building UP to make sure, and check mainline as well.

	-Mike

-

From: Mike Galbraith
Date: Monday, April 9, 2007 - 1:24 am

Ok, I can't reproduce any bad interactivity here with that workload
either with SMP or UP kernel.  That said however, gzip does attain
interactive status, which it really should not - that gives it an unfair
advantage over it's peers.

With my throttled tree, it gets pushed back down to where it belongs.
I'm going to try to tighten the tolerance on behavior to evict the
riffraff who don't really belong in the elite interactive club sooner,
and guarantee that even fast/light tasks can't dominate the CPU without
paying heavily.

(to close the many fast/light tasks wakeup scenario that the "untested"
patch someone mentioned did, but was shown to be too painful to bare).

	-Mike

-

From: Ingo Molnar
Date: Sunday, April 8, 2007 - 3:58 am

and note that a year ago Mike did a larger patch too, not unlike his 
current patch - but we hoped that his smaller change would be sufficient 
- and nobody came along and said "i tested Mike's and the difference is 
significant on my system". Which seems to suggest that the number of 
problem-systems and worried users/developers isnt particularly large.

	Ingo
-

From: Gene Heskett
Date: Sunday, April 8, 2007 - 10:04 am

May I suggest that while it may have been noticeable, it was 
not 'significant', so we didn't sing praises and bow to mecca at the 
time.  I just thought that this is the way it was, till Cons patch proved 
otherwise for this  'desktop' user.  We were then, and still are, looking 
for the magic that lets it all load up and slow down in a linear feeling 
fashion.  Only those IRQ's that are fleeting and need serviced NOW should 
be exceptions to that rule.  AFAIAC, gzip can take its turn in the queue, 
getting no more time in proportion than any other process that wakes up 
in its slice and finds it has something to do, if nothing to do it should 
yield the floor immediately, and in any event be put back at the far end 
of the queue when its timeslice is over.  gzip in particular seems very 
reticent to give up the cpu at what should be the end of its timeslice.  
As it is, the IRQ's are being serviced, so no keystrokes are being lost, 
or very few, unlike the situation 2 years ago when whole sentences typed 
blind were on the missing list when x finally did get a chance to play 
catchup.

As a desktop user, I fail to understand any good reason why a keystroke 
typed can't be echoed to the screen within 200 milliseconds regardless of 
how many gzip -best's amdump may be running in the background.

I have a coco3, running nitros9 at a cpu clock rate of 1.79mhz with a 
1/10th second context switch, in the basement that CAN do that while 
assembling an executable with a separate process printing the listing of 
that assembly as it progresses.


Again, may I suggest that this sort of behavior on the desktop is a 



-- 
Cheers, Gene
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
The meek will inherit the earth -- if that's OK with you.
-

From: Mike Galbraith
Date: Sunday, April 8, 2007 - 9:03 pm

Actually, there was practically nil interest in testing.  We made a
couple of minor adjustments to the interactivity logic, and all went
quiet, so I didn't think it was enough of a problem to require more
intrusive countermeasures.

	-Mike

-

From: Gene Heskett
Date: Sunday, April 8, 2007 - 9:08 pm

Does one of these messages have a url so I can test the latest of your 
patches for -rc6?  Or was the one Ingo sent the most recent?

Putting that url in your sig would be nice, and might result in its 
getting a lot more exersize which should = more feedback.

-- 
Cheers, Gene
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
Got a complaint about the Internal Revenue Service?  
Call the convenient toll-free "IRS Taxpayer Complaint Hot Line Number":

	1-800-AUDITME
-

From: Mike Galbraith
Date: Sunday, April 8, 2007 - 10:59 pm

No, my tree has a bugfix and some other adjustments that try to move the

When I get it cleaned up and better tested, I'll post again.  If you
want, I'll CC you... willing victims are a highly valued commodity :)

	-Mike

-

From: Gene Heskett
Date: Monday, April 9, 2007 - 6:01 am

-- 
Cheers, Gene
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
The box said "Requires Windows 95 or better."  I can't understand    
why it won't work on my Linux computer.
-

From: Rene Herman
Date: Sunday, April 8, 2007 - 11:51 am

Ah yes, that one. Here's the next one in that series:

commit f1adad78dd2fc8edaa513e0bde92b4c64340245c
Author: Linus Torvalds <torvalds@g5.osdl.org>
Date:   Sun May 21 18:54:09 2006 -0700

     Revert "[PATCH] sched: fix interactive task starvation"

It personally had me wonder if _anyone_ was testing this stuff...

Rene.

-

From: Mike Galbraith
Date: Sunday, April 8, 2007 - 9:23 pm

Well of course not.  Making random untested changes, and reverting them
later is half the fun of kernel development.

	-Mike

-

From: Rene Herman
Date: Monday, April 9, 2007 - 5:14 am

The point ofcourse is that the very example Molnar quoted as an example 
of responsible, testcase driven development was in fact hugely broken 
and sat in the tree that way for 4 rc's.

To me, the example rather serves as confirmation of what Kolivas has 
been saying; endlessly tweaking the tweaks isn't going anywhere. The 
minute you tweak A, tweak B over there in corner C-Sharp falls flat on 
its face.

Computers are horribly stupid and tend to fail most situations their 
smart human programmers didn't specifically tell them about. If, as in 
the case of a scheduler, the real-world demands on a piece of software 
are so diverse that you cannot tell them about all possible situations 
specifically, the only workable solution is to make them _predictable_ 
so that when hitting one of those special situations, the smart human 
using the computer at least gets to know how to intervene if he feels 
inclined to do so.

This turned into an interactivity thing, and while interactivity is in 
fact better for a large majority of testers, that isn't what Kolivas' 
scheduler is about. It's about predictability and leaving the dead-end 
road of these endlesss tweaks, which then break previous tweaks, rinse, 
repeat.

It's unfortunate that Kolivas is having health problems currently, but I 
certainly do hope that his scheduler finds its way into _a_ -rc1. He 
said it was done...

Rene.

-

From: Mike Galbraith
Date: Monday, April 9, 2007 - 10:10 am

To me, it's more than an interactivity thing.  It is also about reacting

Well, there I disagree with him quite strongly, but it's not my decision
what gets integrated into any tree but my own ;-)

	-Mike

-

From: Andreas Mohr
Date: Monday, April 9, 2007 - 6:27 am

Hi,


The whole recent discussion/flamefest/... here makes me think that we're
still heading towards actually introducing plugsched (most preferrably
by making mainline scheduler the builtin default and optionally building
a plugsched kernel which then allows selection).
There are fundamental behavioural differences between the various
CPU scheduler types developed; while some people want a very interactive
system with in most(!) cases good latency and exploit-less operation,
several others want a scheduler which provides very predictable latency,
low overhead and additionally as much interactivity as this strict
model can provide for. And then there are people who have very specific
SMP requirements which both characteristic scheduler types may have trouble
satisfying properly.

And I really don't see much difference whatsoever to the I/O scheduler
area: some people want predictable latency, while others want maximum
throughput or fastest operation for seek-less flash devices (noop).
Hardware varies similarly greatly has well:
Some people have huge disk arrays or NAS, others have a single flash disk.
Some people have a decaying UP machine, others have huge SMP farms.

IMHO both areas are too varied, thus runtime or compile-time selection
is justified for both areas, not simply for I/O schedulers only.
I don't think anybody would want to introduce new very similar scheduler types
just for the fun of it; development would center around improving the at
most 3 or 4 different scheduler implementations (as is the case with I/O
schedulers, BTW: there hasn't been an explosion of different variants
either!).

I think the whole discussion went on the wrong track when people somehow
had the notion of making RSDL (and its later variants) the main scheduler
for desktop machines, not just server operation. And this target of course
(and rightfully so) prompted people to ask for interactivity similar
to what the current scheduler achieves which RSDL cannot fully provide
within its ...
From: Rene Herman
Date: Monday, April 9, 2007 - 12:54 pm

I do agree, and yes, I/O scheduling seems to not have suffered from the 
choice although I must say I'm not sure how much use each I/O scheduler 
individualy sees.

If one CPU scheduler can be good enough then it would better to just 
have that one, but well, yes, maybe it can't. I certainly believe any 
one scheduler can't avoid breaking down onder some condition. Demand is 
just too varied.

I find it interesting that you see SD as a server scheduler and I guess 
deterministic behaviour does point in that direction somewhat. I would 
be enabling it on the desktop though, which probably is _some_ argument 
on having multiple schedulers.

Rene.

-

From: Ingo Molnar
Date: Monday, April 9, 2007 - 7:15 am

but ... SD clearly regresses in some areas, so by that logic SD isnt 
going anywhere either?

note that i still like the basic idea about SD, that it is an experiment 
that if the only conceptual focus is on "scheduling fairness", we'll get 
a better scheduler. But for that to work out two things have to be done 
i think:

 - the code actually has to match that stated goal. Right now it
   diverges from it (it is not a "fair" scheduler), and it's not clear
   why.

note that SD at the moment produces ~10% more code in sched.o, and the 
reason is that SD is more complex than the vanilla scheduler. People 
tend to get the impression that SD is simpler, partly because it is a 
net linecount win in sched.c, but many of the removed lines are 
comments.

this "provide fairness" goal is quite important, because if SD's code is 
not only about providing fairness, what is the rest of the logic doing? 
Are they "tweaks", to achieve interactivity? If yes, why are they not 
marked as such? I.e. will we go down the _same_ road again, but this 
time with a much less clearly defined rule for what a "tweak" is?

note that under the interactivity estimator it is not that hard to 
achieve forced "fairness".

So _if_ we accept that scheduling must include a fair dose of heuristics 
(which i tend to think it has to), we are perhaps better off with an 
interactivity design that _accepts_ this fundamental fact and separates 
heuristics from core scheduling. Right now i dont see the SD proponents 
even _accepting_ that even the current SD code does include heuristics.

the other one is:

 - the code has to demonstrate that it can flexibly react to various 
   complaints of regressions.

(I identified a few problem workloads that we tend to care about and i 
havent seen much progress with them - but i really reserve judgement 
about that, given Con's medical condition.)

	Ingo
-

From: Rene Herman
Date: Monday, April 9, 2007 - 10:05 am

No. The logic isn't that (performance and other) characteristics must 
always be exactly the same between two schedulers, the logic is that 
having one of them turn into a contrived heap of heuristics where every 
progression on one front turns into a regression on another means that 
one is on a dead-end road.

Now ofcourse, while not needing to behave the same in all conceivable 
situations, any alternative like SD needs to behave _well_ and for me, 

I read most of the discussion centering around that specific point as 
well, and frankly, I mostly came away from it thinking "so what?". It 
seems this is largely an issue of you and Kolivas disagreeing on what 
needs to be called design and what needs to be called implementation, 
but more importantly I feel a solution is to just shy away from the 
inherently subjective word "fair". If you feel that some of the things 
SD does need to be called "unfair" as much as mainline, so be it, but do 
you think that SD is less _predictably_ fair or unfair than mainline?

This is what I consider to be very important; if my retarted kid brother 
sometimes walk left and sometimes right when I tell him to walk forward, 
I can't go stand to the right and say "nono, forward I said". If on the 
right there's a highway, you can imagine what that means... All software 

One answer to that is that it's much less important what a tweak is as 
long as it's the same always. If I then don't like the definition I'll 
just define it the other way around privately and be done with it. I do 
believe that SDs objective is not fairness as such, it's predictability. 
Being "fair" was postulated as a condition for being so, but let's not 
put too much focus on that one point; it's a matter of definitions (and 

I agree that the demands on a (one) general purpose scheduler are so 
diverse that it's impossible to have one that doesn't break down under 
some set of conditions. The mainline scheduler does so, and SD does so. 
What SD does is take some of the ...
From: Ingo Molnar
Date: Monday, April 9, 2007 - 10:48 am

it's important due to what Mike mentioned in the previous mail too: SD 
seems to be quite rigid in certain aspects. So if we end up with that 
fundamental rigidity we might as well be _very_ sure that it makes 
sense. Because otherwise there might be no other way out but to "revert 
the whole thing again". Today we always have the "tweak the 
interactivity estimator" route, because that code is not rigid at the 

that's not what i found when testing Mike's latest patches - they 
visibly improved those testcases, part of which were written to 
"exploit" heuristics, without regressing others. Several people reported 
improvements with those patches.

Why was that possible without spending years on writing a new scheduler? 
Because the interactivity estimator is fundamentally _tweakable_. What 
you flag with sometimes derogative sentences as a weakness of the 
interactivity estimator is also its strength: tweakability is 
flexibility. And no, despite what you claim to be a "patchwork" it makes 
quite some sense: reward certain scheduling behavior and punish other 
type of behavior. That's what SD does too in the end. Sure, if your 
"reward" fights against the "punishment", they cancel out each other, or 
if the metrics used are just arbitrary and make no independent sense 
it's bad, but that's just plain bad engineering.

Why didnt much happen in the past year or so? Frankly, due to lack of 
demand for change - because most people were just happy about it, or 
just not upset enough. And i know the types of complaints first-hand, 
the -rt tree is a _direct answer_ to desktop-space complaints of Linux 
and it includes a fair bit of scheduler changes too. Now that we have 
actual new testcases and people with complaints and their willingness to 

i didnt say that, in fact my first lkml comment about RSDL on lkml was 
the exact opposite, but you SD advocates are _still_ bickering about 
(and not accepting) fundamental things like Mike's make -j5 workload and 
flagging it as ...
From: Gene Heskett
Date: Monday, April 9, 2007 - 12:56 pm

Mikes -j5 workload is AFAIAC, a very realistic workload for building a 
kernel.  My own script I just discovered was using -j8, and that was 
noticeable, but by no means a killing hit on my poor old Xp2800 Athlon.  
I pulled it back to 4 for this mornings build and the hit, while less, is 
still noticeable.  Killer hit?  No way.  Using Mike's v4 patch I think it 



-- 
Cheers, Gene
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
When you jump for joy, beware that no-one moves the ground from beneath
your feet.
		-- Stanislaw Lem, "Unkempt Thoughts"
-

From: Rene Herman
Date: Monday, April 9, 2007 - 12:09 pm

I suppose I'm lumped in with the "SD advocates" now but you will note 
that I haven't been bickering about make -j5 loads. You cut away the 
entire meat of my reply which was all that predictability harping.

What I did say about make -j5 loads is that I do not think that they, 
under all circumstances, on all machines and at all cost, need to 
perform the same as currently if other situations improve. Do I want 
heuristics? Sure, I'm just saying the kernel is fundamentally incapable 
of getting it right all of the time and as such it should provide me 
with as many opportunities as possible at stepping in. That is, let me 
understand what it is and is going to be doing and then listen to me.

I agree not a lot of progress is to be made if people keep ignoring each 
other like that but also while SD's author is offline. Let's just shelve 
it until he's back. Not bury though...

Rene.

-

From: Ingo Molnar
Date: Monday, April 9, 2007 - 6:53 am

yes - in hindsight i regret having asked Mike for a "simpler" patch, 
which turned out to be rushed and plain broke your setup: my bad. And i 
completely forgot about that episode, Mike did a stream of changes in 

yes, i certainly tried it and it broke nothing, and it was in fact acked 


so reverting it was justified. Basically, the approach was that the 
vanilla scheduler is working reasonably well, and that any improvement 
to it must not cause regression in areas where it already works well. 
(it obviously must have been working on your audio setup to a certain 
degree if reverting Mike's patch made the underruns go away)

In any case, it would be very nice if you could try Mike's latest patch, 
how does it work on your setup? (i've attached it)

	Ingo
From: Rene Herman
Date: Monday, April 9, 2007 - 8:37 am

Can do. Note that "my setup" in that case consisted of browsing around 
eBay in firefox with ogg123 playing audio directly to ALSA in an xterm 
as the only other thing running. That is, just about as basic a Linux 
desktop as imagineable.

Testing Mike's latest will have to wait a bit though; I'm currently 
testing the latest incarnation of SD (against 2.6.20.6). For people 
who've lost track of what and where, it's available as:

http://ck.kolivas.org/patches/staircase-deadline/2.6.20.5-sd-0.39.patch

and versus 2.6.21-rc5 as:

http://ck.kolivas.org/patches/staircase-deadline/2.6.21-rc5-sd-0.39.patch

For the moment it is giving me a snappy feeling desktop on this Duron 
1300, with ogg123 playing in an xterm without audio underruns, with a 
make -j2 kernel compile running (not niced) and me browsing around in 
firefox.

Mike latest would probably also support this load without much problem. 
Given that I feel the basic idea of SD is better than mainline though, 
I'll be concentrating on using SD for a bit for now.

Rene.

-

From: Ed Tomlinson
Date: Sunday, April 8, 2007 - 6:08 am

Hi,

I am one of those who have been happily testing Con's patches.  

They work better than mainline here.

There seems to be a disconnect on what Con is trying to achieve with SD.
They do not improve interactivity per say.  Instead they make the scheduler 
predictable by removing the alchemy used by the interactivity estimator.   
Mikes patches may be better alchemy but they continue down the same 
path - from prior experience, we can say with fairly good confidence, that
 there will be new corner cases that trigger problems.

With SD, if you ask too much of the machine it slows down.  You can fix this,
if required, by renicing tasks some tasks - or by reducing the load on the box.

If one really needs some sort of interactivity booster (I do not with SD), why
not move it into user space?  With SD it would be simple enough to export
some info on estimated latency.  With this user space could make a good
attempt to keep latency within bounds for a set of tasks just by renicing.... 

Thanks
Ed Tomlinson

PS.  Get well soon Con.

-

From: Mike Galbraith
Date: Sunday, April 8, 2007 - 10:38 pm

(I tried a UP kernel yesterday, and even a single kernel build would

I don't think you can have very much effect on latency using nice with
SD once the CPU is fully utilized.  See below.

/*
 * This contains a bitmap for each dynamic priority level with empty slots
 * for the valid priorities each different nice level can have. It allows
 * us to stagger the slots where differing priorities run in a way that
 * keeps latency differences between different nice levels at a minimum.
 * ie, where 0 means a slot for that priority, priority running from left to
 * right:
 * nice -20 0000000000000000000000000000000000000000
 * nice -10 1001000100100010001001000100010010001000
 * nice   0 0101010101010101010101010101010101010101
 * nice   5 1101011010110101101011010110101101011011
 * nice  10 0110111011011101110110111011101101110111
 * nice  15 0111110111111011111101111101111110111111
 * nice  19 1111111111111111111011111111111111111111
 */

Nice allocates bandwidth, but as long as the CPU is busy, tasks always
proceed downward in priority until they hit the expired array.  That's
the design.  If X gets busy and expires, and a nice 20 CPU hog wakes up
after it's previous rotation has ended, but before the current rotation
is ended (ie there is 1 task running at wakeup time), X will take a
guaranteed minimum 160ms latency hit (quite noticeable) independent of
nice level.  The only way to avoid it is to use a realtime class.

A nice -20 task has maximum bandwidth allocated, but that also makes it
a bigger target for preemption from tasks at all nice levels as it
proceeds downward toward expiration.  AFAIKT, low latency scheduling
just isn't possible once the CPU becomes 100% utilized, but it is
bounded to runqueue length.  In mainline OTOH, a nice -20 task will
always preempt a nice 0 task, giving it instant gratification, and
latency of lower priority tasks is bounded by the EXPIRED_STARVING(rq)
safety net.

	-Mike

-

From: Mike Galbraith
Date: Monday, April 9, 2007 - 7:39 pm

There's another aspect of this that may require some thought - kernel
threads.  As load increases, so does rotation length.  Would you really
want CPU hogs routinely preempting house-keepers under load?

	-Mike

-

From: Ed Tomlinson
Date: Tuesday, April 10, 2007 - 4:23 am

SD has a schedule batch nice level.  This is good for tasks that want lots
of cpu when they can get it.  If you overload your cpu I expect the box
to slow down - including kernel threads.  If really required they can be
started with a higher priority...

Ed
-

From: Mike Galbraith
Date: Tuesday, April 10, 2007 - 5:04 am

Sure.  Anything that is latency sensitive, and those kernel threads that
are necessary for system function can be made RT to bypass the designed
in latency.  It's just another thing that should be considered before
integration.  Now if burst loads (only one of which it the desktop)
would just cease to exist...

	-Mike

-

From: Ed Tomlinson
Date: Monday, April 9, 2007 - 4:26 am

Interesting.  I run UP amd64, 1000HZ, 1.25G, preempt off (on causes kernel 
stalls with no messages - but that is another story).  I do not notice a single 
make.   When several are running the desktop slows down a bit.  I do not have 
X niced.  Wonder why we see such different results? 

I am not saying that SD is perfect - I fully expect that more bugs will turn up
in its code (some will affect mainline too).  I do however like the idea of a 
scheduler that does not need alchemy to achieve good results.  Nor do I
necessarily expect it to be 100% transparent.  If one changes something
as basic as the scheduler some tweaking should be expected.  IMO this

Mike I made no mention of low latency.  I did mention predictable latency.  If
you are 100% utilized, and have a nice -20 task cpu hog, I would expect it to run 
and that it _should_ affect other tasks - thats why it runs with -20...

This is why I suggest that user space may be a better place to boost interactive
tasks.  A daemon that posted a message telling me that the nice -20 cpu hog
is causing 300ms delays for X would, IMHO, be a good thing.  That same daemon
could then propose a fix telling me the expected latencies and let me decide if 
I want to change priorities.  It could also be set to automaticily adjust nice levels...

Thanks
Ed
-

From: Mike Galbraith
Date: Monday, April 9, 2007 - 9:50 am

Probably because with your processor, in general cc1 can get the job
done faster, as can X.  The latency big hit happens when you hit the end
of the rotation.  You simply don't hit it as often as I do.  Anyone with
an old PIII box should hit the wall very quickly indeed.  I haven't had


You did say that Con's patch works better than mainline, and you seemed
very much to be talking about the desktop.  X very definitely is a
latency sensitive application, and often a CPU hog to boot.  The point I
illustrated above is a salient point.

If you don't want to hear about anything other than this idea about
I did above, that we are absolutely _going_ to take a 160ms + remaining
task ticks latency hit.

Nice -20 was used only to show clearly what SD trades away, and it's not
only the desktop it's trading for mundane latency, it's trading any
possibility of low latency, and dismissing burst loads as if they don't
even exist.  The current scheduler is dynamic.  SD is utterly rigid.

Apply what I wrote to X at the recommended nice -10.  It makes no
difference what bandwidth you allocate if the latency sensitive
application _will_ take a very major latency hit if it uses it.  X does

Re-read what I wrote.  You simply can't get there from here, by design.

If I'm wrong, someone please show me where.

	-Mike

-

From: Martin Steigerwald
Date: Sunday, April 22, 2007 - 3:48 am

Hi!

I am running 2.6.20.7 + sd-0.44 on an IBM ThinkPad T23 that I use as my 
Amarok machine[1]. It  has a Pentium 3 with 1.13 GHz using ondemand 
frequency scaling and XFS as filesystem.

So far music playback has been perfect even when I had it building kernel 
packages while wildly clicking around starting apps and then moving the 
Amarok window like mad while solid window moving is enabled. Amarok / 
xine continued to play the music totally unimpressed of that.

So for me from a users point of view who wants good music playback *no 
matter what*, this is already perfect. Also the desktop feels quite 
snappy to me. It was only slow on anything I/O bound but thats 
understandable IMHO when make-kpkg tar -bzips the kernel source while 20 
KDE applications are starting and Amarok plays music.

Should I try any specific tests? This also goes out to anybody else, 
especially to you, Con.  So if you want me to run some benchmarks, please 
tell me. I am not experienced in benchmarking, but if you tell me what to 
do, I can try it out. I prefer benchmarks that do not disrupt music 
playback, but can run more aggressive benchmarks over night. I think it 
might be good to use a benchmark that isn't I/O bound to really test the 
scheduler... but as said I am no expert on that and real life loads 
usually are I/O bound as well.

Have to have an carefully eye on the harddisk though...

Apr 22 11:51:06 deepdance smartd[3116]: Device: /dev/sda, SMART Prefailure 
Attribute: 3 Spin_Up_Time changed from 154 to 150

(well threshold is at 033, so still plenty to go, hope it will take some 
time till the next change)

[1] http://martin-steigerwald.de/amarok-machine/ ;)

Regards,
-- 
Martin 'Helios' Steigerwald - http://www.Lichtvoll.de
GPG: 03B0 0D6C 0040 0710 4AFA  B82F 991B EAAC A599 84C7
-

From: Con Kolivas
Date: Sunday, April 22, 2007 - 4:15 am

Thanks for the report. In your case, you've done the testing I require; that 
for your workloads everything works as you'd desire it without obvious 
problems. Just keeping an eye on newer versions if you have the time and 
inclination and making sure that everything stays as you expect it would be 
the most helpful thing you can do.

Thanks!

-- 
-ck
-

From: Prakash Punnoor
Date: Wednesday, March 28, 2007 - 10:34 am

Hi, I am using 2.6.21-rc5 with rsdl 0.37 and think I still see a regression=
=20
with my Athlon X2. Namely using this ac3 encoder=20
(http://aften.sourceforge.net/), which I parallelized in a simple way, with=
=20
my test sample I remember having encoding times of ~5.4sec with vanilla and=
=20
~5.8 sec with rsdl - once the whole test wave is in cache. Otherwise you ca=
n=20
easily I/O limit the encoder. ;-) You need to get sources from svn though.=
=20
The current 0.06 release doesn't have threads support.

Cheers,

=2D-=20
(=B0=3D                 =3D=B0)
//\ Prakash Punnoor /\\
V_/                 \_V
From: Prakash Punnoor
Date: Saturday, March 31, 2007 - 11:40 pm

BTW, I confirmed this regression. With vanilla 2.76.21-rc5 I get back my 5.=
4=20
secs with the test sample and two threads. Furtmermore for me vanilla=20
actually feels nicer on my dual core, even with load - just subjectively=20
that's why I ditched rsdl...

Cheers,
=2D-=20
(=B0=3D                 =3D=B0)
//\ Prakash Punnoor /\\
V_/                 \_V
From: Con Kolivas
Date: Wednesday, March 28, 2007 - 11:36 pm

My neck condition got a lot worse today. I'm forced offline for a week and 
will be uncontactable.

-- 
-ck
-

From: Andrew Morton
Date: Monday, April 23, 2007 - 1:58 am

OK, this is bizarre.  I'm getting this:

[   52.754522] RTNL: assertion failed at net/ipv4/devinet.c (1055)
[   52.758258]  [<c02cb6f7>] inetdev_event+0x46/0x2d8
[   52.762041]  [<c01049c9>] show_trace_log_lvl+0x28/0x2c
[   52.765887]  [<c0105482>] show_trace+0xf/0x13
[   52.769627]  [<c01054d7>] dump_stack+0x14/0x18
[   52.773320]  [<c029b22e>] rtnl_unlock+0xd/0x2f
[   52.776999]  [<c029f410>] fib_rules_event+0x3a/0xeb
[   52.780678]  [<c01236aa>] notifier_call_chain+0x2c/0x55
[   52.784339]  [<c012371a>] raw_notifier_call_chain+0x17/0x1b
[   52.787975]  [<c0295984>] dev_open+0x63/0x6b
[   52.791587]  [<c02944fd>] dev_change_flags+0x50/0x104
[   52.795201]  [<c02cbcf4>] devinet_ioctl+0x259/0x57b
[   52.798798]  [<c02955b2>] dev_ifsioc+0x113/0x3a0
[   52.802408]  [<c028b127>] sock_ioctl+0x1a1/0x1c4
[   52.805966]  [<c028af86>] sock_ioctl+0x0/0x1c4
[   52.809475]  [<c0165969>] do_ioctl+0x19/0x4d
[   52.812977]  [<c0165b99>] vfs_ioctl+0x1fc/0x216
[   52.816478]  [<c0165bff>] sys_ioctl+0x4c/0x65
[   52.819944]  [<c0103b68>] syscall_call+0x7/0xb
[   52.823395]  =======================
[   52.826923] RTNL: assertion failed at net/ipv4/igmp.c (1358)
[   52.830485]  [<c02cf545>] ip_mc_up+0x35/0x59
[   52.834034]  [<c029b22e>] rtnl_unlock+0xd/0x2f
[   52.837569]  [<c02cb7ed>] inetdev_event+0x13c/0x2d8
[   52.841123]  [<c01049c9>] show_trace_log_lvl+0x28/0x2c
[   52.844682]  [<c0105482>] show_trace+0xf/0x13
[   52.848227]  [<c01054d7>] dump_stack+0x14/0x18
[   52.851752]  [<c029b22e>] rtnl_unlock+0xd/0x2f
[   52.855242]  [<c029f410>] fib_rules_event+0x3a/0xeb
[   52.858734]  [<c01236aa>] notifier_call_chain+0x2c/0x55
[   52.862241]  [<c012371a>] raw_notifier_call_chain+0x17/0x1b
[   52.865759]  [<c0295984>] dev_open+0x63/0x6b
[   52.869191]  [<c02944fd>] dev_change_flags+0x50/0x104
[   52.872571]  [<c02cbcf4>] devinet_ioctl+0x259/0x57b
[   52.875998]  [<c02955b2>] dev_ifsioc+0x113/0x3a0
[   52.879399]  [<c028b127>] sock_ioctl+0x1a1/0x1c4
[   52.882741]  [<c028af86>] ...
Previous thread: [PATCH]: Resubmit Fix bogus softlockup warning with sysrq-t by Prarit Bhargava on Wednesday, March 28, 2007 - 9:03 am. (1 message)

Next thread: Re: Linux page cache issue? by Xin Zhao on Wednesday, March 28, 2007 - 9:15 am. (1 message)