On Tue, 2010-04-06 at 18:42 +0530, Suresh Jayaraman wrote:
Why bother using -stable for reporting bugs?
Right, except its not a severe imbalance as the subject suggests. For
some reason it seems to end up in a semi-stable state that is actually
quite balanced.
for ((i=0; i<8; i++)) do while :; do :; done & done
for ((i=0; i<3; i++)) do while :; do :; done & renice -n -15 -p $! ;
done
gets me:
Cpu0 :100.0%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu1 :100.0%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu2 :100.0%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu3 :100.0%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu4 : 99.0%us, 1.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu5 :100.0%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu6 :100.0%us, 0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu7 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 16440840k total, 1073672k used, 15367168k free, 105844k buffers
Swap: 16777212k total, 0k used, 16777212k free, 296504k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
4370 root 5 -15 105m 804 304 R 100.1 0.0 0:45.02 bash
4374 root 5 -15 105m 804 304 R 100.1 0.0 0:44.95 bash
4372 root 5 -15 105m 804 304 R 99.1 0.0 0:45.00 bash
4364 root 20 0 105m 804 304 R 51.0 0.0 0:33.06 bash
4362 root 20 0 105m 800 300 R 50.0 0.0 0:33.17 bash
4365 root 20 0 105m 804 304 R 50.0 0.0 0:33.75 bash
4368 root 20 0 105m 804 304 R 50.0 0.0 0:33.32 bash
4369 root 20 0 105m 804 304 R 50.0 0.0 0:33.38 bash
4363 root 20 0 105m 804 304 R 49.1 0.0 0:33.65 bash
4366 root 20 0 105m 804 304 R 49.1 0.0 0:33.29 bash
4367 root 20 0 105m 804 304 R 49.1 0.0 0:33.54 bash
So we have the 3 -15 loops on a cpu each, and the 8 0 loops on 2 cpus
each, and 1 cpu idle. That is actually quite balanced, 'better' would be
if those 0 loops would rotate over the 5 available cpus, but that would
also trash more caches I guess.
I'm not quite sure what makes the load-balancer end up in this situation
though, but I suspect the various imbalance_pct things might have
something to do with it.
It doesn't always end up in this state either, if you only start 2 -15
loops its a roll of the dice on what happens, sometimes it ends up with
the 6 cpus cycling the 2 extra tasks around, sometimes its 1 cpu idle
with cycling 1 task.
Unexpected, maybe, severe imbalance, no. Would be nice to get it to be a
little more stable behaviour though.
--