nice and hyperthreading on atom

Previous thread: kvm-intel + vista64 installer == BSOD. by Paweł Sikora on Saturday, September 6, 2008 - 7:57 am. (1 message)

Next thread: hotplug create_slot hang by Yinghai Lu on Saturday, September 6, 2008 - 10:15 am. (7 messages)
From: Phil Endecott
Date: Saturday, September 6, 2008 - 8:43 am

Dear Experts,

I have an ASUS Eee with an Atom processor, which has hyperthreading 
enabled.  If I have two processes, one nice and the other normal, they 
each get 50% of the CPU time.  Of course this is what you'd expect if 
the scheduler didn't understand that the two virtual processors are not 
really independent.  I'd like to fix it.

Google finds patches posted by Con Kolivas a looong time ago to address 
this.  Can anyone tell me what has happened in the meantime?  Maybe 
this feature is now in the kernel, but there's something I have to do 
to enable it (e.g. choose the right scheduler).  Or maybe it never made 
it in, for some reason.

Thanks for any suggestions.

Phil.



--

From: Peter Zijlstra
Date: Saturday, September 6, 2008 - 9:31 am

Assuming the hardware makes each 'virtual' cpu get a similar share of
the hardware resources, there is nothing the operating system can do.

The OS just sees two cpus, and its impossible to schedule two tasks of
different weight on two cpus so that execution time is fair and work is
conserved - this is called an infeasible weight distribution.

So unless the Atom has some interface to influence how the resources are
distributed between the execution contexts (for which Linux currently
lacks any and all support) there is nothing we can do.


--

From: Arjan van de Ven
Date: Saturday, September 6, 2008 - 9:42 am

On Sat, 06 Sep 2008 16:43:31 +0100

but you cannot imfluence the cpu's scheduling of the instructions.

As an OS one COULD decide to just not schedule the nice task at all,
but then, especially on atom where HT has a high efficiency, your cpu
is mostly idle ...

-- 
If you want to reach me at my work email, use arjan@linux.intel.com
For development, discussion and tips for power savings, 
visit http://www.lesswatts.org
--

From: Phil Endecott
Date: Saturday, September 6, 2008 - 11:24 am

Here's how I imagine it: say I have one regular task and one "nice -9" 
task.  On a conventional uniprocessor system they would get about 90% 
and 10% of the CPU respectively.  On the hyperthreadng system they 
currently get equal shares; except that the CPU is more efficient with 
two threads running, so you could perhaps say that they get 60% each or 
something like that.  But 60% is still less than 90%, and I don't want 
my foreground interactive task being slowed down that much by this 
niced task.  So I envisage the system spending 20% of its time running 
both tasks and the remaining 80% of the time running just the 
higher-priority task.  That way, I get half of 20% = 10% spent on the 
nice task and half of 20% plus 80% = 90% spent on the foreground task.  
(Or maybe something like 12% + 92%, allowing for the hyperthreading efficiency.)

Here's a link to Con Kolivas' post where he described something like 
this back in 2004:

   http://thread.gmane.org/gmane.linux.kernel/178090/focus=178882


Phil.



--

From: Ulrich Drepper
Date: Saturday, September 6, 2008 - 11:30 am

One thread being idle is even on Atom the right thing to do in some
situations.  If you have processes which, when HT is used, experience
high pressure on the common cache(s) then you should not schedule them
together.  We can theoretically find out whether this is the case
using the PMCs.  With perfmon2 hopefully on the horizon soon it might
actually be possible to automatically make these measurements.

There is another aspect I talked to Peter about already.  We really
want concurrent threat scheduling in some cases.  For the
implementation of helper threads you don't want two threads to be
scheduled independently, you want them to be scheduled on a HT pair.
Currently this isn't possible except by pinning them to fixed threads.
 We really want to have a new way to express this type of scheduling
(Peter, how did you call it?)
--

From: Peter Zijlstra
Date: Sunday, September 7, 2008 - 2:34 am

Really helps if you make sure I'm on the CC ;-)

I think I called it something like affinity grouping - but I'm still a
bit scared of the bin-packing issues involved - those will really mess
up the already complex balancing rules.

--

From: Phil Endecott
Date: Sunday, September 7, 2008 - 6:58 am

I thought I'd try to quantify the effect with real processes.  My 
"foreground" task is a compilation and my "background" task is a tight 
loop at nice -9.  No doubt you would get different results with 
different tasks (amount of I/O, cache hit rate, different nice level etc.).

With no background task running, the foreground task takes 86s whether 
or not HT is enabled.  With the background task running, the foreground 
task takes 97s with HT off and 104s with HT on.  104s is better than I 
was expecting; in fact it's close enough to 97s that the problem can be 
overlooked in this case.

I made a number of other measurements, of which the most significant is 
that the run time with no background task comes down to 63s with -j2 
when HT is on.  So for this compilation, hyperthreading makes the CPU 
perform like 1.36 uniprocessors (in some sense).  I'll have to try to 
remember how to make -j2 the default...

Anyway, can I take it that the previous patches to improve this 
behaviour have never been merged?


Regards,  Phil.




--

From: Bill Davidsen
Date: Sunday, September 7, 2008 - 6:09 pm

Phil, I got about the same improvement when CFS was being evaluated from 
patches, so I think you can trust your result, HT really does help in 
the 1.30..1.35 range depending on the application. It also seems to help 
when processes or threads are running data through a pipe, and my check 
Just to provide a confirmation of the magnitude of the benefit, no real 
new information, although you might have a real piped operation to 
track, noting the real time, CPU time, and ctx rate.

I believe that there were reports on this list of unithreaded processes 
running faster with HT on, and some of lower core temp with HT on. The 
lower core temp was at my limit of measurement, so I can only say "I 
think so," 1-3 C is too small to really trust as a power saver test.

-- 
Bill Davidsen <davidsen@tmr.com>
   "We have more to fear from the bungling of the incompetent than from
the machinations of the wicked."  - from Slashdot
--

Previous thread: kvm-intel + vista64 installer == BSOD. by Paweł Sikora on Saturday, September 6, 2008 - 7:57 am. (1 message)

Next thread: hotplug create_slot hang by Yinghai Lu on Saturday, September 6, 2008 - 10:15 am. (7 messages)