Re: [PATCH] x86: set_cyc2ns_scale() remove tsc_now and ns_now

Previous thread: [PATCH] Add additional examples in Documentation/spinlocks.txt by Mark Fasheh on Thursday, April 10, 2008 - 1:55 pm. (1 message)

Next thread: [PATCH] x86_64: change _end to end_before_pgt by Yinghai Lu on Thursday, April 10, 2008 - 3:06 pm. (2 messages)
From: Karsten Wiese
Date: Thursday, April 10, 2008 - 2:31 pm

Both are unused and the functions rdtscll() and __cycles_2_ns() don't have
side-effects.

Signed-off-by: Karsten Wiese <fzu@wemgehoertderstaat.de>
---
 arch/x86/kernel/tsc_32.c |    4 ----
 arch/x86/kernel/tsc_64.c |    4 ----
 2 files changed, 0 insertions(+), 8 deletions(-)

diff --git a/arch/x86/kernel/tsc_32.c b/arch/x86/kernel/tsc_32.c
index 82602d7..0406b80 100644
--- a/arch/x86/kernel/tsc_32.c
+++ b/arch/x86/kernel/tsc_32.c
@@ -84,7 +84,6 @@ DEFINE_PER_CPU(unsigned long, cyc2ns);
 
 static void set_cyc2ns_scale(unsigned long cpu_khz, int cpu)
 {
-	unsigned long long tsc_now, ns_now;
 	unsigned long flags, *scale;
 
 	local_irq_save(flags);
@@ -92,9 +91,6 @@ static void set_cyc2ns_scale(unsigned long cpu_khz, int cpu)
 
 	scale = &per_cpu(cyc2ns, cpu);
 
-	rdtscll(tsc_now);
-	ns_now = __cycles_2_ns(tsc_now);
-
 	if (cpu_khz)
 		*scale = (NSEC_PER_MSEC << CYC2NS_SCALE_FACTOR)/cpu_khz;
 
diff --git a/arch/x86/kernel/tsc_64.c b/arch/x86/kernel/tsc_64.c
index f08e1ea..5246036 100644
--- a/arch/x86/kernel/tsc_64.c
+++ b/arch/x86/kernel/tsc_64.c
@@ -44,7 +44,6 @@ DEFINE_PER_CPU(unsigned long, cyc2ns);
 
 static void set_cyc2ns_scale(unsigned long cpu_khz, int cpu)
 {
-	unsigned long long tsc_now, ns_now;
 	unsigned long flags, *scale;
 
 	local_irq_save(flags);
@@ -52,9 +51,6 @@ static void set_cyc2ns_scale(unsigned long cpu_khz, int cpu)
 
 	scale = &per_cpu(cyc2ns, cpu);
 
-	rdtscll(tsc_now);
-	ns_now = __cycles_2_ns(tsc_now);
-
 	if (cpu_khz)
 		*scale = (NSEC_PER_MSEC << CYC2NS_SCALE_FACTOR)/cpu_khz;
 
-- 
1.5.3.3

--

From: Andi Kleen
Date: Friday, April 11, 2008 - 12:42 am

I don't think you should remove those. At some point the offset
needs to be reset when the frequency scale changes for obvious reasons, 
and that needs to be fixed, not the code removed. Right now you'll get
scheduling inconsistencies during frequency changes on 
!constant_tsc CPUs.

(actually it is still the wrong time -- really needs a grace 
period during which the TSC is not used
ftp://firstfloor.org/pub/ak/quilt/patches/sched-clock implemented
some of these ideas against an older kernel)

-Andi

--

From: Ingo Molnar
Date: Friday, April 11, 2008 - 12:55 am

recent CPUs have constant-freq TSCs so it's mostly a legacy issue, but 
the affected installed base is still significant to warrant a fix.

we dont really have to worry about complications like grace periods - 
higher layers in the scheduler protect against temporary sched_clock() 
outliers. So i think this can all be done much simpler. Just get rid of 
the global cpu_khz notion, sched_clock() should simply follow the ->freq 
value - and that's it.

	Ingo
--

From: Andi Kleen
Date: Friday, April 11, 2008 - 1:06 am

Actually there millions of non constant freq TSC CPUs shipped each

But you still get scheduling hickups even with the sanity check. If the 
scheduler depends on a smooth time that is not good and my (admittedly much less 
than yours) understanding of CFS is that it relies on that. Especially ondemand 

At some point you have to generate an offset to something and that
offset must be different for different frequencies, otherwise
you get large systematic errors

(<imagine complicated mathematical proof why this is so, but it should
be fairly obvious>)

-Andi

--

From: Guillaume Chazarain
Date: Friday, April 11, 2008 - 1:25 am

This offset is already there, represented by rq->clock - sched_clock()

-- 
Guillaume
--

From: Andi Kleen
Date: Friday, April 11, 2008 - 1:40 am

Same issue. Think about it.  Perhaps I should have written
the complicated proof :)

You really have to compute the offset before the scaling, otherwise it 
is useless.

TSC is a counter that adds up time units. Now when the frequency
changes the time units change, but counter doesn't reset.

Now the full absolue value of the counter is useless for exact time 
because there is no way to reconstruct what the lengths of the different 
time units meshed together in the single counter value were
and how long it ticked at the different frequencies.

So after a frequency change the only way to get anything 
approaching a sane time is to take a snapshot of the counter
already ticking at the new frequency (but before it is scaled!) 
and then only work with current TSC - snapshot and only scale
after that.

Then there is also the issue that you don't know when exactly
the counter changes and any measurements during that time
might be dodgy (but system is not usually fully stopped during
it). 

The rewritten sched_clock handled this by having a unstable period 
between cpufreq starting to change, cpufreq reporting end of change 
and falling back to another clock during this instable period.

Also there are of course other users, like printk who
don't have rq->clock.

-Andi
--

Previous thread: [PATCH] Add additional examples in Documentation/spinlocks.txt by Mark Fasheh on Thursday, April 10, 2008 - 1:55 pm. (1 message)

Next thread: [PATCH] x86_64: change _end to end_before_pgt by Yinghai Lu on Thursday, April 10, 2008 - 3:06 pm. (2 messages)