On Thu, 15 Nov 2007 04:00:47 GMT
this worries me.. this appears to effectively disable preemption during
udelay() and mdelay() loops... which are very obvious latency inducers.Now you can argue that if you're preemptible you should have used
msleep() and co, and I'll totally buy that.Maybe we should just check if we're still on the same cpu or something,
or have a cheap way to pin a process to a cpu.... but both are longer
term solutions.--
If you want to reach me at my work email, use arjan@linux.intel.com
For development, discussion and tips for power savings,
visit http://www.lesswatts.org
-
You can use preemption notifiers to get a callback when you are
preempted. Not sure what you'd to with that callback, though.--
Do not meddle in the internals of kernels, for they are subtle and quick to panic.-
but that should not be needed in this case. Why doesnt the TSC using
delay loop simply poll the CPU it is on and fix up the TSC?Ingo
-
something like the patch below.
Ingo
--------------->
Subject: x86: make delay_tsc() preemptible again
From: Ingo Molnar <mingo@elte.hu>make delay_tsc() preemptible again.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
arch/x86/lib/delay_32.c | 28 +++++++++++++++++++++++-----
arch/x86/lib/delay_64.c | 30 ++++++++++++++++++++++++------
2 files changed, 47 insertions(+), 11 deletions(-)Index: linux/arch/x86/lib/delay_32.c
===================================================================
--- linux.orig/arch/x86/lib/delay_32.c
+++ linux/arch/x86/lib/delay_32.c
@@ -38,17 +38,35 @@ static void delay_loop(unsigned long loo
:"0" (loops));
}-/* TSC based delay: */
+/*
+ * TSC based delay:
+ *
+ * We are careful about preemption as TSC's are per-CPU.
+ */
static void delay_tsc(unsigned long loops)
{
- unsigned long bclock, now;
+ unsigned long prev, now;
+ long left = loops;
+ int prev_cpu, cpu;- preempt_disable(); /* TSC's are per-cpu */
- rdtscl(bclock);
+ preempt_disable();
+ rdtscl(prev);
do {
+ prev_cpu = smp_processor_id();
rep_nop();
+ preempt_enable();
+
+ preempt_disable();
+ cpu = smp_processor_id();
rdtscl(now);
- } while ((now-bclock) < loops);
+ /*
+ * If we preempted we skip this small amount of time:
+ */
+ if (prev_cpu != cpu)
+ prev = now;
+ left -= now - prev;
+ prev = now;
+ } while (left > 0);
preempt_enable();
}Index: linux/arch/x86/lib/delay_64.c
===================================================================
--- linux.orig/arch/x86/lib/delay_64.c
+++ linux/arch/x86/lib/delay_64.c
@@ -26,17 +26,35 @@ int read_current_timer(unsigned long *ti
return 0;
}+/*
+ * TSC based delay:
+ *
+ * We are careful about preemption as TSC's are per-CPU.
+ */
void __delay(unsigned long loops)
{
- unsigned bclock, now;
+ unsigned long prev, now;
+ long left = loops;
+ int prev_cpu, cpu;- preempt_disable(); /* TSC's are pre-cpu */
- rdtscl(bcl...
Otherwise, looks like a very nice patch :-)
-
yeah, fixed.
updated patch below.
Ingo
---------------->
Subject: x86: make delay_tsc() preemptible again
From: Ingo Molnar <mingo@elte.hu>make delay_tsc() preemptible again.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
---
arch/x86/lib/delay_32.c | 27 ++++++++++++++++++++++-----
arch/x86/lib/delay_64.c | 29 +++++++++++++++++++++++------
2 files changed, 45 insertions(+), 11 deletions(-)Index: linux/arch/x86/lib/delay_32.c
===================================================================
--- linux.orig/arch/x86/lib/delay_32.c
+++ linux/arch/x86/lib/delay_32.c
@@ -38,17 +38,34 @@ static void delay_loop(unsigned long loo
:"0" (loops));
}-/* TSC based delay: */
+/*
+ * TSC based delay:
+ *
+ * We are careful about preemption as TSC's are per-CPU.
+ */
static void delay_tsc(unsigned long loops)
{
- unsigned long bclock, now;
+ unsigned long prev, now;
+ long left = loops;
+ int prev_cpu, cpu;- preempt_disable(); /* TSC's are per-cpu */
- rdtscl(bclock);
+ preempt_disable();
+ rdtscl(prev);
do {
+ prev_cpu = smp_processor_id();
+ preempt_enable();
rep_nop();
+ preempt_disable();
+ cpu = smp_processor_id();
rdtscl(now);
- } while ((now-bclock) < loops);
+ /*
+ * If we migrated we skip this small amount of time:
+ */
+ if (prev_cpu != cpu)
+ prev = now;
+ left -= now - prev;
+ prev = now;
+ } while (left > 0);
preempt_enable();
}Index: linux/arch/x86/lib/delay_64.c
===================================================================
--- linux.orig/arch/x86/lib/delay_64.c
+++ linux/arch/x86/lib/delay_64.c
@@ -26,17 +26,34 @@ int read_current_timer(unsigned long *ti
return 0;
}+/*
+ * TSC based delay:
+ *
+ * We are careful about preemption as TSC's are per-CPU.
+ */
void __delay(unsigned long loops)
{
- unsigned bclock, now;
+ unsigned long prev, now;
+ long left = loops;
+ int prev_cpu, cpu;...
Yes, we can do better.
But this bug can cause very rare failures in probably a large number of
device drivers on a minorty of machines. Ugly. So I felt it best to
plug it fast while people think about more sophisticated fixes.-
how about using usleep() transparently if high-res timers are active and
we have !preempt_count()? That would be a sufficient solution and would
avoid all the calibration and per-cpu-ness problems.Ingo
-
It sounds like it would work OK. What is the setup cost for a usleep? I'd
have thought that code which does something likewhile (i++ < 1000) {
foo();
udelay(1);
}would take qiute a bit longer with such a change?
-
full roundtrip cost ought to be below 10 usecs, depending on the system.
There's no problem doing a non-preemptible udelay up to 10 usecs and we
could use usleep above that.Ingo
-
Ow. So the above timeout would take 10x longer. That probably won't break
anything, but quite a few drivers do udelay(1) for post-IO settling timesYup, with a few smarts in there we could work out which is the best to use,
and also compensate for the setup costs.It doesn't sound very 2.6.24ish though.
As a quicky things perhaps we could only do the preempt_disable()/preempt_enable()
if the TSCs are unsynced? Do we reliably know that? I guess not..
-
| Greg Kroah-Hartman | [PATCH 011/196] sysfs: Fix a copy-n-paste typo in comment |
| Al Boldi | Re: [ck] Re: [ANNOUNCE] RSDL completely fair starvation free interactive cpu sched... |
| Andrew Morton | 2.6.25-rc8-mm2 |
| Ingo Molnar | Re: [patch] sched_clock(): cleanups |
| David Miller | [GIT]: Networking |
| John P Poet | Realtek 8111C transmit timed out |
| Gerrit Renker | [PATCH 15/37] dccp: Set per-connection CCIDs via socket options |
| Kenny Chang | Multicast packet loss |
git: | |
