Re: RFC: outb 0x80 in inb_p, outb_p harmful on some modern AMD64 with MCP51 laptops

Previous thread: [PATCH] vivi driver works only as first device by Gregor Jasny on Thursday, December 6, 2007 - 6:06 pm. (4 messages)

Next thread: PATCH] adding wistron_btns support for X86_64 systems. by Rémi Hérilier on Thursday, December 6, 2007 - 4:16 pm. (1 message)
To: <linux-kernel@...>
Cc: Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, H. Peter Anvin <hpa@...>
Date: Thursday, December 6, 2007 - 6:38 pm

After much, much testing (months, off and on, pursuing hypotheses), I've
discovered that the use of "outb al,0x80" instructions to "delay" after
inb and outb instructions causes solid freezes on my HP dv9000z laptop,
when ACPI is enabled.

It takes a fair number of out's to 0x80, but the hard freeze is reliably
reproducible by writing a driver that solely does a loop of 50 outb's to
0x80 and calling it in a loop 1000 times from user space. !!!

The serious impact is that the /dev/rtc and /dev/nvram devices are very
unreliable - thus "hwclock" freezes very reliably while looping waiting
for a new second value and calling "cat /dev/nvram" in a loop freezes
the machine if done a few times in a row.

This is reproducible, but requires a fair number of outb's to the 0x80
diagnostic port, and seems to require ACPI to be on.

io_64.h is the source of these particular instructions, via the
CMOS_READ and CMOS_WRITE macros, which are defined in mc146818_64.h. (I
wonder if the same problem occurs in 32-bit mode).

I'm happy to complete and test a patch, but I'm curious what the right
approach ought to be. I have to say I have no clue as to what ACPI is
doing on this chipset (nvidia MCP51) that would make port 80 do this.
A raw random guess is that something is logging POST codes, but if so,
not clear what is problematic in ACPI mode.

ANy help/suggestions?

Changing the delay instruction sequence from the outb to short jumps
might be the safe thing. But Linus, et al. may have experience with
that on other architectures like older Pentiums etc.
--

To: David P. Reed <dpreed@...>
Cc: <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, H. Peter Anvin <hpa@...>
Date: Friday, December 7, 2007 - 6:44 am

Use a variable for the port and and do a early quirk to change
the port to something safe on your chipset?

Ok there might be code using outb_p() before the early quirks,
but should be possible to find using instrumentation.

Also the port assignment might not be chipset specific, but BIOS
specific, then you would need to match the DMI identifier. The
disadvantage of that is that there are usually other BIOS

I don't think that makes sense to do on anything modern. The trouble
is that the jumps will effectively execute near "infinitely fast" on any
modern CPU compared to the bus. But the delay really needs to be something
that is about IO port speed. Ok in theory you could try to measure
a outb using RDTSC and then use udelay, but first you would need
a safe port for that already and then RDTSC is not necessarily constant.

-Andi
--

To: Andi Kleen <andi@...>
Cc: David P. Reed <dpreed@...>, <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, H. Peter Anvin <hpa@...>
Date: Friday, December 7, 2007 - 12:04 pm

You don't need to. Port 0x80 historically is about 8uS so just udelay(8)
and make sure the initial default delay is conservative enough before the
CPU speed is computed.

0x80 should be fine for anything PC compatible anyway, its specifically
reserved as a debug port and supported for *exactly* that purpose by
many chipsets.

The afflicted laptop should really be taken up with the vendor. If its
got port 0x80 wrong gods knows what else it might have problems with.

Alan
--

To: Alan Cox <alan@...>
Cc: Andi Kleen <andi@...>, David P. Reed <dpreed@...>, <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, H. Peter Anvin <hpa@...>
Date: Sunday, December 9, 2007 - 8:54 am

Actually, I've seen few pci cards with leds on port 0x80, and I
wonder: is our outb_p really correct?

I mean, we expect 8usec delay -- historical ISA timing -- but when
_PCI_ card with leds is inserted, it is likely to be faster than old
ISA, right?
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
--

To: Pavel Machek <pavel@...>
Cc: Alan Cox <alan@...>, Andi Kleen <andi@...>, David P. Reed <dpreed@...>, <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, H. Peter Anvin <hpa@...>
Date: Sunday, December 9, 2007 - 12:59 pm

Yes, i guess switching to udelay at least on newer systems would
be a good idea. I'm not quite sure about systems without TSC though.

-Andi
--

To: Andi Kleen <andi@...>
Cc: Alan Cox <alan@...>, David P. Reed <dpreed@...>, <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, H. Peter Anvin <hpa@...>
Date: Sunday, December 9, 2007 - 5:25 pm

Something like this? (Warning, will not probably even compile on
x86-64, I do not have 64-bit compiler near me).

(I believe VGA cards do not need slow outputs, plus udelay is not
available in uncompressor?)

Signed-off-by: Pavel Machek <pavel@suse.cz> [but it needs fixing x86-64]
Pavel

diff --git a/arch/x86/boot/compressed/misc_32.c b/arch/x86/boot/compressed/misc_32.c
index b74d60d..288e162 100644
--- a/arch/x86/boot/compressed/misc_32.c
+++ b/arch/x86/boot/compressed/misc_32.c
@@ -276,10 +276,10 @@ static void putstr(const char *s)
RM_SCREEN_INFO.orig_y = y;

pos = (x + cols * y) * 2; /* Update cursor position */
- outb_p(14, vidport);
- outb_p(0xff & (pos >> 9), vidport+1);
- outb_p(15, vidport);
- outb_p(0xff & (pos >> 1), vidport+1);
+ outb(14, vidport);
+ outb(0xff & (pos >> 9), vidport+1);
+ outb(15, vidport);
+ outb(0xff & (pos >> 1), vidport+1);
}

static void* memset(void* s, int c, unsigned n)
diff --git a/include/asm-x86/io_32.h b/include/asm-x86/io_32.h
index fe881cd..944dc5f 100644
--- a/include/asm-x86/io_32.h
+++ b/include/asm-x86/io_32.h
@@ -3,6 +3,7 @@ #define _ASM_IO_H

#include <linux/string.h>
#include <linux/compiler.h>
+#include <linux/delay.h>

/*
* This file contains the definitions for the x86 IO instructions
@@ -17,17 +18,6 @@ #include <linux/compiler.h>
* mistake somewhere.
*/

-/*
- * Thanks to James van Artsdalen for a better timing-fix than
- * the two short jumps: using outb's to a nonexistent port seems
- * to guarantee better timings even on fast machines.
- *
- * On the other hand, I'd like to be sure of a non-existent port:
- * I feel a bit unsafe about using 0x80 (should be safe, though)
- *
- * Linus
- */
-
/*
* Bit simplified and optimized by Jan Hubicka
* Support of BIGMEM added by Gerhard Wichert, Siemens AG, July 1999.
@@ -252,7 +242,7 @@ #endif /* __KERNEL__ */

static inline void native_io_delay(vo...

To: Pavel Machek <pavel@...>
Cc: Andi Kleen <andi@...>, Alan Cox <alan@...>, David P. Reed <dpreed@...>, <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, H. Peter Anvin <hpa@...>
Date: Monday, December 10, 2007 - 12:17 am

Alan, did you double-check that 8 us? I tried to but I seem to not have
trustworthy documentation.

Rene.
--

To: Rene Herman <rene.herman@...>
Cc: Pavel Machek <pavel@...>, Andi Kleen <andi@...>, Alan Cox <alan@...>, David P. Reed <dpreed@...>, <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, H. Peter Anvin <hpa@...>
Date: Monday, December 10, 2007 - 7:30 am

I remember 16-bit CPU-driven ISA was able to do 2-3 MB/s transfers,
that means at least 1 Maccesses/second = up to 1 microsecond/access.

Perhaps IO ports accesses were slower than memory? But 8-12 times?
Perhaps port 0x80 was using (slower) 8-bit timings?

Bus-mastering ISA cards were able to do ca. 5 MB/s with 8 MHz (10 MHz?)
clocking, some old machines didn't like it.

Googling suggests that a slave access on 8-bit ISA bus was taking
6 cycles by default (including 4 wait states), 16-bit - 3 cycles
(with 1 WS). Respectively 0.75 us and 0.375 us, and 0.25 us for
16-bit 0WS memory access (with standard 8 MHz clock).

These values could be changed with BIOS setup, and devices could
use 0WS or I/O CHRDY signals if they didn't like the defaults
(dir 0WS mean 1 WS for 8-bit devices?).
--
Krzysztof Halasa
--

To: Krzysztof Halasa <khc@...>
Cc: Rene Herman <rene.herman@...>, Pavel Machek <pavel@...>, Andi Kleen <andi@...>, Alan Cox <alan@...>, David P. Reed <dpreed@...>, <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, H. Peter Anvin <hpa@...>
Date: Monday, December 10, 2007 - 9:10 pm

Where did the 8us delay come from? The documentation and source is
careful not to say how long the delay is. Would changing it to, say
1us, be technically wrong? Is code that requires 8us correct?
--

To: David Newall <david@...>
Cc: Krzysztof Halasa <khc@...>, Rene Herman <rene.herman@...>, Pavel Machek <pavel@...>, Andi Kleen <andi@...>, Alan Cox <alan@...>, David P. Reed <dpreed@...>, <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>
Date: Monday, December 10, 2007 - 9:25 pm

I think a single ISA bus transaction is 1 µs, so two of them back to
back should be 2 µs, not 8 µs...

-hpa
--

To: H. Peter Anvin <hpa@...>
Cc: David Newall <david@...>, Krzysztof Halasa <khc@...>, Pavel Machek <pavel@...>, Andi Kleen <andi@...>, Alan Cox <alan@...>, David P. Reed <dpreed@...>, <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>
Date: Tuesday, December 11, 2007 - 2:54 am

Sigh. And now where do these _two_ transactions come from? (and yes, see
Alan's folowups, a transaction on a spec bus is 1 us).

Rene.
--

To: Rene Herman <rene.herman@...>
Cc: David Newall <david@...>, Krzysztof Halasa <khc@...>, Pavel Machek <pavel@...>, Andi Kleen <andi@...>, Alan Cox <alan@...>, David P. Reed <dpreed@...>, <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>
Date: Tuesday, December 11, 2007 - 1:01 pm

Stale memory, sorry.

-hpa
--

To: H. Peter Anvin <hpa@...>
Cc: Krzysztof Halasa <khc@...>, Rene Herman <rene.herman@...>, Pavel Machek <pavel@...>, Andi Kleen <andi@...>, Alan Cox <alan@...>, David P. Reed <dpreed@...>, <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>
Date: Monday, December 10, 2007 - 9:42 pm

Exactly. You think it's 2us, but the documentation doesn't say. The _p
functions are generic inasmuch as they provide an unspecified delay.
Drivers which work across platforms, and which use _p, therefore have
different delays on different platforms. Should the length of the delay
be unimportant? I wouldn't have thought so. If it is important, does
that mean that such drivers are buggy on some platforms?

I really *hate* the idea that access to non-present hardware is used to
generate a delay. That sucks so badly. It's worthy of a school-aged
hacker, not of a world-leading operating system. It's so not
best-practice that it's worst-practice.

--

To: David Newall <david@...>
Cc: H. Peter Anvin <hpa@...>, Krzysztof Halasa <khc@...>, Rene Herman <rene.herman@...>, Pavel Machek <pavel@...>, Andi Kleen <andi@...>, David P. Reed <dpreed@...>, <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>
Date: Tuesday, December 11, 2007 - 9:14 am

Actually its very good practice.

The LPC bus behaviour is absolutely and precisely defined. The timing of
the inb is defined in bus clocks which is perfect as the devices needing
delay are running at a fraction of busclock usually busclock/2.

Older processors did not have a high precision timer so you couldn't
calibrate loop based delays for 1uS.

Port 0x80 is used all over the place for this, not just in Linux but in a
large number of DOS programs and other PC OS's. It's even got specific
hardware support in many of the chipsets so that you can make the latched
last 0x80 write appear on the parallel port for debugging.

Alan
--

To: Alan Cox <alan@...>
Cc: David Newall <david@...>, H. Peter Anvin <hpa@...>, Krzysztof Halasa <khc@...>, Rene Herman <rene.herman@...>, Pavel Machek <pavel@...>, Andi Kleen <andi@...>, David P. Reed <dpreed@...>, <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>
Date: Tuesday, December 11, 2007 - 9:32 am

For newer CPUs udelay() would be probably fine though. We seem
to have several documented examples now where the bus aborts
trigger hardware bugs, and it is always better to avoid such situations.

I still think the best strategy would be to switch based on TSC
availability. Perhaps move out*_p out of line to avoid code bloat.

-Andi

--

To: Andi Kleen <andi@...>
Cc: Alan Cox <alan@...>, David Newall <david@...>, H. Peter Anvin <hpa@...>, Krzysztof Halasa <khc@...>, Rene Herman <rene.herman@...>, David P. Reed <dpreed@...>, <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>
Date: Tuesday, December 11, 2007 - 9:47 am

Why is TSC significant? udelay() based on bogomips seems to be good
enough...?
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
--

To: Pavel Machek <pavel@...>
Cc: Andi Kleen <andi@...>, Alan Cox <alan@...>, David Newall <david@...>, H. Peter Anvin <hpa@...>, Krzysztof Halasa <khc@...>, Rene Herman <rene.herman@...>, David P. Reed <dpreed@...>, <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>
Date: Tuesday, December 11, 2007 - 9:50 am

Maybe I'm not sure how accurate it really is on
non TSC system. On the other hand it is unclear that the port 80 IO
is always the same time so it's probably ok to vary a bit.
So most likely going to udelay() unconditionally is fine.

-Andi
--

To: Andi Kleen <andi@...>
Cc: Pavel Machek <pavel@...>, Alan Cox <alan@...>, David Newall <david@...>, H. Peter Anvin <hpa@...>, Krzysztof Halasa <khc@...>, Rene Herman <rene.herman@...>, David P. Reed <dpreed@...>, <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>
Date: Friday, December 14, 2007 - 9:33 am

yep, agreed, and have queued up the patch below. I've killed the
misc_*.c outb_p() uses because they happen before there's an udelay()
available - but that should be perfectly fine anyway: i dont remember
any video hardware that needed pauses for cursor updates, i think those
_p()'s just came in accidentally. (there's hardware that needed _p() for
other aspects of video such as mode switching - but cursor updates ...)

Ingo

------------------>
Subject: x86: fix in/out_p delays
From: Ingo Molnar <mingo@elte.hu>

Debugged by David P. Reed <dpreed@reed.com>.

Do not use port 0x80, it can cause crashes, see:

http://bugzilla.kernel.org/show_bug.cgi?id=6307
http://bugzilla.kernel.org/show_bug.cgi?id=9511

instead of just removing _p postfixes en masse, lets just first
remove the 0x80 port usage, then remove any unnecessary _p io ops
gradually. It's more debuggable this way.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
arch/x86/boot/compressed/misc_32.c | 8 ++++----
arch/x86/boot/compressed/misc_64.c | 8 ++++----
arch/x86/kernel/quirks.c | 10 ++++++++++
include/asm-x86/io_32.h | 5 +----
include/asm-x86/io_64.h | 5 +----
5 files changed, 20 insertions(+), 16 deletions(-)

Index: linux-x86.q/arch/x86/boot/compressed/misc_32.c
===================================================================
--- linux-x86.q.orig/arch/x86/boot/compressed/misc_32.c
+++ linux-x86.q/arch/x86/boot/compressed/misc_32.c
@@ -276,10 +276,10 @@ static void putstr(const char *s)
RM_SCREEN_INFO.orig_y = y;

pos = (x + cols * y) * 2; /* Update cursor position */
- outb_p(14, vidport);
- outb_p(0xff & (pos >> 9), vidport+1);
- outb_p(15, vidport);
- outb_p(0xff & (pos >> 1), vidport+1);
+ outb(14, vidport);
+ outb(0xff & (pos >> 9), vidport+1);
+ outb(15, vidport);
+ outb(0xff & (pos >> 1), vidport+1);
}

static void* memset(void* s, int c, unsigned n)
Index: linux...

To: David Newall <david@...>
Cc: H. Peter Anvin <hpa@...>, Krzysztof Halasa <khc@...>, Rene Herman <rene.herman@...>, Pavel Machek <pavel@...>, Andi Kleen <andi@...>, Alan Cox <alan@...>, David P. Reed <dpreed@...>, <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, <rol@...>
Date: Tuesday, December 11, 2007 - 3:40 am

Hi,

On Tue, 11 Dec 2007 12:12:59 +1030

Well, if the delay is so much unspecified, what about _reading_ port 0x80 ?
Will the delay be shorter ? And if so, what about reading port 0x80 and
writing the value back ?
inb al,0x80
outb 0x80,al

I've been wondering since the beginning of this thread if the problem is not
just the value we put to port 0x80, not writing to the port...

Just my 0.02 Eur...

Paul

--
Paul Rolland E-Mail : rol(at)witbe.net
Witbe.net SA Tel. +33 (0)1 47 67 77 77
Les Collines de l'Arche Fax. +33 (0)1 47 67 77 99
F-92057 Paris La Defense RIPE : PR12-RIPE

Please no HTML, I'm not a browser - Pas d'HTML, je ne suis pas un navigateur
"Some people dream of success... while others wake up and work hard at it"

"I worry about my child and the Internet all the time, even though she's too
young to have logged on yet. Here's what I worry about. I worry that 10 or 15
years from now, she will come to me and say 'Daddy, where were you when they
took freedom of the press away from the Internet?'"
--Mike Godwin, Electronic Frontier Foundation
--

To: Paul Rolland <rol@...>
Cc: David Newall <david@...>, H. Peter Anvin <hpa@...>, Krzysztof Halasa <khc@...>, Pavel Machek <pavel@...>, Andi Kleen <andi@...>, Alan Cox <alan@...>, David P. Reed <dpreed@...>, <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, <rol@...>
Date: Tuesday, December 11, 2007 - 5:50 am

The delay is completely and fully specified in terms of the ISA/LPC clock
which certainly for anything modern means a fixed, unchanging value
(something very close to 1 us) and even on older PCs that allow some
tweaking just means a delay synced to the actual bus clock which is what the
_p variants should normally want to accomplish.

Yes, as far as I'm aware, an inb() means the same delay but clobbers

See? Moreover, this also only makes sense if there's in fact something
responding to reads at 0x80 and with port 0x80 being a well-known legacy PC
port, a POST monitor would be just about that and writing to _that_ would
seem unlikely to have any ill effects other than turning your POST board LED
display into a christmas tree. The problem more likely is some piece of
hardware getting upset at LPC bus aborts and your suggestion wouldn't fix that.

In earlier incarnations of this thread it's been reported that various
implementations of the legacy PC timer, DMA controller and PIC needed the
delay but just replacing the outb with a udelay(1) would seem very likely to
have the desired effect also for those.

The only problem with _that_ is that you need a calibrated timing loop first
which means not-very-early boot (ie, not while you try to program the timer
to calibrate the loop for example). Pavel Machek already posted a patch,
although with an overly pessimistic delay value.

The problem here is with an x86-64 machine that very likely does not need
any delay at all in fact. One thing to do would be to make _any_ delay
dependent on 32-bit but given that 64-bit machines can run 32-bit kernels
this doesn't fix things fully, although it probably does in practice.

Keying of DMI for any delay could be possible. But if the simple udelay(1)
just works, all the better.

Rene.

--

To: Rene Herman <rene.herman@...>
Cc: Paul Rolland <rol@...>, H. Peter Anvin <hpa@...>, Krzysztof Halasa <khc@...>, Pavel Machek <pavel@...>, Andi Kleen <andi@...>, Alan Cox <alan@...>, David P. Reed <dpreed@...>, <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, <rol@...>
Date: Tuesday, December 11, 2007 - 8:08 am

That would be the delay on the i386 (sic) architecture. In general,
though, the delay is:

"Some devices require that accesses to their ports are slowed down.
This functionality is provided by appending a _p to the end of the
function."
-- Documentation/DocBook/deviceiobook.tmpl

(I've not seen any other formal definition.)

Most architectures (Alpha, Arm, Arm2, Blackfin, FRV, h8300, IA64,
PA-RISC, PowerPC, Sparc, Sparc64, V850 and Xtensa) do no pause. M68k
does no pause except in one configuration, when it's the same as i386.
On m32r it's a push and a pop. On SuperH it's similar to i386, only
using 16-bit input. X86-64 is the same as i386!

Thinking that _p gives a pause is perhaps too PC-centric. Why, if a
delay is needed, wouldn't you use a real delay; one that says how long
it should be?
--

To: David Newall <david@...>
Cc: Paul Rolland <rol@...>, H. Peter Anvin <hpa@...>, Krzysztof Halasa <khc@...>, Pavel Machek <pavel@...>, Andi Kleen <andi@...>, Alan Cox <alan@...>, David P. Reed <dpreed@...>, <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, <rol@...>
Date: Tuesday, December 11, 2007 - 9:16 am

This particular discussion isn't about anything in general but solely about
the delay an outb_p gives you on x86 since what is under discussion is not

Because any possible outb_p delay should be synced to the bus-clock, not to
any wall-clock. Drivers that want to sync to wall-clock need to use an outb,
delay pair as you'd expect.

In the real world, driver authors aren't perfect and will have used outb_p
as a wall-clock delay which they have gotten away with since it's a nicely
specified delay in terms of the ISA/LPC clock and the ISA/LPC clock being
fairly (old) to very (new) constant.

The delay it gives is very close to 1 us on a spec ISA/LPC bus (*) and as
such, even though it may not be the right thing to do from an theoretical
standpoint, generally a udelay(1) is going to be a fine replacement from a
practical one -- as soon as we _can_ use udelay(), as I also wrote.

Rene.

(*) some local testing shows it to be almost exactly that for both out and
in on my own PC -- a little over. If anyone cares, see attached little test
program. The "little over" I don't worry about. 0 us delay is also fine for
me and if any code was _that_ fragile it would have broken long ago.

To: Rene Herman <rene.herman@...>
Cc: Paul Rolland <rol@...>, H. Peter Anvin <hpa@...>, Krzysztof Halasa <khc@...>, Pavel Machek <pavel@...>, Andi Kleen <andi@...>, Alan Cox <alan@...>, David P. Reed <dpreed@...>, <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, <rol@...>
Date: Tuesday, December 11, 2007 - 9:50 am

That could be true if outb_p were used only in architecture dependent
code, but it's not. It's used in drivers that are supposed to run on
all sorts of platforms. Why does a megaraid controller need delays on
i386 but not on Sparc, PowerPC, Alpha and others? Is it buggy on most

It's most commonly a zero delay. Only in the minority of architectures
is it otherwise. If a delay is needed, then put one in, but don't put
in a paper promise that's more likely to be ignored than observed.

Plenty of doubt has been expressed as to whether _p is widely used
without need. Not surprising since it has such a vague specific
meaning. One could say, Linux on i386 is liberally sprinkled with
needless delays. I suppose it has the advantage that Microsoft will be
hard pressed to catch up when finally we remove them. :-)

I really prefer accurate code, but I'm also pragmatic and realise that
it's far too much work to fix this any time soon. But if it were to be
fixed, then perhaps _p would take an additional parameter, measured in
cycles of delay.
--

To: David Newall <david@...>
Cc: Rene Herman <rene.herman@...>, Paul Rolland <rol@...>, H. Peter Anvin <hpa@...>, Krzysztof Halasa <khc@...>, Pavel Machek <pavel@...>, Andi Kleen <andi@...>, Alan Cox <alan@...>, David P. Reed <dpreed@...>, <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, <rol@...>
Date: Tuesday, December 11, 2007 - 11:41 am

Most, probably most-all, of the delays to port operations
on modern ix86 machines are not needed at all. Certainly
machines that use bridges to expand port I/O to the ISA
bus do need any such delays. There are exactly two (and
only two) problems with removing the delays.

(1) Older machines which have an actual ISA bus with its
attendent capacity that needs to be charged long enough
for the data to become valid --before being overwritten
by new data.

(2) I/O operations that have two ports, one an index
port and the other a data port, like the CMOS RTC. Once
you set the index port, it takes about 300 ns for it to
propigate to the hardware, so there needs to be some
delay between the back-to-back CPU operations which can
occur much faster than that.

On this machine, I have changed all the _p macros so
they don't do anything. Since it is a modern machine
with N/S bridges, which provide their own delays,
everything works. Such would not be the case if I
was using a machine that had an actual ISA (or PC-104)
bus. Those are not terminated busses, but open-ended
capacitors made up of connectors and PC traces. It
takes about 300 ns to charge one of those (so 1us is
a good dalay).

BYW, there are no "transactions" on the ISA or EISA
bus. It works by using a sequence of operations
with minimum setup and hold times. It's very primative.

Cheers,
Dick Johnson
Penguin : Linux version 2.6.22.1 on an i686 machine (5588.29 BogoMips).
My book : http://www.AbominableFirebug.com/
_

****************************************************************
The information transmitted in this message is confidential and may be privileged. Any review, retransmission, dissemination, or other use of this information by persons or entities other than the intended recipient is prohibited. If you are not the intended recipient, please notify Analogic Corporation immediately - by replying to this message or by sending an email to DeliveryErrors@analogic.com - and destroy all copies of this informati...

To: linux-os (Dick Johnson) <linux-os@...>
Cc: David Newall <david@...>, Rene Herman <rene.herman@...>, Paul Rolland <rol@...>, H. Peter Anvin <hpa@...>, Krzysztof Halasa <khc@...>, Pavel Machek <pavel@...>, Andi Kleen <andi@...>, Alan Cox <alan@...>, David P. Reed <dpreed@...>, <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, <rol@...>
Date: Tuesday, December 11, 2007 - 12:30 pm

We know this. The problem is that there is no good known way to
figure out which machines need it. Also it is typically
slow hardware anyways -- the most time critical is probably
the 8259, but nobody who cares about performance still uses
it except as a fail safe fallback and for those it is better

It has been observed to be required talking to some older

and PIT etc.

Anyways it looks like the discussion here is going in a
a loop. I had hoped David would post his test results with
another port so that we know for sure that the bus aborts
(and not port 80) is the problem on his box. But it looks like
he doesn't want to do this. Still removing the bus aborts
is probably the correct way to go forward.

Only needs a patch now. If nobody beats me to it i'll
add one later to my tree.

-Andi

--

To: Andi Kleen <andi@...>
Cc: linux-os (Dick Johnson) <linux-os@...>, David Newall <david@...>, Paul Rolland <rol@...>, H. Peter Anvin <hpa@...>, Krzysztof Halasa <khc@...>, Pavel Machek <pavel@...>, Alan Cox <alan@...>, David P. Reed <dpreed@...>, <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, <rol@...>
Date: Tuesday, December 11, 2007 - 12:50 pm

Pavel Machek already posted one. His udelay(8) wants to be less -- 1 or "to
be safe" perhaps 2.

http://lkml.org/lkml/2007/12/9/131

Rene.
--

To: Rene Herman <rene.herman@...>
Cc: Andi Kleen <andi@...>, linux-os (Dick Johnson) <linux-os@...>, David Newall <david@...>, Paul Rolland <rol@...>, H. Peter Anvin <hpa@...>, Krzysztof Halasa <khc@...>, Alan Cox <alan@...>, David P. Reed <dpreed@...>, <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, <rol@...>
Date: Tuesday, December 11, 2007 - 3:16 pm

2 at least; that's how long outb(0x80) takes on one of my
machines. Actually, ISA can go down to 4MHz, so maybe we should be
using 4 usec.... but I guess I'm paranoid here.
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
--

To: Pavel Machek <pavel@...>
Cc: Andi Kleen <andi@...>, linux-os (Dick Johnson) <linux-os@...>, David Newall <david@...>, Paul Rolland <rol@...>, H. Peter Anvin <hpa@...>, Krzysztof Halasa <khc@...>, Alan Cox <alan@...>, David P. Reed <dpreed@...>, <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, <rol@...>
Date: Tuesday, December 11, 2007 - 4:00 pm

4 isn't sensible. There have been machines capable both of running Linux and
their ISA bus at less than 8 MHz (if only for example by picking a 5 divisor
on a system that was capable of hosting a 40 Mhz 386/486 but using a slower
CPU) but not by much. And machines doing that and running Linux, even more
so "today": 0.

My posted test program (although there seems to be something wrong with it
since it's influenced by compiler optimisation) is showing more than 1 but
note that on the vast majority of machines, 0 would in fact do. 1 will on
all, 2 will as well.

Rene.

--

To: Rene Herman <rene.herman@...>
Cc: Pavel Machek <pavel@...>, Andi Kleen <andi@...>, linux-os (Dick Johnson) <linux-os@...>, David Newall <david@...>, Paul Rolland <rol@...>, H. Peter Anvin <hpa@...>, Krzysztof Halasa <khc@...>, Alan Cox <alan@...>, <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, <rol@...>
Date: Wednesday, December 12, 2007 - 4:07 pm

Sadly, I've been busy with other crises in my day job for the last few
days. I did modify Rene's test program and ran it on my "problem"
machine, with the results below.

The interesting part of this is that port 80 seems to respond to "in"
instructions faster than the presumably "unused" ports 0xEC and 0XEF
(those were mentioned by someone as alternatives to port 80).

That, and the fact that the port 80 test reliably freezes the machine
solid the second time it is run, and the "hwclock" utility reliably
hangs the machine if the port 80's are used in the CMOS_READ/CMOS_WRITE
loop, seems to strongly indicate that this chipset or motherboard
actually uses port 80, rather than there being a bus problem.

Someone might have an in to nVidia to clarify this, since I don't. In
any case, the udelay(2) approach seems to be a safe fix for this machine.

Hope input from an "outsider" is helpful in going forward. I put a lot
of time and effort into tracking down this problem on this particular
machine model, largely because I like the machine.

Running the (slightly modified to test ports 80, ec, ef instead of just
port 80) test when the 2 GHz max speed CPU is running at 800 MHz, here's
what I get for port 80 and port ec and port ef.

port 80: cycles: out 1430, in 792
port ef: cycles: out 1431, in 1378
port ec: cycles: out 1432, in 1372
----------------------------

System info: HP Pavilion dv9000z laptop (AMD64x2)

PCI bus controller is nVidia MCP51.
processor : 0
vendor_id : AuthenticAMD
cpu family : 15
model : 72
model name : AMD Turion(tm) 64 X2 Mobile Technology TL-60
stepping : 2
cpu MHz : 800.000
cache size : 512 KB
--

To: David P. Reed <dpreed@...>
Cc: Pavel Machek <pavel@...>, Andi Kleen <andi@...>, linux-os (Dick Johnson) <linux-os@...>, David Newall <david@...>, Paul Rolland <rol@...>, H. Peter Anvin <hpa@...>, Krzysztof Halasa <khc@...>, Alan Cox <alan@...>, <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, <rol@...>
Date: Wednesday, December 12, 2007 - 4:26 pm

Don't know if someone else mentioned those but I only said 0xed. That's the
value Phoenix BIOSes use (yes, and which H. Peter Anvin) reported as being
generally problematic as well).

It's in fact not all that unexpected it seems that port 0x80 responds to in
given that it's used by the DMA controller. It's a write that falls on deaf
ears. The read is going to be faster if it doesn't timeout on an unused port.

Although it's not faster for everyone, such as for me indicating that for us
port 0x80 is really-really unused, it is for many. See results here:

Yes, so it seems. In this case we could in fact also "fix" your situation by
just going to 0xed depending on for example DMI. Alan Cox just posted a few

At 800 MHz, that's 1.79 / 0.99 microseconds. The precision of the "in" is
somewhat interesting. Did someone at nVidia think it's an "in" from 0x80

Rene.
--

To: Alan Cox <alan@...>
Cc: David P. Reed <dpreed@...>, Pavel Machek <pavel@...>, Andi Kleen <andi@...>, linux-os (Dick Johnson) <linux-os@...>, David Newall <david@...>, Paul Rolland <rol@...>, H. Peter Anvin <hpa@...>, Krzysztof Halasa <khc@...>, <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, <rol@...>
Date: Wednesday, December 12, 2007 - 4:58 pm

By the way, _does_ anyone have a contact at nVidia who could clarify? Alan
maybe? I'm quite curious what they did...

Summary:

Unless after booting with "acpi=off", outputs to port 0x80 (the legacy way
to delay I/O) reliably, but not immediately, hang MCP51 machines. Outputs to
port 0xed do not indicating it's a not a generic bus abort problem.

Rene.
--

To: Rene Herman <rene.herman@...>
Cc: Alan Cox <alan@...>, David P. Reed <dpreed@...>, Pavel Machek <pavel@...>, Andi Kleen <andi@...>, linux-os (Dick Johnson) <linux-os@...>, David Newall <david@...>, Paul Rolland <rol@...>, Krzysztof Halasa <khc@...>, <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, <rol@...>
Date: Wednesday, December 12, 2007 - 5:05 pm

Sorry, the first sentence didn't parse unambiguously for me. Do you
mean "acpi=off" works, or that "acpi=off" allows *subsequent* boots to work?

I have some people at nVidia I can probably ping.

-hpa
--

To: H. Peter Anvin <hpa@...>
Cc: Rene Herman <rene.herman@...>, Alan Cox <alan@...>, David P. Reed <dpreed@...>, Pavel Machek <pavel@...>, Andi Kleen <andi@...>, linux-os (Dick Johnson) <linux-os@...>, David Newall <david@...>, Paul Rolland <rol@...>, Krzysztof Halasa <khc@...>, <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, <rol@...>
Date: Friday, December 14, 2007 - 6:05 pm

Have them search on Google for:

--

To: H. Peter Anvin <hpa@...>
Cc: Chuck Ebbert <cebbert@...>, Rene Herman <rene.herman@...>, Alan Cox <alan@...>, David P. Reed <dpreed@...>, Pavel Machek <pavel@...>, Andi Kleen <andi@...>, linux-os (Dick Johnson) <linux-os@...>, David Newall <david@...>, Paul Rolland <rol@...>, Krzysztof Halasa <khc@...>, <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, <rol@...>
Date: Saturday, December 15, 2007 - 3:22 am

Sorry, didn't see this again due to aforementioned horseshit ISP. "acpi=off"
works it seems. Report from David Reed here:

Rene.

--

To: Rene Herman <rene.herman@...>
Cc: David P. Reed <dpreed@...>, Pavel Machek <pavel@...>, Andi Kleen <andi@...>, linux-os (Dick Johnson) <linux-os@...>, David Newall <david@...>, Paul Rolland <rol@...>, H. Peter Anvin <hpa@...>, Krzysztof Halasa <khc@...>, <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, <rol@...>
Date: Wednesday, December 12, 2007 - 5:01 pm

On Wed, 12 Dec 2007 21:58:25 +0100

I don't. Nvidia are not the most open bunch of people on the planet. This
doesn't appear to be a chipset bug anyway but a firmware one (other
systems with the same chipset work just fine).

The laptop maker might therefore be a better starting point.
--

To: Alan Cox <alan@...>
Cc: Rene Herman <rene.herman@...>, David P. Reed <dpreed@...>, Pavel Machek <pavel@...>, Andi Kleen <andi@...>, linux-os (Dick Johnson) <linux-os@...>, David Newall <david@...>, Paul Rolland <rol@...>, Krzysztof Halasa <khc@...>, <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, <rol@...>
Date: Wednesday, December 12, 2007 - 5:12 pm

One wonders if it does some SMM trick to capture port 0x80 writes and
attempt to haul them off for debugging; it almost sounds like some kind
of debugging code got let out into the field.

-hpa
--

To: H. Peter Anvin <hpa@...>, Alan Cox <alan@...>
Cc: Rene Herman <rene.herman@...>, David P. Reed <dpreed@...>, Pavel Machek <pavel@...>, Andi Kleen <andi@...>, linux-os (Dick Johnson) <linux-os@...>, David Newall <david@...>, Paul Rolland <rol@...>, Krzysztof Halasa <khc@...>, <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, <rol@...>
Date: Saturday, December 15, 2007 - 6:34 pm

[Empty message]
To: Allen Martin <AMartin@...>
Cc: Alan Cox <alan@...>, Rene Herman <rene.herman@...>, David P. Reed <dpreed@...>, Pavel Machek <pavel@...>, Andi Kleen <andi@...>, linux-os (Dick Johnson) <linux-os@...>, David Newall <david@...>, Paul Rolland <rol@...>, Krzysztof Halasa <khc@...>, <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, <rol@...>
Date: Saturday, December 15, 2007 - 6:46 pm

Presumably you have programmable decoders to trigger SMI? If not, then
they're probably doing the equivalent in a SuperIO chip or similar.

-hpa
--

To: H. Peter Anvin <hpa@...>
Cc: Rene Herman <rene.herman@...>, David P. Reed <dpreed@...>, Pavel Machek <pavel@...>, Andi Kleen <andi@...>, linux-os (Dick Johnson) <linux-os@...>, David Newall <david@...>, Paul Rolland <rol@...>, Krzysztof Halasa <khc@...>, <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, <rol@...>
Date: Wednesday, December 12, 2007 - 5:29 pm

Not implausible. We've got a bug I've been dealing with where a vendor
left debug stuff enabled via the parallel port and which clearly
"escaped" from the test environment to the BIOS proper.
--

To: Rene Herman <rene.herman@...>
Cc: Pavel Machek <pavel@...>, Andi Kleen <andi@...>, linux-os (Dick Johnson) <linux-os@...>, David Newall <david@...>, Paul Rolland <rol@...>, H. Peter Anvin <hpa@...>, Krzysztof Halasa <khc@...>, Alan Cox <alan@...>, <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, <rol@...>
Date: Wednesday, December 12, 2007 - 4:37 pm

Port 0xED, just FYI:

cycles: out 1430, in 1370
cycles: out 1429, in 1370

(800 Mhz)

--

To: Pavel Machek <pavel@...>
Cc: Andi Kleen <andi@...>, linux-os (Dick Johnson) <linux-os@...>, David Newall <david@...>, Paul Rolland <rol@...>, H. Peter Anvin <hpa@...>, Krzysztof Halasa <khc@...>, Alan Cox <alan@...>, David P. Reed <dpreed@...>, <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, <rol@...>
Date: Tuesday, December 11, 2007 - 4:00 pm

4 isn't sensible. There have been machines capable both of running Linux and
their ISA bus at less than 8 MHz (if only for example by picking a 5 divisor
on a system that was capable of hosting a 40 Mhz 386/486 but using a slower
CPU) but not by much. And machines doing that and running Linux, even more
so "today": 0.

My posted test program (although there seems to be something wrong with it
since it's influenced by compiler optimisation) is showing more than 1 but
note that on the vast majority of machines, 0 would in fact do. 1 will on
all, 2 will as well.

Rene.

--

To: Pavel Machek <pavel@...>
Cc: Andi Kleen <andi@...>, linux-os (Dick Johnson) <linux-os@...>, David Newall <david@...>, Paul Rolland <rol@...>, H. Peter Anvin <hpa@...>, Krzysztof Halasa <khc@...>, Alan Cox <alan@...>, David P. Reed <dpreed@...>, <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, <rol@...>
Date: Tuesday, December 11, 2007 - 3:59 pm

4 isn't sensible. There have been machines capable both of running Linux and
their ISA bus at less than 8 MHz (if only for example by picking a 5 divisor
on a system that was capable of hosting a 40 Mhz 386/486 but using a slower
CPU) but not by much. And machines doing that and running Linux, even more
so "today": 0.

My posted test program (although there seems to be something wrong with it
since it's influenced by compiler optimisation) is showing more than 1 but
note that on the vast majority of machines, 0 would in fact do. 1 will on
all, 2 will as well.

Rene.

--

To: Pavel Machek <pavel@...>
Cc: Andi Kleen <andi@...>, linux-os (Dick Johnson) <linux-os@...>, David Newall <david@...>, Paul Rolland <rol@...>, H. Peter Anvin <hpa@...>, Krzysztof Halasa <khc@...>, Alan Cox <alan@...>, David P. Reed <dpreed@...>, <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, <rol@...>
Date: Tuesday, December 11, 2007 - 3:59 pm

4 isn't sensible. There have been machines capable both of running Linux and
their ISA bus at less than 8 MHz (if only for example by picking a 5 divisor
on a system that was capable of hosting a 40 Mhz 386/486 but using a slower
CPU) but not by much. And machines doing that and running Linux, even more
so "today": 0.

My posted test program (although there seems to be something wrong with it
since it's influenced by compiler optimisation) is showing more than 1 but
note that on the vast majority of machines, 0 would in fact do. 1 will on
all, 2 will as well.

Rene.

--

To: Rene Herman <rene.herman@...>
Cc: Andi Kleen <andi@...>, linux-os (Dick Johnson) <linux-os@...>, David Newall <david@...>, Paul Rolland <rol@...>, H. Peter Anvin <hpa@...>, Krzysztof Halasa <khc@...>, Pavel Machek <pavel@...>, Alan Cox <alan@...>, <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, <rol@...>
Date: Tuesday, December 11, 2007 - 1:00 pm

Which port do you want me to test? Also, I can run the timing test on
my machine if you share the source code so I can build it.

--

To: David P. Reed <dpreed@...>
Cc: Andi Kleen <andi@...>, linux-os (Dick Johnson) <linux-os@...>, David Newall <david@...>, Paul Rolland <rol@...>, H. Peter Anvin <hpa@...>, Krzysztof Halasa <khc@...>, Pavel Machek <pavel@...>, Alan Cox <alan@...>, <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, <rol@...>
Date: Tuesday, December 11, 2007 - 1:04 pm

Oh, thought your previous reply was already responding to this. The "other
diagnostic port", 0xed. The point is not so much that it's going to be a

Thanks, would be interesting. This one:

Rene.

To: Rene Herman <rene.herman@...>
Cc: David P. Reed <dpreed@...>, Andi Kleen <andi@...>, linux-os (Dick Johnson) <linux-os@...>, David Newall <david@...>, Paul Rolland <rol@...>, H. Peter Anvin <hpa@...>, Krzysztof Halasa <khc@...>, Alan Cox <alan@...>, <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, <rol@...>
Date: Tuesday, December 11, 2007 - 3:18 pm

Try replacing port 0x80 in include/asm-x86/io_*.h with 0xed... and see
if it makes your machine stable.
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
--

To: David P. Reed <dpreed@...>
Cc: Andi Kleen <andi@...>, linux-os (Dick Johnson) <linux-os@...>, David Newall <david@...>, Paul Rolland <rol@...>, H. Peter Anvin <hpa@...>, Krzysztof Halasa <khc@...>, Pavel Machek <pavel@...>, Alan Cox <alan@...>, <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, <rol@...>
Date: Tuesday, December 11, 2007 - 1:27 pm

Okay, this needs to be junked. I don't get it, but I get different results
from an -O2 and an -O0 compile on this one.

Anyone?

Rene.

--

To: David Newall <david@...>
Cc: Rene Herman <rene.herman@...>, Paul Rolland <rol@...>, H. Peter Anvin <hpa@...>, Krzysztof Halasa <khc@...>, Pavel Machek <pavel@...>, Andi Kleen <andi@...>, David P. Reed <dpreed@...>, <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, <rol@...>
Date: Tuesday, December 11, 2007 - 10:25 am

Each platform provides its own versions of the various _p functions which
work as required for that platform.

As to megaraid, I don't have the docs so I couldn't specifically tell you

Most of those platforms have hardware that was designed not to need those
delays and they know that their CMOS clock etc are not clocked at half

"vague specific" ? sorry don't follow you.

Its an ISA bus delay on systems that need it (or an LPC bus delay on

measured in what, against what, for which bus.

inb_p/outb_p are really only meaningful for ISA/LPC bus devices. In those
cases it is precisely defined. Its use for PCI devices is a bit suspect
and as a general rule probably wrong.

Alan
--

To: Alan Cox <alan@...>
Cc: Rene Herman <rene.herman@...>, Paul Rolland <rol@...>, H. Peter Anvin <hpa@...>, Krzysztof Halasa <khc@...>, Pavel Machek <pavel@...>, Andi Kleen <andi@...>, David P. Reed <dpreed@...>, <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, <rol@...>
Date: Wednesday, December 12, 2007 - 6:18 pm

The _p variants are a universal fixture, defined as ending with a pause,
but without specifying the duration. (The duration is architecture
specific, mostly zero.) It really isn't a form that should be used in

Yes, it's now clear that all of this is so. Regrettably, it's used in
dozens of drivers, most having nothing to do with an ISA/LPC bus.

If it really is specific to the ISA architecture, then it should only be
used in architecture specific code.

I think the solution is to remove it. Replace all _p calls with the
non-_p variant, and add an explicit udelay. Udelay can initially be set
conservatively until it's been properly calibrated, allowing it to be
used during early boot. The good news is that it's only used in a few
dozen drivers, so that actually might be doable. And then, who knows,
maybe Microsoft might have to scratch their corporate heads, trying to
find out how to compete with a suddenly much faster Linux! :-p
--

To: David Newall <david@...>
Cc: Rene Herman <rene.herman@...>, Paul Rolland <rol@...>, H. Peter Anvin <hpa@...>, Krzysztof Halasa <khc@...>, Pavel Machek <pavel@...>, Andi Kleen <andi@...>, David P. Reed <dpreed@...>, <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, <rol@...>
Date: Wednesday, December 12, 2007 - 7:00 pm

[Empty message]
To: Alan Cox <alan@...>
Cc: David Newall <david@...>, Rene Herman <rene.herman@...>, Paul Rolland <rol@...>, H. Peter Anvin <hpa@...>, Krzysztof Halasa <khc@...>, Pavel Machek <pavel@...>, Andi Kleen <andi@...>, <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, <rol@...>
Date: Thursday, December 13, 2007 - 9:13 am

Perhaps what was meant is that ISA-tuned timings make little sense on
devices that are part of the chipset or on the PCI or PCI-X buses?

On the other hand, since we don't know in many cases whether the "_p"
was supposed to mean "the time it takes to execute an "out al,80h" on
whatever bus structure happens to be on whatever machine, the problem is
unsolvable.

Ranting about whether ISA/LPC is on what machines seems to be of little
value in contributing to a constructive solution.

It seems to me that in the long term, driver writers would do well to
think more clearly about the timings their devices require, when that is
possible. They are probably implementation dependent - depending on
the clock speed of the particular clock that is driving the particular
i/o device.

Then there's the social problem of a community development project -
which is to get people to tune their code but preserve its ability to
run on older and variant machines.

--

To: David P. Reed <dpreed@...>
Cc: David Newall <david@...>, Rene Herman <rene.herman@...>, Paul Rolland <rol@...>, H. Peter Anvin <hpa@...>, Krzysztof Halasa <khc@...>, Pavel Machek <pavel@...>, Andi Kleen <andi@...>, <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, <rol@...>
Date: Thursday, December 13, 2007 - 9:21 am

On Thu, 13 Dec 2007 08:13:29 -0500

No.

ISA as LPC bus is alive and well inside and outside chipsets. Welcome to
planet earth and the reality of 'its cheaper to reuse cells than design a
new one'. For the chipset logic like DMA controllers the _p is absolutely
correct.

Alan
--

To: Alan Cox <alan@...>
Cc: David Newall <david@...>, Rene Herman <rene.herman@...>, Paul Rolland <rol@...>, H. Peter Anvin <hpa@...>, Krzysztof Halasa <khc@...>, Pavel Machek <pavel@...>, Andi Kleen <andi@...>, <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, <rol@...>
Date: Thursday, December 13, 2007 - 9:50 pm

Simulating 1 microsecond delays (assuming LPC meets that goal for 0x80)
is "absolutely correct" for devices provided on PCI-X running on 3 GHz
or greater machines?

Well, you are entitled to your opinion. Seems likely that reading the
timing specs of such a chipset might be correct, and delaying for a time
proportional to CPU speed, rather than assuming running 3000 3GHz clock
cycles is needed on a very fast emulation of an old device that probably
runs at the fastest bus speed provided in the chipset.

Every device has different timing constraints. In the real world that I
live in.

--

To: David P. Reed <dpreed@...>
Cc: David Newall <david@...>, Rene Herman <rene.herman@...>, Paul Rolland <rol@...>, H. Peter Anvin <hpa@...>, Krzysztof Halasa <khc@...>, Pavel Machek <pavel@...>, Andi Kleen <andi@...>, <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, <rol@...>
Date: Friday, December 14, 2007 - 11:16 am

On Thu, 13 Dec 2007 20:50:33 -0500

Yes - the LPC bus clock doesn't change for the CPU clock.
--

To: David Newall <david@...>
Cc: Paul Rolland <rol@...>, H. Peter Anvin <hpa@...>, Krzysztof Halasa <khc@...>, Pavel Machek <pavel@...>, Andi Kleen <andi@...>, Alan Cox <alan@...>, David P. Reed <dpreed@...>, <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, <rol@...>
Date: Tuesday, December 11, 2007 - 10:14 am

It not only could be, it _is_ true. Not using an output to port 0x80 is what

The latter probably and I don't bleedin' well care. In a discussion about
removing the out to 0x80 the only thing that is relevant is what it should

No damnit, you misunderstand. I'm saying that an outb_p _should_ be defined
in terms of the bus clock since if you want a wall-clock delay you should be
using just that.

The _hardware_ is synced to the bus clock and therefore, having a delay
available that is synced to the bus clock as well makes some sense. And
again again again again not withstanding that, a udelay will still be an
okay replacement in practice.

Rene.
--

To: Rene Herman <rene.herman@...>
Cc: David Newall <david@...>, H. Peter Anvin <hpa@...>, Krzysztof Halasa <khc@...>, Pavel Machek <pavel@...>, Andi Kleen <andi@...>, Alan Cox <alan@...>, David P. Reed <dpreed@...>, <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, <rol@...>
Date: Tuesday, December 11, 2007 - 9:32 am

Hello,

On Tue, 11 Dec 2007 14:16:01 +0100

Some results :

Core 2Duo 1.73GHz :
[root@tux tmp]# ./in
out = 2366
in = 2496
[root@tux tmp]# ./in
out = 3094
in = 2379

Plain old PIII 600 MHz:
[root@www-dev /tmp]# ./in
out = 314
in = 543
[root@www-dev /tmp]# ./in
out = 319
in = 538
[root@www-dev /tmp]# ./in
out = 319
in = 550
[root@www-dev /tmp]# ./in
out = 329
in = 531

Opteron 150 2.4GHz :
-bash-3.1# ./in
out = 4801
in = 4863
-bash-3.1# ./in
out = 5041
in = 4909
-bash-3.1# ./in
out = 4829
in = 4886

Paul

--

To: Paul Rolland <rol@...>
Cc: David Newall <david@...>, H. Peter Anvin <hpa@...>, Krzysztof Halasa <khc@...>, Pavel Machek <pavel@...>, Andi Kleen <andi@...>, Alan Cox <alan@...>, David P. Reed <dpreed@...>, <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, <rol@...>
Date: Tuesday, December 11, 2007 - 10:15 am

Okay, these vary to wildly for you and might I suppose be a serialising
artifact or some such. Give me a bit and I'll try to improve it...

Rene
--

To: Paul Rolland <rol@...>
Cc: David Newall <david@...>, H. Peter Anvin <hpa@...>, Krzysztof Halasa <khc@...>, Pavel Machek <pavel@...>, Andi Kleen <andi@...>, Alan Cox <alan@...>, David P. Reed <dpreed@...>, <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, <rol@...>
Date: Tuesday, December 11, 2007 - 11:28 am

This might be a bit more constant, I suppose. This serialises with cpuid.
Don't see a difference locally, but perhaps you do.

On a Duron 1300 with an actual ISA bus, "out" is between 1300 and 1600 for
me and "in" between 1200 and 1500 with a few flukes above that which will I
suppose be caused by the bus (ISA _or_ PCI) being momentarily busy or some
such...

Rene.

To: Rene Herman <rene.herman@...>
Cc: Paul Rolland <rol@...>, David Newall <david@...>, H. Peter Anvin <hpa@...>, Krzysztof Halasa <khc@...>, Pavel Machek <pavel@...>, Andi Kleen <andi@...>, Alan Cox <alan@...>, David P. Reed <dpreed@...>, <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, <rol@...>
Date: Tuesday, December 11, 2007 - 12:32 pm

Here's my results on a PIII Xeon, 550mhz, 440GX chipset, and an ISA
slot, which until recently was actually used with an 8 port serial
card:

jfsnew:~/src> sudo ./port80
out: 729
in : 348
jfsnew:~/src> sudo ./port80
out: 729
in : 354
jfsnew:~/src> sudo ./port80
out: 729
in : 350
jfsnew:~/src> sudo ./port80
out: 728
in : 346
jfsnew:~/src> sudo ./port80
out: 730
in : 340
--

To: John Stoffel <john@...>
Cc: Paul Rolland <rol@...>, David Newall <david@...>, H. Peter Anvin <hpa@...>, Krzysztof Halasa <khc@...>, Pavel Machek <pavel@...>, Andi Kleen <andi@...>, Alan Cox <alan@...>, David P. Reed <dpreed@...>, <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, <rol@...>
Date: Tuesday, December 11, 2007 - 12:40 pm

Thank you. That's a little odd. The "in" time should be close to the "out"
time really.

Well, err, <shrug> I guess.

For now noone's contemplating replacing the out with an in anyways :-)

Rene.

--

To: Rene Herman <rene.herman@...>
Cc: David Newall <david@...>, H. Peter Anvin <hpa@...>, Krzysztof Halasa <khc@...>, Pavel Machek <pavel@...>, Andi Kleen <andi@...>, Alan Cox <alan@...>, David P. Reed <dpreed@...>, <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, <rol@...>
Date: Tuesday, December 11, 2007 - 11:37 am

Hello,

On Tue, 11 Dec 2007 16:28:56 +0100
Well, yes, at least on the PIII and the Opteron... Core2 is still changing
The results :

Core 2Duo 1.73 GHz :
[root@tux tmp]# ./in2
out: 2777
in : 2519
[root@tux tmp]# ./in2
out: 2440
in : 2391
[root@tux tmp]# ./in2
out: 2460
in : 2388

Pentium III :
[root@www-dev /tmp]# ./in2
out: 746
in : 747
[root@www-dev /tmp]# ./in2
out: 746
in : 747
[root@www-dev /tmp]# ./in2
out: 746
in : 745

AMD Opteron 150 :
-bash-3.1# ./in2
out: 4846
in : 4845
-bash-3.1# ./in2
out: 4846
in : 4846
-bash-3.1# ./in2
out: 4846
in : 4845

Paul
--

To: Paul Rolland <rol@...>
Cc: David Newall <david@...>, H. Peter Anvin <hpa@...>, Krzysztof Halasa <khc@...>, Pavel Machek <pavel@...>, Andi Kleen <andi@...>, Alan Cox <alan@...>, David P. Reed <dpreed@...>, <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, <rol@...>
Date: Tuesday, December 11, 2007 - 11:53 am

On 11-12-07 16:37, Paul Rolland wrote:

Great, thanks for the quick replies.

That last one below especially is quite a bit more than 1. As said before,
most hardware isn't in fact going to need anything but I suppose udelay(2)

Okayish I guess, especially when subsequent runs stay near those values.

4846 / 2400 = 2.02 us

Rene.

--

To: Rene Herman <rene.herman@...>
Cc: Paul Rolland <rol@...>, David Newall <david@...>, H. Peter Anvin <hpa@...>, Krzysztof Halasa <khc@...>, Pavel Machek <pavel@...>, Andi Kleen <andi@...>, Alan Cox <alan@...>, <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, <rol@...>
Date: Tuesday, December 11, 2007 - 12:58 pm

As the person who started this thread, I'm still puzzled by two things:

1) is there anyone out there who wrote one of these drivers (most listed
below, from a list of those I needed to patch to eliminate refs to _b
calls) or arch specific code (also listed below), who might know why the
_p macros are actually needed (for any platform)?

Note that many of the devices are not on the ISA/LPC bus now, even if
they were, and the vga has never needed a bus-level pause since the
original IBM PC existed. (it did need a sync with retrace, but that's
another story).

2) Why are opterons and so forth so slow on out's to x80 as the
measurements show? That seems to me like there is a hidden bus timeout
going on. I'm still trying to figure out what is happening on my
machine which hangs when not in legacy mode (i.e. in ACPI mode) after a
lot of out's to x80. Perhaps the bus timeout handling is the issue.

I do remind all that 0x80 is a BIOS-specific standard, and is per BIOS -
other ports have been used in the history of the IBM PC family by some
vendors, and some machines have used it for DMA port mapping!! And
Windows XP does NOT use it at all. Therefore it may not be supported by
vendors, and may in fact be used for other purposes, since it can if the
BIOS doesn't use it.

I have a simple patch that fixes my primary concern - just change the
CMOS_READ and CMOS_WRITE, 64-bit versions of I/O and bootcode vga
accesses (first group below) to use the straight inb and outb code.

I may submit it so that the many others who share my pain will be made
happy - at least on modern _64 x86 machines those instructions don't
need the _p feature. The rest of the drivers and code are just lurking
disasters, which I hope can be resolved somehow by the community
figuring out what the timing delays were put there for...
-------------
arch/x86/boot/compressed/misc_64.c
arch/x86/kernel/i8259_64.c
arch/x86/pci/irq.c
include/asm/floppy.h
include/asm/io_64.h
include/asm/mc146818...

To: David P. Reed <dpreed@...>
Cc: Rene Herman <rene.herman@...>, Paul Rolland <rol@...>, David Newall <david@...>, H. Peter Anvin <hpa@...>, Krzysztof Halasa <khc@...>, Pavel Machek <pavel@...>, Andi Kleen <andi@...>, <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, <rol@...>
Date: Tuesday, December 11, 2007 - 1:32 pm

Because the controllers were historically slower than the CPU and thus
clocked at half bus speed. Various chipsets simply shrank the logic

Sync with retrace is MDA memory updates.

The vga driver is somewhat misnamed. In console mode we handle everything

Older Windows does. Don't know about XP although DOS apps in XP will but

All .. none of them ?

That one is a mistake I believe, I'll dig out the docs.

--

To: Alan Cox <alan@...>
Cc: Rene Herman <rene.herman@...>, Paul Rolland <rol@...>, David Newall <david@...>, H. Peter Anvin <hpa@...>, Krzysztof Halasa <khc@...>, Pavel Machek <pavel@...>, Andi Kleen <andi@...>, <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, <rol@...>
Date: Tuesday, December 11, 2007 - 3:19 pm

No they don't. I really, really, really know this for a fact. I wrote
ASM drivers for every early video adapter and ran them all through Lotus
QA and Software Arts QA. Personally. The only delay needed is caused
by not having dual-ported video frame buffers on the original CGA in
high res character mode. This caused "snow" when a memory write was done
concurrently with the read being done by the scanline character
generator. And that delay was done by waiting for a bit in the I/O port
space to change. There was NO reason to do waits between I/O
instructions. Produce a spec sheet, or even better a machine. I may
have an original PC-XT still lying around in the attic, don't know if I
can fire it up, but it had such graphics cards. I also have several
Not true. Again, I can produce machines that don't use 0x80. Perhaps
that is because I am many years older than you are, and have been
writing code for PC compatibles since 1981. (not a typo - this was
Show me one line of Windows code written by Microsoft that uses port
80. I don't know what app hackers might have done - there was no
I obviously have not. Clearly the guys who want this port 80 hack so
desperately have not either. That's why we are in this pickle. (well,
we only to the extent that I am accepted as having useful input. I'm
There is a long standing set of reports of "hwclock" not working on HP
dv.000v laptops, where the . stands for 2, 4, 6, and 9. These are all
nvidia MCP51 chipset AMD64's.

And if you choose to be such an insulting ****, I may just stop trying
to be helpful. I presume that others in the community find my comments
--

To: David P. Reed <dpreed@...>
Cc: Alan Cox <alan@...>, Rene Herman <rene.herman@...>, Paul Rolland <rol@...>, David Newall <david@...>, H. Peter Anvin <hpa@...>, Krzysztof Halasa <khc@...>, Pavel Machek <pavel@...>, Andi Kleen <andi@...>, <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, <rol@...>
Date: Tuesday, December 11, 2007 - 4:27 pm

Hmmm,
I didn't know you worked in Boca Raton for Don Estrage on
the IBM 5150. I must have missed you --somehow.

[Snipped...]

You do remember that the X86 can do back-to-back port
instructions faster than the ISA bus capacity can be
charged, don't you? You do remember the admonishment
about:
intel asm

mov dx, port ; One of two adjacent ports
mov al,ffh ; All bits set
out dx,al ; Output to port, bits start charging bus
inc al ; Al becomes 0
inc dx ; Next port
out dx,al ; Write 0 there, data bits discharged

When the port at 'port' gets its data, it will likely
be 0, not 0xff, because the intervening instructions
can execute faster than a heavily-loaded ISA bus.

So, with a true ISA/EISA bus, not an emulated one off
a bridge, you have to worry about this. In the IBM/PC
BIOS listing, supplied with every early real PC, it
was called "bus settle time." Remember? If not, you
never wrote code for that platform.

Cheers,
Dick Johnson
Penguin : Linux version 2.6.22.1 on an i686 machine (5588.29 BogoMips).
My book : http://www.AbominableFirebug.com/
_

****************************************************************
The information transmitted in this message is confidential and may be privileged. Any review, retransmission, dissemination, or other use of this information by persons or entities other than the intended recipient is prohibited. If you are not the intended recipient, please notify Analogic Corporation immediately - by replying to this message or by sending an email to DeliveryErrors@analogic.com - and destroy all copies of this information, including any attachments, without reading or disclosing them.

Thank you.
--

To: linux-os (Dick Johnson) <linux-os@...>
Cc: Alan Cox <alan@...>, Rene Herman <rene.herman@...>, Paul Rolland <rol@...>, David Newall <david@...>, H. Peter Anvin <hpa@...>, Krzysztof Halasa <khc@...>, Pavel Machek <pavel@...>, Andi Kleen <andi@...>, <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, <rol@...>
Date: Tuesday, December 11, 2007 - 7:56 pm

1) I found in a book, the Undocumented PC, that I have lying around that
the "pause" recommended for some old adapter chips on the ISA bus was 1
usec. The book carefully points out on various models of PCs how many
short jumps are required to implement 1 usec, and suggests that for
faster machines, 1 usec loops be calibrated. That seems like a good
heuristic.

2) Also, Dick, you got me interested in doing more historical research
into electrical specs and circuit diagrams (which did come with the IBM
5150). The bus in the original IBM PC had no problem with "bus capacity
being charged" as you put it. Perhaps you don't remember that the I/O
bus had the same electrical characteristics as the memory bus. Thus
there is no issue with the bus being "charged". The issue of delays
between i/o instructions was entirely a problem of whether the adapter
card could clock data into its buffers and handle the clocked in data in
time for another bus cycle. This had nothing to do with "charging" -
buses in those days happily handled edges that were much faster than 1 usec.

We at Software Arts did what we did based on direct measurements and
testing. We found that the early BIOS listings were usually fine, but
in fact were misleading. After all, the guys who built the machine and
wrote the BIOS were in a hurry. There were errata.

--

To: David P. Reed <dpreed@...>
Cc: Alan Cox <alan@...>, Rene Herman <rene.herman@...>, Paul Rolland <rol@...>, David Newall <david@...>, H. Peter Anvin <hpa@...>, Krzysztof Halasa <khc@...>, Pavel Machek <pavel@...>, Andi Kleen <andi@...>, <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, <rol@...>
Date: Wednesday, December 12, 2007 - 9:11 am

Wrong: the bus is not a clocked bus. Read a book. There is
a "clock" trace provided, but it has nothing to do with the
bus or its timing. The bus is not impedance-controlled, nor
is it clocked. It relies upon certain established states.
Look in the back of the IBM/PC book or read about the bus
in http://www.techfest.com/hardware/nis/isa.htm and observe

Yep. We are all wrong. You come out of nowhere and claim to
be right. Goodbye.

Cheers,
Dick Johnson
Penguin : Linux version 2.6.22.1 on an i686 machine (5588.30 BogoMips).
My book : http://www.AbominableFirebug.com/
_

****************************************************************
The information transmitted in this message is confidential and may be privileged. Any review, retransmission, dissemination, or other use of this information by persons or entities other than the intended recipient is prohibited. If you are not the intended recipient, please notify Analogic Corporation immediately - by replying to this message or by sending an email to DeliveryErrors@analogic.com - and destroy all copies of this information, including any attachments, without reading or disclosing them.

Thank you.
--

To: linux-os (Dick Johnson) <linux-os@...>
Cc: Alan Cox <alan@...>, Rene Herman <rene.herman@...>, Paul Rolland <rol@...>, David Newall <david@...>, H. Peter Anvin <hpa@...>, Krzysztof Halasa <khc@...>, Pavel Machek <pavel@...>, Andi Kleen <andi@...>, <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, <rol@...>
Date: Wednesday, December 12, 2007 - 3:42 pm

Who has attitude problems here? I have indeed learned a lot about assholes.

--

To: David P. Reed <dpreed@...>
Cc: Alan Cox <alan@...>, Rene Herman <rene.herman@...>, Paul Rolland <rol@...>, David Newall <david@...>, H. Peter Anvin <hpa@...>, Krzysztof Halasa <khc@...>, Pavel Machek <pavel@...>, Andi Kleen <andi@...>, <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, <rol@...>
Date: Wednesday, December 12, 2007 - 4:31 pm

Hmmm, I gave you every opportunity to back off your pretense
of writing code for the IBM/PC before it existed. You insisted
upon continuing your diatribe. You are the one with the attitude
problem. FYI, the last COMPLETE BIOS I wrote is on-line at:
http://www.AbominableFirebug.com/Sources/pbios.zip
It was written seven years ago. The first one I helped write
was written 36 years ago. I do know something about this stuff.

Cheers,
Dick Johnson
Penguin : Linux version 2.6.22.1 on an i686 machine (5588.30 BogoMips).
My book : http://www.AbominableFirebug.com/
_

****************************************************************
The information transmitted in this message is confidential and may be privileged. Any review, retransmission, dissemination, or other use of this information by persons or entities other than the intended recipient is prohibited. If you are not the intended recipient, please notify Analogic Corporation immediately - by replying to this message or by sending an email to DeliveryErrors@analogic.com - and destroy all copies of this information, including any attachments, without reading or disclosing them.

Thank you.
--

To: David P. Reed <dpreed@...>
Cc: Alan Cox <alan@...>, Rene Herman <rene.herman@...>, Paul Rolland <rol@...>, David Newall <david@...>, H. Peter Anvin <hpa@...>, Krzysztof Halasa <khc@...>, Pavel Machek <pavel@...>, Andi Kleen <andi@...>, Linux kernel <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, <rol@...>
Date: Friday, December 14, 2007 - 12:01 pm

I hastily responded to this with some invective of my own.
I wish to publicly apologize because I was just a bit hasty.
Dr. Reed is a principle contributer to networking as we
know it now, in particular, UDP. He is also known for
"Reed's Law," which points out a problem with the
scalability of large networks.

Of course, none of this means that he is right or wrong
about port 0x80 on these crappy Wintel machines.

Cheers,
Dick Johnson
Penguin : Linux version 2.6.22.1 on an i686 machine (5588.30 BogoMips).
My book : http://www.AbominableFirebug.com/
_

****************************************************************
The information transmitted in this message is confidential and may be privileged. Any review, retransmission, dissemination, or other use of this information by persons or entities other than the intended recipient is prohibited. If you are not the intended recipient, please notify Analogic Corporation immediately - by replying to this message or by sending an email to DeliveryErrors@analogic.com - and destroy all copies of this information, including any attachments, without reading or disclosing them.

Thank you.
--

To: <unlisted-recipients@...>, <@...>
Cc: Rene Herman <rene.herman@...>, Paul Rolland <rol@...>, H. Peter Anvin <hpa@...>, Krzysztof Halasa <khc@...>, Pavel Machek <pavel@...>, Andi Kleen <andi@...>, <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, <rol@...>
Date: Wednesday, December 12, 2007 - 12:12 pm

There is another reason we can't just do a dumb changeover - two actually

#1: Some drivers are using inb_p/outb_p in PCI cases which are going to
cause PCI posting changes. Most are probably just wrong in the first
place but they need hand checking

#2: We've got SMP cases that only 'work' because the odds of splitting
the outb and the following port 0x80 cycle, which locks the bus, are tiny.

That is we've got

CPU1 CPU2
main path irq path
outb
outb

inb 0x80
inb 0x80

races in one or two spots.
--

To: Alan Cox <alan@...>
Cc: <unlisted-recipients@...>, <no@...>, <To-header@...>, <on@...>, <input <@...>, <INVALID_ADDRESS_IN_GROUP@...>, <@...>
Date: Friday, December 14, 2007 - 10:33 am

hm, any intelligent way to force PCI posting? I guess not.

here's a list of candidate drivers (match the out*_p() pattern and do
pci)

./char/epca.c
./char/sonypi.c
./scsi/megaraid.c
./ide/pci/serverworks.c
./ide/pci/cmd640.c
./input/mouse/pc110pad.c
./i2c/busses/i2c-amd756.c
./i2c/busses/i2c-ali15x3.c
./i2c/busses/i2c-ali1563.c
./i2c/busses/i2c-ali1535.c
./i2c/busses/i2c-viapro.c
./i2c/busses/i2c-nforce2.c
./i2c/busses/i2c-i801.c
./i2c/busses/i2c-piix4.c
./hwmon/vt8231.c
./hwmon/via686a.c
./hwmon/sis5595.c
./telephony/ixj.c
./net/irda/donauboe.c
./watchdog/pcwd_pci.c

which seems to suggest we are better off not doing the port 0x80 trick
at all.

Ingo
--

To: linux-os (Dick Johnson) <linux-os@...>
Cc: David P. Reed <dpreed@...>, Alan Cox <alan@...>, Paul Rolland <rol@...>, David Newall <david@...>, H. Peter Anvin <hpa@...>, Krzysztof Halasa <khc@...>, Pavel Machek <pavel@...>, Andi Kleen <andi@...>, <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, <rol@...>
Date: Tuesday, December 11, 2007 - 4:34 pm

Frankly, if there ever was a good reason for _not_ merging i386 and x86-64
it would've been having an escape from these kinds of discussions...

Rene.
--

To: linux-os (Dick Johnson) <linux-os@...>
Cc: Rene Herman <rene.herman@...>, Alan Cox <alan@...>, Paul Rolland <rol@...>, David Newall <david@...>, H. Peter Anvin <hpa@...>, Krzysztof Halasa <khc@...>, Pavel Machek <pavel@...>, Andi Kleen <andi@...>, <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, <rol@...>
Date: Tuesday, December 11, 2007 - 5:03 pm

Dick - I didn't work for Don in Boca. I did know him, having met with
him several times when he was still alive. I worked from 1979-1985 as a
consultant and eventually VP R&D, at Software Arts in Cambridge, MA, and
there was a machine we developed the first IBM Visicalc for, in a locked
room which required NDA sign-in, with a list of authorized employees and
consultants. The machine was a plywood board. It was not a 5150, yet.

Note that I did not say I worked in Boca. Funny thing to twist my
comments into that assertion.

Note: I am not trying to say that I know everything about the history of
PC-compatibles, nor am I trying to prove some kind of macho thing. But
I do happen to have a lot of practical experience in this space.

In particular, I am trying to contribute to Linux to make it better.
Largely because I think you guys are doing a great thing, and as a user
of it, I think it's a good thing to give back.

--

To: David P. Reed <dpreed@...>
Cc: Rene Herman <rene.herman@...>, Paul Rolland <rol@...>, David Newall <david@...>, H. Peter Anvin <hpa@...>, Krzysztof Halasa <khc@...>, Pavel Machek <pavel@...>, Andi Kleen <andi@...>, <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, <rol@...>
Date: Tuesday, December 11, 2007 - 4:16 pm

>> And if you choose to be such an insulting ****

Fine. I won't bother submitting patches to fix this because I don't seem
to care any more. The only person who is suffering seems to have an
attitude problem and the only people I work for with attitude problems
are customers of my employer.

Alan
--

To: David P. Reed <dpreed@...>
Cc: Alan Cox <alan@...>, Rene Herman <rene.herman@...>, Paul Rolland <rol@...>, David Newall <david@...>, H. Peter Anvin <hpa@...>, Krzysztof Halasa <khc@...>, Andi Kleen <andi@...>, <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, <rol@...>
Date: Tuesday, December 11, 2007 - 3:36 pm

PC-XT does not count ... it needs to be 386 capable to

Noone _wants_ this port 0x80 hack. We already have it, had it for 10+
years, and now we are trying to get rid of it -- _safely_.
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
--

To: David P. Reed <dpreed@...>
Cc: Rene Herman <rene.herman@...>, Paul Rolland <rol@...>, David Newall <david@...>, Krzysztof Halasa <khc@...>, Pavel Machek <pavel@...>, Andi Kleen <andi@...>, Alan Cox <alan@...>, <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, <rol@...>
Date: Tuesday, December 11, 2007 - 1:05 pm

Correction: ALL machines use it for DMA port mapping. The port is
assigned to the legacy DMA controller, but performs no operation.
That's what makes it safe to write (NOT read!)

-hpa
--

To: David P. Reed <dpreed@...>
Cc: Paul Rolland <rol@...>, David Newall <david@...>, H. Peter Anvin <hpa@...>, Krzysztof Halasa <khc@...>, Pavel Machek <pavel@...>, Andi Kleen <andi@...>, Alan Cox <alan@...>, <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, <rol@...>
Date: Tuesday, December 11, 2007 - 1:01 pm

There's lots of things concerning the PC that is documented nowhere and is
still true. Did you test 0xed?

Rene.
--

To: David Newall <david@...>
Cc: Krzysztof Halasa <khc@...>, Rene Herman <rene.herman@...>, Pavel Machek <pavel@...>, Andi Kleen <andi@...>, Alan Cox <alan@...>, David P. Reed <dpreed@...>, <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>
Date: Monday, December 10, 2007 - 9:51 pm

That the _p delay is different across platforms is actually to be
expected, since it pretty much amounts to a platform delay. And yes, if
it is used as a specific walltime delay that has nothing to do with the
bus architecture of the system then I would classify that as a driver bug.

-hpa
--

To: David Newall <david@...>
Cc: Krzysztof Halasa <khc@...>, Rene Herman <rene.herman@...>, Pavel Machek <pavel@...>, Andi Kleen <andi@...>, Alan Cox <alan@...>, David P. Reed <dpreed@...>, <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>
Date: Monday, December 10, 2007 - 9:46 pm

What it specifically does is it generates a delay which is proportional

Perhaps you do, but it's the de facto standard on the platform. Every
BIOS uses the same technique, because it works.

*Now*, the real question is how many drivers actually need these delays.
My guess is most don't at all.

-hpa

--

To: Krzysztof Halasa <khc@...>
Cc: Rene Herman <rene.herman@...>, Pavel Machek <pavel@...>, Andi Kleen <andi@...>, Alan Cox <alan@...>, David P. Reed <dpreed@...>, <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, H. Peter Anvin <hpa@...>
Date: Monday, December 10, 2007 - 2:02 pm

Hard disk is limited to about 2 MB/s when connected through ISA controller,

Yes, ISA clock can be changed on many machines. Most cards run fine even at 11
(33/3) MHz with zero wait states. In fact, I haven't seen any card that did

--
Ondrej Zary
--

To: Krzysztof Halasa <khc@...>
Cc: Pavel Machek <pavel@...>, Andi Kleen <andi@...>, Alan Cox <alan@...>, David P. Reed <dpreed@...>, <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, H. Peter Anvin <hpa@...>
Date: Monday, December 10, 2007 - 8:08 am

Yes, the thing is that I'm fairly convinced that an out to an unused port
takes sort of exactly 1 us (8 cycles at 8 MHz). That's what I've always
known it to be at least and have taken as the point. The same's also said
here for example:

http://tldp.org/HOWTO/IO-Port-Programming-4.html

as well as in the bit of MINIX source in an earlier version of this thread:

http://lkml.org/lkml/2002/3/14/194

Oh, in fact, a few older posts by Alan himself:

http://lkml.org/lkml/2002/3/15/2
http://lkml.org/lkml/2003/9/22/263

I'm not a hardware person as the level of actual electrical signals but if
an earlier instance of Alan is disagreeing with the current one as well, I
guess it's halfway safe to join that one...

Rene.

--

To: Pavel Machek <pavel@...>
Cc: Andi Kleen <andi@...>, David P. Reed <dpreed@...>, <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, H. Peter Anvin <hpa@...>
Date: Sunday, December 9, 2007 - 6:29 pm

On Sun, 9 Dec 2007 22:25:13 +0100

You need to stick in a bug trap to verify that the udelay is not called
before the cpu timer has been set up.
--

To: Alan Cox <alan@...>
Cc: Andi Kleen <andi@...>, David P. Reed <dpreed@...>, <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, H. Peter Anvin <hpa@...>
Date: Sunday, December 9, 2007 - 7:22 pm

Really?

udelay() seems to use
... cpu_data(raw_smp_processor_id()).loops_per_jiffy ..

..so it seems that bug trap is already there... because
raw_smp_processor_id() will probably just oops...

We could solve this by pre-initializing loops_per_jiffy to some huge
number, but I do not see convenient place where to do that.
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
--

To: Pavel Machek <pavel@...>
Cc: Andi Kleen <andi@...>, David P. Reed <dpreed@...>, <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, H. Peter Anvin <hpa@...>
Date: Monday, December 10, 2007 - 8:02 am

And I double checked my docs - they say 8 cycles - 1uS

Incidentally some of the drivers seem buggy for SMP. The bus locking
nature of the inb_p probably hid this but they don't all seem to have
sufficient locking to ensure that we don't get back to back cycles
without delays

Alan
--

To: Pavel Machek <pavel@...>
Cc: Alan Cox <alan@...>, Andi Kleen <andi@...>, David P. Reed <dpreed@...>, <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, H. Peter Anvin <hpa@...>
Date: Sunday, December 9, 2007 - 11:54 am

Yes, there are PCI POST debug cards too. And also MiniPCI ones for notebooks.
Port 0x80 could be used for more useful things than delay with these cards
(and mainboards with integrated LED display) - like (kernel) debugging or as
a temperature display (when exported to userspace).

--
Ondrej Zary
--

To: Pavel Machek <pavel@...>
Cc: Alan Cox <alan@...>, Andi Kleen <andi@...>, David P. Reed <dpreed@...>, <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, H. Peter Anvin <hpa@...>
Date: Sunday, December 9, 2007 - 9:41 am

In the opposite direction I'm sure I've heard of things that port 80
information down i2c - could this be slower?

Dave
--
-----Open up your eyes, open up your mind, open up your code -------
/ Dr. David Alan Gilbert | Running GNU/Linux on Alpha,68K| Happy \
\ gro.gilbert @ treblig.org | MIPS,x86,ARM,SPARC,PPC & HPPA | In Hex /
\ _________________________|_____ http://www.treblig.org |_______/
--

To: Alan Cox <alan@...>
Cc: Andi Kleen <andi@...>, <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, H. Peter Anvin <hpa@...>
Date: Saturday, December 8, 2007 - 3:25 pm

Disagree. The definitions of PC compatible are quite problematic. I
have the advantage over some of you young guys, in that I actually wrote
code on one of the first 5 breadboard IBM PCs on the planet at Software
Arts, Inc. and I was directly involved in hardware spec projects with
the original IBM and Compaq engineers. No one actually defined the port
numbered 80h as a "standard" for anything. You won't find it documented
in any early manual for an IBM machine.

The ISA bus supported unterminated transactions safely. That allowed
some clever folks to design BIOS diagnostic tools that optionally
plugged into the bus.

In any case, my machine does not have an ISA bus. Why should it? It's
a laptop!

Now the interesting thing is that I have been scanning the source code
of Linux, and I find gazillions of inb_p outb_p and so forth
instructions where they have NO value. It's as if some hacker who half
understood hardware threw in the _p "just to be safe". Well, it's
neither safe, nor is it economical of code or data. It hangs up the bus
on an MP machine, for example, even when it works, to do the delay by
"outb al,80h"

Worse, the actual requirements of the gazillions of inb_p instructions
for delays are not documented in the code! This is interesting, because
the number of devices likely to need a delay after providing data on an
"in" instruction is very likely to be near zero. After all, the device
has already serviced the bus and delivered data! Why put many
microseconds into the bus, locking out other ISA transactions (and PCI
maybe too) with an out to port 80?

Some of the code in linux is really nice, really clean, really
well-thought out. Some is ... well, I'm not trolling for a fight.

--

To: David P. Reed <dpreed@...>
Cc: Alan Cox <alan@...>, Andi Kleen <andi@...>, <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, H. Peter Anvin <hpa@...>
Date: Tuesday, December 11, 2007 - 11:14 am

Really? Are you sure? How does the CPU talk to the BIOS? How about
the parallel port if you have one? (I will assume you have no serial
ports since almost no laptop does anymore). Just because you don't see
such a bus doesn't mean you don't have one. Even PCMCIA uses the ISA
bus, although many new laptops are starting to have expresscard slots
instead which elliminates that problem. LPC (which is ISA in a
different form factor) is still around on most if not all x86 systems.

--
Len Sorensen
--

To: David P. Reed <dpreed@...>
Cc: Alan Cox <alan@...>, Andi Kleen <andi@...>, <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, H. Peter Anvin <hpa@...>
Date: Sunday, December 9, 2007 - 9:22 am

There are mini-pci based cards with port 0x80 displays for notebooks.

Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
--

To: David P. Reed <dpreed@...>
Cc: Alan Cox <alan@...>, Andi Kleen <andi@...>, <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, H. Peter Anvin <hpa@...>
Date: Sunday, December 9, 2007 - 1:04 am

A bus is not something with expansion slots. Your machine has an ISA bus, or
LPC rather, if only to hang its BIOS of. That earlier report about BIOS
chips shitting themselves due to aborts on LPC together with ACPI involving
the BIOS sounds a bit suspicious (and in that case using 0xed shouldn't help
any).

Rene.
--

To: David P. Reed <dpreed@...>
Cc: Andi Kleen <andi@...>, <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, H. Peter Anvin <hpa@...>
Date: Saturday, December 8, 2007 - 4:26 pm

Yes it does. The branding spec said "No ISA bus" so it was renamed "LPC"

Historically processors didn't have a high precision time source so it

Like all things, it doesn't always age well 8)
--

To: Alan Cox <alan@...>
Cc: David P. Reed <dpreed@...>, Andi Kleen <andi@...>, <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>
Date: Tuesday, December 11, 2007 - 1:58 am

Well that, plus it was serialized and uses PCI electricals and timing,
hence the LPC (Low Pin Count) moniker. Its performance is pretty much
exactly ISA, though, and unlike PCI it provides full support for all
legacy ISA features like slave DMA.

-hpa
--

To: David P. Reed <dpreed@...>
Cc: Alan Cox <alan@...>, Andi Kleen <andi@...>, <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, H. Peter Anvin <hpa@...>
Date: Saturday, December 8, 2007 - 3:50 pm

It is a defacto standard for quite some time. Many motherboards
even come with builtin port 80 displays and port 80 cards are a standard
diagnostic tool. Pretty much all the of the standard BIOS
write diagnostic messages to port 80. While in theory a vendor
could change those BIOS they are pretty unlikely to do that.

Have you tried yet as someone asked earlier if using another
free port that also leads to aborts causes the hang too?
If yes you would know for sure it is nothing on port 80,
but something not liking aborts (similar to the problem Eric B.
found earlier)

Anyways using udelay is likely the way to go for modern

It is hard to know afterwards. In the past we definitely had systems
who needed such delays. But some of it might be what you said.

-Andi
--

To: Andi Kleen <andi@...>
Cc: Alan Cox <alan@...>, <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, H. Peter Anvin <hpa@...>
Date: Saturday, December 8, 2007 - 4:47 pm

I am going to do a test on another "unused" port.

However, I realized as I was thinking about this. 0x80 is the
"diagnostic device" port. It is not an "unused" port.

Normally, Linux would support a device like the diagnostic device by
providing a character device, called /dev/post-diagnosis (for the
power-on test diagnostic). That device would reserve port 80 for its
use, and the driver could be loaded if there was such a device.

Now one possibility is that my laptop contains a diagnostic code device
that stores all the out's to port 80 (documented only to the designers,
and kept "secret"). That device may need "clearing" periodically,
which is perhaps done by the SMM, which is turned off when I go into
ACPI-on state. Or maybe it is designed to be cleared only when the
system boots at the beginning of the BIOS. What happens when (as
happens in hwclock's polling of the RTC) thousands of in/out*_p calls
are made very fast? Well, perhaps it is not cleared quickly enough, and
hangs the bus.

The point here is that Linux is NOT using a defined-to-be "unused"
port. It IS using the "diagnostic" port, and talking to a diagnostic
device that *is* used, and may be present.

Just doesn't seem clean to me.

So I'd suggest 2 actions:

1) figure out a better implementation of _p that is "safe" and doesn't
use questionable heuristics. udelay seems reasonable because it doesn't
drive contention on the busses on SMP machines, but perhaps someone has
a better idea.

2) Start a background task with the maintainers of drivers to clean up
their code regarding these short delays for slow devices (note that it's
never because the *bus* is slow, but because the *device* is slow.)
Perhaps this could be helped by "deprecating" the _p calls and
suggesting an alternative that requires the coder to be precise about
what the delay is for, and how long it is supposed to be, perhaps on a
per-machine basis.

--

To: David P. Reed <dpreed@...>
Cc: Andi Kleen <andi@...>, <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, H. Peter Anvin <hpa@...>
Date: Saturday, December 8, 2007 - 5:04 pm

Send patches. For a lot of the devices we know what the requirements are
as its locked to ISA cycle times.

Alan
--

To: Alan Cox <alan@...>
Cc: Andi Kleen <andi@...>, David P. Reed <dpreed@...>, <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, H. Peter Anvin <hpa@...>
Date: Friday, December 7, 2007 - 12:31 pm

How would you make it conservative enough handling let's say a 6Ghz CPU
that can execute multiple jumps per cycle?

-Andi
--

To: Andi Kleen <andi@...>
Cc: Andi Kleen <andi@...>, David P. Reed <dpreed@...>, <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, H. Peter Anvin <hpa@...>
Date: Friday, December 7, 2007 - 1:19 pm

On Fri, 7 Dec 2007 17:31:16 +0100

Pick a sane worst case and go with it at boot. We don't have to be
accurate before we tune udelay - over long in uSecs isnt going to hurt,
and most post boot _p's can be replaced by udelay(8) now

Alan
--

To: Alan Cox <alan@...>
Cc: Andi Kleen <andi@...>, David P. Reed <dpreed@...>, <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, H. Peter Anvin <hpa@...>
Date: Friday, December 7, 2007 - 2:45 pm

Isn't 8 generally a bit overly long? I believe the norm is 1?

Rene.
--

To: Rene Herman <rene.herman@...>
Cc: Andi Kleen <andi@...>, David P. Reed <dpreed@...>, <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, H. Peter Anvin <hpa@...>
Date: Friday, December 7, 2007 - 2:42 pm

On Fri, 07 Dec 2007 19:45:25 +0100

8uS is an ISA bus transaction.
--

To: Alan Cox <alan@...>
Cc: Andi Kleen <andi@...>, David P. Reed <dpreed@...>, <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, H. Peter Anvin <hpa@...>
Date: Friday, December 7, 2007 - 3:25 pm

You very likely know better but just in case you're confused -- I thought it
was 8 cycles...

Rene.

--

To: Rene Herman <rene.herman@...>
Cc: Andi Kleen <andi@...>, David P. Reed <dpreed@...>, <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, H. Peter Anvin <hpa@...>
Date: Friday, December 7, 2007 - 5:45 pm

I'll double check. Its a long time since I stuck a scope on an ISA bus
--

To: Andi Kleen <andi@...>
Cc: <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, H. Peter Anvin <hpa@...>
Date: Friday, December 7, 2007 - 10:50 am

This all presumes that you need any delay at all. From back in the
early days (when I was writing DOS and BIOS code on 80286 class
machines) the /only/ reason this was a problem was using really slow
acting, non-buffered chips compared to the processor clock (8259?). If
you think about it, if there is a sequence such as outb->device,
inb<-device, the only reason for a delay would be that the device failed
to process the out command, /and/ the device had no "done" flag. The
other "slow" problem would be an out->device, out->device at a rate
higher than the device could handle because it had a one-level buffer
that ignored input that came too fast after the previous, but didn't
stall the bus to protect the device. Modern machines just are not
designed that way - a few of the early PC compatibles were.

My machine in question, for example, needs no waiting within CMOS_READs
at all. And I doubt any other chip/device needs waiting that isn't
already provided by the bus. the i/o to port 80 is very, very odd in
this context. Actually, modern machines have potentially more serious
problems with i/o ops to non-existent addresses, which may cause real
bus wierdness.

So that's why I suggested the short-jump answer - it fixes the problem
on the ancient machines, but doesn't do anything on the modern ones,
where there should be no problem.

One patch that makes immediate sense is to use the "virtualization"
hooks for the CMOS_READ/WRITE ops that is there in the 32-bit code to
allow substitution of a workable sequence for the RTC, which is where I
experience the problem on my machine. This doesn't fix any lurking
issues with the _p APIs, since they are not virtualized. I'd suggest
the safest possible route that would fix my machine would be either an
early_quirk, a boot parameter, or both that would then control the
virtualization hook logic.

That patch would fix my machine's current issues, and would not harm any
machines that need the 0x80 del...

To: David P. Reed <dpreed@...>
Cc: Andi Kleen <andi@...>, <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, H. Peter Anvin <hpa@...>
Date: Wednesday, December 5, 2007 - 7:10 am

I dislike outb_p clobbering port 0x80, but you are wrong here. BIOSes
already do outs to port 0x80 for debugging reason, so these accesses
are unlikely to do something bad.

Can we just do udelay(1) instead of port 80 access?

--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
--

To: Pavel Machek <pavel@...>
Cc: David P. Reed <dpreed@...>, Andi Kleen <andi@...>, <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, H. Peter Anvin <hpa@...>
Date: Friday, December 7, 2007 - 8:21 pm

They only do that briefly during boot though. But Linux
can do it much more often. If it's a race
or similar it might just not trigger with the BIOS.

-Andi
--

To: David P. Reed <dpreed@...>
Cc: Andi Kleen <andi@...>, <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, H. Peter Anvin <hpa@...>
Date: Friday, December 7, 2007 - 10:54 am

I don't know about CMOS, but there were definitely some not too ancient
systems (let's say not more than 10 years) who required IO delays in the
floppy driver and the 8253/8259. But on those the jumps are already
far too fast.

-Andi
--

To: Andi Kleen <andi@...>
Cc: David P. Reed <dpreed@...>, <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>
Date: Tuesday, December 11, 2007 - 1:53 am

Yes, early Linux used jumps. I believe it broke a bunch of machines
when the P5 came out, as the jumps were too fast. (I have to admit to
being a bit fuzzy on this... my memory says it was the 486 and not the
P5, but that clearly can't be the case since my first Linux box was a
486/33.)

-hpa
--

To: Andi Kleen <andi@...>
Cc: David P. Reed <dpreed@...>, <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, H. Peter Anvin <hpa@...>
Date: Friday, December 7, 2007 - 11:43 am

Also see Alan's replies in the thread I posted a link to:

http://linux.derkeiler.com/Mailing-Lists/Kernel/2003-09/5700.html

Also 8254 (PIT) at least it seems.

Rene.
--

To: David P. Reed <dpreed@...>
Cc: Andi Kleen <andi@...>, <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, H. Peter Anvin <hpa@...>
Date: Friday, December 7, 2007 - 12:28 pm

By the way, David, it would be interesting if you could test 0xed. If your
problem is some piece of hardware getting upset at LPC bus aborts it's not
going to matter and we'd know an outb delay is just not an option on your
system at least. You said you could quickly reproduce the problem with port
0x80?

Rene.
--

To: Rene Herman <rene.herman@...>
Cc: David P. Reed <dpreed@...>, Andi Kleen <andi@...>, <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>
Date: Monday, December 10, 2007 - 8:31 pm

I tried 0xED for a few versions (1.31-1.37) of SYSLINUX. It broke on a
lot of hardware (Phoenix BIOS uses 0xED by default, but BIOSes don't
have to work on arbitrary hardware.)

-hpa
--

To: David P. Reed <dpreed@...>
Cc: <linux-kernel@...>, Thomas Gleixner <tglx@...>, Ingo Molnar <mingo@...>, H. Peter Anvin <hpa@...>
Date: Thursday, December 6, 2007 - 8:15 pm

Post boot we can use udelay() for this. Earlier I guess we could use
udelay and make sure it starts "safe" before we know the timing.

Alan
--

Previous thread: [PATCH] vivi driver works only as first device by Gregor Jasny on Thursday, December 6, 2007 - 6:06 pm. (4 messages)

Next thread: PATCH] adding wistron_btns support for X86_64 systems. by Rémi Hérilier on Thursday, December 6, 2007 - 4:16 pm. (1 message)