Re: Could the k8temp driver be interfering with ACPI?

Previous thread: Re: [PATCH] serial driver PMC MSP71xx, kernel linux-mips.git mast er by Marc St-Jean on Friday, February 16, 2007 - 10:06 am. (2 messages)

Next thread: [PATCH] serial driver PMC MSP71xx, kernel linux-mips.git master by Marc St-Jean on Friday, February 16, 2007 - 10:39 am. (1 message)
From: Chuck Ebbert
Date: Friday, February 16, 2007 - 10:31 am

Recently my notebook has started shutting down with
these messages in the logs:

	ACPI: Critical trip point
	Critical temperature reached (128 C), shutting down.

But it didn't seem hot at all to me, so I wrote a script to
cat /proc/acpi/thermal_zone/THRM/temperature once a second
and eventually caught this (but no shutdown):

temperature:             47 C
temperature:             47 C
temperature:             128 C
temperature:             48 C
temperature:             47 C

Google found several people reporting problems like mine
after installing lm-sensors, and when I looked at the list
of loaded modules I found k8temp and hwmon there. Then I
realized my problems had started after installing a 2.6.19
kernel that had the new k8temp driver.

So, could ACPI and the k8temp driver be at odds?
-

From: Len Brown
Date: Friday, February 16, 2007 - 10:57 am

Yes.
-Len
-

From: Chuck Ebbert
Date: Friday, February 16, 2007 - 11:14 am

Hmm, now it's showing 130 degrees once every five seconds
and 54 degrees for the other four.

And while I was typing this on another machine I heard a
click from the notebook -- it shut down again.

-

From: Andi Kleen
Date: Friday, February 16, 2007 - 12:59 pm

Yes, there is no locking between ACPI and Linux drivers for register access.
e.g. if there is a indexed register both try to access (and temperature
sensors tend to use these things) they can race and get corrupted data.

-Andi
-

From: Chuck Ebbert
Date: Friday, February 16, 2007 - 12:31 pm

I blacklisted the k8temp driver and everything looks OK now.

-

From: Jean Delvare
Date: Sunday, February 18, 2007 - 10:32 am

Hi Chuck,


You could blacklist the ACPI "thermal" module instead. If you're
interested in monitoring your CPU temperature, k8temp is IMHO more
convenient to use than ACPI, as it interfaces properly with libsensors
and all hardware monitoring user interfaces.

-- 
Jean Delvare
-

From: Andi Kleen
Date: Sunday, February 18, 2007 - 4:22 pm

That would be somewhat dangerous because ACPI thermal does more than
just displaying the temperature. Especially laptops often need it.

-Andi
-

From: Rudolf Marek
Date: Saturday, February 17, 2007 - 3:49 am

Hello Chuck,

I'm the author of K8temp. Please can you share with us your DSDT table?

Yes because ACPI AML code has no synchronization with Linux drivers. Second
reason is that ACPI AML code assign resource regions to itself but with cleared
busy flag - so other drivers could bind and might possibly interfere  with ACPI.

This is very long term problem, I already proposed some possible solutions to
this problem
(http://lists.lm-sensors.org/pipermail/lm-sensors/2007-February/018788.html)

There are some ideas, but none is implemented yet. As you already wrote, best
solution is to stop using k8temp driver.

I will check the DSDT table to confirm this fact.

Thanks,
Rudolf
-

From: Chuck Ebbert
Date: Saturday, February 17, 2007 - 11:14 am

The system is Compaq Presario V2300 series notebook.  I won't be able

Well I had an idea after looking at k8temp -- why not make it default to
doing only reads from the sensor?  You'd only get information from whatever
core/sensor combination that ACPI had last used, but it would be safe.


-

From: Jean Delvare
Date: Sunday, February 18, 2007 - 10:38 am

ACPI is broken here, not k8temp, so let's fix ACPI instead. ACPI
doesn't conflict with only k8temp, but with virtually all hardware
monitoring drivers, all I2C/SMBus drivers, and probably other types of
drivers too. We just can't restrict or blacklist all these drivers
because ACPI misbehaves.

-- 
Jean Delvare
-

From: Matthew Garrett
Date: Tuesday, February 20, 2007 - 8:18 am

No, the simple fact of the matter is that if you're running on an ACPI 
platform you need to change some of your assumptions. ACPI owns the 
hardware. The OS doesn't. To an extent this has always been true on 
laptops and servers /anyway/ - the BIOS is free to have a wide variety 
of SMM insanity that invalidates basic assumptions like "If I hold this 
lock, nothing can interrupt me between this write and this read". That's 
simply not true.

So this isn't about fixing ACPI. It's about trying to find a mechanism 
that allows ACPI and raw hardware drivers to coexist, which is made 
somewhat harder by it not being a situation that the platform designers 
have considered in the slightest. The suggested low-level driver for 
io-port arbitration would certainly be a step forward in making this 
work better.

-- 
Matthew Garrett | mjg59@srcf.ucam.org
-

From: Luca Tettamanti
Date: Tuesday, February 20, 2007 - 8:33 am

Motherboard vendors usually provide tools for $(TheOtherOS) that can
read from all thermal  / fan / voltage / whatever sensors, so I guess
it's possible to make the ACPI driver and the "raw" one play nice with
each other[1].

Luca
[1] Unless their solution is "poke at the hardware and hope that ACPI
doesn't blow up", that is.
-

From: Jean Delvare
Date: Wednesday, February 21, 2007 - 7:59 am

Hi Luca,


Without the sources it's hard to tell. And all these applications are
vendor-specific, so if they indeed have ways to avoid conflicting
accesses between ACPI and the rest of the system, these ways are likely
to be vendor-specific as well, and not documented.

Either way, this means we need the support from hardware vendors to
solve this concurrent access problem, and unfortunately I doubt this
happens anytime soon :(

-- 
Jean Delvare
-

From: Jean Delvare
Date: Wednesday, February 21, 2007 - 8:07 am

Hi Matthew,


The Linux device driver model assumes that it owns the hardware. If
this is not true, then should we prevent any non-ACPI driver from

Yeah, this is correct, and just as unfortunate. It's amazingly sad that
hardware vendors as a whole are still repeating the same design


I sure hope we can find a solution, by as your said yourself, nothing
is going to prevent SMM and similar oddities from messing up the drivers
assumptions.

-- 
Jean Delvare
-

From: Pavel Machek
Date: Wednesday, February 28, 2007 - 2:38 pm

Oops, sorry about that but no, that will not work.

There's piece of paper, called ACPI specification, and we are
following it.

Bug is not in our implementation.

Bug is in the ACPI specs... it does not explicitely allow you to go
out and bitbang i2c, and you do it, and you get problems.

Now, you may try to change specs to be hwmon-friendly... good luck.

But currently hw manufacturers follow ACPI specs, so we have to follow
it, too; bad luck for hwmon. BIOS hiding smbus from you is good hint
you are doing something wrong...?
							Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-

From: Jean Delvare
Date: Thursday, March 1, 2007 - 7:26 am

Hi Pavel,


I never suggested otherwise. But the Linux 2.6 device driver model is
based, in part, on the fact that each driver must request the
resources it needs before actually using them. The acpi subsystem fails
to do that, so it is, in that sense, misbehaving. The is the root cause

It really doesn't have anything to do with bit-banged I2C. It's about
SMBus master chips (most often embedded in south bridges), hardware
monitoring chips (be them SMBus-based or not), or basically any driver
which uses a resource ACPI is using as well (virtually any driver
accessing a Super-IO/LPC device is affected, for example.)

Secondly, your logic is somewhat broken. Just because one specification
doesn't explicitely allow me to do something, I must stop doing it?
There must be two dozen different specifications modern PC hardware
supposedly conforms to. If we stop doing everything which isn't
explicitely allowed by all of them, I fear there won't be much left ;)
A better question would be IMHO: Does the ACPI specification
explicitely _prohibits_ accessing directly to some categories of
hardware?

Lastly, I wholeheartedly agree that the core problem appears to be in
the ACPI design rather than in the Linux implementation. As a matter of
fact, other operating systems are facing the same problem [1]. The

I would like them to be driver-model-friendly, that's even a broader

I would love things to be that easy, but unfortunately they are not.

Firstly, the first records of hidden SMBus, in September 2000, predate
ACPI. All the early boards where the SMBus was hidden did not have ACPI
code poking at it at all, so this is definitely not the reason why it
was removed. The Asus P4 series is a good example of that. Unhiding the
SMBus on these boards actually let the user take benefit of the
hardware they had paid for.

Secondly, only a few south bridges are capable of hiding the SMBus. And
an even smaller number of systems have their SMBus actually hidden by
the BIOS. A much large number ...
From: Dave Jones
Date: Thursday, March 1, 2007 - 10:48 am

On Thu, Mar 01, 2007 at 03:26:55PM +0100, Jean Delvare wrote:

 > Firstly, the first records of hidden SMBus, in September 2000, predate
 > ACPI.

The earliest ACPI spec I have handy is 1.0b, which came out in Feb 2 1999
so this isn't true. The all knowing (and always accurate :) wikipedia
claims it was first released in 1996, though I believe that all the pre 1.0b
machines were using acpi implementations before the standard was finalised.

I certainly remember seeing ACPI capable machines circa 1997/1998.

		Dave

-- 
http://www.codemonkey.org.uk
-

From: Jean Delvare
Date: Friday, March 2, 2007 - 4:27 am

Yeah, and these early ACPI implementations were so good that we have an
option to blacklist them.
arch/i386/defconfig:CONFIG_ACPI_BLACKLIST_YEAR=2001

My point (which you didn't quote) was that there is no correlation
between the SMBus being hidden and ACPI accessing the hardware
monitoring chip, contrary to what Pavel was suggesting.

-- 
Jean Delvare
-

From: Pavel Machek
Date: Friday, March 2, 2007 - 4:31 am

It may not be correlated with ACPI, but BIOS authors clearly want to
keep you away from their SMBus controllers....
									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-

From: Jean Delvare
Date: Friday, March 2, 2007 - 6:37 am

drivers/pci/quirks.c is full of things we do against the BIOS authors
intent. You don't plan to remove them all, do you?

(And as a side note, this is really the board's owner SMBus controller.
The hardware doesn't belong to the BIOS author.)

-- 
Jean Delvare
-

From: Henrique de Moraes Holschuh
Date: Friday, March 2, 2007 - 6:47 am

Especially when they won't do their job right, and won't fix it later
either...

-- 
  "One disk to rule them all, One disk to find them. One disk to bring
  them all and in the darkness grind them. In the Land of Redmond
  where the shadows lie." -- The Silicon Valley Tarot
  Henrique Holschuh
-

From: Pavel Machek
Date: Friday, March 2, 2007 - 6:57 am

Notice how quirks.c is careful to name machines where given quirk is
used.

If you do whitelist "it is okay to do sensors accesses on this board",
that is okay with me. But having quirk "on all future Intel chipsets,

True.
									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-

From: Jean Delvare
Date: Friday, March 2, 2007 - 11:44 pm

This is the view of someone obviously not interested in sensors. Go
tell that to all the lm-sensors users and they'll reply: "Disabling
sensors blindly would be stupid." Please realize that we're not writing
hardware monitoring drivers just for the fun of conflicting with ACPI
on some machines. We're writing them because users are asking for these
features.

-- 
Jean Delvare
-

From: Pavel Machek
Date: Friday, March 2, 2007 - 4:40 am

Ok. You are right that ACPI is an ugly piece of mess. But I'm pretty
sure that 90%+ of ACPI notebook implementations *will* want to talk to
their monitoring chips... for temperature readings.

So even if we fixed ACPI to reserve the ports, you'd be still unhappy;

...against wishes of the manufacturers. Which sometimes know what they

Well, I'm afraid you should assume all recent notebooks touch sensors


Yes, I know ACPI sucks at hardware monitoring. Unfortunately, we can't

Fix the interface? ;-). Actually that move may be underway as we are

Whitelist seems like a way to go :(.
									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-

From: Matthew Garrett
Date: Friday, March 2, 2007 - 4:47 am

The DSDT code clearly can't touch the hardware itself - hardware access 
is carried out by the kernel. If we can identify cases where ACPI reads 
and writes would touch resources claimed by other drivers, that would be 
a good starting point for working out what's going on.

Of course, this ignores the case where the DSDT just traps into SMM 
code. That one is clearly unsolvable.
-- 
Matthew Garrett | mjg59@srcf.ucam.org
-

From: Pavel Machek
Date: Friday, March 2, 2007 - 6:58 am

We can't solve SMM stuff. (Whitelist needed :-).

Actually for the acpi stuff... we might wrap ACPI interpretter with a
semaphore that needs to be taken before starting any AML code. Then
just make sure sensors take same semaphore?
									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-

From: Jean Delvare
Date: Friday, March 2, 2007 - 2:00 pm

Hi Pavel,


I like the idea, it should work as long as we are guaranteed that all
the hwmon device accesses happen in the AML code? I'm not familiar with
ACPI, so you tell me.

In practice it's rather the SMBus drivers which will need to take the
lock, as this is what the AML code will access (for SMBus-based
hardware monitoring drivers.) For non-SMBus based hardware monitoring
drivers, indeed the driver itself should take it. We will have to pay
attention to deadlocks on systems with multiple hardware monitoring
chips and/or SMBus channels.

Can you please provide a patch implementing your proposal in acpi? Then
I could implement the i2c/hwmon part on some selected drivers and start
testing real world scenarii.

Thanks,
-- 
Jean Delvare
-

From: Henrique de Moraes Holschuh
Date: Friday, March 2, 2007 - 2:22 pm

Some fubar machines do it in SMM mode, and the AML code just reads memory
regions that were updated by the SMBIOS.  Or, if one is lucky, the AML
triggers the SMI (ThinkPads do this, for example -- but I am not sure that's
the only way a ThinkPad will cause a SMI), so it could be made to run
protected by the lock.

-- 
  "One disk to rule them all, One disk to find them. One disk to bring
  them all and in the darkness grind them. In the Land of Redmond
  where the shadows lie." -- The Silicon Valley Tarot
  Henrique Holschuh
-

From: Pavel Machek
Date: Sunday, April 1, 2007 - 8:39 am

I'm sorry, but I do not have time for a patch.... and I'm not really
acpi expert, anyway. Ask Len?
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-

From: Jean Delvare
Date: Monday, April 2, 2007 - 8:48 am

Hi Pavel, Len,


Meanwhile Alexey Starikovskiy pointed me to the
acpi_ex_enter/exit_interpreter functions, which take an ACPI mutex
(ACPI_MTX_INTERPRETER). As the mutex already exists, it sounds more
efficient to just reuse it rather than introducing a new one. I made an
experiment with the f71805f driver on a machine where ACPI is accessing
the F71805F chip, and it appears to work fine; patch at the end of this
post is someone wants to take a look and/or comment.

Looking at the comment before acpi_ex_exit_interpreter raises two

1* This suggests that the mutex could be released by the AML
interpreter in the middle of an SMBus transaction. If so, and if it
happens in practice, this means that we unfortunately cannot use this
mutex to reliably protect the SMBus drivers from concurrent accesses.
This even suggests that it's simply not possible to have a mutex we
take at the beginning when entering the AML interpreter and we release
when leaving the AML interpreter, as it looks like ACPI itself allows
interlaced execution of AML functions. Len, is this true?

What is the purpose of the ACPI_MTX_INTERPRETER mutex in the first
place, given that it seems it will be released on numerous occasions?
Is it to prevent concurrent AML execution while still allowing
interlaced execution?

2* What are "user-installed opregion handlers"? Are they something that
could help solve the ACPI vs. other drivers problem?

Thanks.

---
 drivers/acpi/utilities/utmutex.c |    2 ++
 drivers/hwmon/f71805f.c          |   32 +++++++++++++++++++++++++++++++-
 2 files changed, 33 insertions(+), 1 deletion(-)

--- linux-2.6.21-rc5.orig/drivers/acpi/utilities/utmutex.c	2007-02-21 08:34:20.000000000 +0100
+++ linux-2.6.21-rc5/drivers/acpi/utilities/utmutex.c	2007-04-02 15:48:41.000000000 +0200
@@ -265,6 +265,7 @@ acpi_status acpi_ut_acquire_mutex(acpi_m
 
 	return (status);
 }
+EXPORT_SYMBOL_GPL(acpi_ut_acquire_mutex);
 
 ...
From: Dave Jones
Date: Monday, April 2, 2007 - 12:22 pm

On Mon, Apr 02, 2007 at 05:48:59PM +0200, Jean Delvare wrote:
 > +	u8  val;
 > +#ifdef CONFIG_ACPI
 > +	acpi_ut_acquire_mutex(ACPI_MTX_INTERPRETER);
 > +#endif
 >  	outb(reg, data->addr + ADDR_REG_OFFSET);
 > -	return inb(data->addr + DATA_REG_OFFSET);
 > +	val = inb(data->addr + DATA_REG_OFFSET);
 > +#ifdef CONFIG_ACPI
 > +	acpi_ut_release_mutex(ACPI_MTX_INTERPRETER);
 > +#endif
 > +	return val;
 > ... deletia, more of the same.

it'd probably end up a lot cleaner to #define them to empty macros
in the !ACPI case in acpi/acpi.h and just #include it unconditionally.

	Dave

-- 
http://www.codemonkey.org.uk
-

From: Jean Delvare
Date: Monday, April 2, 2007 - 10:49 pm

Hi Dave,


Sure, the implementation details can be refined later. I'm only trying
to see what can be done for now.

-- 
Jean Delvare
-

From: Moore, Robert
Date: Monday, April 2, 2007 - 1:55 pm

The ACPI specification allows concurrent execution of control methods
although methods cannot be preempted. The ACPICA interpreter mutex is
Control Method is not preemptive, but it can block. When a control
method does block, the operating software can initiate or continue the
execution of a different control method. A control method can only
assume that access to global objects is exclusive for any period the
control method does not block.

Therefore, the interpreter lock is acquired and a control method is
allowed to execute to completion unless it blocks on one of the events
described below. If the method blocks, the interpreter is unlocked and
other control methods may execute.

I'm not sure what you mean by "in the middle of an SMBus transaction", I
don't know how long such a transaction is valid. I might guess that a
single transaction can only span a single operation region access, but
I'm not sure of this.

A user-installed operation region handler is an operation region handler
that is installed by a device driver. This feature would probably only
be used for custom (OEM-defined) operation region address spaces. (I
have not seen one yet.) For the standard address spaces (memory, I/O,
etc.), usually only the default handlers are used.

-

From: Jean Delvare
Date: Tuesday, April 3, 2007 - 12:21 am

Hi Bob,



Basically an SMBus transaction looks like this:
1* Prepare the transaction.
2* Start the transaction.
3* Wait for the transaction to complete, typically a few ms.
4* Read the result of the transaction.

Steps 1 and 2 require writing to the SMBus I/O region. Step 4 requires
reading from it, and so does step 3 if the wait loop is poll-based. The
transaction is only safe if we have an exclusive access to the I/O
region during all the 4 steps. My fear is that step 3 could be
implemented by ACPI using either a Sleep() or Acquire() or Wait()
opcode. If it is, we're doomed. OTOH, if it does, it is probably not
even safe for itself, unless there's an additional,
implementation-specific mutex to protect SMBus transactions. I yet have
to get my hands on the DSDT of ACPI implementations which actually

Could regular Linux device drivers install such handlers for a specific
I/O region? I'm asking because Rudolf Marek's proposal [1] to solve the
concurrent access problem involved extending struct resource with
callbacks to driver-specific routines to handle external access to an
I/O region. This sounds somewhat similar to these "user-installed
operation region handler" defined by ACPI, doesn't it? If ACPI already
has an infrastructure to handle this problem, we probably want to use
it rather than implementing our own.

[1] http://marc.info/?l=linux-kernel&m=117302946017204&w=2

-- 
Jean Delvare
-

From: Moore, Robert
Date: Wednesday, April 4, 2007 - 2:35 pm

> Could regular Linux device drivers install such handlers for a

No. ACPICA only supports operation region handlers on a

As far as the AML interpreter is concerned, access to the SMBus is via
an operation region. So, each access to such a region would encompass a
single SMBus transaction. Also, the interpreter remains locked during

I think the spec is referring to any global namespace object. This
includes operation regions, so the answer is yes, as long as access to
the region does not block and cause the interpreter to be released. As
far as ACPICA, none of the default handlers for operation regions will
block.

-

From: Jean Delvare
Date: Friday, March 2, 2007 - 7:10 am

Hi Matthew,


I'm not familiar with APCI at all so I didn't know, but what you write
here brings some hope. Would it be possible to parse all the DSDT code
at boot time and deduce all the ports which ACPI would need to request

Yeah, SMM is an even more complex problem :(

Do we know in advance when we are going to SMM mode and back? If we do,
I'd be happy with a mutex every interested driver could use to protect
relevant parts of its code. SMBus master drivers for example could
request that mutex during SMBus transactions. Of course we don't know
if SMM will actually touch the SMBus, but better safe than sorry I
guess. And SMM calls aren't happening so frequently, are they?

Thanks,
-- 
Jean Delvare
-

From: Matthew Garrett
Date: Friday, March 2, 2007 - 7:18 am

In theory I /think/ so, but it would probably end up being an 
overestimate of the coverage actually needed. Trapping at runtime is 

My understanding is that pretty much arbitrary hardware access can cause 
SMM transitions without OS notification, though this is getting outside 
the areas I know about.
-- 
Matthew Garrett | mjg59@srcf.ucam.org
-

From: Jean Delvare
Date: Friday, March 2, 2007 - 2:04 pm

It might be more elegant but it won't work. We don't want to prevent
ACPI from accessing these I/O ports. If we need to choose only one
"driver" accessing the I/O port, it must be acpi, at leat for now,
despite the fact that acpi provides very weak hardware monitoring
capabilities compared to the specific drivers.

Why would we end up with an overestimation if we check the I/O ports at
boot time? Do you have concrete cases in mind?

Thanks,
-- 
Jean Delvare
-

From: Matthew Garrett
Date: Friday, March 2, 2007 - 2:12 pm

Assuming arbitration of access, what's the problem with having two 
drivers accessing the same hardware? Do these chips generally have any 

ACPI will often describe large operation regions, but won't necessarily 
touch all of them. Effectively, every codepath would have to be walked 
through at boot time and checked for io access.

-- 
Matthew Garrett | mjg59@srcf.ucam.org
-

From: Jean Delvare
Date: Saturday, March 3, 2007 - 2:53 am

Hi Matthew,


The "assuming arbitration of access" is the key part of your
sentence ;) The problem is that currently no arbitration is done. If it
was done, then state would probably not be a problem. Most hardware
monitoring drivers don't assume any state is preserved between
accesses, and those which do can easily be changed not to. The ACPI
side is another story though, I guess we can't assume anything about
the AML code's assumption on states, as it differs from one machine to
the next. But we can try to be cooperative and restore the sensible
registers (e.g. bank selector) in the same state we found them.

Anyway, just because we can't get things right on 100% of the machines
is no reason not to try anything at all. The current situation is bad,

Is there anything preventing us from doing such a walk and pre-allocate
all the I/O ranges? I am not familiar with the ACPI code at all, would
you possibly propose a patch doing that?

If we can't do that, the overestimation approach might still work. I
wonder if it would cause problems in practice. If it does, we're back
to Pavel's AML lock.

-- 
Jean Delvare
-

From: David Hubbard
Date: Saturday, March 3, 2007 - 8:47 am

Here's a random idea -- what do you think of it?


For I/O and memory that ACPI accesses and has not reserved, the AML
interpreter could allocate at run-time.

I'm not sure how to implement exactly. For example, it would be bad to
have a /proc/ioports that had a lot of single ports allocated, for
example:
1000-107f : 0000:00:1f.0
 1000-1000 : ACPI PM1a_EVT_BLK
 1001-1001 : ACPI PM1a_EVT_BLK
 1002-1002 : ACPI PM1a_EVT_BLK
 1003-1003 : ACPI PM1a_EVT_BLK

Thus the AML interpreter would need to have some reasonable
intelligence when allocating regions. Conflict resolution would also
be more difficult, e.g. if a hwmon driver was loaded first and then
acpi as a module, ACPI could not allocate the region. Maybe run-time
allocating won't work.

And then, how would ACPI release a region after it has used it? The
easiest method would be to never release anything used even once.

Thoughts?

David
-

From: Matthew Garrett
Date: Saturday, March 3, 2007 - 8:50 am

Not ideal. ACPI's already fiddling with ranges that have been reserved 
by other drivers.

-- 
Matthew Garrett | mjg59@srcf.ucam.org
-

From: Rudolf Marek
Date: Saturday, March 3, 2007 - 10:08 am

Hello all,

I was already thinking about some very general port forwarder,
(http://lists.lm-sensors.org/pipermail/lm-sensors/2007-February/018788.html)

But I'm not able to find out how to do that painlessly. Even don't know if to
use the classes or even some new "bus" in driver model :/ some suggestions on
this would be very welcome.

Let start with some very naive 1:1 solution which would try out if we can handle
"transactions" (typically write to one port read from other operation)
and preserve the "banks". As was already written some virtualization in driver
would be required, but it may be done because we know how the hardware is
supposed to work.

It seems that most simple solution to try out would be to add a possible
callback structure into a IORESOURCE structure. Please don't blame me for this,
it is just for testing purpose, we must think of better place, but as I said I
dont know where or how :/

Here is some concept:

When the request_region fails, ACPI could call request_region_acpi
which will return a pointer to some structure with defined callbacks
like do->outb do->inb. ACPI will first chekc if the pointer is not NULL ;)

This callback would end up in let say other linux driver.

For the typical hwmon chip things works similar way as CMOS access. Port is
written 0x295 with register address and port 0x296 is read or written.

Therefore:
void outb(port, val) {

if (port == 0x295) {
	reg_pointer=val;
} else if (port == 0x296) {
	// ACPI wants to write to the chip
	if (reg_pointer == BANKREG)
		reg_bank=val;
	}
	do_chip_write(reg_pointer, reg_bank_val);
}
}

void inb(port, val) {

if (port == 0x296) {
	

	return do_read_chip(reg_pointer, reg_bank);
}

}

The code above shows how the IO access could be synchronized with the ACPI, to
make everyone happy again. The driver could snoop what ACPI tries to change, so
it for example force reloading the cached values...

The second device which might be a challange is the SMBus controller. ...
From: Rudolf Marek
Date: Sunday, March 4, 2007 - 10:29 am

Hello again,

I produced some code which I proposed in mail above. The patch is not for review
it is just a PoC to better show what I'm trying to do. It is a test case for my
motherboard which has W83627EHF chip and ACPI thermal method and w83627ehf
compete the device.

When no driver is loaded, acpi is doing its own raw access to device. However
when the w83627ehf driver is loaded, the ACPI access to ports (0x295/0x296) is
forwarded to the w83627ehf driver.

How it works? I added one pointer to the struct resource which will contain a
pointer to the structure with callback functions. I know this is not an ideal
solution, but for a test it works fine. Everytime ACPI wants do IO it will ask
via ____request_resource(conflict, &res) if some driver claimed the resource,
if so, it iterates the resource structure to the last entry, if the callback
structure exists it is called (and device driver context is passed too), if not,
raw hw access is performed.

The routines in EHF driver partly emulates the chip, the address of which next
acpi access will want to read or write is saved in the data->reg_pointer.

The actual access to the chip is done only when the data port of the chip is
accessed, and driver generic io function is called. Bank register writes are
faked to the data->reg_bank. All access to the chip which is done in these
routines respects the "virtual" bank register which was previously set by ACPI.
(current emulation is not 100% perfect, but this could be easily fixed)
If something was written to the chip, driver's register cache is invalidated.

This approach does not need any external locking because the actual device
access is done by one function of driver (which already locks).

As from ACPI point of view the device behaves same way as real HW, the big plus
is that the driver actually KNOWs what is ACPI trying to do to the chip. Of
course if the HW access is done in SMM some other countermeasures must be
invented like GBL lock or the "take ownership" ...
From: Jean Delvare
Date: Monday, March 5, 2007 - 2:16 pm

Hi Rudolf,


I like your implementation, it's nice. Thanks for coming up with this
idea. The only change I would propose is to make the last parameter of
the read_io and write_io callback functions a device * instead of a
void *. This would let the compiler check the types and avoid improper
casts.



This port forwarding approach is interesting, maybe even beyond the
ACPI case we are trying to solve here. However, it has some drawbacks:

1* It requires that we modify each driver individually. It's quite a
shame that we have to update drivers all around the kernel tree with
specific code to workaround a problem which is originally located in a
single place (the ACPI subsystem.) That being said this isn't a blocker
point, if this is the only way, we'll do it. But that's a rather great
amount of work.

2* It seems to incur a signficant performance penalty.
____request_resource isn't cheap as far as I know. And in the
absence of conflict, you are even allocating and releasing the resource
on _each_ I/O operation, which certainly isn't cheap either. Again, it
is not a blocker point, after all this is a workaround for an improper
behavior of the acpi subsystem, this performance penalty is the price to
pay. But there is also a performance penalty for legitimate I/O access,
which worries me more.

3* What about concurrent accesses from ACPI itself? Unless we have a
guarantee that only one kernel thread can run AML code at a given
moment, I can imagine a conflict happening, as your code protects us
well against ACPI and driver concurrent accesses, but not against two
concurrent ACPI accesses. But I'm not familiar with ACPI - maybe we do
have this guarantee? OTOH, if this ever happens, well, it would happen
even without your code.

So for now I tend to think that the idea of a global AML lock proposed
by Pavel Machek and Bodo Eggert would be more efficient. And it
wouldn't need any driver-specific code, so it would be much more simple
to implement. The drawback being that we ...
From: David Hubbard
Date: Monday, March 5, 2007 - 2:35 pm

I thought Rudolf's patch allocated the resource in the driver
(w83627ehf) and ACPI contacted the driver when it could not allocate
the resource. Since ACPI never *really* wants to allocate the
resource, is there a fast-path check it could do? This would help

All of ACPI uses the same "virtualized" access mechanism, so even if
two threads are poking at the bank select register in a race condition
(that would be awful code, I think) they would see the same virtual

I like the virtualized driver method (if it wasn't obvious!) but the
global AML lock works also. It will be interesting to see profiling of
both solutions.

David
-

From: Jean Delvare
Date: Tuesday, March 6, 2007 - 8:10 am

Hi David,


Alas, I don't think it can work safely. If ACPI doesn't actually
allocate the resource, there is a risk that another driver does so
between the conflict check and the I/O access. I believe this is the

In fact Rudolf's solution is nice for LPC chips, however I don't think
it can work with SMBus chips. It might intercept the accesses to the
SMBus master and be able to emulate it, even though this will be more
complex that the W83627EHF case, and I'm not sure about the PCI config
space. But ultimately we need to emulate all the chips behind the
SMBus, too. The drivers for these chips won't know if they are accessed
for real of through the emulation layer, so there is no way they'll
remember states, while they might have to (e.g. the W83781D has a bank
register too.)

The AML lock approach, OTOH, should work fine in all cases as long as
the context doesn't need to be remembered across AML "sections".

-- 
Jean Delvare
-

From: Jean Delvare
Date: Sunday, March 4, 2007 - 3:54 am

Hi David,


I've been thinking about it too, but I could convince myself quickly
that it would never work, so I didn't bother posting about it ;)

As you found out yourself, it isn't trivial to allocate the ports
dynamically. Either you'll end up with lots of 1-address ranges, or
you'll allocate more than is actually needed, or you'll need a
buddy-allocator like mechanism to merge contiguous ranges. This sounds
quite complex to get right. And very costly, too. I don't know the
exact algorithm to find out whether an I/O range is already allocated
or not, and by who, but having to do it for each port access from AML,
forever, sounds like a huge overhead.

On top of that, there is a risk that another driver already requested
the I/O range. And we do not want to prevent ACPI from accessing it!
OK, it would be cleaner that the current situation in a way, but it
could also have bad consequences as was underlined elsewhere in this
thread.

What we want is to grant access to the resources to at least ACPI (and
ACPI only if we can't do better), or if possible to both ACPI and
individual drivers but with some mechanism avoiding concurrent access
(be it a mutex or a port forwarder.)

-- 
Jean Delvare
-

From: Pavel Machek
Date: Monday, March 5, 2007 - 3:25 pm

ACPI AML is probably turing-complete: I'm afraid you are trying to
solve the halting problem (-> impossible).

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-

From: Jean Delvare
Date: Tuesday, March 6, 2007 - 8:26 am

Hi Pavel,


Can you please translate this into something mere humans like myself
have a chance to understand?

Thanks,
-- 
Jean Delvare
-

From: Pavel Machek
Date: Tuesday, March 6, 2007 - 2:20 pm

ACPI AML is turing-complete -- that means it is as powerful any
programming language. It can do arbitrary computation. That means it
is theoretically impossible to analyze its accesses using any program.

Now... may be possible to introduce _some_ ACPI BIOSes, but doing it
would certainly be very complex -- we are talking "put gcc into
kernel" here.

So no, it is not possible to preallocate the ranges.
								Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-

From: Moore, Robert
Date: Tuesday, March 6, 2007 - 2:25 pm

In other words, as per my earlier message:

Port addresses can be dynamically generated by the AML code and thus,
there is no way that the ACPI subsystem can statically predict any
addresses that will be accessed by the AML.

-

From: richardvoigt@gmail.com
Date: Sunday, March 18, 2007 - 12:36 pm

I recall that one of the stated drawbacks of a lock was that no ACPI
code could execute while the hwmon driver was doing its fairly lengthy
conversation with the hardware.

It seems that using transactional concepts here would solve that
problem.  For example, the hwmon driver issues a start transaction
request.  The first write request to any given location (bank register
for example) causes the previous memory value to be saved.  Then,
instead of locking AML out, AML is allowed to execute, but any access
to the memory/port ranges reserved by the driver (when the transaction
is set up) cause the hwmon transaction to be rolled back so the AML
sees the expected state.  Then AML proceeds as usual.  When hwmon
tries additional operations, they fail with some "transaction
interrupted" error message, indicating to the hwmon driver to start
over.

The only issue with this that I can see, is that if AML isn['t
executed atomically wrt hwmon, then knowing when it is safe for hwmon
to retry is going to be difficult.

This probably requires changes to every hwmon driver, but they can be
updated piecemeal, starting with the ones most commonly found in
notebooks, where ACPI is most important.
-

From: Jean Delvare
Date: Monday, March 19, 2007 - 12:08 am

Hi Richard,


No. We're not rolling back anything, it's totally unrealistic. These
are device drivers we're talking about here, not a database. The I/O
accesses done by the hardware monitoring drivers are not that long, so
AML gets to wait for them to be finished, and that's it. There is no
valid reason to give the priority to AML over regular device drivers.

Most notebooks don't expose their hardware monitoring chip at all.
Those which do use SMBus devices in majority, where I/O forwarding is
going to be difficult, as it needs to be done at the SMBus controller
level, not the hardware monitoring device level. I want to get my hands
on such a laptop first though, as I need to see what exacly ACPI is
doing before I can think of a solution.

-- 
Jean Delvare
-

From: Moore, Robert
Date: Friday, March 2, 2007 - 3:07 pm

Port (and memory) addresses can be dynamically generated by the AML code
and thus, there is no way that the ACPI subsystem can statically predict
-

From: Pavel Machek
Date: Friday, March 9, 2007 - 12:18 am

Can you take this as a wishlist item?

It would be nice if next version of acpi specs supported table

'AML / SMM BIOS will access these ports'

...so we can get it correct with acpi4 or something..?

							Pavel

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-

From: Jean Delvare
Date: Friday, March 9, 2007 - 3:24 am

I can only second Pavel's wish here. This would be highly convenient
for OS developers to at least know which resources are accessed by AML
and SMM. Without this information, we can never be sure that OS-level
code won't conflict with ACPI or SMM.

-- 
Jean Delvare
-

From: Alexey Starikovskiy
Date: Friday, March 9, 2007 - 3:39 am

BIOS vendors are not required to support latest and greatest ACPI spec. 
So even if some future spec version
will include this ports description, we will still have majority of 
hardware not exporting it...

Regards,
    Alex.
-

From: Pavel Machek
Date: Friday, March 9, 2007 - 4:21 am

That's okay... vendors are not required to support _ACPI_, but they
mostly do. Can we get the "ports used by BIOS" table to the spec?
								Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-

From: Jean Delvare
Date: Friday, March 9, 2007 - 10:23 am

Hi Alexey,


Your reasoning is amazing. So we should refrain from proposing any
improvement which we aren't certain 100% of the systems will support
tomorrow? Then let's all stay away from our keyboards forever, as the
evolution of computer technology is based on exactly that -
improvements which not all systems implement.

It's friday evening, let's have some more for fun. With a similar
logic, ten years ago we'd have come up with the following conclusions:

The majority of computers have a single CPU, there is no point in
adding SMP support to Linux.

Let's not add a new instruction set in our next CPU family. The
majority of systems will not implement it so it will be useless anyway.

There's no point in supporting PnP in Linux, there are a majority of
legacy ISA cards out there which do not support it anyway!

See my point? Just because not every hardware out there supports a
given standard doesn't make that standard necessarily useless.

Just make the next version of ACPI better than the previous one (not
necessarily a challenge) and everyone will embrace it.

-- 
Jean Delvare
-

From: Alexey Starikovskiy
Date: Friday, March 9, 2007 - 10:35 am

You get me wrong, I'm not against the proposal, so keep your breath.
I'm just saying that you get old waiting for BIOS vendors to export this 
info, even if it's in spec.

Regards,
Alex.
-

From: Moore, Robert
Date: Friday, March 9, 2007 - 2:03 pm

No ACPI discussion can be complete without mentioning Microsoft and
Microsoft compatibility -- Windows does not fully support ACPI 2.0 to
this day, even though it was released in the year 2000, and ACPI 3.0 has
-

From: Moore, Robert
Date: Friday, March 9, 2007 - 1:56 pm

Included the Intel ACPI spec representative.

I have heard that Windows is somehow restricting the ports and memory
locations that are accessible via AML; I don't know any of the details.
Also, there are fears of an "AML virus" attacking the machine.

-

From: Pavel Machek
Date: Friday, March 2, 2007 - 7:22 am

I'm afraid it is about as hard as disassembling SMM BIOS to figure out

SMM is often invoked by timer, AC unplug, etc.
								Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-

From: Jean Delvare
Date: Friday, March 2, 2007 - 7:03 am

Hi Pavel,


That's a secondary problem. The primary issue is the concurrent access
to resources, which cause lots of trouble which are hard to investigate.
If ACPI reserves the ports, then the SMBus or hardware monitoring
drivers (or any other conflicting driver) will cleanly fail to load,
which would be a move in the right direction. Ideally we would be able
to synchronize the accesses between ACPI and the other drivers, but if
we can't, I'd already be _very happy_ to just prevent conflicting
drivers from being loaded at the same time.


Correct, and on most notebooks, traditional hardware monitoring
drivers do not work anyway. Or I should say, used to not work. The
recent CPUs have embedded sensors which can be read using Rudolf
Marek's k8temp and coretemp drivers, this works on laptops as well.
Chances are good that future laptops will not include a separate
temperature sensor but will read the temperature from the CPU directly,
which will cause conflicts with our drivers.

Now the problem is that we can't blacklist SMBus and hardware
monitoring drivers on all laptops by default. There may be other
devices on the SMBus which the user can legitimately want to access,

What kind of whitelist do you have in mind? We can't realistically
maintain an ever-growing whitelist of hundreds of entries in the
kernel. We could block all laptops by default and maintain a white list
only for them, and a black list for other systems, the would probably
limit the maintenance work, maybe not to an acceptable level though.

Anyway I would definitely prefer the resource conflicts approach, as it


Great, looking forward.

-- 
Jean Delvare
-

From: Pavel Machek
Date: Friday, March 2, 2007 - 7:24 am

No idea, talk to Len Brown (or start reading the code) :-(. I don't

I'm afraid something like that is way to go.

									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-

From: Matthew Garrett
Date: Friday, March 2, 2007 - 7:57 am

How about this? It's informational only, but ought to result in 
complaints whenever ACPI tries to touch something that other hardware 
has reserved. We can't fail these accesses, but in theory we could 
consider some sort of locking layer that made it possible to interact 
anyway. I haven't even checked if this builds, but I think the concept 
is reasonable.

diff --git a/drivers/acpi/osl.c b/drivers/acpi/osl.c
index d870fb2..088abd1 100644
--- a/drivers/acpi/osl.c
+++ b/drivers/acpi/osl.c
@@ -35,6 +35,7 @@
 #include <linux/kmod.h>
 #include <linux/delay.h>
 #include <linux/workqueue.h>
+#include <linux/ioport.h>
 #include <linux/nmi.h>
 #include <acpi/acpi.h>
 #include <asm/io.h>
@@ -338,6 +339,10 @@ acpi_status acpi_os_read_port(acpi_io_address port, u32 * value, u32 width)
 {
 	u32 dummy;
 
+	if (__check_region(&ioport_resource, port, width/8))
+		printk (KERN_INFO "ACPI read from allocated ioport %lx\n", 
+			port);
+
 	if (!value)
 		value = &dummy;
 
@@ -362,6 +367,10 @@ EXPORT_SYMBOL(acpi_os_read_port);
 
 acpi_status acpi_os_write_port(acpi_io_address port, u32 value, u32 width)
 {
+	if (__check_region(&ioport_resource, port, width/8))
+		printk (KERN_INFO "ACPI write to allocated ioport %lx\n", 
+			port);
+
 	switch (width) {
 	case 8:
 		outb(value, port);
@@ -387,6 +396,10 @@ acpi_os_read_memory(acpi_physical_address phys_addr, u32 * value, u32 width)
 	u32 dummy;
 	void __iomem *virt_addr;
 
+	if (__check_region(&iomem_resource, phys_addr, width/8))
+		printk (KERN_INFO "ACPI read from allocated iomem %lx\n", 
+			phys_addr);
+
 	virt_addr = ioremap(phys_addr, width);
 	if (!value)
 		value = &dummy;
@@ -415,6 +428,10 @@ acpi_os_write_memory(acpi_physical_address phys_addr, u32 value, u32 width)
 {
 	void __iomem *virt_addr;
 
+	if (__check_region(&iomem_resource, phys_addr, width/8))
+		printk (KERN_INFO "ACPI write to allocated iomem %lx\n", 
+			phys_addr);
+
 	virt_addr = ioremap(phys_addr, width);
 
 	switch (width) {

-- ...
From: Jean Delvare
Date: Friday, March 2, 2007 - 2:41 pm

Hi Matthew,


I like the patch, after adding some casts to the printf args it
compiles fine. However you print warnings each time a resource has been
reserved... without checking if it hasn't been reserved by ACPI itself!
My machine looks like this:

1000-107f : 0000:00:1f.0
  1000-1003 : ACPI PM1a_EVT_BLK
  1004-1005 : ACPI PM1a_CNT_BLK
  1008-100b : ACPI PM_TMR
  1010-1015 : ACPI CPU throttle
  1020-1020 : ACPI PM2_CNT_BLK
  1028-102b : ACPI GPE0_BLK
  102c-102f : ACPI GPE1_BLK

Given that these ports were reserved by ACPI it is perfectly legitimate
that ACPI accesses it, so we must not print a warning in this case. We
need to exclude from the test the regions those "name" starts with
"ACPI", but I'm not sure how we can do that.

Thanks,
-- 
Jean Delvare
-

From: Matthew Garrett
Date: Friday, March 2, 2007 - 2:46 pm

Oops! I'll look into fixing that. Thanks, that's an excellent point...

-- 
Matthew Garrett | mjg59@srcf.ucam.org
-

From: Jean Delvare
Date: Tuesday, March 6, 2007 - 2:28 pm

Hi Matthew,


Here is what I have come up with, by mixing your patch with Rudolf
Marek's one. Again this is only a reporting patch, but now it only
reports real unreserved accesses. I plan to use it for debugging
purposes.

Signed-off-by: Jean Delvare <khali@linux-fr.org>
---
 drivers/acpi/osl.c |   72 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 72 insertions(+)

--- linux-2.6.21-rc2.orig/drivers/acpi/osl.c	2007-03-06 20:59:00.000000000 +0100
+++ linux-2.6.21-rc2/drivers/acpi/osl.c	2007-03-06 21:33:13.000000000 +0100
@@ -35,6 +35,7 @@
 #include <linux/kmod.h>
 #include <linux/delay.h>
 #include <linux/workqueue.h>
+#include <linux/ioport.h>
 #include <linux/nmi.h>
 #include <linux/acpi.h>
 #include <acpi/acpi.h>
@@ -370,6 +371,7 @@ u64 acpi_os_get_timer(void)
 acpi_status acpi_os_read_port(acpi_io_address port, u32 * value, u32 width)
 {
 	u32 dummy;
+	struct resource *conflict, res;
 
 	if (!value)
 		value = &dummy;
@@ -388,6 +390,23 @@ acpi_status acpi_os_read_port(acpi_io_ad
 		BUG();
 	}
 
+	res.flags = IORESOURCE_IO;
+	res.name = "_ACPI Access";
+	res.start = port;
+	res.end = port + width/8 - 1;
+
+	conflict = ____request_resource(&ioport_resource, &res);
+	while (conflict && conflict->child)
+		conflict = ____request_resource(conflict, &res);
+
+	if (conflict && strncmp(conflict->name, "ACPI ", 5)) {
+		printk (KERN_INFO "ACPI read from allocated ioport %lx, value %lx, width %d\n",
+			(unsigned long)port, (unsigned long)(*value), (int)width);
+	}
+
+	if (conflict == NULL)
+		release_resource(&res);
+
 	return AE_OK;
 }
 
@@ -395,6 +414,25 @@ EXPORT_SYMBOL(acpi_os_read_port);
 
 acpi_status acpi_os_write_port(acpi_io_address port, u32 value, u32 width)
 {
+	struct resource *conflict, res;
+
+	res.flags = IORESOURCE_IO;
+	res.name = "_ACPI Access";
+	res.start = port;
+	res.end = port + width/8 - 1;
+
+	conflict = ____request_resource(&ioport_resource, &res);
+	while (conflict && conflict->child)
+		conflict = ...
From: Bjorn Helgaas
Date: Friday, April 13, 2007 - 11:18 am

Sorry to join this discussion so late.

ACPI tells us the resources used by devices.  Today, we don't reserve
ACPI resources until a driver claims a device.  PCI does some sort of
reservation up front, before the driver claims devices.  Conceptually,
I think ACPI should do the same thing, and I don't think it's that
hard to do.

But breaking things like lmsensors would make the transition painful.

Bjorn
-

From: Pavel Machek
Date: Friday, April 13, 2007 - 1:07 pm

Problem seems to be that ACPI does _not_ tell us which ports it
accesses from AML code.

But we already found a lock we can take; AFAICT we know how to solve
this problem.
									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-

From: Bjorn Helgaas
Date: Friday, April 13, 2007 - 1:59 pm

I think that would violate at least the spirit of the ACPI spec.
The example in section 11.6 of the ACPI 3.0 spec shows a _TMP
method that runs an EC method to read the temp, and the EC ioport
usage is correctly declared in the EC device's _CRS method.

Of course, there are always BIOS defects.  But if we could make a
case that a BIOS that doesn't declare the resources used by the AML
is defective, we could add quirks to reserve the undeclared resources.

Chuck's last update (http://lkml.org/lkml/2007/2/20/136) says his
problem turned out to be unrelated to k8temp and may have gone away

This might solve it, but doesn't seem like a clean way to do it.
I don't like the idea of sharing a lock between drivers and ACPI.
k8temp happens to be x86-dependent, so we'll always have ACPI, but
in principle, we could have the same problem with an arch-independent
PCI driver that only has ACPI on x86 and ia64 platforms.

(BTW, if Chuck's problem was solved by the BIOS update, I assume
there *is* another instance of the problem that we're trying to
solve with this lock.)

Bjorn
-

From: Jean Delvare
Date: Sunday, April 15, 2007 - 2:41 am

Hi Bjorn,


Only realistic if the list of systems needing a quirk is small. Do you

We even have two solutions, but both have drawbacks.

The AML lock solution has performance issue. It could be improved with a
better semaphore primitive, but that needs to be implemented first. It
also requires to modify all drivers ACPI might conflict with, and
that's a rather long list.

Rudolf Marek's port forwarding idea should perform better, and might
also be useful for other problems than ACPI vs. regular drivers,
however it makes resources bigger, and requires to modify the same long
list of drivers, and with specific code for every driver. I'm not sure
that it will be possible for all driver in practice, as this more or

We can protect the additional driver code with #ifdef CONFIG_ACPI, as I
did in my proof-of-concept. Not particularly elegant, granted, but it
should work.

If something cleaner can be done, I'm all for it. Could you please
propose an implementation of your idea? It doesn't need to be perfect,

Yes, there are several other systems out there which are known to be
affected by the problem. And probably many others where ACPI access I/O
ports reserved by regular drivers and we are simply lucky that nothing
bad happens. My desktop system has such a motherboard (Jetway K8M8MS).
But SMP alternatives are becoming more popular, so we might start
seeing more problems in practice in a near future.

-- 
Jean Delvare
-

From: Bjorn Helgaas
Date: Sunday, April 15, 2007 - 1:31 pm

I don't know.  I confess that I don't clearly understand the problem
yet.  It sounds like the sensor drivers want to talk to hardware that
ACPI methods might also use.

But I missed the details, such as the specific devices in question,
which ports they use, how they are described in ACPI, which AML
methods use those ports, and which non-ACPI drivers also use them.

It also sounds like the non-ACPI drivers provide much more
functionality than ACPI exposes.  I'd like to understand this,
too, because an  obvious way to solve the problem would be to
drop the non-ACPI drivers.  Is this extra functionality available
on Windows?  If so, do we know whether Windows uses non-ACPI drivers
or whether they have some smarter way to use ACPI?  In the long
run, I think the easiest, most reliable route would be to use the
system in a similar way.  Then we'd be doing things the way the
manufacturer intended and we'd take advantage of all the Windows-
focused firmware testing.

Bjorn
-

From: Luca Tettamanti
Date: Sunday, April 15, 2007 - 1:59 pm

The original report was about the temperature sensors of K8 cpus. It
happens that ACPI reads the sensors while the linux driver is using it
and gets garbage (and shut down the system). The problem is more
generic though, and applies to all hardware monitoring chips for which


Usually ACPI exposes 1 or 2 temperature readings (CPU and
motherboard), while the hw driver can also provide fans and voltages
measurements.

Vendors usually provides a monitoring utility for Win that also
exposes these information. It's not known whether there's a way to
avoid conflicting accesses between ACPI and the raw driver; it's
possible that it's vendor-specific and not documented.

Luca
-

From: Bjorn Helgaas
Date: Sunday, April 15, 2007 - 5:57 pm

Yes, I saw that much, but that's not enough detail to craft a good
solution.

In the case of k8temp, the driver claims PCI devices with a certain
vendor and device ID.  PCI devices are mostly outside the scope of
ACPI.  There's a standard enumeration protocol, and a driver should
be able to claim any device it recognizes without fear of conflict.

I claim that an AML method that accesses a PCI device is
defective because the AML can't know whether a native driver
has claimed the device.

Sometimes the firmware can hide PCI devices so the OS
enumeration doesn't find them.  In that case, AML might
be able to safely use the PCI device, but the native
driver wouldn't be able to claim the device, so there
would be no conflict.  (Linux sometimes uses quirks to
"unhide" things like this, which could lead to a conflict
of our own making.)

I suspect that other sensor drivers may just probe for devices
at "known" addresses hard-coded in the driver.  This is a
problem because the ACPI model is that the OS learns about
all built-in devices via the ACPI namespace.  If it isn't
in the namespace, it shouldn't exist as far as the OS is
concerned.

So we could easily have the situation where ACPI uses a
sensor and does not expose it to the OS in the namespace.
In that case, the firmware expectation is that the OS
won't touch the device.  If the OS pokes around at the
magic addresses and happens to trip over the device, we

If ACPI accesses sensors but there is no native driver, there

I'd be surprised if Windows provided interfaces to coordinate
between two drivers.  My impression is that they really want
to have a single owner for a piece of hardware.  It would be
interesting to figure out how these monitoring utilities work.
Maybe the monitor and the AML both go through an embedded
controller driver and coordinate that way?

Bjorn
-

From: Luca Tettamanti
Date: Monday, April 16, 2007 - 2:14 pm

Usually sensors are attached to SMBUS or available in ISA IO space.
AFAIK they're not enumerated anywhere (at least I2C devices are not, you
poke at various addresses and see if something responds - not sure about


Hum, the utility I have here (Asus PC Probe) seems to use ACPI:

Pro2.dll:
[Ordinal/Name Pointer] Table
        [  11] OCAPI_ACPI_OC_PresetList
        [  12] OCAPI_ACPI_OC__MB_GetBoardName
        [  23] OCAPI_CheckAiGear
        [   0] OCAPI_CheckIntelPowerSaving
        [   6] OCAPI_CheckWorkable
        [   2] OCAPI_Close
        [   9] OCAPI_EnableQFan
        [  16] OCAPI_GENERAL_GetList
        [  14] OCAPI_GetCpuVoltRange
        [  13] OCAPI_GetCurrentCpuFrequency
        [  15] OCAPI_GetFanStartTemp
        [   3] OCAPI_GetHealthData
        [   4] OCAPI_GetHwSensorData
        [  22] OCAPI_GetMBIF
        [   8] OCAPI_GetQFanInfo
        [   5] OCAPI_HW_EnumerateOption
        [   7] OCAPI_HideQFan
        [   1] OCAPI_Initialization
        [  19] OCAPI_NQFAN_GetData
        [  21] OCAPI_NQFAN_GetList
        [  20] OCAPI_NQFAN_SetData
        [  17] OCAPI_QFAN_GetData
        [  18] OCAPI_QFAN_SetData
        [  10] OCAPI_SetQFanTarget
        [  24] ___CPPdebugHook

ASiO.dll:
[Ordinal/Name Pointer] Table
        [   1] ASIO_Close
        [   9] ASIO_GetCpuID
        [   3] ASIO_InPortB
        [   5] ASIO_InPortD
        [   7] ASIO_MapMem
        [   0] ASIO_Open
        [   4] ASIO_OutPortB
        [   6] ASIO_OutPortD
        [  10] ASIO_ReadMSR
        [   8] ASIO_UnmapMem
        [  11] ASIO_WriteMSR
        [   2] OC_GetCurrentCpuFrequency

It seems that Asus exposes monitorining data using "ATK0110" (enumerated
in DSDT); I see it both on my P5B-E motherboard and on my notebook (L3D)
(they have different methods though). Another motherboard with the same
device may actually call it "FOOBAR123" or "WTFISTHIS".

Problem is that ACPI methods are not documented at all (how am I
supposed to know that "G6T6" is the reading ...
From: Bjorn Helgaas
Date: Monday, April 16, 2007 - 3:28 pm

Yup, we have the same problem with other devices.  See the long list

Yes, I see that it's attractive to use a single w83627ehf.c driver.
For an ACPI driver, we'd have to build a list of PNP IDs, and possibly
information about which methods read which information.  That's
certainly more work.

On the other hand, the ACPI driver would avoid the synchronization
issues that started this whole thread.  That's a pretty compelling

Maybe Asus didn't hook up those readings on the board.  I would
guess that PC Probe doesn't expose the VSB or battery voltage either.

I'm sure you've seen these:
  http://lists.lm-sensors.org/pipermail/lm-sensors/2005-October/014050.html
  http://www.lm-sensors.org/wiki/AsusFormulaHacking

Looks like nobody took up the challenge, though :-)  It looks fun
to play with, if only I had the time and hardware.

Bjorn
-

From: Luca Tettamanti
Date: Tuesday, April 17, 2007 - 4:50 pm

PC Probe does not. But the lines are wired and the readings (from

Actually I haven't, I've happily ignored ACPI until now ;-) My DSDT
doesn't look too bad, I may give it a try...

Luca
-

From: Luca Tettamanti
Date: Sunday, April 22, 2007 - 9:55 am

It wasn't hard :) Temperature reading works fine. AML code is still a
bit obscure though: the reading are enumerated in two different
sections:

* TSIF, FSIF, VSIF: they have read/write methods, easy to understand (I'm
  using these right now)

* SITM/GITM and GGPR for enumeration: more settings are available and in
  part they overlap the others. The methods are very strange, for e.g.
  temperature GITM (which calls GIT6) returns a hard-coded magic value;
  other methods (e.g. CPU frequency, Q-FAN settings) write a random
  number somewhere in the system memory and return an obscure magic
  value... very cool :)


I'm posting the proof-of-concept driver. A few notes:
- only temperature reading is implemented: I just got my MS degree so
  I've been drun^Wbusy
- sysfs files are still RO
- coding style: yes, I know... will cleanup
- There's a gazillion of debugging printk's :)
- I've extended the hwmon sysfs interface, with a new "temp%d_name";
  ACPI knows how the sensor is wired
- You'll need the following patch (against current Linus' git):

diff --git a/drivers/acpi/namespace/nsutils.c b/drivers/acpi/namespace/nsutils.c
index 90fd059..97e1139 100644
--- a/drivers/acpi/namespace/nsutils.c
+++ b/drivers/acpi/namespace/nsutils.c
@@ -700,6 +700,7 @@ struct acpi_namespace_node *acpi_ns_map_handle_to_node(acpi_handle handle)
 
 	return (ACPI_CAST_PTR(struct acpi_namespace_node, handle));
 }
+EXPORT_SYMBOL(acpi_ns_map_handle_to_node);
 
 /*******************************************************************************
  *
@@ -736,6 +737,7 @@ acpi_handle acpi_ns_convert_entry_to_handle(struct acpi_namespace_node *node)
 	return ((acpi_handle) Node);
 ------------------------------------------------------*/
 }
+EXPORT_SYMBOL(acpi_ns_convert_entry_to_handle);
 
 /*******************************************************************************
  *
@@ -875,6 +877,7 @@ acpi_ns_get_node(struct acpi_namespace_node *prefix_node,
 	ACPI_FREE(internal_path);
 ...
From: Jean Delvare
Date: Tuesday, April 17, 2007 - 3:03 am

Hi Bjorn, Luca,


Yes, it is, and the latest news are that there were no problems with

True, and we are stepping back on this. See:
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=9208ee8286...

However, it's not so easy, as I have been explaining to Pavel lately.
Some machines have the ACPI vs. lm-sensors conflict while the SMBus
wasn't hidden. On others, the SMBus was hidden and there was no
conflict. So we are not going to just stop unhiding the SMBus and fix
all the problems that way. It's neither necessary, nor sufficient. The

Legacy ISA hardware monitoring chips (W83781D, W83782D, LM78, LM79)
were indeed probed at an arbitrary address (0x290). I pretty much doubt
that these old chips are found on systems with (working) ACPI though.

More recent "ISA" hardware monitoring chips are embedded in Super-I/O
chips, they are in fact LPC devices, not ISA. The base I/O address is
read from the Super-I/O configuration space, (0x2e/0x2f or 0x4e/0x4f),
just like serial ports, parallel ports, floppy disk controllers, IR,
etc. So these devices _are_ enumerated. Sometimes they are also listed
as PNP devices.

As for I2C/SMBus devices, they are indeed not enumerated. But I fail to

Interesting theory, but how does it fly in practice? Not much, I guess.
I doubt you could do any useful with an ACPI-enabled system today


Asus does things like this, yeah, but I don't remember any other vendor
having such ACPI methods. It would make sense to write a driver for
this Asus stuff. One of the problem we have in hardware monitoring
drivers is that we usually don't know how the chip is wired on a given
motherboard. The ACPI data might help.

For the other vendors, my assumption is the same as yours: they either
ignore the conflict, or have proprietary, undocumented ways to get
around it. After all, there must be a reason why there is no native
sensors support in Windows and every vendor provides it's own tool.

-- 
Jean Delvare
-

From: Rudolf Marek
Date: Sunday, February 18, 2007 - 3:43 pm

Hello all,

I got the DSDT from chuck and it seems there is nothing interesting - no
declaration of PCI_config for the registers. If someone wants to check it I can
send him the DSDT.

_TMP looks like this:

 Store (\_SB.PCI0.LPC0.EC0.RTMP, Local0)
                    Store ("Current temp is: ", Debug)

                    Store (Local0, Debug)

                    Store (Local0, \_SB.CM25)

                    Return (Add (0x0AAC, Multiply (Local0, 0x0A)))

This looks quite OK LPC0.EC0 is embedded controller IO RAM at 0x62. Nothing
special. I guess some SMM interrupt is reading the the PCI regs and sends it to EC.


Chuck, please can you try to use your script which reads the temps from acpi
once a second and in other console run following command: (as root)

 watch -d -n 1 lspci -xxx -s 18.3

This will list the pci registers every second and marks the diff. You should see
changes at 0xe6 which is your temperature. However if you see 0xe4 changing and
the k8temp driver is NOT loaded - it means something else is doing that.
If you spot the 0xe4 changes, please stop your script and check if it is still
changing.


Maybe you will not be able to see any changes at all - then the SMM is doing
reads only. Also you did not say how many cores, or places temps you see using
the k8temp driver. You have dual core with two possible places?

Thanks
Rudolf
-

From: Chuck Ebbert
Date: Tuesday, February 20, 2007 - 8:08 am

I blacklisted the k8temp driver (and the out-of-tree k8_edac driver
in Fedora) and the temps were still volatile, so that's not causing
it. Since then I've upgraded the system BIOS from F.06 to F.27 and
the problems _may_ have gone away. My own custom 2.6.19 kernel has
never been a problem, so I'm thinking it's one of these drivers
loaded by Fedora that I never even compile:

	i2c_core
	i2c_ec
	i2c_piix4
	asus_acpi (on a Compaq???)
	sbs


-

From: Dave Jones
Date: Tuesday, February 20, 2007 - 12:11 pm

On Tue, Feb 20, 2007 at 10:08:26AM -0500, Chuck Ebbert wrote:
 
 > 	i2c_core
 > 	i2c_ec
 > 	i2c_piix4
 > 	asus_acpi (on a Compaq???)
 > 	sbs

Something is pulling in asus_acpi as a dependancy. I've never
figured out what the cause is.  For a long time I was thinking
that we had an explicit modprobe for it in an initscript, but
grepping for it in /etc turns up zip.

		Dave

-- 
http://www.codemonkey.org.uk
-

From: Jean Delvare
Date: Wednesday, February 21, 2007 - 9:17 am

How could it be, given that asus_acpi doesn't export any symbol?

-- 
Jean Delvare
-

From: Dave Jones
Date: Wednesday, February 21, 2007 - 10:37 am

On Wed, Feb 21, 2007 at 05:17:37PM +0100, Jean Delvare wrote:
 > On Tue, 20 Feb 2007 14:11:42 -0500, Dave Jones wrote:
 > > On Tue, Feb 20, 2007 at 10:08:26AM -0500, Chuck Ebbert wrote:
 > >  
 > >  > 	i2c_core
 > >  > 	i2c_ec
 > >  > 	i2c_piix4
 > >  > 	asus_acpi (on a Compaq???)
 > >  > 	sbs
 > > 
 > > Something is pulling in asus_acpi as a dependancy. I've never
 > > figured out what the cause is.  For a long time I was thinking
 > > that we had an explicit modprobe for it in an initscript, but
 > > grepping for it in /etc turns up zip.
 > 
 > How could it be, given that asus_acpi doesn't export any symbol?
 
If I knew I'd have fixed it by now.

		Dave

-- 
http://www.codemonkey.org.uk
-

From: Dave Jones
Date: Wednesday, February 21, 2007 - 1:19 pm

On Wed, Feb 21, 2007 at 12:37:40PM -0500, Dave Jones wrote:
 > On Wed, Feb 21, 2007 at 05:17:37PM +0100, Jean Delvare wrote:
 >  > On Tue, 20 Feb 2007 14:11:42 -0500, Dave Jones wrote:
 >  > > On Tue, Feb 20, 2007 at 10:08:26AM -0500, Chuck Ebbert wrote:
 >  > >  
 >  > >  > 	i2c_core
 >  > >  > 	i2c_ec
 >  > >  > 	i2c_piix4
 >  > >  > 	asus_acpi (on a Compaq???)
 >  > >  > 	sbs
 >  > > 
 >  > > Something is pulling in asus_acpi as a dependancy. I've never
 >  > > figured out what the cause is.  For a long time I was thinking
 >  > > that we had an explicit modprobe for it in an initscript, but
 >  > > grepping for it in /etc turns up zip.
 >  > 
 >  > How could it be, given that asus_acpi doesn't export any symbol?
 >  
 > If I knew I'd have fixed it by now.

Ah, Fedora has this horror in its initscripts (which explains why I missed
it in my grep)..

# Initialize ACPI bits
if [ -d /proc/acpi ]; then
    for module in /lib/modules/$unamer/kernel/drivers/acpi/* ; do
        module=${module##*/}
        module=${module%.ko}
        modprobe $module >/dev/null 2>&1
    done
fi


This is there because there's no clean way for userspace to know whether
to load the system specific stuff right now.   Bill Nottingham pointed
out that we could add a /sys/class/dmi/modalias and appropriate MODULE_DMI
tags to the various modules like asus_acpi to make udev autoload them.

		Dave

-- 
http://www.codemonkey.org.uk
-

From: Jean Delvare
Date: Thursday, February 22, 2007 - 9:37 am

Ah, this also explains why the i2c_ec and sbs drivers were loaded on

Something similiar should be doable for i2c_ec, as it's only useful if a
given ACPI object is present. sbs, in turn, is only useful if i2c_ec is
loaded.

-- 
Jean Delvare
-

From: Hans de Goede
Date: Friday, February 23, 2007 - 12:13 am

I'm thinking that it might be an idea to also use this idea of udev autoloading 
through DMI info for the abituguru and abituguru3 driver (review please). The 
both only support about 12 motherboards. For the abituguru driver, dmi info 
could also be used to automatically set the module options needed on the 2 
oldest uguru featuring abit motherboards. What do you think about this?

Regards,

Hans


-

From: Jean Delvare
Date: Friday, February 23, 2007 - 12:47 am

Given that the uguru chips are hard (impossible) to detect and only a
small number of boards need it, yes, I think it's a good idea.

-- 
Jean Delvare
-

From: Jean Delvare
Date: Wednesday, February 21, 2007 - 7:54 am

Hi Chuck,



Presumably autoloaded by the ACPI subsystem, I guess your ACPI

i2c-piix4 will autoload if a supported PCI device is found on your
system. Assuming this is the same physical bus as i2c_ec is exposing,
it's no good to load both i2c-piix4 and i2c_ec at the same time.
Unfortunately i2c_ec doesn't request the I/O resources it uses so this
kind of conflict cannot be avoided currently.

Can you try to load the i2c-dev driver, then run the following commands
and report the results:
$ i2cdetect -l
For each bus listed:
$ i2cdetect N

This is a new battery driver used in conjunction with i2c_ec. I guess
you have a smart battery in your laptop which is accessed through
the SMBus. I found that this driver bypasses the i2c-core locking,
which is really bad. I reported it one week ago:
http://marc.theaimsgroup.com/?l=linux-acpi&m=117160531631100&w=2
(for some reason my original post wasn't archived)
My patch wasn't applied, but the problems you describe could well be
caused by this locking issue. So I suggest that you unload the sbs
driver and see if things get better. If they do, you could try to apply
my patch and load sbs again, and see if it fixes it.

-- 
Jean Delvare
-

From: Chuck Ebbert
Date: Wednesday, February 21, 2007 - 9:03 am

FWIW it's really an ATIIXP chipset, but supposedly PIIX4 compatible:

# i2cdetect -l
i2c-0   smbus           SMBus PIIX4 adapter at 8400             Non-I2C SMBus adapter
# i2cdetect 0
WARNING! This program can confuse your I2C bus, cause data loss and worse!
I will probe file /dev/i2c-0.
I will probe address range 0x03-0x77.
Continue? [Y/n] y
     0  1  2  3  4  5  6  7  8  9  a  b  c  d  e  f
00:          XX XX XX XX XX XX XX XX XX XX XX XX XX
10: XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX
20: XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX
30: XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX
40: XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX
50: 50 51 XX XX XX XX XX XX XX XX XX XX XX XX XX XX
60: XX XX XX XX XX XX XX XX XX 69 XX XX XX XX XX XX
70: XX XX XX XX XX XX XX XX
-

From: Jean Delvare
Date: Wednesday, February 21, 2007 - 9:22 am

No i2c_ec. Maybe your distribution is loading it by default for
everyone then.

Either way, it means you can forget right away about sbs, if i2c_ec

Only a couple EEPROMs and a clock chip on your SMBus, it's very
unlikely that ACPI accesses this at all. So I'd be surprised that
i2c-piix4 is causing any trouble.

This leaves asus_acpi as the only candidate? Better unload _all_ the
drivers you consider as suspects, and see if it changes anything. I
guess not.

-- 
Jean Delvare
-

Previous thread: Re: [PATCH] serial driver PMC MSP71xx, kernel linux-mips.git mast er by Marc St-Jean on Friday, February 16, 2007 - 10:06 am. (2 messages)

Next thread: [PATCH] serial driver PMC MSP71xx, kernel linux-mips.git master by Marc St-Jean on Friday, February 16, 2007 - 10:39 am. (1 message)