Re: LMSENSORS: 2.6.26-rc, enabling ACPI Termal Zone support costs sensors

Previous thread: [PATCH] Fix serial_match_port() for dynamic major tty-device numbers by Guennadi Liakhovetski on Saturday, June 21, 2008 - 3:45 pm. (3 messages)

Next thread: [OLPC] sdhci: add quirk for the Marvell CaFe's vdd/powerup issue by Andres Salomon on Saturday, June 21, 2008 - 6:15 pm. (4 messages)
From: Rene Herman
Date: Saturday, June 21, 2008 - 5:47 pm

Good day.

On 2.6.26-rc and perhaps earlier, when I enable the ACPI Thermal Zone 
support (CONFIG_ACPI_THERMAL) I see in dmesg:

ACPI: LNXTHERM:01 is registered as thermal_zone0
ACPI: Thermal Zone [THRM] (56 C)

My /sys/class/hwmon/hwmon0 (a W83782D chip) becomes hwmon1, there's a 
new /sys/class/hwmon/hwmon0 and "sensors -s" craps out with:

# sensors -s
Can't access procfs/sysfs file
Kernel interface access error
For 2.6 kernels, make sure you have mounted sysfs and libsensors
was compiled with sysfs support!

# sensors --version
sensors version 2.10.6 with libsensors version 2.10.6

This is the slackware 12.1 (recent) standard version. What's wrong?

In case it's useful, my /etc/sensors.conf is at:

http://members.home.nl/rene.herman/sensors.conf

Rene.
--

From: Hans de Goede
Date: Sunday, June 22, 2008 - 12:28 am

I'm pretty sure this caused by your lm_sensors using space being to old to 
support the new thermalzone stuff. You need atleast 3.0.2 to support the 
thermalzone driver.

Regards,

Hans
--


On 22-06-08 09:28, Hans de Goede wrote:



I see. I was about to mark this up as Volkerding doing his usual "if it 
has a lower version number it must be better" thing but in this case it 
seems it's hwmon or ACPI which is to blame.

Firstly -- with CONFIG_ACPI_THERMAL selected my sensors work fine on 
2.6.25-rc7 with the above 2.10.6 lm_sensors userspace. Now, with 
2.6.26-rc (*) they do not as per above.

This is ABI breakage. I wouldn't care if my older lm_sensors userspace 
couldn't handle the ACPI Thermal Zone, but I do care that even having it 
breaks my other sensors; especially given the CONFIG_ACPI_THERMAL help 
text which can not be read as recommending to disable it:

   This driver adds support for ACPI thermal zones.  Most mobile and
   some desktop systems support ACPI thermal zones.  It is HIGHLY
   recommended that this option be enabled, as your processor(s)
   may be damaged without it.

Now, I'm actually usally a big fan of not dragging around old gunk 
forever, ABI be damned, but in this case this really won't do. 2.6.10 is 
a recent maintenance release and I see for the new 3.0 branch:

http://www.lm-sensors.org/wiki/Download

===
Most third party monitoring applications do not yet work with the 
library in this package. We are encouraging authors to port their 
applications to the new library. We already have patches for xsensors 
0.60, gkrellm-2.3.0, net-snmp-5.4.1 (configure with 
--with-mib-modules="ucd-snmp/lmsensorsMib" --with-ldflags="-lsensors"), 
xfce4-sensors-plugin-0.10.99.2, kdebase-3.5.8(ksysguard), 
sensors-applet-1.8.1 and ksensors-0.7.3-fedora-14.tar.gz (upstream is 
dead this tarbal contains a version with all Debian's changes + 2 
patches from Fedora, including lm_sensors-3.x support).
===

So it seems we have here a change to the kernel requiring a userspace 
basically noone is ready for and which at least the (again, recent) 
slackware 12.1 doesn't ship as a result. This is ABI breakage of the 
really bad kind.

If there's ...

No it is not, in 2.6.26rcX, the acpi thermalzones have grown a hwmon interface, 
that is they register a hwmon device so that "sensors" and other lm_sensors ABI 
compliant using applications can read the zone temperatures using an existing 
ABI instead of adding yet another ABI.

The problem is that the hwmon entries for the thermalzone device lack a 
"device" symlink under /sys/class/hwmon/hwmonX, as they are not tied to a 
specific device. lm_sensors-2.10.x barfs on this, which would not be a problem 
if it would simply skip with the new hwmon which it does not understand 
(because of the missing device link, which is not a part of the documented 
ABI), but instead of skipping it, if I understand you correctly it aborts and 
never gets to hwmon1 (which is 100% unchanged and should still work fine).

I wonder what just plain "sensors" (without the -s) does.

Still this is an issue that needs fixing, but not on the kernel side, but 

Actually its lm_sensors userspace which is to blame, as instead of skipping the 
new hwmon device which it doesn't understand it aborts (atleast that is what I 


Erm if you look at that same page you will notice there are links to patches 
for almost every userspace package which uses lm_sensors, I know as I wrote 
most of them, quite a few of them have been integrated by their resp. upstreams.

Also "noone is ready for"? lm_sensors-3.0.x is the default in both Fedora 9 and 

I agree that requiring libsensors-3.0.2 for this is not a good solution, but I 
don't want to be crippling the kernel for what I believe is a bug in 2.10.x either.

So we need todo 2 or 3 things:
1) Find out if this really is as big an issue as you make it, maybe
    "sensors -s" is rightfully complaining about hwmon0 and then still happily
    doing its job for hwmon1?

    Again, what does plain "sensors" say? Does it still show the hwmon1
    readings, and are the limits what they should be after sensors -s?

2) Fix this in the 2.10.x series (which are still ...

[ ... ]

Now what? Yes it is. 2.6.25.7 works and 2.6.26-rcX with the same config 
options and the same userspace does not. What do you think ABI breakage 
is? It's not relevant that you feel/know that the lm_sensors userspace 
has a bug; it used to work, it's widely installed and the new kernel 

You don't need to. As said, just make it optional. The attached seems to 
be working for me.

Rene.

Know what? No it isn't. Just because some random userspace apps breaks because 
certain _assumptions_ no longer hold true, does not make something an ABI breakage.

I agree with you that the results are still no good though.

Know something else? I've just stopped caring about this issue, I'm not the 
author of the changes causing said breakage. I'm merely an lm_sensors (both 
userspace and kernel space) developer who was heavily involved in getting this 
fixed for lm_sensors-3.0.2, and I believe that adding yet another kconfig 
option which we then carry for years and years is _not_ a good solution. Some 
userspace utlities like udev sit very close to the kernel and sometimes an 
kernel update mandates a new udev. To me this is much the same.

But at the end of the day, I do not feel responsible for this as I'm not the 
author of the code causing the breakage. I'm just someone who knows the ins and 
outs and tried to help, but given the treatment and thanks I've been getting 
for my help I'm stopping with helping now.

Regards,

Hans
--


What on earth are you talking about? Could you please re-read? I didn't 
  "treat badly" you, hwmon, acpi or whatever.

I'm simply pointing out the problem that 2.6.26 is going to break all 
setups using lm_sensors 2.0 (which among many, many others includes 
every single slackware and derivative system on the planet).

We are not having a flamewar. If you think that every disagreement or 
pointing out of a problem constitutes as much, that in itself is a 
problem but it's not mine. I reported the problem and then posted a 
patch that solves it one particular way.

Another way to solve it _could_ be to just make up a device link if 
something generic is available so that sensors doesn't trip over it in 
the first place but I don't know if that's a good option. You might.

I haven't a clue what you're talking about. Treatment? What treatment? I 
just want to get the above mentioned problem fixed and didn't suggest 
anything else. Let's get the problem fixed.

Rene.
--


This also works for me and, if correct, is ofcourse better than the 
CONFIG option. Wants a comment from the thermal_zone side (for which 
Zhang Rui seems the correct CC?) though.

Rene

Hi, Rene and Hans,

Thank you for your efforts on this issue and sorry for the late
response, I did not check my email during the whole weekend.

About the hwmon ABI, after the device symbol link is created, are there
any other ABIs required in the device node? 
If no, this patch seems to work, although it might break if the first
registered ACPI thermal zone device is unregistered, which ONLY happens
theoretically.

thanks,

--


Not that I know of, Jean ?

Regards,

Hans
--


Mmm. Because more thermal zones may share one hwmon interface I gather?
Do you feel this is an okay minimal fix for 2.6.26 or is there something 

--

From: Jean Delvare
Date: Monday, June 23, 2008 - 4:06 am

Hi Rui,


Not that I know of. The code currently in the kernel works just fine
with both lm-sensors SVN trunk (which will become 2.10.7 soon) and

If I remember correctly, there are more than one device in a given
thermal zone, so having a link pointing to one of them would create an
asymmetry. This basically means that nobody can use the device link in
question, so there's no point in creating it from a functional point of
view. If the goal is simply to solve the breakage that was reported by
Rene, the bug is in libsensors, so that' where it must be fixed (and
actually it already is.)

Thanks,
-- 
Jean Delvare
--

From: Jean Delvare
Date: Monday, June 23, 2008 - 3:56 am

Hi Rene,


You certainly did. I felt the same as Hans when reading the discussion
thread.

Now, you are lucky that I already know you and I know that you are
usually helpful and cooperative, so I know that you probably didn't
mean the aggressive tone you used. But presumably Hans doesn't know
you, so his reaction was understandable.

Hans is one of the main contributors to the hwmon subsystem and the
lm-sensors project in general. I would be grateful if the users he
tries to help could treat him with the respect he reserves. If he was
to leave the project, this would be a significant loss, and this is
something I really want to avoid.

Thanks,
-- 
Jean Delvare
--


Rene,
Thank you for reporting this.

I agree that this failure is an unwelcome surprise to those users
who upgrade to 2.6.26 but are still using libsensors <= 2.10.6.

Jean, Mark, Hans,

I'm actually fine with adding a temporary kernel config option
along the lines Rene suggested to ease the  migration
to linux-2.6.26 for those users.

But the config option would need to be scheduled for removal
after a certain period (say 6 months) so we don't have to maintain
it forever.

More importantly, I think it would also have to be disabled by default
so that it would not have a negative impact on what we think are the
majority of properly configured systems.  After all, we fixed this
bug in user-space about out 4 months ago and as you point out, the
distro upgrade path is actually quite well looked after.

So I'm not sure how useful it would be to the target users.
After they run into the problem, they'd probably google it
and find that they can either tweak a kernel config option
or upgrade libsensors.  And we'd prefer that they do the
later rather than the former, yes?

just let me know.

thanks,
-Len
--

From: Jean Delvare
Date: Monday, June 23, 2008 - 12:54 pm

Hi Len,


If the option defaults to thermal zone being enabled, and users have to
rebuild their kernel to disable it, then I don't think it has any
value. Patching and rebuilding libsensors is no harder than
reconfiguring and rebuilding the kernel, so we can as well tell the
users to do the former.

If the option defaults to thermal zone being disabled, then it makes
some sense as a way to smooth the transition. If the help text is clear
enough (clearly saying for which versions of lm-sensors users can
enable the option and for which versions they would rather not) it
should work. The drawback being that users in a hurry might stick to
the default without reading the help text, and miss an opportunity to
have the ACPI thermal zones magically integrated into their favorite
monitoring application.

Whether avoiding the risk of easily fixable breakage for some users is
worth the temporary loss of new functionality for other users, I can't
really say. I think I wouldn't do it myself, but I'm not the only one
to decide. If we decide to do it, I have no objection.

Thanks,
-- 
Jean Delvare
--


Thank you for the reason. Yes, people upgrading libsensors would be 
preferred and this issue should show up in google now.

Frankly, if I had gone to the lm-sensors homepage and had seen a fixed 
2.x version available I'd no doubt have shrugged it off then and there 
so I might as well do so now that Delvare announced that it's _going_ to 
be there real soon. 2.10.7 should be painless enough as an upgrade, it's 
needing lm-sensors 3.0(.2) and all its listed accompanying patches to 
programs depending on libsensors which triggered this thread.

So, <shrug>. I know how to fix my own systems, others will be able to 
find out and once lm-sensors 2.10.7 is available, it's not a big deal 
anymore.

That said, here's the "make it optional" patch tweaked according to your 
comments. Feel free to drop it on the floor...

Rene.

Typo(s) in the feature-removal-schedule (only change).

Rene.

On 22-06-08 16:29, Hans de Goede wrote:


Same thing as sensors -s:

rene@7ixe4:~$ sensors 

Can't access procfs/sysfs file
Kernel interface access error
For 2.6 kernels, make sure you have mounted sysfs and libsensors
was compiled with sysfs support!

And no, the values are not being set by sensors -s, it's not just 
complaining but still working.

Rene.
--

From: Jean Delvare
Date: Monday, June 23, 2008 - 3:08 am

Hi Hans,


Technically speaking, Hans is right. The problem doesn't qualify as an
"ABI breakage". Just because a kernel update broke user-space, doesn't
imply API breakage.

That being said, the truth is that we don't really care how the
breakage is called. It's broken and needs to be fixed, that's all we


Can you please update the wiki to say so? I was myself wondering
whether your patches had been integrated or not by now.

Maybe we could present things a bit better, for example with a table or
a bullet-list, listing all applications, with the minimum version that
has libsensors 3 support (for applications which are already updated
upstream) or a link to the patch (for those which are not.) This could
stay on the Download page, or be moved to a dedicated page. If you have
some time to do something like this, please do. If not, please tell me,

Note: openSuse 11.0 ships with both libsensors 2.10.6 and libsensors
3.0.2. sensors links with the latter, but other applications presumably
link with the former. This version of openSuse is the transition one.

Actually, requiring libsensors 3.0.2 wouldn't be such a bad solution.
libsensors 2.10.x lacks support for many new devices. It also needs
libsysfs, which most distributions are trying to get rid of.

You can't at the same time upgrade to the latest kernel and expect
legacy branches of all user-space tools to keep working. Legacy means,

In fact, it is already fixed. If you look carefully at:
http://www.lm-sensors.org/wiki/Download

You'll see that I maintain a list of recommended patches for the last
version of each branch of lm-sensors. For version 2.10.6, there are
currently 3 recommended patches, the first one being:
http://www.lm-sensors.org/changeset/5147

Which, you guess, fixes the problem reported by Rene.


No, the kernel does the right thing and does not need to be modified at
all.

-- 
Jean Delvare
--

From: Rene Herman
Date: Monday, June 23, 2008 - 3:24 am

No Jean, this is totally unacceptable. No matter how you want to call 
things, 2.6.26 is going to break important functionality on millions of 
systems and you simply do not get to do that. Can you comment on the 
last patch posted? It's trivial:

http://lkml.org/lkml/2008/6/22/243

Rene.

--

From: Jean Delvare
Date: Monday, June 23, 2008 - 4:57 am

Hi Rene,


No, it's not going to be the end of the world that you predict. Please
stop being alarmist, it really doesn't help.

We are going to break hardware monitoring for users who upgrade to
kernel 2.6.26 by themselves and have enabled option "THERMAL" and are
using lm-sensors <= 2.10.6. I suspect this is a relatively small number
of users, and these are also the ones who are presumably skilled enough
to go to http://www.lm-sensors.org/, find the patch they need, and
apply it to libsensors themselves. 

We are not going to break any system using a distribution kernel
because distributions test their kernel at least to some extent before
they release it, and that kind of breakage can't go unnoticed. So,
distributions which haven't completely switched to lm-sensors 3.x yet,
will see the breakage and patch their libsensors 2.10.6 to fix it. For
what it's worth, the patch in question is in openSuse since March 17th.

For distributions who have good maintainers, there should never be any
problem anyway. We maintain and publish a list of recommended patches.
A distribution with all these patches applied should avoid all known
problems, compatibility or otherwise.

Please also realize that I personally keep the maintainers of the
Fedora, openSuse and Debian lm-sensors packages informed when I update
the list of recommended patches. If I should forget to do so and they

It's trivial and wrong, so thanks but no thanks. The bug is in
libsensors, we fix it in libsensors.

-- 
Jean Delvare
--

From: Rene Herman
Date: Monday, June 23, 2008 - 5:35 am

Which is an option selected by ACPI_THERMAL, for which I quoted the help 
text earlier. Basically everyone with ACPI enabled will have it enabled. 

Yes, right. All not completely new systems, all completely new slackware 
and derived systems... "relatively" is a word very much needed here.

I really cannot believe you guys are actually arguing this. It seems 
that me being tired and short pulled this in to senseless country but 
can we please concentrate on the issue?

libsensors dictated the ABI rule that the hwmon directories must have 
device backlinks; the new ACPI Thermal Zone hwmon interface breaks that 
bit of ABI. It is not relevant that that ABI may have gotten to be as a 
result of unfortunate programming on the userspace side -- the only 
thing relevant is that it IS. lm-sensors 2 is on millions of systems out 
there. This is not meant agressively, or whatever you guys seem to want 

At times there can obviously be situations where it's fine to require 
new userspace but in this case we have a new userspace which hasn't even 
been released yet, we have a ton of _different_ userspace depending  on 
that bit of core userspace, we have breakage of the important kind (as 
you no doubt know, sensors can be pretty vital, although admittedly it's 
not silent breakage at least) and we have an opportunity to just say 
"okay, we'll apply a 1 line patch and be done with it" which avoids any 

This cannot be the reason, because it's not wrong. We just need a device 
backlink. Basically, any single one will do. It's just about keeping 
lm-sensors 2 happy.

Rene.
--

From: Jean Delvare
Date: Monday, June 23, 2008 - 6:47 am

Hi Rene,

FYI: this is my last reply to you as far as this thread is concerned. I
have work to do, and this problem is already solved.


Correct, I had forgotten that it was enabled by ACPI_THERMAL. So indeed

Slackware 12.1 shipped with a 2.6.25 kernel and this isn't going to
change. So I am absolutely not worried about Slackware users in
general. The only users who will hit the breakage are the ones
upgrading their kernel themselves, and that's regardless of the

When you have the feeling that everybody else has gone crazy and you're

We don't care about how many systems use lm-sensors 2. We only care
about how many of these will upgrade to kernel 2.6.26 before they
upgrade to lm-sensors 2.10.7 or patch their lm-sensors 2.10.6. My take
is that these aren't that many people.

Seriously, the kind of "ABI breakage", as you insist on calling it,
happens all the time. Just looking at sensors and only counting the
very big changes, sensors were exposed in /proc in 2.4 kernels and are
now exposed in /sys in 2.6 kernels, and all hardware monitoring devices
were i2c devices to the kernel until kernel 2.6.13 and this is no
longer the case. Just try going a couple lm-sensors versions back while
still running your 2.6.25 kernel and you'll see it won't take long
before at least a specific case breaks.

And I'm fairly certain that we (the lm-sensors group) aren't specially
bad at that. Every other subsystem that evolves quickly must have the
same problems. That's exactly the reason why we have user-space
libraries interfacing with the kernel. When the kernel interfaces
evolve, we update the libraries to take the changes into account. It is
usually so smooth that you don't see it. This time it's a bit less
smooth because we've been too slow releasing 2.10.7. You can't get it
perfect all the time.

I'm really sorry that we don't live in an ideal world where everything

I'm arguing this because you're trying to frighten everyone with
"kernel ABI breakage" and "millions of users" when ...
From: Rene Herman
Date: Monday, June 23, 2008 - 7:06 am

As Zhang Rui said, this cannot happen in reality. I'll stop talking to 
you. Kernel side, it's not your problem anyway, it's an ACPI Thermal 
Zone one. Guess I'll go ask the 2.6.26 release manager if he feels that 
breaking existing lm-sensors 2 userspace systems is acceptable. I myself 
obviously know how to fix things by now.

Admittedly I need to find another hobby because the regularity with 
which people on this list piss me off is definitely disturbing. I also 
have this depressing notion that it might just be people pissing me off 
with frightening regularity period, but oh well.

Gardening... maybe I'll do gardening now.

Rene.
--

From: Matthew Garrett
Date: Monday, June 23, 2008 - 7:31 am

No, libsensors made an assumption about the ABI that turns out not to be 
true. The ABI hasn't changed, libsensors is just being exposed to a case 
it didn't previously see.

We've had this kind of change before. The ACPI backlight code changed in 
such a way that scripts that blindly wrote values instead of (correctly) 
reading the maximum brightness value broke. mmap's behaviour changed in 
such a way that it was no longer possible for vm86 to execute code that 
wasn't mapped as executable, breaking libx86. The applications in 
question were undeniably buggy. Those are examples that I was personally 
involved with - I'm sure there are others. Where userspace has made 
false assumptions, it's not the kernel's responsibility to continue to 
support those assumptions.

-- 
Matthew Garrett | mjg59@srcf.ucam.org
--

From: Rene Herman
Date: Monday, June 23, 2008 - 10:10 am

We are not going to agree. In this, it's not a random application, but the
one and only interface to sensors that's in use that breaks. It is all of
sensors support that breaks, all user interfaces, as they all depend on the
one libsensors. Sure, if some random application makes bad assumptions the
remedy is fixing the random application. If the one and only interface to
something breaks, it's the ABI that breaks.

And if people really insist on calling it FNOOZLEGLUM breakage instead of
ABI breakage, all for it. I love exciting words. Its just that I'm really
more interested in the "breakage" bit than anyone else in this thread it
seems.

Rene.
--

From: Hans de Goede
Date: Monday, June 23, 2008 - 6:51 am

To elaborate on that let me add that I am the Fedora lm_sensors maintainer and 
as such that I'm very much aware of this problem.

Regards,


Hans

--

From: Hans de Goede
Date: Sunday, June 22, 2008 - 12:30 am

I'm pretty sure this caused by your lm_sensors using space being too old to
support the new thermalzone stuff.

<Correction that should ofcourse read "userspace" not "using space", so the 
correct reply I was trying to send is>:

I'm pretty sure this caused by your lm_sensors using space being too old to
support the new thermalzone stuff. You need atleast 3.0.2 to support the
thermalzone driver.

Regards,

Hans

--

Previous thread: [PATCH] Fix serial_match_port() for dynamic major tty-device numbers by Guennadi Liakhovetski on Saturday, June 21, 2008 - 3:45 pm. (3 messages)

Next thread: [OLPC] sdhci: add quirk for the Marvell CaFe's vdd/powerup issue by Andres Salomon on Saturday, June 21, 2008 - 6:15 pm. (4 messages)