Hello all, forgive me for such a blunt description, but: "bad hardware - PASS, good hardware - FAIL" That's about what could be used as a summary for my recent observations about our ACPI PM-Timer usage ;) With a modified pmtmr_test on my stoneage system, I get the following sequence (relevant excerpt only): 652419 <== 65241b <=================== 65241d <== 65201e 65241f 652020 652421 652022 652423 652024 652425 652429 65202e 652431 652032 652433 652034 652435 652036 652437 652038 <== 65203a <==================== 65203c <== 65203e 652040 652042 652044 652046 652048 Result: catastrophic timer behaviour (a large backwards skip is possible), even in case we do a triple-read workaround, due to a floating bit at 0x0400 (possibly caused by underclocking from 400 to 150, but whatever...). And my system does pass the bootup PM-Timer check quite often despite this severe defect (2 in 4 bootups _did_ register my defective acpi_pm clocksource). That was the "PASS for broken hardware" part. Now to the "FAIL for sufficiently-good hardware" part: I realized that in historic versions (e.g. 2.6.12) read_pmtmr() encompassed the _entire_ "triple-reading due to latch bug" logic. Nowadays read_pmtmr() is the raw inline version of a single inl() only! However despite this large change, the initial hardware check (at init_acpi_pm_clocksource()) _kept using_ the now single-read read_pmtmr() as if nothing had happened. Why nobody realized this during review is beyond me. The result is that even on latch-bugged systems which are capable to be used successfully (via the triple-read workaround) the PM-Timer bootup check most likely can fail quite often now due to now doing buggy single-reads during initial check (cannot verify this in real life due to lack of affected P3 hardware here). Both issues are caused by very weak monotony verification in init_acpi_pm_clocksource(), thus that init check should get improved by leaps and bounds instead of stupidly ...
Hi Andreas,
this isn't the bug which is handled by the read-three-times-workaround.
Instead, that handels the following PIIX4 errata:
* PIIX4 Errata:
*
* The power management timer may return improper results when read.
* Although the timer value settles properly after incrementing,
* while incrementing there is a 3 ns window every 69.8 ns where the
* timer value is indeterminate (a 4.2% chance that the data will be
* incorrect when read). As a result, the ACPI free running count up
No surprise there -- it is the first time I see such an error; and it might
Well, we could do something like this for sure, but I haven't seen any other
=> might do this, but currently I'm not yet convinced whether we really need
it.
Best,
Dominik
acpi_pm.c: use proper read function also in errata mode.
When acpi_pm is used in errata mode (three reads instead of one), also the
acpi_pm init functions need to use three reads instead of just one.
Thanks to Andreas Mohr for spotting this issue.
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.de>
diff --git a/drivers/clocksource/acpi_pm.c b/drivers/clocksource/acpi_pm.c
index 5ca1d80..2c00edd 100644
--- a/drivers/clocksource/acpi_pm.c
+++ b/drivers/clocksource/acpi_pm.c
@@ -151,13 +151,13 @@ DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_SERVERWORKS, PCI_DEVICE_ID_SERVERWORKS_LE,
*/
static int verify_pmtmr_rate(void)
{
- u32 value1, value2;
+ cycle_t value1, value2;
unsigned long count, delta;
mach_prepare_counter();
- value1 = read_pmtmr();
+ value1 = clocksource_acpi_pm.read()
mach_countup(&count);
- value2 = read_pmtmr();
+ value2 = clocksource_acpi_pm.read()
delta = (value2 - value1) & ACPI_PM_MASK;
/* Check that the PMTMR delta is within 5% of what we expect */
@@ -177,7 +177,7 @@ static int verify_pmtmr_rate(void)
static int __init init_acpi_pm_clocksource(void)
{
- u32 value1, value2;
+ cycle_t value1, value2;
unsigned int i;
if (!pmtmr_ioport)
@@ -187,9 +187,9 @@ ...On Sun, 10 Aug 2008 18:29:20 +0200 if it's really only one board that's faulty... a dmi blacklist might be appropriate. -- If you want to reach me at my work email, use arjan@linux.intel.com For development, discussion and tips for power savings, visit http://www.lesswatts.org --
OK, right, technically this workaround is not related to this
different bug.
And it's in fact not this triple-read which has any weakness here but rather
Yeah, might be motherboard only, but likely still chipset-global,
since probably not too many people tried this beast with ACPI / acpi_pm even
(we're not even talking about the usual Linux ACPI 2001 blacklist limit
with this board, more like 1999, 1998 or even 1997 stoneage).
OK, dmidecode said:
Vendor: Award Software International, Inc.
Version: 4.51 PG
Release Date: 07/05/99
_DAMN_ you're fast! ;)
Technically it's related to the base type of cycle_t (i.e., u64 and thus
probably "unsigned long long"), thus %llx is the format specifier
Even if it's not a systematic chipset / layout error, then I'm sure there's always
the occasional custom-broken (read: damaged) system which would need a
useful check to avoid counter-related lockups.
IMHO the current init check is too weak, it will catch the very simplest
types of problems only, and that's not a good thing.
About Arjan's suggestion to use DMI blacklisting here: not the right
method here IMHO since one could easily catch such problems generically
and thus much more reliably than maintaining an ever-growing and thus
always-incomplete blacklist collection.
Anyway, he provided important input still ;)
Andreas Mohr
--
Hi Andreas,
what about this?
Best,
Dominik
acpi_pm.c: check for monotonicity
Expand the check for monotonicity by doing ten tests instead of one.
Applies on top of "acpi_pm.c: use proper read function also in errata mode."
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.de>
diff --git a/drivers/clocksource/acpi_pm.c b/drivers/clocksource/acpi_pm.c
index 2c00edd..f05c4fb 100644
--- a/drivers/clocksource/acpi_pm.c
+++ b/drivers/clocksource/acpi_pm.c
@@ -178,7 +178,7 @@ static int verify_pmtmr_rate(void)
static int __init init_acpi_pm_clocksource(void)
{
cycle_t value1, value2;
- unsigned int i;
+ unsigned int i, j, good = 0;
if (!pmtmr_ioport)
return -ENODEV;
@@ -187,24 +187,32 @@ static int __init init_acpi_pm_clocksource(void)
clocksource_acpi_pm.shift);
/* "verify" this timing source: */
- value1 = clocksource_acpi_pm.read();
- for (i = 0; i < 10000; i++) {
- value2 = clocksource_acpi_pm.read();
- if (value2 == value1)
- continue;
- if (value2 > value1)
- goto pm_good;
- if ((value2 < value1) && ((value2) < 0xFFF))
- goto pm_good;
- printk(KERN_INFO "PM-Timer had inconsistent results:"
- " 0x%#llx, 0x%#llx - aborting.\n", value1, value2);
- return -EINVAL;
+ for (j = 0; j < 10; j++) {
+ value1 = clocksource_acpi_pm.read();
+ for (i = 0; i < 10000; i++) {
+ value2 = clocksource_acpi_pm.read();
+ if (value2 == value1)
+ continue;
+ if (value2 > value1)
+ good++;
+ break;
+ if ((value2 < value1) && ((value2) < 0xFFF))
+ good++;
+ break;
+ printk(KERN_INFO "PM-Timer had inconsistent results:"
+ " 0x%#llx, 0x%#llx - aborting.\n",
+ value1, value2);
+ return -EINVAL;
+ }
+ udelay(300 * i);
+ }
+
+ if (good != 10) {
+ printk(KERN_INFO "PM-Timer had no reasonable result:"
+ " 0x%#llx - aborting.\n", value1);
+ return -ENODEV;
}
- printk(KERN_INFO "PM-Timer had no reasonable result:"
- " 0x%#llx - aborting.\n", ...Hi,
anyone willing to take these two patches? Possibly 2.6.27 material; hasn't
been in -next or -mm.
Best,
Dominik
The following changes since commit 796aadeb1b2db9b5d463946766c5bbfd7717158c:
Linus Torvalds (1):
Merge branch 'fixes' of git://git.kernel.org/.../davej/cpufreq
are available in the git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/brodo/pcmcia-2.6.git clocksource
Dominik Brodowski (2):
acpi_pm.c: use proper read function also in errata mode.
acpi_pm.c: check for monotonicity
drivers/clocksource/acpi_pm.c | 50 +++++++++++++++++++++++-----------------
1 files changed, 29 insertions(+), 21 deletions(-)
--
Expand the check for monotonicity by doing ten tests instead of one.
Applies on top of "acpi_pm.c: use proper read function also in errata mode."
CC: John Stultz <johnstul@us.ibm.com>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
---
drivers/clocksource/acpi_pm.c | 42 ++++++++++++++++++++++++----------------
1 files changed, 25 insertions(+), 17 deletions(-)
diff --git a/drivers/clocksource/acpi_pm.c b/drivers/clocksource/acpi_pm.c
index 2c00edd..f05c4fb 100644
--- a/drivers/clocksource/acpi_pm.c
+++ b/drivers/clocksource/acpi_pm.c
@@ -178,7 +178,7 @@ static int verify_pmtmr_rate(void)
static int __init init_acpi_pm_clocksource(void)
{
cycle_t value1, value2;
- unsigned int i;
+ unsigned int i, j, good = 0;
if (!pmtmr_ioport)
return -ENODEV;
@@ -187,24 +187,32 @@ static int __init init_acpi_pm_clocksource(void)
clocksource_acpi_pm.shift);
/* "verify" this timing source: */
- value1 = clocksource_acpi_pm.read();
- for (i = 0; i < 10000; i++) {
- value2 = clocksource_acpi_pm.read();
- if (value2 == value1)
- continue;
- if (value2 > value1)
- goto pm_good;
- if ((value2 < value1) && ((value2) < 0xFFF))
- goto pm_good;
- printk(KERN_INFO "PM-Timer had inconsistent results:"
- " 0x%#llx, 0x%#llx - aborting.\n", value1, value2);
- return -EINVAL;
+ for (j = 0; j < 10; j++) {
+ value1 = clocksource_acpi_pm.read();
+ for (i = 0; i < 10000; i++) {
+ value2 = clocksource_acpi_pm.read();
+ if (value2 == value1)
+ continue;
+ if (value2 > value1)
+ good++;
+ break;
+ if ((value2 < value1) && ((value2) < 0xFFF))
+ good++;
+ break;
+ printk(KERN_INFO "PM-Timer had inconsistent results:"
+ " 0x%#llx, 0x%#llx - aborting.\n",
+ value1, value2);
+ return -EINVAL;
+ }
+ udelay(300 * i);
+ }
+
+ if (good != 10) {
+ printk(KERN_INFO "PM-Timer had no reasonable result:"
+ " 0x%#llx ...When acpi_pm is used in errata mode (three reads instead of one), also the
acpi_pm init functions need to use three reads instead of just one.
Thanks to Andreas Mohr for spotting this issue.
CC: John Stultz <johnstul@us.ibm.com>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
---
drivers/clocksource/acpi_pm.c | 16 ++++++++--------
1 files changed, 8 insertions(+), 8 deletions(-)
diff --git a/drivers/clocksource/acpi_pm.c b/drivers/clocksource/acpi_pm.c
index 5ca1d80..2c00edd 100644
--- a/drivers/clocksource/acpi_pm.c
+++ b/drivers/clocksource/acpi_pm.c
@@ -151,13 +151,13 @@ DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_SERVERWORKS, PCI_DEVICE_ID_SERVERWORKS_LE,
*/
static int verify_pmtmr_rate(void)
{
- u32 value1, value2;
+ cycle_t value1, value2;
unsigned long count, delta;
mach_prepare_counter();
- value1 = read_pmtmr();
+ value1 = clocksource_acpi_pm.read()
mach_countup(&count);
- value2 = read_pmtmr();
+ value2 = clocksource_acpi_pm.read()
delta = (value2 - value1) & ACPI_PM_MASK;
/* Check that the PMTMR delta is within 5% of what we expect */
@@ -177,7 +177,7 @@ static int verify_pmtmr_rate(void)
static int __init init_acpi_pm_clocksource(void)
{
- u32 value1, value2;
+ cycle_t value1, value2;
unsigned int i;
if (!pmtmr_ioport)
@@ -187,9 +187,9 @@ static int __init init_acpi_pm_clocksource(void)
clocksource_acpi_pm.shift);
/* "verify" this timing source: */
- value1 = read_pmtmr();
+ value1 = clocksource_acpi_pm.read();
for (i = 0; i < 10000; i++) {
- value2 = read_pmtmr();
+ value2 = clocksource_acpi_pm.read();
if (value2 == value1)
continue;
if (value2 > value1)
@@ -197,11 +197,11 @@ static int __init init_acpi_pm_clocksource(void)
if ((value2 < value1) && ((value2) < 0xFFF))
goto pm_good;
printk(KERN_INFO "PM-Timer had inconsistent results:"
- " 0x%#x, 0x%#x - aborting.\n", value1, ...On Mon, 18 Aug 2008 21:03:25 +0200 A bare git URL is somewhat user-unfriendly. I guess this file falls under Thomas's git-hrt tree. I can queue the patches up and spam Thomas with them, but I'm at a bit of a loss regarding their priority due to the above questions. --
Hi Andrew, Indeed: on all affected hardware (some Intel ICH4, PIIX4 and PIIX4E chipsets) there's about a 4.2% chance that initialization of the ACPI PMTMR fails. On those chipsets, we need to read out the timer value at least three times to get a correct result, for every once in a while (i.e. within a 3 ns window every 69.8 ns) the read returns a bogus result. During normal operation we work around this issue, but during initialization reading a bogus value may Indeed: http://lkml.org/lkml/2008/8/10/77 -- quote: "Result: catastrophic timer behaviour (a large backwards skip is possible)," The current check for monotonicity is way too weak. And at least on one That would be great, thanks. Dominik --
On Mon, 18 Aug 2008 21:35:17 +0200 yeah, that does look like 2.6.27 material. And possibly 2.6.26.x and Could I trouble you to resend them as plain-old-patches, with full changelogs along with your thoughts about the suitablity for 2.6.2[5678].x please? I could attempt to put all that together here, but I'd probably end up with something inappropriate. Thanks. --
patches will be sent as replies to this message. Thanks for taking care of this; I should have written more meaningful texts. Also, I failed to remember the second patch hasn't been tested yet .. .. but still I think this is stuff for 2.6.27; not (yet) suitable for 2.6.26 and 2.6.25 based on the -stable rules. Best, Dominik --
From 13cc8b6ff0d8198102f60691c38c07151a693c7b Mon Sep 17 00:00:00 2001 From: Dominik Brodowski <linux@dominikbrodowski.net> Date: Sun, 10 Aug 2008 21:34:54 +0200 Subject: [PATCH 2/2] acpi_pm.c: check for monotonicity The current check for monotonicity is way too weak: Andreas Mohr reports ( http://lkml.org/lkml/2008/8/10/77 ) that on one of his test systems the current check only triggers in 50% of all cases, leading to catastrophic timer behaviour. To fix this issue, expand the check for monotonicity by doing ten consecutive tests instead of one. CC: John Stultz <johnstul@us.ibm.com> CC: Thomas Gleixner <tglx@linutronix.de> CC: Ingo Molnar <mingo@elte.hu> Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net> --- drivers/clocksource/acpi_pm.c | 42 ++++++++++++++++++++++++---------------- 1 files changed, 25 insertions(+), 17 deletions(-) diff --git a/drivers/clocksource/acpi_pm.c b/drivers/clocksource/acpi_pm.c index 2c00edd..f05c4fb 100644 --- a/drivers/clocksource/acpi_pm.c +++ b/drivers/clocksource/acpi_pm.c @@ -178,7 +178,7 @@ static int verify_pmtmr_rate(void) static int __init init_acpi_pm_clocksource(void) { cycle_t value1, value2; - unsigned int i; + unsigned int i, j, good = 0; if (!pmtmr_ioport) return -ENODEV; @@ -187,24 +187,32 @@ static int __init init_acpi_pm_clocksource(void) clocksource_acpi_pm.shift); /* "verify" this timing source: */ - value1 = clocksource_acpi_pm.read(); - for (i = 0; i < 10000; i++) { - value2 = clocksource_acpi_pm.read(); - if (value2 == value1) - continue; - if (value2 > value1) - goto pm_good; - if ((value2 < value1) && ((value2) < 0xFFF)) - goto pm_good; - printk(KERN_INFO "PM-Timer had inconsistent results:" - " 0x%#llx, 0x%#llx - aborting.\n", value1, value2); - return -EINVAL; + for (j = 0; j < 10; j++) { + value1 = clocksource_acpi_pm.read(); + for (i = 0; i < 10000; i++) { + value2 = clocksource_acpi_pm.read(); + if (value2 == ...
Hi, Technically spoken this log message could now be considered partially outdated... (we're doing 10 evaluations after all, not one with a precise end result). Seeing a define for those several open-coded 10 loops values would be nice. Andreas Mohr --
On Mon, 18 Aug 2008 22:18:44 +0200 Also it's a bit dodgy printing a cycle_t with %llx. We don't _know_ that cycle_t was implemented with `long long' - if this was always true, we wouldn't (or shouldn't) have a cycle_t at all. But it seems that it happens to work for all architectures which implement acpi. --
Hi,
Well, adding a format modifier just for this isn't necessary IMO, especially
as all acpi-implementing architectures seem to be fine. Regarding the other
issues: well, I don't care all that much about a define for something used
twice and this comment; but if you want to squash it into the other patch
I'm fine with it.
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
diff --git a/drivers/clocksource/acpi_pm.c b/drivers/clocksource/acpi_pm.c
index f05c4fb..09d4650 100644
--- a/drivers/clocksource/acpi_pm.c
+++ b/drivers/clocksource/acpi_pm.c
@@ -175,6 +175,9 @@ static int verify_pmtmr_rate(void)
#define verify_pmtmr_rate() (0)
#endif
+/* Number of monotonicity checks to perform during initialization */
+#define ACPI_PM_MONOTONICITY_CHECKS 10
+
static int __init init_acpi_pm_clocksource(void)
{
cycle_t value1, value2;
@@ -187,7 +190,7 @@ static int __init init_acpi_pm_clocksource(void)
clocksource_acpi_pm.shift);
/* "verify" this timing source: */
- for (j = 0; j < 10; j++) {
+ for (j = 0; j < ACPI_PM_MONOTONICITY_CHECKS; j++) {
value1 = clocksource_acpi_pm.read();
for (i = 0; i < 10000; i++) {
value2 = clocksource_acpi_pm.read();
@@ -207,9 +210,9 @@ static int __init init_acpi_pm_clocksource(void)
udelay(300 * i);
}
- if (good != 10) {
- printk(KERN_INFO "PM-Timer had no reasonable result:"
- " 0x%#llx - aborting.\n", value1);
+ if (good != ACPI_PM_MONOTONICITY_CHECKS) {
+ printk(KERN_INFO "PM-Timer failed consistency check "
+ " (0x%#llx) - aborting.\n", value1);
return -ENODEV;
}
--
On all hardware (some Intel ICH4, PIIX4 and PIIX4E chipsets) affected by a
hardware errata there's about a 4.2% chance that initialization of the ACPI
PMTMR fails. On those chipsets, we need to read out the timer value at least
three times to get a correct result, for every once in a while (i.e. within
a 3 ns window every 69.8 ns) the read returns a bogus result. During normal
operation we work around this issue, but during initialization reading a
bogus value may lead to -EINVAL even though the hardware is usable.
Thanks to Andreas Mohr for spotting this issue.
CC: John Stultz <johnstul@us.ibm.com>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
---
drivers/clocksource/acpi_pm.c | 16 ++++++++--------
1 files changed, 8 insertions(+), 8 deletions(-)
diff --git a/drivers/clocksource/acpi_pm.c b/drivers/clocksource/acpi_pm.c
index 5ca1d80..2c00edd 100644
--- a/drivers/clocksource/acpi_pm.c
+++ b/drivers/clocksource/acpi_pm.c
@@ -151,13 +151,13 @@ DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_SERVERWORKS, PCI_DEVICE_ID_SERVERWORKS_LE,
*/
static int verify_pmtmr_rate(void)
{
- u32 value1, value2;
+ cycle_t value1, value2;
unsigned long count, delta;
mach_prepare_counter();
- value1 = read_pmtmr();
+ value1 = clocksource_acpi_pm.read()
mach_countup(&count);
- value2 = read_pmtmr();
+ value2 = clocksource_acpi_pm.read()
delta = (value2 - value1) & ACPI_PM_MASK;
/* Check that the PMTMR delta is within 5% of what we expect */
@@ -177,7 +177,7 @@ static int verify_pmtmr_rate(void)
static int __init init_acpi_pm_clocksource(void)
{
- u32 value1, value2;
+ cycle_t value1, value2;
unsigned int i;
if (!pmtmr_ioport)
@@ -187,9 +187,9 @@ static int __init init_acpi_pm_clocksource(void)
clocksource_acpi_pm.shift);
/* "verify" this timing source: */
- value1 = read_pmtmr();
+ value1 = clocksource_acpi_pm.read();
for (i = 0; i < 10000; i++) ...compiler got upset. Did you send the wrong version? --
gcc version 4.2.3 (Ubuntu 4.2.3-2ubuntu7) on amd64 worked fine... what did yours complain about? Best, Dominik --
next try...
From: Dominik Brodowski <linux@dominikbrodowski.net>
Date: Fri, 22 Aug 2008 23:49:21 +0200
Subject: [PATCH 1/2] acpi_pm.c: use proper read function also in errata mode.
On all hardware (some Intel ICH4, PIIX4 and PIIX4E chipsets) affected by a
hardware errata there's about a 4.2% chance that initialization of the ACPI
PMTMR fails. On those chipsets, we need to read out the timer value at least
three times to get a correct result, for every once in a while (i.e. within
a 3 ns window every 69.8 ns) the read returns a bogus result. During normal
operation we work around this issue, but during initialization reading a
bogus value may lead to -EINVAL even though the hardware is usable.
Thanks to Andreas Mohr for spotting this issue.
CC: John Stultz <johnstul@us.ibm.com>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
---
drivers/clocksource/acpi_pm.c | 16 ++++++++--------
1 files changed, 8 insertions(+), 8 deletions(-)
diff --git a/drivers/clocksource/acpi_pm.c b/drivers/clocksource/acpi_pm.c
index 5ca1d80..860d033 100644
--- a/drivers/clocksource/acpi_pm.c
+++ b/drivers/clocksource/acpi_pm.c
@@ -151,13 +151,13 @@ DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_SERVERWORKS, PCI_DEVICE_ID_SERVERWORKS_LE,
*/
static int verify_pmtmr_rate(void)
{
- u32 value1, value2;
+ cycle_t value1, value2;
unsigned long count, delta;
mach_prepare_counter();
- value1 = read_pmtmr();
+ value1 = clocksource_acpi_pm.read();
mach_countup(&count);
- value2 = read_pmtmr();
+ value2 = clocksource_acpi_pm.read();
delta = (value2 - value1) & ACPI_PM_MASK;
/* Check that the PMTMR delta is within 5% of what we expect */
@@ -177,7 +177,7 @@ static int verify_pmtmr_rate(void)
static int __init init_acpi_pm_clocksource(void)
{
- u32 value1, value2;
+ cycle_t value1, value2;
unsigned int i;
if (!pmtmr_ioport)
@@ -187,9 +187,9 @@ static int __init ...From: Dominik Brodowski <linux@dominikbrodowski.net> Date: Fri, 22 Aug 2008 23:51:23 +0200 Subject: [PATCH 2/2] acpi_pm.c: check for monotonicity The current check for monotonicity is way too weak: Andreas Mohr reports ( http://lkml.org/lkml/2008/8/10/77 ) that on one of his test systems the current check only triggers in 50% of all cases, leading to catastrophic timer behaviour. To fix this issue, expand the check for monotonicity by doing ten consecutive tests instead of one. CC: John Stultz <johnstul@us.ibm.com> CC: Thomas Gleixner <tglx@linutronix.de> CC: Ingo Molnar <mingo@elte.hu> Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net> --- drivers/clocksource/acpi_pm.c | 46 +++++++++++++++++++++++++--------------- 1 files changed, 29 insertions(+), 17 deletions(-) diff --git a/drivers/clocksource/acpi_pm.c b/drivers/clocksource/acpi_pm.c index 860d033..4eee533 100644 --- a/drivers/clocksource/acpi_pm.c +++ b/drivers/clocksource/acpi_pm.c @@ -21,6 +21,7 @@ #include <linux/errno.h> #include <linux/init.h> #include <linux/pci.h> +#include <linux/delay.h> #include <asm/io.h> /* @@ -175,10 +176,13 @@ static int verify_pmtmr_rate(void) #define verify_pmtmr_rate() (0) #endif +/* Number of monotonicity checks to perform during initialization */ +#define ACPI_PM_MONOTONICITY_CHECKS 10 + static int __init init_acpi_pm_clocksource(void) { cycle_t value1, value2; - unsigned int i; + unsigned int i, j, good = 0; if (!pmtmr_ioport) return -ENODEV; @@ -187,24 +191,32 @@ static int __init init_acpi_pm_clocksource(void) clocksource_acpi_pm.shift); /* "verify" this timing source: */ - value1 = clocksource_acpi_pm.read(); - for (i = 0; i < 10000; i++) { - value2 = clocksource_acpi_pm.read(); - if (value2 == value1) - continue; - if (value2 > value1) - goto pm_good; - if ((value2 < value1) && ((value2) < 0xFFF)) - goto pm_good; - printk(KERN_INFO "PM-Timer had inconsistent results:" - " 0x%#llx, ...
Hi, some minor comments: 300*10000 microseconds seems like a long time to me. Is this the If the inner loop runs out once, you alreay know that you will later abort here. Maybe move the check directly after the inner loop to avoid the additional delay (10*10000*300 microseconds = 30 seconds) in case of failure? I hope this helps, Jochen -- http://seehuhn.de/ --
On Mon, 18 Aug 2008 22:09:16 +0200 (cc stable) OK, thanks. I tagged these as Cc: <stable@kernel.org> [2.6.26.x, 2.6.25.x but not immediately] which is a bit rubbery. What do you think are the criteria for deciding when these are ready for the backports? Something like "after 2.6.28-rc1 if nothing blew up"? --
That sounds perfectly reasonable. Best, Dominik --
Hi, Just want to say thank you for promptly following up on this issue! I wanted to test your 2nd patch (improved init), but was unable to do so (currently too busy due to large changes, plus distance to production system, downtime NOT appreciated ;). Andreas Mohr --
| Greg KH | Og dreams of kernels |
| Jens Axboe | [PATCH 31/33] Fusion: sg chaining support |
| Arnd Bergmann | Re: finding your own dead "CONFIG_" variables |
| Mark Brown | [PATCH 2/2] Subject: natsemi: Allow users to disable workaround for DspCfg reset |
| Tony Breeds | [LGUEST] Look in object dir for .config |
git: | |
| Brian Downing | Re: Git in a Nutshell guide |
| John Benes | Re: master has some toys |
| Matthias Lederhofer | [PATCH 4/7] introduce GIT_WORK_TREE to specify the work tree |
| Alexander Sulfrian | [RFC/PATCH] RE: git calls SSH_ASKPASS even if DISPLAY is not set |
| Junio C Hamano | Re: Rss produced by git is not valid xml |
