Re: [PATCH 4/7] CPUFREQ: powernow-k8: Get transition latency from ACPI _PSS table

Previous thread: [PATCH 3/7] CPUFREQ: ondemand/conservative: sanitize sampling_rate restrictions by Thomas Renninger on Tuesday, February 3, 2009 - 9:46 am. (3 messages)

Next thread: Re: Performance counter API review was [patch] Performance Counters for Linux, v3 by Maynard Johnson on Tuesday, February 3, 2009 - 9:53 am. (1 message)
From: Thomas Renninger
Date: Tuesday, February 3, 2009 - 9:46 am

These are the same I sent out some days ago, but messed up the
cpufreq list address.

The patches are based against Dave's cpufreq git tree's fixes branch.
There is also the one from Mark Langsdorf (adjusted to latest tree)
included which got posted on the cpufreq list recently.
guilt should add the
From: Mark Langsdorf ...
tag right, so that he shows up as the author of this one.

Dave, can you add these to your tree, please.

Thanks,

    Thomas


--

From: Thomas Renninger
Date: Tuesday, February 3, 2009 - 9:46 am

This is the typical message you get if you plug in a CPU
which is newer than your BIOS. It's annoying seeing this
message for each core.

Signed-off-by: Thomas Renninger <trenn@suse.de>
---
 arch/x86/kernel/cpu/cpufreq/powernow-k8.c |   12 ++++++------
 1 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kernel/cpu/cpufreq/powernow-k8.c b/arch/x86/kernel/cpu/cpufreq/powernow-k8.c
index 9accffb..9e312c5 100644
--- a/arch/x86/kernel/cpu/cpufreq/powernow-k8.c
+++ b/arch/x86/kernel/cpu/cpufreq/powernow-k8.c
@@ -743,7 +743,7 @@ static int find_psb_table(struct powernow_k8_data *data)
 	 * BIOS and Kernel Developer's Guide, which is available on
 	 * www.amd.com
 	 */
-	printk(KERN_ERR PFX "BIOS error - no PSB or ACPI _PSS objects\n");
+	printk(KERN_ERR FW_BUG PFX "No PSB or ACPI _PSS objects\n");
 	return -ENODEV;
 }
 
@@ -1154,11 +1154,11 @@ static int __cpuinit powernowk8_cpu_init(struct cpufreq_policy *pol)
 		 * an UP version, and is deprecated by AMD.
 		 */
 		if (num_online_cpus() != 1) {
-			printk(KERN_ERR FW_BUG PFX "Your BIOS does not provide"
-			       " ACPI _PSS objects in a way that Linux "
-			       "understands. Please report this to the Linux "
-			       "ACPI maintainers and complain to your BIOS "
-			       "vendor.\n");
+			WARN_ONCE(1, KERN_ERR FW_BUG PFX "Your BIOS does not "
+				  "provide ACPI _PSS objects in a way that "
+				  "Linux understands. Please report this to "
+				  "the Linux ACPI maintainers and complain to "
+				  "your BIOS vendor.\n");
 			goto err_out;
 		}
 		if (pol->cpu != 0) {
-- 
1.6.0.2

--

From: Andrew Morton
Date: Tuesday, February 3, 2009 - 10:09 pm

WARN_ONCE will also spew a stack backtrace, which seems inappropriate here.

There was talk of writing a simple ONCE() macro for this sort of thing:

	if (ONCE())
		printk(...)

but it never happened.
--

From: Thomas Renninger
Date: Wednesday, February 4, 2009 - 2:59 am

Yes. I saw WARN_ONCE(1, KERN_INFO..), which should be a KERN_ERR then
(if you get a backtrace you do not want to have the message suppressed?)
I added a static int print_once; hack now.

Thanks,

   Thomas
--

From: Thomas Renninger
Date: Tuesday, February 3, 2009 - 9:46 am

powernow-k8 driver should always try to get cpufreq info from ACPI.
Otherwise it will not be able to detect the transition latency correctly
which results in ondemand governor taking a wrong sampling rate which will
then result in sever performance loss.

Let the user not shoot himself in the foot and always compile in ACPI
support for powernow-k8.

This also fixes a wrong message if ACPI_PROCESSOR is compiled as a module and
#ifndef CONFIG_ACPI_PROCESSOR
path is chosen.

Signed-off-by: Thomas Renninger <trenn@suse.de>
---
 arch/x86/kernel/cpu/cpufreq/Kconfig       |   19 ++-----------------
 arch/x86/kernel/cpu/cpufreq/powernow-k8.c |   17 -----------------
 arch/x86/kernel/cpu/cpufreq/powernow-k8.h |    5 +----
 3 files changed, 3 insertions(+), 38 deletions(-)

diff --git a/arch/x86/kernel/cpu/cpufreq/Kconfig b/arch/x86/kernel/cpu/cpufreq/Kconfig
index efae3b2..b63b0db 100644
--- a/arch/x86/kernel/cpu/cpufreq/Kconfig
+++ b/arch/x86/kernel/cpu/cpufreq/Kconfig
@@ -87,30 +87,15 @@ config X86_POWERNOW_K7_ACPI
 config X86_POWERNOW_K8
 	tristate "AMD Opteron/Athlon64 PowerNow!"
 	select CPU_FREQ_TABLE
+	depends on ACPI && ACPI_PROCESSOR
 	help
-	  This adds the CPUFreq driver for mobile AMD Opteron/Athlon64 processors.
+	  This adds the CPUFreq driver for K8/K10 Opteron/Athlon64 processors.
 
 	  To compile this driver as a module, choose M here: the
 	  module will be called powernow-k8.
 
 	  For details, take a look at <file:Documentation/cpu-freq/>.
 
-	  If in doubt, say N.
-
-config X86_POWERNOW_K8_ACPI
-	bool
-	prompt "ACPI Support" if X86_32
-	depends on ACPI && X86_POWERNOW_K8 && ACPI_PROCESSOR
-	depends on !(X86_POWERNOW_K8 = y && ACPI_PROCESSOR = m)
-	default y
-	help
-	  This provides access to the K8s Processor Performance States via ACPI.
-	  This driver is probably required for CPUFreq to work with multi-socket and
-	  SMP systems.  It is not required on at least some single-socket yet
-	  multi-core systems, even if SMP is enabled.
-
-	  It is ...
From: Thomas Renninger
Date: Tuesday, February 3, 2009 - 9:46 am

It's not only useful for the ondemand and conservative governors, but
also for userspace daemons to know about the HW transition latency of
the CPU.
It is especially useful for userspace to know about this value when
the ondemand or conservative governors are run. The sampling rate
control value depends on it and for userspace being able to set sane
tuning values there it has to know about the transition latency.

Signed-off-by: Thomas Renninger <trenn@suse.de>
---
 Documentation/cpu-freq/user-guide.txt |   12 ++++++++++++
 drivers/cpufreq/cpufreq.c             |    3 +++
 2 files changed, 15 insertions(+), 0 deletions(-)

diff --git a/Documentation/cpu-freq/user-guide.txt b/Documentation/cpu-freq/user-guide.txt
index e3443dd..57f23cb 100644
--- a/Documentation/cpu-freq/user-guide.txt
+++ b/Documentation/cpu-freq/user-guide.txt
@@ -152,6 +152,18 @@ cpuinfo_min_freq :		this file shows the minimum operating
 				frequency the processor can run at(in kHz) 
 cpuinfo_max_freq :		this file shows the maximum operating
 				frequency the processor can run at(in kHz) 
+cpuinfo_transition_latency	The time it takes on this CPU to
+				switch between two frequencies in nano
+				seconds. If unknown or known to be
+				that high that the driver does not
+				work with the ondemand governor, -1
+				(CPUFREQ_ETERNAL) will be returned.
+				Using this information can be useful
+				to choose an appropriate polling
+				frequency for a kernel governor or
+				userspace daemon. Make sure to not
+				switch the frequency too often
+				resulting in performance loss.
 scaling_driver :		this file shows what cpufreq driver is
 				used to set the frequency on this CPU
 
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index b55cb67..ae92986 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -455,6 +455,7 @@ static ssize_t show_##file_name				\
 
 show_one(cpuinfo_min_freq, cpuinfo.min_freq);
 show_one(cpuinfo_max_freq, ...
From: Thomas Renninger
Date: Tuesday, February 3, 2009 - 9:46 am

From: Mark Langsdorf <mark.langsdorf@amd.com>

At this time, the PowerNow! driver for K8 uses an experimentally
derived formula to calculate transition latency.  The value it
provides is orders of magnitude too large on modern systems.
This patch replaces the formula with ACPI _PSS latency values
for more accuracy and better performance.

I've tested it on two 2nd generation Opteron systems, a 3rd
generation Operton system, and a Turion X2 without seeing any
stability problems.

Signed-off-by: Mark Langsdorf <mark.langsdorf@amd.com>
Signed-off-by: Thomas Renninger <trenn@suse.de>
---
 arch/x86/kernel/cpu/cpufreq/powernow-k8.c |   28 ++++++++++++++++++++++------
 1 files changed, 22 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kernel/cpu/cpufreq/powernow-k8.c b/arch/x86/kernel/cpu/cpufreq/powernow-k8.c
index 5c28b37..fb039cd 100644
--- a/arch/x86/kernel/cpu/cpufreq/powernow-k8.c
+++ b/arch/x86/kernel/cpu/cpufreq/powernow-k8.c
@@ -939,10 +939,25 @@ static void powernow_k8_cpu_exit_acpi(struct powernow_k8_data *data)
 	free_cpumask_var(data->acpi_data.shared_cpu_map);
 }
 
+static int get_transition_latency(struct powernow_k8_data *data)
+{
+	int max_latency = 0;
+	int i;
+	for (i = 0; i < data->acpi_data.state_count; i++) {
+		int cur_latency = data->acpi_data.states[i].transition_latency
+			+ data->acpi_data.states[i].bus_master_latency;
+		if (cur_latency > max_latency)
+			max_latency = cur_latency;
+	}
+	/* value in usecs, needs to be in nanoseconds */
+	return 1000 * max_latency;
+}
+
 #else
 static int powernow_k8_cpu_init_acpi(struct powernow_k8_data *data) { return -ENODEV; }
 static void powernow_k8_cpu_exit_acpi(struct powernow_k8_data *data) { return; }
 static void powernow_k8_acpi_pst_values(struct powernow_k8_data *data, unsigned int index) { return; }
+static int get_transition_latency(struct powernow_k8_data *data) { return 0; }
 #endif /* CONFIG_X86_POWERNOW_K8_ACPI */
 
 /* Take a frequency, and issue the fid/vid transition command */
@@ ...
From: Robert Hancock
Date: Sunday, February 8, 2009 - 11:35 am

Hopefully somebody is pushing this for .29? This seems to make a big 
difference in system responsiveness on my Athlon X2 system.
--

From: Thomas Renninger
Date: Tuesday, February 3, 2009 - 9:46 am

The same info can be obtained via the transition_latency sysfs file

Signed-off-by: Thomas Renninger <trenn@suse.de>
---
 Documentation/cpu-freq/governors.txt   |   10 ++++++++--
 drivers/cpufreq/cpufreq_conservative.c |    4 ++++
 drivers/cpufreq/cpufreq_ondemand.c     |    5 +++++
 3 files changed, 17 insertions(+), 2 deletions(-)

diff --git a/Documentation/cpu-freq/governors.txt b/Documentation/cpu-freq/governors.txt
index 5b0cfa6..9b18512 100644
--- a/Documentation/cpu-freq/governors.txt
+++ b/Documentation/cpu-freq/governors.txt
@@ -119,8 +119,14 @@ want the kernel to look at the CPU usage and to make decisions on
 what to do about the frequency.  Typically this is set to values of
 around '10000' or more.
 
-show_sampling_rate_(min|max): the minimum and maximum sampling rates
-available that you may set 'sampling_rate' to.
+show_sampling_rate_(min|max): THIS INTERFACE IS DEPRECATED, DON'T USE IT.
+You can use wider ranges now and the general
+cpuinfo_transition_latency variable (cmp. with user-guide.txt) can be
+used to obtain exactly the same info:
+show_sampling_rate_min = transtition_latency * 500    / 1000
+show_sampling_rate_max = transtition_latency * 500000 / 1000
+(divided by 1000 is to illustrate that sampling rate is in us and
+transition latency is exported ns).
 
 up_threshold: defines what the average CPU usage between the samplings
 of 'sampling_rate' needs to be for the kernel to make a decision on
diff --git a/drivers/cpufreq/cpufreq_conservative.c b/drivers/cpufreq/cpufreq_conservative.c
index 0320962..5ba0a3f 100644
--- a/drivers/cpufreq/cpufreq_conservative.c
+++ b/drivers/cpufreq/cpufreq_conservative.c
@@ -140,11 +140,15 @@ static struct notifier_block dbs_cpufreq_notifier_block = {
 /************************** sysfs interface ************************/
 static ssize_t show_sampling_rate_max(struct cpufreq_policy *policy, char *buf)
 {
+	printk(KERN_INFO "CPUFREQ: conservative sampling_rate_max "
+	       "sysfs file is deprecated - used by: ...
From: Andrew Morton
Date: Tuesday, February 3, 2009 - 10:03 pm

I'd have thought that the user-irritation risk here is pretty high.

Would it make sense to throttle these or to make them once-per-boot
or something?

--

From: Thomas Renninger
Date: Tuesday, February 3, 2009 - 9:46 am

They were long enough set deprecated...

Update Documentation/cpu-freq/users-guide.txt:
The deprecated files listed there seen not to exist for some time anymore
already.

Signed-off-by: Thomas Renninger <trenn@suse.de>
CC: Len Brown <lenb@kernel.org>
CC: Linux ACPI list <linux-acpi@vger.kernel.org>
---
 Documentation/cpu-freq/user-guide.txt |   16 -----
 arch/x86/kernel/cpu/cpufreq/Kconfig   |   11 ----
 drivers/acpi/processor_perflib.c      |  105 ---------------------------------
 3 files changed, 0 insertions(+), 132 deletions(-)

diff --git a/Documentation/cpu-freq/user-guide.txt b/Documentation/cpu-freq/user-guide.txt
index 57f23cb..75f4119 100644
--- a/Documentation/cpu-freq/user-guide.txt
+++ b/Documentation/cpu-freq/user-guide.txt
@@ -207,19 +207,3 @@ scaling_setspeed.		By "echoing" a new frequency into this
 				you can change the speed of the CPU,
 				but only within the limits of
 				scaling_min_freq and scaling_max_freq.
-				
-
-3.2 Deprecated Interfaces
--------------------------
-
-Depending on your kernel configuration, you might find the following 
-cpufreq-related files:
-/proc/cpufreq
-/proc/sys/cpu/*/speed
-/proc/sys/cpu/*/speed-min
-/proc/sys/cpu/*/speed-max
-
-These are files for deprecated interfaces to cpufreq, which offer far
-less functionality. Because of this, these interfaces aren't described
-here.
-
diff --git a/arch/x86/kernel/cpu/cpufreq/Kconfig b/arch/x86/kernel/cpu/cpufreq/Kconfig
index b63b0db..52c8398 100644
--- a/arch/x86/kernel/cpu/cpufreq/Kconfig
+++ b/arch/x86/kernel/cpu/cpufreq/Kconfig
@@ -230,17 +230,6 @@ config X86_E_POWERSAVER
 
 comment "shared options"
 
-config X86_ACPI_CPUFREQ_PROC_INTF
-	bool "/proc/acpi/processor/../performance interface (deprecated)"
-	depends on PROC_FS
-	depends on X86_ACPI_CPUFREQ || X86_POWERNOW_K7_ACPI || X86_POWERNOW_K8_ACPI
-	help
-	  This enables the deprecated /proc/acpi/processor/../performance
-	  interface. While it is helpful for debugging, the generic,
-	  ...
From: Len Brown
Date: Tuesday, February 3, 2009 - 10:13 pm

applied

thanks,
--
Len Brown, Intel Open Source Technology Center

--

From: Dave Jones
Date: Tuesday, February 3, 2009 - 9:48 am

On Tue, Feb 03, 2009 at 05:46:39PM +0100, Thomas Renninger wrote:
 > These are the same I sent out some days ago, but messed up the
 > cpufreq list address.
 > 
 > The patches are based against Dave's cpufreq git tree's fixes branch.
 > There is also the one from Mark Langsdorf (adjusted to latest tree)
 > included which got posted on the cpufreq list recently.
 > guilt should add the
 > From: Mark Langsdorf ...
 > tag right, so that he shows up as the author of this one.
 > 
 > Dave, can you add these to your tree, please.

Thanks Thomas,
 I'll get to these today and queue them up.

	Dave

-- 
http://www.codemonkey.org.uk
--

From: Dave Jones
Date: Tuesday, February 3, 2009 - 2:21 pm

On Tue, Feb 03, 2009 at 05:46:39PM +0100, Thomas Renninger wrote:
 > These are the same I sent out some days ago, but messed up the
 > cpufreq list address.
 > 
 > The patches are based against Dave's cpufreq git tree's fixes branch.

'fixes' is for stuff pending for .29, which this patchset seems out
of scope for.  Unfortunatly, there's a ton of checkpatch cleanups up
the 'next' branch which makes this impossible to apply.

Can you rediff against that branch please?
 
thanks,

	Dave

-- 
http://www.codemonkey.org.uk
--

From: Thomas Renninger
Date: Tuesday, February 3, 2009 - 4:29 pm

Mark's latency fix for powernow-k8 should go to .29.
It's a sever fix which makes powernow-k8 work fine with ondemand
without tweaking sampling rate. Otherwise you could hit sever
performance loss, e.g. there are machines out there which currently
only check for frequency changes every 1.2 seconds (see comment #24):
https://bugzilla.novell.com/show_bug.cgi?id=436717

According to Mark, Windows PowerNow! driver also uses these ACPI
values and it got tested on various different K8 CPUs.
It should be the first patch touching powernow-k8, so this
The rest is not critical, I will rediff everything. I expect you still
want to have the whole series rebased, even if you decide to add
above one to .29?

Thanks,

     Thomas
--

From: Dave Jones
Date: Tuesday, February 3, 2009 - 4:44 pm

On Wed, Feb 04, 2009 at 12:29:51AM +0100, Thomas Renninger wrote:
 
 > Mark's latency fix for powernow-k8 should go to .29.

Ok, I'll get that moved along to Linus

 > > Can you rediff against that branch please?
 > The rest is not critical, I will rediff everything. I expect you still
 > want to have the whole series rebased, even if you decide to add
 > above one to .29?

Yes please.  Thanks.

	Dave

-- 
http://www.codemonkey.org.uk
--

Previous thread: [PATCH 3/7] CPUFREQ: ondemand/conservative: sanitize sampling_rate restrictions by Thomas Renninger on Tuesday, February 3, 2009 - 9:46 am. (3 messages)

Next thread: Re: Performance counter API review was [patch] Performance Counters for Linux, v3 by Maynard Johnson on Tuesday, February 3, 2009 - 9:53 am. (1 message)