Re: [PATCH 2/4] Add cpufreq driver for the IBM PowerPC 750GX

Previous thread: [PATCH 01/09] block: misc updates by Tejun Heo on Monday, August 25, 2008 - 3:47 am. (12 messages)

Next thread: none
From: Kevin Diggs
Date: Monday, August 25, 2008 - 3:53 am

This adds the actual cpufreq driver for the 750GX. It supports all integer
ratios that are valid for the processor model and bus frequency.

It has two modes of operation. Normal mode uses all valid frequencies. In
minmaxmode, only the minimum and maximum are used. This provides the ability
for very low latency frequency switches.

There is also a sysfs attribute to have it switch off the unused PLL for
additional power savings.

This does NOT support SMP.

My name is Kevin Diggs and I approve this patch.

Signed-off-by: Kevin Diggs <kevdig@hypersurf.com>
Index: Documentation/DocBook/cf750gx.tmpl
===================================================================
--- /dev/null	2004-08-10 18:55:00.000000000 -0700
+++ Documentation/DocBook/cf750gx.tmpl	2008-08-20 10:00:14.000000000 -0700
@@ -0,0 +1,441 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN"
+	"http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd" []>
+
+<book id="ppc750gx-cpufreq-guide">
+ <bookinfo>
+  <title>PowerPC 750GX (and FX?) cpufreq driver guide</title>
+
+  <authorgroup>
+   <author>
+    <firstname>Kevin</firstname>
+    <surname>Diggs</surname>
+   </author>
+  </authorgroup>
+
+    <revhistory>
+      <revision>
+	<revnumber>1.0 </revnumber>
+	<date>August 12, 2008</date>
+	<revremark>Initial revision posted to linuxppc-dev</revremark>
+      </revision>
+    </revhistory>
+
+  <copyright>
+   <year>2008</year>
+   <holder>Kevin Diggs</holder>
+  </copyright>
+
+  <legalnotice>
+   <para>
+    This documentation is free software; you can redistribute
+    it and/or modify it under the terms of the GNU General Public
+    License as published by the Free Software Foundation; either
+    version 2 of the License, or (at your option) any later
+    version.
+   </para>
+
+   <para>
+    This program is distributed in the hope that it will be
+    useful, but WITHOUT ANY WARRANTY; without even the implied
+    warranty ...
From: Arnd Bergmann
Date: Monday, August 25, 2008 - 4:43 am

Thanks for posting this driver and for your attention for detail
and for documentation in particular. Few people bother to write
documentation at this level.

I don't understand enough of cpufreq or your hardware to comment






Are these correct on any board? If they can be different
depending on the board design, it would be better to get

Is 0 a meaningful value for these? If it just means 'uninitialized',

Also, in general, try to avoid global variables here, even 
in file scope (static), but rather put all device specific


data is a void pointer, so you don't need the cast, and shouldn't


Don't initialize local variables in the declaration, as that will prevent

more whitespace damage. Maybe there is something wrong with your

The unlikely() here looks like overoptimization, drop it in favor of


Please go through all your dprintk and see if you really really need all of them.
Usually they are useful while you are doing the initial code, but only get in the

The conventional way to write this would be:

	result = -ENODEV;
	if (foo)
		goto out_unlock;

	result = 0;
	if (bar)
		goto out_unlock;

	return 0;

out_unlock:
	clear_bit(cf750gxmChangingPllBit, &cf750gx_v_state_bits);
	complete(&cf750gx_v_exit_completion);
out:

The comment does not really explain anything. If you just want to disable
code, use #if 0, but better drop it right away and add a comment about

These casts look wrong, cf750gx_v_lock_call_data is not a notifier_block.


This "wait for anything" use of wait_for_completion looks wrong, 
because once any other function has done the 'complete', you won't
wait here any more.



Is this purely a feature of the CPU or does it need logic
in the board design? If you need external hardware for it,
you need to check the device tree for the presence of that

ret = -ENOMEM;
if (!cf750gx_freq_table)
	goto err_simple;


	Arnd <><
--

From: Kevin Diggs
Date: Monday, August 25, 2008 - 5:57 pm

For reasons I'd rather not go into, my email address is not likely
Now that I look at it the _t is supposed to be at the end. It is
meant to indicate that this is a structure tag or type. I'll
They are a spceification of the processor itself. Should be
The first 3 are module parameters. For the first 2, 0 means
that they were not set. minmaxmode is a boolean. 0 is the
default of disabled.

When I was initially writing the code I figured I would
need the min and max core frequencies in several places.
As it turns out they are only used in the code
initialization routine (cf750gx_init()). I have made
them locals.

..._idle_pll_off is a boolean for a sysfs attribute. 0 is
the default of disabled.

..._min_max_mode is a boolean to hold the state of
minmaxmode. Seems to be only used to print the current
value.

..._state_bits is a global to maintain state.

Does the PowerPC suffer any performance penalties when
accessing shorts compared to ints? Can I save a few bytes
How big of a problem is this? I regret the decision to rip
the SMP stuff out. But it is kinda done. If absolutely
Just a coding style thing. I put declarations (or definitions -
I get the two confused?) on the same indent as the block they are
This was stolen from the ACPI Processor P-States Driver. Given the
This from a code readability standpoint? Or an efficiency one?
I think the cpufreq stuff has a debug configure option that
Originall I had something like:

	while(some_bit_is_still_set)
		sleep

I think you suggested I use a completion because it is in
fact simpler than a sleep. Now that I think about it also seems
intuitive to give the system a hint as to when something will
be done. 'complete' just means there is no timer pending (unless,
Purely a feature of the CPU. From what I know, voltage scaling
is an important part of reducing power comsumption. That would
be platform specific code. Anyone know if this is possible for
any 750GX based system? As far as I know it is not possible on

--

From: Arnd Bergmann
Date: Tuesday, August 26, 2008 - 4:29 am

[Empty message]
From: Kevin Diggs
Date: Wednesday, August 27, 2008 - 2:11 am

Since this is a boolean parameter I don't know? What if I change the
name to enable_minmax. And actually use the "bool" module parameter
I probably don't need both? I kinda treat the variables tied to module
Not that I have heard of. I thought it was lacking some hardware that
The choice I made here was to wait for the timer to fire rather than
to delete it. I think it is easier to just wait for it than to delete
it and try to get things back in order. Though since this is in a
module exit routine it probably does not matter. Also I would have to
check whether there was an analogous HRTimer call and call the right

--

From: Arnd Bergmann
Date: Wednesday, August 27, 2008 - 4:34 am

The reasoning on Linux is that you can tell from the declaration
whether something is global or automatic. In fact, functions should
be so short that you can always see all automatic declarations
when you look at some code.

If you use nonstandard variable naming, people will never stop
asking you about that, so it's easier to just to the same as

Module parameter names should be short, so just "minmax" would
be a good name, but better put the module_param() line right
after that. If it's a bool type, I would just leave out the

But you have marked as read/write in the module_param line.

I would prefer to just have the module parameter and have
a tool to modify that one.

If a module parameter only makes sense as read-only, you
should mark it as 0444 instead of 0644, but this one looks

No, I was wrong about the 603. The machine I was thinking of is actually

I think the module_exit() function should leave the frequency in a
well-defined state, so the easiest way to get there is probably
to delete the timer, and then manually set the frequency.

	Arnd <><
--

From: Geert Uytterhoeven
Date: Wednesday, August 27, 2008 - 4:40 am

The BeBox had a dual 603.

With kind regards,

Geert Uytterhoeven
Software Architect

Sony Techsoft Centre Europe
The Corporate Village 
From: Kevin Diggs
Date: Wednesday, August 27, 2008 - 2:01 pm

Ok. But leaving out the initialization will make me itch. Should I
also replace "override_min_core" with "mincore" (or "min_core")? And
"override_max_core" with "maxcore" (or "max_core")?

Leaving out the initializations makes me ... uneasy. It's ok to leave
I meant I treat them as read only from the code. That is why I have a
separate variable to change from the sysfs routines. I'll eliminate it
I really don't follow you here? If I let the timer fire then the cpu
(and the cpufreq sub-system) will be left in a well-defined state. I
don't understand why you want me to delete the timer and then
basically do manually what it was going to do anyway. There are two
calls to cpufreq_notify_transition(). One just before the modify_PLL()
call, with CPUFREQ_PRECHANGE as an argument, and one in the
pll_switch_cb() routine, with CPUFREQ_POSTCHANGE as an argument. I
would need to make sure that these are matched up.

Even without the HRTimer stuff being used the timer fires in less than
4 ms (@ 250 HZ). So I can't see the user actually trying to interrupt
a frequency change. With HRTimers it is 100 us.

--

From: Arnd Bergmann
Date: Thursday, August 28, 2008 - 12:46 am

Yes, that's how global and static variables are defined in C.

I'm still not convinced that it's actually correct if you call complete()
from the other places as well. You have three locations in your code where
you call complete() but only one for INIT_COMPLETION. The part that I don't
understand (and therefore don't expect other readers to understand) is how
the driver guarantees that only one complete() will be called on the
completion variable after it has been initialized.

What I meant with the well-defined state is that after unloading the module,
the CPU frequency should be the same as before loading the module, typically
the maximum frequency, but not the one that was set last.

	Arnd <><
--

From: Kevin Diggs
Date: Thursday, August 28, 2008 - 9:34 am

As you point out, there are three calls to complete().

The first is in the switch callback, in the idle_pll_off disabled branch.
This callback is triggered from the pll switch routine in pll_if. So that
means the call to _modify_PLL() in _target worked. So the complete() call
in _target will NOT be called. The complete() call in the lock callback
is only called in the idle_pll_off enabled branch.

As just mentioned, the second complete() call in the lock callback is
only called when idle_pll_off is enabled.

The final complete() call is in the unlock exit path in _target(). This
is an error path, where for one reason or another, there was no successful
call to _modify_PLL(). Thus there will be triggering of either callback.

There is another initialization of the completion:  the DECLARE_COMPLETION().
I think I will deadlock if I get unloaded before _target() is ever called.
This can happen. I may add a test_bit(...changing_pll_bit) condition on the
Thanks for taking the time to review and for the comments!

kevin

--

From: Kevin Diggs
Date: Wednesday, August 27, 2008 - 2:04 pm

Previous thread: [PATCH 01/09] block: misc updates by Tejun Heo on Monday, August 25, 2008 - 3:47 am. (12 messages)

Next thread: none