Re: [PATCH] pm_qos: Add system bus performance parameter

Previous thread: [GIT PULL] tegra fixes for 2.6.36 by Colin Cross on Thursday, August 26, 2010 - 8:40 pm. (1 message)

Next thread: Add system bus performance parameter by Saravana Kannan on Thursday, August 26, 2010 - 9:13 pm. (4 messages)
From: Saravana Kannan
Date: Thursday, August 26, 2010 - 9:13 pm

Some drivers/devices might need some minimum system bus performance to
provide acceptable service. Provide a PM QoS parameter to send these requests
to.

The new parameter is named "system bus performance" since it is generic enough
for the unit of the request to be frequency, bandwidth or something else that
might be appropriate. It's up to each implementation of the QoS provider to
define what the unit of the request would be.

Signed-off-by: Saravana Kannan <skannan@codeaurora.org>
---
 kernel/pm_qos_params.c |    9 +++++++++
 1 files changed, 9 insertions(+), 0 deletions(-)

diff --git a/kernel/pm_qos_params.c b/kernel/pm_qos_params.c
index 996a4de..1a44a67 100644
--- a/kernel/pm_qos_params.c
+++ b/kernel/pm_qos_params.c
@@ -93,12 +93,21 @@ static struct pm_qos_object network_throughput_pm_qos = {
 	.type = PM_QOS_MAX,
 };
 
+static BLOCKING_NOTIFIER_HEAD(system_bus_performance_notifier);
+static struct pm_qos_object system_bus_performance_pm_qos = {
+	.requests = PLIST_HEAD_INIT(system_bus_performance_pm_qos.requests, pm_qos_lock),
+	.notifiers = &system_bus_performance_notifier,
+	.name = "system_bus_performance",
+	.default_value = 0,
+	.type = PM_QOS_MAX,
+};
 
 static struct pm_qos_object *pm_qos_array[] = {
 	&null_pm_qos,
 	&cpu_dma_pm_qos,
 	&network_lat_pm_qos,
 	&network_throughput_pm_qos
+	&system_bus_performance_pm_qos
 };
 
 static ssize_t pm_qos_power_write(struct file *filp, const char __user *buf,
-- 
1.7.1.1
---
Sent by an employee of the Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum.
--

From: mark gross
Date: Thursday, August 26, 2010 - 11:41 pm

nack.

Change the name to system_bus_throughput_pm_qos assuming KBS units and
I'll ok it.  It needs to be portable and without units I think drivers
will start using magic numbers that will break when you go from a
devices with 16 to 32 bus with the same clock.

We had an email thread about this last year 
http://lkml.org/lkml/2009/12/31/143
I don't recall solution ever coming out of it.   I think you guys didn't
like the idea of using units.  Further I did post a patch adding
something like using units. Although I looks like I botch the post the
linux-pm as I can't seem to find it in the linux-pm archives :(
http://lkml.org/lkml/2010/4/22/213

Would you be ok with using throughput instead of a unit less performance
magic number?


--mark


--

From: skannan
Date: Friday, August 27, 2010 - 1:10 am

Ignoring other details for now, the biggest problem with throughput/KBps
units is that PM QoS can't handle it well in its current state. For KBps
the requests should be added together before it's "enforced". Just picking
the maximum won't work optimally.

Another problem with using KBps is that the available throughput is going
to vary depending on the CPU frequency since the CPU running at a higher
freq is going to use more bandwidth/throughput than the same CPU running
at a lower freq.

A KHz unit will side step both problems. It's not the most ideal in theory
but it's simple and gets the job done since, in our case, there aren't
very many fine grained levels of system bus frequencies (and corresponding
throughputs).

I understand that other architectures might have different practical
constraints and abilities and I didn't want to impose the KHz limitation
on them. That's the reason I proposed a parameter whose units is defined
by the "enforcer".

Thoughts?

Thanks,
Saravana
-- 
Sent by an employee of the Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum.


--

From: Peter Zijlstra
Date: Friday, August 27, 2010 - 3:17 am

Like Mark said, unit-less constraints are impossible to use correctly.
What if a driver moves from one platform to another?

Also, the KHz thing you propose simply doesn't make sense, given a fixed
bus width, KBs and KHz have a fixed ratio, and thus you get the exact
same problem you initially had and refused to fix.


--

From: mark gross
Date: Friday, August 27, 2010 - 7:05 pm

um, if your modem SPI needs a min freq its really saying it needs a min
throughput (throughput is just a scaler times freq, and 8KBS is a 13 bit

I think your getting too wrapped up with this Hz thing and need write a

The problem is that doing this will result in too many one-off drivers
that don't port nicely to my architecture when I use the same
not really anything additional, other than I wonder why kbs isn't
working for you.  Perhaps I'm missing something subtle.

--mark

--

From: Saravana Kannan
Date: Friday, August 27, 2010 - 7:55 pm

I don't know how the network throughput is enforced, but if the unit is 
KBps and it's just doing a Max, then I think it's broken. If two clients 
request 50 KBps and your network can go till 200 KBps, you would still 
be requesting 50 KBps when you could have requested 100 KBps.


I think my point wasn't clear. Say the driver is doing mem read/write 
and it needs 10 MBps and the system bus maxes out at 20 MBps.

When the CPU is idling, and isn't using the system bus, it would be 
sufficient for the system bus to run at 10MBps speeds. But when CPU 
starts executing at full speed, it's going to eat up some bandwidth and 

Yes, I could just do this and call it a day. Although, in my opinon, 
it's a misrepresentation of the parameter since we aren't doing a 

Most of the drivers/devices that really need PM QoS and don't degrade 
gracefully are internal to SoCs. I can't think of too many external or 
loosely coupled devices that don't degrade gracefully. Anyway, 

I don't think that PM QoS in it's current state can meet all the real 
requirements of bus bandwidth configuration or management. Which is fine 
-- we need to start somewhere. Something like what Kevin mentions or a 
different method would be needed to get this right for the complex 
cases. I too would be very eager to join this discussion in one of the 
conferences (Linux Plumbers?).

My thought was that till that's ready (this has been in discussion for 
_at least_ 8 months) we could go with a PM QoS parameter (preferably 
without units) and switch to the new design when it's available. I would 
be more than glad to switch to the new/future design when it's available.

Did I convince you to allow a unit less parameter? :-)

-Saravana
-- 
Sent by an employee of the Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum.
--

From: mark gross
Date: Saturday, August 28, 2010 - 3:52 pm

PM_QoS could do a summation, but keep in mind it pm_qos not qos.  pm_qos
is a best effort thing to constrain power management throttling, not
provide a true quality of service or deadline scheduling support.

If you stick to the full up quality of service mentality you quickly get
into discussions just like those around memory over commit.  Its really
hard to know when best effort or hard QoS is appropriate.

If you are trying to use pm_qos as a true qos interface then, its
definitely not up to the task.

example:  you have one 100Mb NIC in your box.  With PM QoS you could
have 4 user mode applications requesting 100Mbs PM_Q0S.  In this case
the right thing to do is to constrain the NIC PM to keep it full on and
the PHY going as fast as it can.  But you'll never get 400Mbs out of the
thing.

So far only max and min really have made sense for pm_qos but, if a case
pops up where summation makes more sense for aggregating the pm_qos

This is a quality of service problem not a power management problem.
For this the summation is more correct.  I don't think pm_qos will be

I agree with you on this.  The problem I'm reading into your words is a
pretty interesting one, but I don't think pm_qos is the right place to
start.  You seem to be looking for a QoS facility that will either
return an error, or bug, when a QoS request cannot be met and you want
guarantees not best efforts.

pm_qos != QoS

How far would you be willing to take it?  You're example is talking
about shared bus bandwidth, what about CPU cycles, deadline scheduling,
network bw,storage bw, RAM, and god knows what else?


I think plumbers would be a good place for such a discussion.  As is
this mailing list (or perhaps another if we are not talking about PM.)

I am now wondering if Kevin was thinking QoS or PM in what he was
talking about at the colab summit last spring.

We need to keep these requirements separated and understood WRT what
they are for.  I think there is a need for a separate QoS API and, ...
From: Kevin Hilman
Date: Monday, August 30, 2010 - 11:56 am

For me (and I think Saravana too), this is still all about power, but
it's closely tied to QoS.  

For things like busses, which are inherently shared, PM is tightly
coupled with "true" QoS, so I'm not sure I fully follow the distinction
being made between PM QoS and QoS.  Seems like the tradeoff is always

Using your example above, what if the 4 apps all request 10Mb/s?

What is best effort?  Leave the NIC in 10Mb/s mode, or bump up the power
state to 100Mb/s mode?

This decision is both QoS and PM related.  Without summation, the 'max'
request is still 10Mb/s so you would keep the lower power state.  But
you also know that none of the clients will get their requested rate.

There's some gray area here since there is a choice.  Was the point
of the request to keep the NIC at the *power-state* needed for 10Mb/s (a
PM request) or was the request saying the app wanted at least 10Mb/s (a
QoS request.)  

My understanding is that PM QoS is intended to limit power-state
throttling.  IOW, in the absence of PM QoS requests, the PM core code is
free to throttle the power of the devices/subsystems/busses etc.  If
requests are present, it is no longer free to throttle blindly.

The question here seems to be whether or not the PM core code should
also be free to increase the power state to meet a combination of PM QoS
requests.  To me this is still PM related.  Just like in race-to-idle
for the CPU, it might be better for overall power to go to the highter
state for a burst and then be able to fully throttle again.

Kevin
--

From: mark gross
Date: Tuesday, August 31, 2010 - 11:40 am

Now I get it!  For throughput we need to do a sum.  Ok, we need sum
comparator/performance aggregaters too!

Do we also need to figure out the max throughput and warn if the pm_qos
requests are going over?  I suppose the network stack could register
each device with a max bus bandwidth and pm_qos could warn on exceeding

I need to think on this a bit.  You are correct, and it looks like we

thanks for the example!  it really helped me to understand the issue
better.

--mark

--

From: Saravana Kannan
Date: Tuesday, August 31, 2010 - 3:38 pm

Kevin, Thanks for explaining exactly what I had in mind. I was caught up 
with other work and was glad to see this discussion moved forward.

I pretty much agree with all of Kevin's statements, so here is a 


In my opinion, here is where the "best effort" part, if any, comes in. 
PM QoS could do it's best to meet the QoS while keeping power low, but 
if the h/w can't support it, we let it run at highest performance and 

I'm not sure having both interfaces would work. Should a single client 
be allowed to keep the *power state* to what's needed for 10Mb/s? What 
happens if another client votes with "I need at least 20Mb/s"?

I think the "limit max power-state to X" should be a specific to each PM 
QoS parameter (not its clients) similar to how scaling_max_freq works 
for CPU freq and is not set by each client (process - it uses the CPU).

So, will be be adding a system bus thruput parameter? Is it going to 
have min comparator for now?

Btw, Mark, I'm a he. Not a she :-)

Thanks,
Saravana

-- 
Sent by an employee of the Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum.
--

From: mark gross
Date: Wednesday, September 1, 2010 - 7:28 am

yes, we need to add a summation aggregater to the pm_qos logic and
likely apply it to all the throughput pm_qos parameters.  You where
right about that point.  (but I'm not budging on the unit less

so we don't need to warn if the aggregate qos request exceeds the

I need to think some more on this buy its looking like for throughput
we may only want one type of interface because, as you say,  it will be

yes.  However; it follows the units of the pm_qos parameter abstraction


I made an effort to not use he/she in all my email's with you because I
couldn't tell from your name.  I must have screwed up in one of my
emails.

--mark

--

From: Saravana Kannan
Date: Wednesday, September 1, 2010 - 8:37 pm

That should work for now. If we see a strong reason for notifying QoS 


Ok. Who is going to add the summation "comparator"? I can write a patch 
for the system bus thruput parameter.

-Saravana

-- 
Sent by an employee of the Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum.
--

From: mark gross
Date: Thursday, September 2, 2010 - 7:09 am

I think the choice of units in the abstraction mostly defines what type

I'll do a patch by monday as an RFC but, I'm wondering if we want it in
the main line vrs an implementation that scales by bus integrated into
the driver model per what Kevin has been talking about....

--mgross

--

From: Saravana Kannan
Date: Friday, September 3, 2010 - 7:04 pm

The model Kevin is thinking about will probably need some time and work 
before it's usable. In the meanwhile, can we have this added?

Thanks,
Saravana

-- 
Sent by an employee of the Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum.
--

From: Saravana Kannan
Date: Friday, September 17, 2010 - 1:32 pm

Mark, Any updates?

-Saravana

-- 
Sent by an employee of the Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum.
--

From: Kevin Hilman
Date: Friday, August 27, 2010 - 7:31 am

With this current design, only one system-wide bus would be managed.
What if a platform has more than one independently scalable bus?

I think the only scalable way to handle this kind of thing is to have
per-device QoS constraints that can then be combined/aggregated by parent
devices/busses.

At LPC this year, I've proposed per-device QoS constraints[1] as a topic
for the PM mini-conf.  I hope some folks from the MSM camp can be there
for these discussions.

Kevin

--

From: Bryan Huntsman
Date: Friday, August 27, 2010 - 11:33 am

Yeah, I'm planning on rounding up some MSM folks for LPC this year. 
Power is a big concern for us so it would be good to join the 
discussion.  Initially, I was very keen on the per-device QoS contraints 
but I've since cooled on it.  For our HW, there's not a generic unit 
that can convey enough data for us to act on.  At least not w/o lookup 
tables, etc., at which point the unit loses it's value and becomes a 
generic handle.  I'm looking forward to a good group discussion on this 
topic.  Thanks.

- Bryan


-- 
Sent by an employee of the Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum.
--

From: mark gross
Date: Friday, August 27, 2010 - 6:55 pm

I am looking forward to a good discussion too :)

It would be cool if we had a prototype for the per-device qos
constraint idea to use to help guide the discussion.

Power is a big concern for everyone, and everyone doing SOC's have
all have about the same issues to boot with spi, sdio i2c power domains
and clock domains.

If msm is having issues I bet IA and Omap are or soon will have the
exact same class of issues to solve the kernel.

--mark

--

From: mark gross
Date: Friday, August 27, 2010 - 7:09 pm

we have the same problem for mutliple network devices today.  its a

I think this path has a chance of fixing the current limitations of

I thought we were going to do some prototyping or hacking on this by
now. Who was going to take the first wack at this?  

--

From: mark gross
Date: Saturday, August 28, 2010 - 4:05 pm

I'm just realizing that Saravana may not be talking about a power
management problem.  I think she is talking about QoS, not constraining
the throttling of some platform bus or device. (i.e. not pm_qos)

I think its important to keep requirements clear. Are your ideas around
doing proper throttling constraints or are they about QoS proper?

I see a need for both but, as a QoS facility will likely call into
pm_qos to help grantee a qos it will be fundamentally different form
pm_qos.  

PM_QoS is a best effort API to constrain power state throttling.

So, what are you thinking about?  QoS or constraining power state
throttling?

--

From: mark gross
Date: Thursday, September 2, 2010 - 7:05 am

I thought a pm_qos like thing per bus would be a patch or you where
going to put up to the driver model. ;)

The current pm_qos would stick around for higher level pm_qos things.
So making the system bus and changing to a summation aggregation would
be  temporary thing.  

Or are you you saying we shouldn't put system_bus into pm_qos at all and
instead we should put effort into adding it to the driver model for
buses?

If so, then I don't know what to do next, I could prototype both I
suppose (well implement the sum+system_bus in pm_qos as an RFC patch,
and take a stab at the driver model thing. )


--

From: Rafael J. Wysocki
Date: Thursday, September 2, 2010 - 1:09 pm

Hmm, well, what's system_bus?

Rafael
--

From: mark gross
Date: Monday, September 6, 2010 - 10:42 pm

My understanding is that system_bus is a somewhat generic concept that
has meaning only in platform specific hardware configurations pm_qos
request class for device specific buses.  Memory, SDIO, SPI, i2c etc.

I'm not sure if it should be exposed up to user mode as an ABI.  Its
pretty abstract and its meaning is perhaps a bit mutable across target 
architectures at this time.

--mgross



--

From: Rafael J. Wysocki
Date: Tuesday, September 7, 2010 - 2:43 pm

No, it shouldn't.

Thanks,
Rafael
--

Previous thread: [GIT PULL] tegra fixes for 2.6.36 by Colin Cross on Thursday, August 26, 2010 - 8:40 pm. (1 message)

Next thread: Add system bus performance parameter by Saravana Kannan on Thursday, August 26, 2010 - 9:13 pm. (4 messages)