Re: [RFC][PATCH -mm 1/5] i/o controller documentation

Previous thread: [RFC][PATCH -mm 0/5] cgroup: block device i/o controller (v9) by Andrea Righi on Wednesday, August 27, 2008 - 9:07 am. (15 messages)

Next thread: [RFC][PATCH -mm 2/5] introduce struct res_counter_ratelimit by Andrea Righi on Wednesday, August 27, 2008 - 9:07 am. (1 message)
From: Andrea Righi
Date: Wednesday, August 27, 2008 - 9:07 am

Documentation of the block device I/O controller: description, usage,
advantages and design.

Signed-off-by: Andrea Righi <righi.andrea@gmail.com>
---
 Documentation/controllers/io-throttle.txt |  377 +++++++++++++++++++++++++++++
 1 files changed, 377 insertions(+), 0 deletions(-)
 create mode 100644 Documentation/controllers/io-throttle.txt

diff --git a/Documentation/controllers/io-throttle.txt b/Documentation/controllers/io-throttle.txt
new file mode 100644
index 0000000..09df0af
--- /dev/null
+++ b/Documentation/controllers/io-throttle.txt
@@ -0,0 +1,377 @@
+
+               Block device I/O bandwidth controller
+
+----------------------------------------------------------------------
+1. DESCRIPTION
+
+This controller allows to limit the I/O bandwidth of specific block devices for
+specific process containers (cgroups) imposing additional delays on I/O
+requests for those processes that exceed the limits defined in the control
+group filesystem.
+
+Bandwidth limiting rules offer better control over QoS with respect to priority
+or weight-based solutions that only give information about applications'
+relative performance requirements. Nevertheless, priority based solutions are
+affected by performance bursts, when only low-priority requests are submitted
+to a general purpose resource dispatcher.
+
+The goal of the I/O bandwidth controller is to improve performance
+predictability from the applications' point of view and provide performance
+isolation of different control groups sharing the same block devices.
+
+NOTE #1: If you're looking for a way to improve the overall throughput of the
+system probably you should use a different solution.
+
+NOTE #2: The current implementation does not guarantee minimum bandwidth
+levels, the QoS is implemented only slowing down I/O "traffic" that exceeds the
+limits specified by the user; minimum I/O rate thresholds are supposed to be
+guaranteed if the user configures a proper I/O bandwidth partitioning of the
+block ...
From: Vivek Goyal
Date: Thursday, September 18, 2008 - 7:04 am

Hi Andrea,

Had a query. What's your use case for capping max bandwidth? I was
wondering will proportional bandwidth not cover it. So if we allocate
weight/share to every cgroup and limit the bandwidth based on shares
only in case of contention. Otherwise applications get to unlimited
bandwidth. Much like what cpu controller does or for that matter dm-ioband
seems to be doing the same thing. Will you not get same kind of QoS here when
comapred to max-bandwidth. The only thing probably missing is what we call
hard limit. When BW is available but you don't want a user to use that
BW, until and unless user has paid for that.

Thanks
Vivek
--

From: Andrea Righi
Date: Thursday, September 18, 2008 - 8:03 am

At the beginning my use case was to guarantee a certain level
performance _predictability_. That means no more and no less than the
specified threshold (should I say this would be useful for the real-time
apps? maybe yes).

But at this stage of development IMHO it's worth to implement a more
generic solution, able to guarantee both min/max thresholds (to cover my
original use case) as well as the weight/share functionality to cover a
larger degree use case (QoS for massive shared environments).

-Andrea
--

From: Vivek Goyal
Date: Thursday, September 18, 2008 - 8:33 am

Is "no more" harmful for real-time env? Which RT application hates more
bandwidth than what one asked for? I could understand "no-less" but you
mentioned in the past that implementing minimum gurantees is lot harder.

I was thinking that what if we continue to stick to the current policy
of letting RT requests go first and try to let them use disk bw first.
cfq first dispatches requests of RT class (based on their priority).
So in simple implementation, IO controller will simply let all the RT class
requests to go directly to elevator and then let elevator dispatch these
requests based on their RT prio. IO-controller will only buffer and control
requests of non-RT class. This will make sure that we don't break the case of
existing working RT applications and still be able to divide remaining disk
BW among other non-RT tasks.

IMHO, once above simple scheme is working, we can probably extend it to
provide additional level of controls.
 
Thanks
Vivek
--

From: Andrea Righi
Date: Thursday, September 18, 2008 - 9:26 am

RT doesn't mean as fast as possible, the objective of RT is to meet the
individual timing requirement. So, the most important property for RT should
be predicatbility. If you know that an application would require exactly
T seconds to read a block from a device (no more, no less) well... in this
case you're not introducing uncertainness in your RT task.

And I agree for the "no-less" part. It's difficult, but there's surely

Sounds reasonable, since we want to give more guarantees to respect minimum bw
requirements for RT tasks.

-Andrea
--

Previous thread: [RFC][PATCH -mm 0/5] cgroup: block device i/o controller (v9) by Andrea Righi on Wednesday, August 27, 2008 - 9:07 am. (15 messages)

Next thread: [RFC][PATCH -mm 2/5] introduce struct res_counter_ratelimit by Andrea Righi on Wednesday, August 27, 2008 - 9:07 am. (1 message)