> Hi,
>
>> > Fernando Luis Vázquez Cao wrote:
>> > >>> This seems to be the easiest part, but the current cgroups
>> > >>> infrastructure has some limitations when it comes to dealing with block
>> > >>> devices: impossibility of creating/removing certain control structures
>> > >>> dynamically and hardcoding of subsystems (i.e. resource controllers).
>> > >>> This makes it difficult to handle block devices that can be hotplugged
>> > >>> and go away at any time (this applies not only to usb storage but also
>> > >>> to some SATA and SCSI devices). To cope with this situation properly we
>> > >>> would need hotplug support in cgroups, but, as suggested before and
>> > >>> discussed in the past (see (0) below), there are some limitations.
>> > >>>
>> > >>> Even in the non-hotplug case it would be nice if we could treat each
>> > >>> block I/O device as an independent resource, which means we could do
>> > >>> things like allocating I/O bandwidth on a per-device basis. As long as
>> > >>> performance is not compromised too much, adding some kind of basic
>> > >>> hotplug support to cgroups is probably worth it.
>> > >>>
>> > >>> (0)
http://lkml.org/lkml/2008/5/21/12
>> > >> What about using major,minor numbers to identify each device and account
>> > >> IO statistics? If a device is unplugged we could reset IO statistics
>> > >> and/or remove IO limitations for that device from userspace (i.e. by a
>> > >> deamon), but pluggin/unplugging the device would not be blocked/affected
>> > >> in any case. Or am I oversimplifying the problem?
>> > > If a resource we want to control (a block device in this case) is
>> > > hot-plugged/unplugged the corresponding cgroup-related structures inside
>> > > the kernel need to be allocated/freed dynamically, respectively. The
>> > > problem is that this is not always possible. For example, with the
>> > > current implementation of cgroups it is not possible to treat each block
>> > > device as a different cgroup subsytem/resource controlled, because
>> > > subsystems are created at compile time.
>> >
>> > The whole subsystem is created at compile time, but controller data
>> > structures are allocated dynamically (i.e. see struct mem_cgroup for
>> > memory controller). So, identifying each device with a name, or a key
>> > like major,minor, instead of a reference/pointer to a struct could help
>> > to handle this in userspace. I mean, if a device is unplugged a
>> > userspace daemon can just handle the event and delete the controller
>> > data structures allocated for this device, asynchronously, via
>> > userspace->kernel interface. And without holding a reference to that
>> > particular block device in the kernel. Anyway, implementing a generic
>> > interface that would allow to define hooks for hot-pluggable devices (or
>> > similar events) in cgroups would be interesting.
>> >
>> > >>> 3. & 4. & 5. - I/O bandwidth shaping & General design aspects
>> > >>>
>> > >>> The implementation of an I/O scheduling algorithm is to a certain extent
>> > >>> influenced by what we are trying to achieve in terms of I/O bandwidth
>> > >>> shaping, but, as discussed below, the required accuracy can determine
>> > >>> the layer where the I/O controller has to reside. Off the top of my
>> > >>> head, there are three basic operations we may want perform:
>> > >>> - I/O nice prioritization: ionice-like approach.
>> > >>> - Proportional bandwidth scheduling: each process/group of processes
>> > >>> has a weight that determines the share of bandwidth they receive.
>> > >>> - I/O limiting: set an upper limit to the bandwidth a group of tasks
>> > >>> can use.
>> > >> Use a deadline-based IO scheduling could be an interesting path to be
>> > >> explored as well, IMHO, to try to guarantee per-cgroup minimum bandwidth
>> > >> requirements.
>> > > Please note that the only thing we can do is to guarantee minimum
>> > > bandwidth requirement when there is contention for an IO resource, which
>> > > is precisely what a proportional bandwidth scheduler does. An I missing
>> > > something?
>> >
>> > Correct. Proportional bandwidth automatically allows to guarantee min
>> > requirements (instead of IO limiting approach, that needs additional
>> > mechanisms to achive this).
>> >
>> > In any case there's no guarantee for a cgroup/application to sustain
>> > i.e. 10MB/s on a certain device, but this is a hard problem anyway, and
>> > the best we can do is to try to satisfy "soft" constraints.
>>
>> I think guaranteeing the minimum I/O bandwidth is very important. In the
>> business site, especially in streaming service system, administrator requires
>> the functionality to satisfy QoS or performance of their service.
>> Of course, IO throttling is important, but, personally, I think guaranteeing
>> the minimum bandwidth is more important than limitation of maximum bandwidth
>> to satisfy the requirement in real business sites.
>> And I know Andrea's io-throttle patch supports the latter case well and it is
>> very stable.
>> But, the first case(guarantee the minimum bandwidth) is not supported in any
>> patches.
>> Is there any plans to support it? and Is there any problems in implementing it?
>> I think if IO controller can support guaranteeing the minimum bandwidth and
>> work-conserving mode simultaneously, it more easily satisfies the requirement
>> of the business sites.
>> Additionally, I didn't understand "Proportional bandwidth automatically allows
>> to guarantee min
>> requirements" and "soft constraints".
>> Can you give me a advice about this ?
>> Thanks in advance.
>>
>> Dong-Jae Kang
>
> I think this is what dm-ioband does.
>
> Let's say you make two groups share the same disk, and give them
> 70% of the bandwidth the disk physically has and 30% respectively.
> This means the former group is almost guaranteed to be able to use
> 70% of the bandwidth even when the latter one is issuing quite
> a lot of I/O requests.
>
> Yes, I know there exist head seek lags with traditional magnetic disks,
> so it's important to improve the algorithm to reduce this overhead.
>