Re: [PATCH 0/2] Traffic control cgroups subsystem

Previous thread: Laptop shock detection and harddisk protection by Tejun Heo on Wednesday, September 10, 2008 - 9:59 am. (27 messages)

Next thread: [PATCH 1/2] Traffic control cgroups subsystem by Ranjit Manomohan on Wednesday, September 10, 2008 - 10:42 am. (13 messages)
From: Ranjit Manomohan
Date: Wednesday, September 10, 2008 - 10:40 am

Incorporated fixes suggested by Li Zefan.

Please consider for net-next-2.6.

This patch provides a simple resource controller (cgroup_tc) based on the
cgroups infrastructure to manage network traffic. The cgroup_tc resource
controller can be used to schedule and shape traffic belonging to the task(s)
in a particular cgroup.

The implementation consists of two parts:

1) A resource controller (cgroup_tc) that is used to associate packets from
   a particular task belonging to a cgroup with a traffic control class id (
   tc_classid). This tc_classid is propagated to all sockets created by tasks
   in the cgroup and will be used for classifying packets at the link layer.

2) A new traffic control classifier (cls_cgroup) that can classify packets
   based on the tc_classid field in the socket to specific destination classes.

An example of the use of this resource controller would be to limit
the traffic from all tasks from a file_server cgroup to 100Mbps. We could
achieve this by doing:

# make a cgroup of file transfer processes and assign it a arbitrary unique
# classid of 0x1234 - this will be used later to direct packets.
mkdir -p /dev/cgroup
mount -t cgroup tc -otc /dev/cgroup
mkdir /dev/cgroup/file_transfer
echo 0x1234 > /dev/cgroup/file_transfer/tc.classid
echo $PID_OF_FILE_XFER_PROCESS > /dev/cgroup/file_transfer/tasks

# Now create a HTB class that rate limits traffic to 100mbits and attach
# a filter to direct all traffic from cgroup file_transfer to this new class.
tc qdisc add dev eth0 root handle 1: htb
tc class add dev eth0 parent 1: classid 1:10 htb rate 100mbit ceil 100mbit
tc filter add dev eth0 parent 1: handle 800 protocol ip prio 1 cgroup value 0x1234 classid 1:10

Signed-off-by: Ranjit Manomohan <ranjitm@google.com>

---
--

From: David Miller
Date: Wednesday, September 10, 2008 - 1:22 pm

I definitely prefer Thomas Graf's work, this stuff is very ugly
and way overengineered.

So no, I won't consider for net-next-2.6, sorry.

--

From: Ranjit Manomohan
Date: Wednesday, September 10, 2008 - 1:44 pm

Could you be more specific? Thomas' work is almost identical to this
(except that he does not store the cgroup id into the socket which is
a trivial change which has downsides which I have pointed out).

Additionally this approach has only minor modifications to the core
networking stack. What portions do you consider ugly and over
engineered and what alternative implementations would you prefer?
Please see the follow up I have sent to Thomas' proposal about why we
need this design approach to handle the inbound case.

I'd be ok if you accepted either change since  we just want a standard
kernel mechanism to do this.

-Thanks,
--

From: Thomas Graf
Date: Wednesday, September 10, 2008 - 3:12 pm

WRT the inbound case, after some experiments I decided to dismiss the
ingress case at all and stick to something as simple as possible for
egress. The reason for this is that it is a very expensive operation
to associate a packet with a task on classifier level. Taking this
cost, it does not add up with the very limited capabilities of ingress
shaping. Ingress shaping is best effort at best. It works fairly well
with a very limited number of bulk data streams but usualy fails
miserably in common congestion situations where a cgroup classifier

Agreed. I think your approach is very reasonable but considering the
reasons I've given above and in the other thread I found it could be done
in a more simple and direct way.
--

From: Ranjit Manomohan
Date: Wednesday, September 10, 2008 - 4:37 pm

Could you elaborate on the failure cases? We have found this to be
useful in practice to prevent applications from reading large amounts
of data off the network so it would be nice if it were supported.

-Thanks,
--

From: Thomas Graf
Date: Wednesday, September 10, 2008 - 4:52 pm

It works fairly well for a small number of bulk streams when no packets
need to be dropped. The results get very inaccurate for higher number
of smaller streams when packets start getting dropped. The problem is
simply that currently none of the congestion notification mechanism
work in (internet) practice.

Therefore I think that it is not worth the effort. If you think
diffrently, I will be more than glad to review code. So far I haven't
seen anything that would work on ingress.
--

Previous thread: Laptop shock detection and harddisk protection by Tejun Heo on Wednesday, September 10, 2008 - 9:59 am. (27 messages)

Next thread: [PATCH 1/2] Traffic control cgroups subsystem by Ranjit Manomohan on Wednesday, September 10, 2008 - 10:42 am. (13 messages)