Re: [lxc-devel] Poor bridging performance on 10 GbE

Previous thread: Re: nftables by Patrick McHardy on Wednesday, March 18, 2009 - 12:47 am. (1 message)

Next thread: e1000e 0000:00:19.0: DMA-API: device driver tries to free DMA memory it has not allocated [device address=0x0000000052ff084a] [size=90 bytes] by Ingo Molnar on Wednesday, March 18, 2009 - 5:04 am. (9 messages)
From: Daniel Lezcano
Date: Wednesday, March 18, 2009 - 3:10 am

Thanks for doing benchmarking.
I did two years ago similar tests and there is an analysis of the 
performances at:
http://lxc.sourceforge.net/network/benchs.php

It is not up to date, but that will give you some clues of what is 

Yeah, definitively the macvlan interfaces is the best in terms of 
performances but with the restriction of not being able to communicate 
between containers on the same hosts.

There are some discussions around that:

http://marc.info/?l=linux-netdev&m=123643508124711&w=2

The veth is a virtual device hence it has not offloading. When the 
packet are sent out, the network stack looks at the nic offloading 
capability which is not present. So the kernel will compute the 
checksums instead of letting the nic to do that either if the packet is 
transmitted through the physical nic. This is a well known issue related 
to network virtualization and xen has developed a specific network driver:

Yes, bridging adds some overhead and AFAIR bridging + netfilter does 


I would recommend to use the 2.6.29-rc8 vanilla because this kernel does 
no longer need patches, a lot of fixes were done in the network 

The performances question is more related to the network virtualization 
implementation and should be sent to netdev@ and containers@ (added in 
the Cc' of this email), of course people at lxc-devel@ will be 
interested by these aspects, so lxc-devel@ is the right mailing list too.

Thanks for your testings
   -- Daniel
--

From: Ryousei Takano
Date: Wednesday, March 18, 2009 - 8:56 am

Hi Daniel,

I am using VServer because other virtualization mechanisms, including OpenVZ,
Xen, and KVM cannot fully utilize the network bandwidth of 10 GbE.

Here are the results of netperf bencmark:
	vanilla (2.6.27-9)		9525.94
	Vserver (2.6.27.10)	9521.79
	OpenVZ (2.6.27.10)	2049.89
	Xen (2.6.26.1)		1011.47
	KVM (2.6.27-9)		1022.42

I checked out the 2.6.29-rc8 vanilla kernel.
The performance after issuing lxc-start improved to 8.7 Gbps!
It's a big improvement, while some performance loss remains.

Best regards,
Ryousei Takano
--

From: Eric W. Biederman
Date: Wednesday, March 18, 2009 - 5:50 pm

Right.  I have been trying to figure out what the best way to cope

Bridging last I looked uses the least common denominator of hardware
offloads.  Which likely explains why adding a veth decreased your

Good question.  Any chance you can profile this and see where the
performance loss seems to be coming from?

Eric
--

From: Ryousei Takano
Date: Wednesday, March 18, 2009 - 10:37 pm

Hi Eric,

On Thu, Mar 19, 2009 at 9:50 AM, Eric W. Biederman
<ebiederm@xmission.com> wrote:

At least now LRO cannot coexist bridging.
I found out this issue is caused by decreasing the MTU size.
Myri-10G's MTU size is 9000 bytes; the veth' MTU size is 1500 bytes.
After bridging veth, MTU size decreases from 9000 to 1500 bytes.
I changed the veth's MTU size to 9000 bytes, and then I confirmed
the throughput improved to 9.6 Gbps.

The throughput between LXC containers also improved to 4.9 Gbps
by changing the MTU sizes.

So I propose to add lxc.network.mtu into the LXC configuration.

Best regards,
Ryousei Takano
--

From: Daniel Lezcano
Date: Thursday, March 19, 2009 - 2:08 am

Sounds good :)
Do you plan to send a patch ?
--

From: Ryousei Takano
Date: Thursday, March 19, 2009 - 3:50 am

Hi Daniel,

On Thu, Mar 19, 2009 at 6:08 PM, Daniel Lezcano <dlezcano@fr.ibm.com> wrote:


Yes, I will post a patch as soon as possible.

Best regards,
Ryousei Takano
--

Previous thread: Re: nftables by Patrick McHardy on Wednesday, March 18, 2009 - 12:47 am. (1 message)

Next thread: e1000e 0000:00:19.0: DMA-API: device driver tries to free DMA memory it has not allocated [device address=0x0000000052ff084a] [size=90 bytes] by Ingo Molnar on Wednesday, March 18, 2009 - 5:04 am. (9 messages)