Re: low overhead packet capturing on linux

Previous thread: [RFC] persistent store (version 2) (part 2 of 2) by Luck, Tony on Tuesday, November 30, 2010 - 5:20 pm. (2 messages)

Next thread: linux-next: build failure after merge of the sound-asoc tree by Stephen Rothwell on Tuesday, November 30, 2010 - 5:31 pm. (4 messages)
From: Thomas Fjellstrom
Date: Tuesday, November 30, 2010 - 5:28 pm

I'm working on a little tool to monitor and measure bandwidth use on a vm 
host, down to keeping track of all guest and host bandwidth, including, 
eventually per layer7 protocol use.

Right now I have a pretty simple setup, I setup an AF_PACKET socket, select on 
it, and read data as it comes in. Obviously, this has a fatal flaw. It takes up 
a rather large amount of cpu time just to capture the packets. On a GbE 
interface, it uses up easily 60-80% cpu (on a 2.6Ghz amd phenom II cpu core) 
just to capture the packets, trying to do anything fancy with them will likely 
cause the kernel to drop some packets.

So what I'm looking for is a very low overhead way to capture packets. I've 
come up with a few ideas, some of which I have no idea if they'd even work.

One idea that came to mind (that doesn't entirely look possible) is using 
splice or vmsplice to get me as little copying as is necessary from the net 
device to my own chunk of memory. Even better if it can be a circular queue of 
sorts. I'd probably use one thread to just sit on the socket and manage the 
packets, and a second thread to actually do the accounting on the incoming 
packets.

Anyone have any pointers or tips for me?

-- 
Thomas Fjellstrom
thomas@fjellstrom.ca
--

From: Alexander Clouter
Date: Wednesday, December 1, 2010 - 2:21 am

...iptables?  You get packet and byte counters there for free and you 
can have a 'web, smtp, $service[0], $service[1], ... , other' easily 
enough.

Five to eight years ago we (an ISP) used this at a previous workplace of 
mine to do xDSL traffic accounting for our users.

Cheers

-- 
Alexander Clouter
.sigmonster says: problem drinker, n.:
                  	A man who never buys.

--

From: Thomas Fjellstrom
Date: Wednesday, December 1, 2010 - 3:18 am

Not with full layer7 support these days. None of the old things like pp2p or 
l7filter will even apply to anything remotely resembling a recent kernel.

Also I'm not sure it'll dynamically keep track of hosts. My solution will 


-- 
Thomas Fjellstrom
thomas@fjellstrom.ca
--

From: Pekka Pietikainen
Date: Wednesday, December 1, 2010 - 5:19 am

Have you checked out

http://public.lanl.gov/cpw/ (IIRC it's actually a part of recent libpcap,
but could be wrong) and http://www.ntop.org/PF_RING.html ?

--

From: Thomas Fjellstrom
Date: Wednesday, December 1, 2010 - 1:28 pm

Hi,

Thanks, yes, at least I've seen the cpw page, probably briefly looked at the 
PF_RING stuff before. But I'll take a closer look this time, thanks :)

When I was looking before, I was unduly rejecting things that required 
patching the kernel, or adding special drivers. But if it really can help I 
might as well take a look.

-- 
Thomas Fjellstrom
thomas@fjellstrom.ca
--

From: Henrique de Moraes Holschuh
Date: Thursday, December 2, 2010 - 7:49 am

Out-of-tree PF_RING :-(

I really wish someone would tack this problem in a way suitable for
inclusion on mainline, now that we have very good generic backend
infrastructure for such stuff (such as high-speed ring buffers).

AFAIK, what we have right now simply can't cope well with wirespeed taps
(or implement sflow-style taps with low overhead) on very fast links.

-- 
  "One disk to rule them all, One disk to find them. One disk to bring
  them all and in the darkness grind them. In the Land of Redmond
  where the shadows lie." -- The Silicon Valley Tarot
  Henrique Holschuh
--

Previous thread: [RFC] persistent store (version 2) (part 2 of 2) by Luck, Tony on Tuesday, November 30, 2010 - 5:20 pm. (2 messages)

Next thread: linux-next: build failure after merge of the sound-asoc tree by Stephen Rothwell on Tuesday, November 30, 2010 - 5:31 pm. (4 messages)