Re: [Patch 0/5] Network Drop Monitor

Previous thread: [PATCH] dm9601: new vendor/product IDs by Peter Korsgaard on Tuesday, March 3, 2009 - 8:22 am. (2 messages)

Next thread: [Patch 1/5] Network Drop Monitor: Add netlink protocol identifer by Neil Horman on Tuesday, March 3, 2009 - 10:00 am. (8 messages)
From: Neil Horman
Date: Tuesday, March 3, 2009 - 9:57 am

Create Network Drop Monitoring service in the kernel

A few weeks ago I posted an RFC requesting some feedback on a proposal that I
had to enhance our ability to monitor the Linux network stack for dropped
packets.  This patchset is the result of that RFC and its feedback.

Overview:

The Linux networking stack, from a users point of view suffers from four
shortcommings:

1) Consolidation: The ability to detect dropped network packets is spread out
over several proc file interfaces and various other utilities (tc,
/proc/net/dev, snmp, etc)

2) Clarity: The ability to discern which statistics reflect dropped packets is
not always clear

3) Ambiguity: The ability to understand the root cause of a lost packet is not
always clear (some stats are incremented at multiple points in the kernel for
subtly different reasons)

4) Performance: Interrogating all of these interface as they currently exist
requires a polling operation, and potentially requires the serialization of
various kernel operations, which can result in performance degradation.

Proposed solution: dropwatch

My proposed solution consists of 4 primary aspects:

A) A hook into kfree_skb to detect dropped packets.  Based on feedback from the
earlier RFC, there are relatively few places in the kernel where packets are
dropped because they have been successfully received or send (for lack of a
better term, end-of-line points).  The remaining calls to kfree_skb are made
because there is something wrong and the packet must be discarded.  I've split
kfree_skb into two calls: kfree_skb and kfree_skb_clean.  The later is simply a
pass through to __kfree_skb, while the former adds a trace hook to capture a
pointer to the skb and the location of the call.

B) A trace hook to monitor the trace point in (A).  this records the locations
at which frames were dropped, and saves them for periodic reporting.

C) A netlink protocol to both control the enabling/disabling of the trace hook
in (B) and to deliver information on ...
From: Stephen Hemminger
Date: Tuesday, March 3, 2009 - 11:06 am

On Tue, 3 Mar 2009 11:57:47 -0500

It would be good to have a way to mask off certain tracepoints.
For example, if running performance test and after measuring number
of packets dropped in TX queue overflow, only see others.
--

From: Neil Horman
Date: Tuesday, March 3, 2009 - 11:54 am

I had actually considered that, yes.  I'd like to save it for a later release,
just to avoid adding too much into it at once, but I'll put it on the roadmap.
I'll probably add the ability to configure a define a filter list to the
protocol.

Thanks for the suggestion!
Neil
--

Previous thread: [PATCH] dm9601: new vendor/product IDs by Peter Korsgaard on Tuesday, March 3, 2009 - 8:22 am. (2 messages)

Next thread: [Patch 1/5] Network Drop Monitor: Add netlink protocol identifer by Neil Horman on Tuesday, March 3, 2009 - 10:00 am. (8 messages)