Create Network Drop Monitoring service in the kernel
A few weeks ago I posted an RFC requesting some feedback on a proposal that I
had to enhance our ability to monitor the Linux network stack for dropped
packets. This patchset is the result of that RFC and its feedback.
The Linux networking stack, from a users point of view suffers from four
1) Consolidation: The ability to detect dropped network packets is spread out
over several proc file interfaces and various other utilities (tc,
/proc/net/dev, snmp, etc)
2) Clarity: The ability to discern which statistics reflect dropped packets is
not always clear
3) Ambiguity: The ability to understand the root cause of a lost packet is not
always clear (some stats are incremented at multiple points in the kernel for
subtly different reasons)
4) Performance: Interrogating all of these interface as they currently exist
requires a polling operation, and potentially requires the serialization of
various kernel operations, which can result in performance degradation.
Proposed solution: dropwatch
My proposed solution consists of 4 primary aspects:
A) A hook into kfree_skb to detect dropped packets. Based on feedback from the
earlier RFC, there are relatively few places in the kernel where packets are
dropped because they have been successfully received or send (for lack of a
better term, end-of-line points). The remaining calls to kfree_skb are made
because there is something wrong and the packet must be discarded. I've split
kfree_skb into two calls: kfree_skb and kfree_skb_clean. The later is simply a
pass through to __kfree_skb, while the former adds a trace hook to capture a
pointer to the skb and the location of the call.
B) A trace hook to monitor the trace point in (A). this records the locations
at which frames were dropped, and saves them for periodic reporting.
C) A netlink protocol to both control the enabling/disabling of the trace hook
in (B) and to deliver information on ...
On Tue, 3 Mar 2009 11:57:47 -0500
It would be good to have a way to mask off certain tracepoints.
For example, if running performance test and after measuring number
of packets dropped in TX queue overflow, only see others.
I had actually considered that, yes. I'd like to save it for a later release,
just to avoid adding too much into it at once, but I'll put it on the roadmap.
I'll probably add the ability to configure a define a filter list to the
Thanks for the suggestion!