"I have a question for the PF/ALTQ masters out there," Matthew Dillon began on the DragonFlyBSD kernel mailing list, having recently switched from using a Cisco router to a DragonFlySD server running PF. "I am trying to configure PF in a manner similar to what Cisco's fair-queue algorithm does. Cisco's algorithm basically hashes TCP and UDP traffic based on the port/IP pairs, creating a bunch of lists of backlogged packets and then schedules the packets at the head of each list." He went on to explain that he was unsuccessfully trying to configure the same thing with PF, "neither CBQ nor HFSC seem to work well. I can separate certain types of traffic but the real problem is when there are multiple TCP connections that are essentially classified the same, and one is hogging the outgoing bandwidth. So the question is, is there a PF solution for that or do I need to write a new ALTQ mechanic to implement fair queueing?"
Not finding a solution, he followed with a series of patches implementing what he needed. He explained the resulting logic noting, "unless something comes up I am going to commit this to DragonFly on Friday and call it done. I would be pleased if other projects picked up some or all of the work":
"The queues are scanned from highest priority to lowest priority; if the packet bandwidth on the queue does not exceed the bandwidth parameter and a packet is available, a packet will be chosen fro that queue; if a packet is available but the queue has exceeded the specified bandwidth, the next lower priority queue is scanned (and so forth); if NO lower priority queues either have packets or are all over the bandwidth limit, then a packet will be taken from the highest priority queue with a packet ready; packet rate can exceed the queue bandwidth specification (but will not exceed the interface bandwidth specification, of course), but under full saturation the average bandwidth for any given queue will be limited to the specified value."
From: Matthew Dillon <dillon@...> Subject: Network transition complete + PF question Date: Apr 3, 2:08 am 2008The network move is complete.
I have a question for the PF/ALTQ masters out there. I am trying to
configure PF in a manner similar to what Cisco's fair-queue algorithm
does. Cisco's algorithm basically hashes TCP and UDP traffic based
on the port/IP pairs, creating a bunch of lists of backlogged packets
and then schedules the packets at the head of each list.I am trying to find something equivalent with PF and not having much
luck. Neither CBQ nor HFSC seem to work well. I can separate certain
types of traffic but the real problem is when there are multiple
TCP connections that are essentially classified the same, and one is
hogging the outgoing bandwidth.So the question is, is there a PF solution for that or do I need to
write a new ALTQ mechanic to implement fair queueing ?If there is no current solution I have a pretty good idea how to
implement it. I can use PF's 'keep state' mechanism and then hash
the state structure pointer and store it in the packet header, then
implement a new ALTQ that takes that hash code and throws it into an
array of queues from which it fair-dequeues packets for output.-Matt
From: Matthew Dillon <dillon@...> Subject: FairQ ALTQ for PF - Patch #1 Date: Apr 4, 12:28 am 2008Ok, This is my first attempt at adding a fairq feature to ALTQ/PF.
It isn't perfect yet, but it appears to work reasonably well.fetch http://apollo.backplane.com/DFlyMisc/fairq01.patch
It isn't hierarchical (at least not yet), but you can specify multiple
queues for each interface as long as you give them different priorities.Here is an example configuration:
altq on vke0 fairq bandwidth 500Kb queue { normal, fair }
queue fair priority 1 bandwidth 100Kb fairq(buckets 64) qlimit 50
queue normal priority 2 bandwidth 400Kb fairq(buckets 64, default) qlimit 50pass out on vke0 inet proto tcp from any to any keep state queue normal
pass out on vke0 inet proto tcp from any to 216.240.41.28 keep state queue fairHere is how it works:
* The queues are scanned from highest priority to lowest priority.
* If the packet bandwidth on the queue does not exceed the bandwidth
parameter and a packet is available, a packet will be chosen from
that queue.* If a packet is available but the queue has exceeded the specified
bandwidth, the next lower priority queue is scanned (and so forth).* If NO lower priority queues either have packets or are all over the
bandwidth limit, then a packet will be taken from the highest priority
queue with a packet ready.* Packet rate can exceed the queue bandwidth specification (but
will not exceed the interface bandwidth specification, of course),
but under full saturation the average bandwidth for any given
queue will be limited to the specified value.Here is how the fair queueing works:
* You MUST specify 'keep state' in the related rules.
* keep state 'connections' will be given a fingerprint hash code which
will be used to enqueue the mbuf in one of the N buckets (64 in our
example) for each fair queue.* When PF request's a packet from the fairq, a packet will be selected
from each of the 64 buckets in a round-robin fashion.Thus if you have a very hungy connection, it will not be able to
steal all the bandwidth (or queue up tons of packets to the actual
interface) from other connections within the queue.Caveats and issues:
(1) The qlimit is per-bucket. So 64 buckets x 50 packets is, worst case,
3200 packets. It's unlikely this would ever occur, but it's an issue
that I haven't dealt with yet.(2) Due to limitations on the number of buckets, multiple connections
can end up in the same bucket. If one of those connections is a
heavy hitter, the others will suffer.This could probably be fixed with further sorting or perhaps a
different topology (e.g. like a tree instead of a fixed array).Please Test! I have this running on my router box right now and
it appears to work very well.-Matt
From: Max Laier <max@...> Subject: Re: FairQ ALTQ for PF - Patch #1 Date: Apr 5, 1:18 pm 2008On Friday 04 April 2008 06:28:22 Matthew Dillon wrote:
> Ok, This is my first attempt at adding a fairq feature to ALTQ/PF.
> It isn't perfect yet, but it appears to work reasonably well.
>
> fetch http://apollo.backplane.com/DFlyMisc/fairq01.patchThere is a WFQ discipline for ALTQ:
http://www.kame.net/dev/cvsweb2.cgi/kame/kame/sys/altq/ altq_wfq.{h,c}It has never been integrated with pf, but I think using your approach of
passing a hash in the pkthdr this should be rather straight forward.> It isn't hierarchical (at least not yet), but you can specify
> multiple queues for each interface as long as you give them different
> priorities.
>
> Here is an example configuration:
>
> altq on vke0 fairq bandwidth 500Kb queue { normal, fair }
> queue fair priority 1 bandwidth 100Kb fairq(buckets 64) qlimit 50
> queue normal priority 2 bandwidth 400Kb fairq(buckets 64, default)
> qlimit 50
>
> pass out on vke0 inet proto tcp from any to any keep state queue normal
> pass out on vke0 inet proto tcp from any to 216.240.41.28 keep state
> queue fair
>
> Here is how it works:
>
> * The queues are scanned from highest priority to lowest priority.
>
> * If the packet bandwidth on the queue does not exceed the
> bandwidth parameter and a packet is available, a packet will be chosen
> from that queue.
>
> * If a packet is available but the queue has exceeded the specified
> bandwidth, the next lower priority queue is scanned (and so
> forth).
>
> * If NO lower priority queues either have packets or are all over
> the bandwidth limit, then a packet will be taken from the highest
> priority queue with a packet ready.
>
> * Packet rate can exceed the queue bandwidth specification (but
> will not exceed the interface bandwidth specification, of
> course), but under full saturation the average bandwidth for any given
> queue will be limited to the specified value.
>
> Here is how the fair queueing works:
>
> * You MUST specify 'keep state' in the related rules.
>
> * keep state 'connections' will be given a fingerprint hash code
> which will be used to enqueue the mbuf in one of the N buckets (64 in
> our example) for each fair queue.
>
> * When PF request's a packet from the fairq, a packet will be
> selected from each of the 64 buckets in a round-robin fashion.
>
> Thus if you have a very hungy connection, it will not be able to
> steal all the bandwidth (or queue up tons of packets to the
> actual interface) from other connections within the queue.
>
> Caveats and issues:
>
> (1) The qlimit is per-bucket. So 64 buckets x 50 packets is, worst
> case, 3200 packets. It's unlikely this would ever occur, but it's an
> issue that I haven't dealt with yet.
>
> (2) Due to limitations on the number of buckets, multiple
> connections can end up in the same bucket. If one of those connections
> is a heavy hitter, the others will suffer.
>
> This could probably be fixed with further sorting or perhaps a
> different topology (e.g. like a tree instead of a fixed array).
>
> Please Test! I have this running on my router box right now and
> it appears to work very well.
>
> -Matt--
/"\ Best regards, | mlaier@freebsd.org
\ / Max Laier | ICQ #67774661
X http://pf4freebsd.love2party.net/ | mlaier@EFnet
/ \ ASCII Ribbon Campaign | Against HTML Mail and News
From: Matthew Dillon <dillon@...> Subject: Re: FairQ ALTQ for PF - Patch #1 Date: Apr 5, 2:09 pm 2008:There is a WFQ discipline for ALTQ:
:http://www.kame.net/dev/cvsweb2.cgi/kame/kame/sys/altq/ altq_wfq.{h,c}
:
:It has never been integrated with pf, but I think using your approach of
:passing a hash in the pkthdr this should be rather straight forward.Ah, there we go. Wow, way back in 1997. The core of that code is
definitely fair-queue. I'm not sure what they are doing at the top
level, though, I don't see any prioritization or bandwidth control.
There's a queue->quota and a queue->weight that looks like it has
been partially coded but not finished. They are using a list of queues
instead of an array which I think is somewhat superior to what I'm
doing (bitmap of active queues with an iterator), but I think my
bandwidth and prioritization algorithm is a bit more advanced.One thing I can theorize would be beneficial would be to record the
bandwidth being used by each sub-queue and then allow low bandwidth queues
to 'burst' data by moving the queue to the head of the list if it is
recognized as having low bandwidth and is otherwise empty. To prevent
starvation from having many low bw connections you'd keep another
counter which is reset when the round-robin encounters the queue normally
without it having been moved.So, e.g. if you do a 'pounding the keyboard' test on an interactive
connection you would get interactive response. Right now with my
implementation if you pound the keyboard you get intermediate
responsiveness because the round-robin has to cycle around to that
queue before the packet gets sent. Maybe that is what they were trying
to control with the weighting variable.I am going to research it a bit more. I kinda like my base better (well,
that's no surprise), but the list of queues approach WFQ takes has a lot
more flexibility.-Matt
Matthew Dillon
From: Matthew Dillon <dillon@...> Subject: FairQ ALTQ for PF - Patch #2 Date: Apr 5, 6:15 pm 2008After looking at WFQ (thanks to Max Laier for the reference!), and
reading a few papers on it, I've got the second version of my fairq
patch for ALTQ ready to go.fetch http://apollo.backplane.com/DFlyMisc/fairq02.patch
This version removes the bitmap and the fixed array scan. It keeps
the fixed array of buckets but links the active buckets together into a
circular queue.The 'hogs' option is now operational. This option allows a bucket to
drain in a burst (i.e. to not advance the round robin pointer) as long
as its bandwidth is less then the specified bandwidth.My fair share scheduler is not yet weighted, but the new topology
makes it possible to implement a full blown fair share scheduler
(aka a weighted scheduler). I haven't decided whether I want to go
that far yet but in the mean I did implement a quick hack to
insert new empty low-bandwidth queues (bw < hogs bw) at the head of
the circular list instead of the tail (kind of a poor-man's deadline
mechanic but not really). I'm considering my options. The new
circular list gives me a lot of flexibility.Here is an example configuration. Also note that your kernel must
be compiled with the various ALTQ options, including the new
ALTQ_FAIRQ option.-Matt
ports="{ 22, 25 }"
altq on vke0 fairq bandwidth 500Kb queue { normal, bulk }
queue bulk priority 1 bandwidth 100Kb \
fairq(buckets 64, hogs 25Kb) qlimit 50
queue normal priority 2 bandwidth 400Kb \
fairq(buckets 64, hogs 25Kb, default) qlimit 50pass out on vke0 inet proto tcp from any to any \
keep state queue normal
pass out on vke0 inet proto tcp from any to 216.240.41.28 port $ports \
keep state queue bulk
From: Matthew Dillon <dillon@...> Subject: Re: FairQ ALTQ for PF - Patch #2 Date: Apr 6, 2:39 pm 2008This has been running well on my router and doesn't really effect
other ALTQ disciplines so I am going to go ahead and commit it
to clear room to port the probability keyword that Cedric mentioned,
before I get back to finishing up HAMMER.-Matt
From: Matthew Dillon <dillon@...> Subject: Re: FairQ ALTQ for PF - Patch #2 Date: Apr 6, 6:36 pm 2008:Matthew Dillon wrote:
:> This has been running well on my router and doesn't really effect
:> other ALTQ disciplines so I am going to go ahead and commit it
:> to clear room to port the probability keyword that Cedric mentioned,
:> before I get back to finishing up HAMMER.
:>
:> -Matt
:
:For some reason, since a week ago, your servers have been unreachable to
:Linux clients. The problem can be temporarily bypassed by setting the
:Linux sysctl net.ipv4.tcp_window_scaling to 0
:
:--
:Robert LucianiIt's got to be something PF (packet filter) is doing. I was using a
Cisco with the T1. I'm using a DFly box running PF with the DSL line.
I'm trying to track it down.-Matt
From: Max Laier <max@...> Subject: Re: FairQ ALTQ for PF - Patch #2 Date: Apr 6, 7:31 pm 2008On Monday 07 April 2008 00:36:29 Matthew Dillon wrote:
> :Matthew Dillon wrote:
> :> This has been running well on my router and doesn't really
> :> effect other ALTQ disciplines so I am going to go ahead and commit
> :> it to clear room to port the probability keyword that Cedric
> :> mentioned, before I get back to finishing up HAMMER.
> :>
> :> -Matt
> :
> :For some reason, since a week ago, your servers have been unreachable
> : to Linux clients. The problem can be temporarily bypassed by setting
> : the Linux sysctl net.ipv4.tcp_window_scaling to 0
> :
> :--
> :Robert Luciani
>
> It's got to be something PF (packet filter) is doing. I was using
> a Cisco with the T1. I'm using a DFly box running PF with the DSL
> line. I'm trying to track it down.This is usually a symptom of creating state on a TCP packet other than the
initial SYN. Make sure you add "flags S/SA" to all your tcp keep state
rules. There is plenty on this in the FAQs and lists (freebsd-pf@ and
the OpenBSD pf list) for more detailed reference.--
/"\ Best regards, | mlaier@freebsd.org
\ / Max Laier | ICQ #67774661
X http://pf4freebsd.love2party.net/ | mlaier@EFnet
/ \ ASCII Ribbon Campaign | Against HTML Mail and News
From: Matthew Dillon <dillon@...> Subject: Re: FairQ ALTQ for PF - Patch #2 Date: Apr 6, 7:48 pm 2008:> It's got to be something PF (packet filter) is doing. I was using
:> a Cisco with the T1. I'm using a DFly box running PF with the DSL
:> line. I'm trying to track it down.
:
:This is usually a symptom of creating state on a TCP packet other than the
:initial SYN. Make sure you add "flags S/SA" to all your tcp keep state
:rules. There is plenty on this in the FAQs and lists (freebsd-pf@ and
:the OpenBSD pf list) for more detailed reference.
:
:--
:/"\ Best regards, | mlaier@freebsd.org
:\ / Max Laier | ICQ #67774661I kinda half understand that. Are you saying that because creating
state on other then the initial syn has no information on the window
scale (which is only handled in the SYN and SYN+ACK), that it will
blow up?Here are two questions:
(1) I'm using keep state, not synproxy. Is PF still attempting to do
window sequence space comparisons and dropping packets if they do
not match? If it is, do you know where in the code that is
(I've been staring at it a while trying to find just such a
comparison but not having a whole lot of luck).(2) If I restart PF, and do not create state for pre-existing connections,
won't that blow up the classification of those connections?In particular, if there are a lot of flows going through the router
and it drops some of its state, won't those flows wind up being
left out of the state code from that point on? They would not be
identifiable to the fairq code, then, which would be a fairly
significant problem.What I would like to do, if (1) is true, is modify PF to flag that the
state was created without a SYN, and have it automatically ignore
sequence space comparisons for that case.-Matt
Matthew Dillon
From: Max Laier <max@...> Subject: Re: FairQ ALTQ for PF - Patch #2 Date: Apr 6, 8:32 pm 2008On Monday 07 April 2008 01:48:28 Matthew Dillon wrote:
> :> It's got to be something PF (packet filter) is doing. I was
> :> using a Cisco with the T1. I'm using a DFly box running PF with the
> :> DSL line. I'm trying to track it down.
> :
> :This is usually a symptom of creating state on a TCP packet other than
> : the initial SYN. Make sure you add "flags S/SA" to all your tcp keep
> : state rules. There is plenty on this in the FAQs and lists
> : (freebsd-pf@ and the OpenBSD pf list) for more detailed reference.
> :
> :--
> :/"\ Best regards, | mlaier@freebsd.org
> :\ / Max Laier | ICQ #67774661
>
> I kinda half understand that. Are you saying that because creating
> state on other then the initial syn has no information on the
> window scale (which is only handled in the SYN and SYN+ACK), that it
> will blow up?Right.
> Here are two questions:
>
> (1) I'm using keep state, not synproxy. Is PF still attempting to
> do window sequence space comparisons and dropping packets if they do
> not match? If it is, do you know where in the code that is
> (I've been staring at it a while trying to find just such a
> comparison but not having a whole lot of luck).See the attached forward from the pf mailinglist. The referenced paper is
a good read, too.> (2) If I restart PF, and do not create state for pre-existing
> connections, won't that blow up the classification of those
> connections?Yes, if you also flush states.
> In particular, if there are a lot of flows going through the router
> and it drops some of its state, won't those flows wind up being
> left out of the state code from that point on? They would not be
> identifiable to the fairq code, then, which would be a fairly
> significant problem.Usually you won't drop active states. You'd simply time them out more
aggressively (see adaptive.{start,end} in pf.conf(5) if your version has
that already) or not allow a new state to be created.> What I would like to do, if (1) is true, is modify PF to flag that
> the state was created without a SYN, and have it automatically ignore
> sequence space comparisons for that case.It really depends on what you want to achieve. If you are after security
for a network of clients with bad/broken TCP stacks then leaving out the
window checks is not a good idea. I can see that there are cases where
you'd want to check only the (src,dst,proto)-tuple and pass every
matching packet regardless. Currently pf doesn't allow for this to
happen statefully and I don't think OpenBSD is going to make that change,
ever. If you think of pf as a security first and foremost mechanism this
makes sense. I'm also somewhat reluctant to make that change in FreeBSD,
otoh there are cases where you'd want that rope.--
/"\ Best regards, | mlaier@freebsd.org
\ / Max Laier | ICQ #67774661
X http://pf4freebsd.love2party.net/ | mlaier@EFnet
/ \ ASCII Ribbon Campaign | Against HTML Mail and News
From: Matthew Dillon <dillon@...> Subject: Re: FairQ ALTQ for PF - Patch #2 Date: Apr 6, 9:26 pm 2008:> (1) I'm using keep state, not synproxy. Is PF still attempting to
:> do window sequence space comparisons and dropping packets if they do
:> not match? If it is, do you know where in the code that is
:> (I've been staring at it a while trying to find just such a
:> comparison but not having a whole lot of luck).
:
:See the attached forward from the pf mailinglist. The referenced paper is
:a good read, too.(reading that right now)
:> and it drops some of its state, won't those flows wind up being
:> left out of the state code from that point on? They would not be
:> identifiable to the fairq code, then, which would be a fairly
:> significant problem.
:
:Usually you won't drop active states. You'd simply time them out more
:aggressively (see adaptive.{start,end} in pf.conf(5) if your version has
:that already) or not allow a new state to be created.
:...
:It really depends on what you want to achieve. If you are after security
:for a network of clients with bad/broken TCP stacks then leaving out the
:window checks is not a good idea. I can see that there are cases where
:you'd want to check only the (src,dst,proto)-tuple and pass every
:matching packet regardless. Currently pf doesn't allow for this to
:happen statefully and I don't think OpenBSD is going to make that change,
:ever. If you think of pf as a security first and foremost mechanism this
:makes sense. I'm also somewhat reluctant to make that change in FreeBSD,
:otoh there are cases where you'd want that rope.
:
:--
:/"\ Best regards, | mlaier@freebsd.org
:\ / Max Laier | ICQ #67774661Yah, we have the adaptive.start/end stuff. I think I have a pretty
good handle on the issues now. I understand NetBSD's viewpoint on
connection tracking.But for my own network I am extremely uncomfortable allowing a router
to drop a good TCP connection, and even more uncomfortable having the
router control timeouts considering that the only way to overcome such
a situation in the face of overload would be to drop the keepalive
timeouts on all my machines down to fairly small values. I don't want
a reboot of my router to blow up the several hundred active TCP
connections from half a dozen servers that are running through it.At the same time I really want to use the keep-state mechanic to serve
as a basis for caching that hash code for my fairq. I don't want to
roll my own like the WFQ code does... that would be a massive duplication
of work.I think the solution is to add another flavor of keep state that
is explicitly meant for use with fairq (or fairq-like) mechanisms,
or for middle-of-network routing (verses edge routing), which want
that hash code or want some sort of identification entity for flows.If I create a 'hash state' keyword that would be fairly obvious
in its function. It would basically operate the same as keep state,
but explicitly omit any checks which cannot be done if the state is
picked up in the middle of the connection.I definitely want to make fairq portable to other OS's. What do you
think about a 'hash state' keyword? From a coding perspective it's
a little work in parse.y and maybe three or four conditionals in the
TCP state code (to omit the sequence space checks for that case).-Matt
From: Max Laier <max@...> Subject: Re: FairQ ALTQ for PF - Patch #2 Date: Apr 6, 9:54 pm 2008On Monday 07 April 2008 03:26:32 Matthew Dillon wrote:
> :> (1) I'm using keep state, not synproxy. Is PF still attempting
> :> to do window sequence space comparisons and dropping packets if they
> :> do not match? If it is, do you know where in the code that is (I've
> :> been staring at it a while trying to find just such a comparison but
> :> not having a whole lot of luck).
> :
> :See the attached forward from the pf mailinglist. The referenced
> : paper is a good read, too.
>
> (reading that right now)
>
> :> and it drops some of its state, won't those flows wind up being
> :> left out of the state code from that point on? They would not be
> :> identifiable to the fairq code, then, which would be a fairly
> :> significant problem.
> :
> :Usually you won't drop active states. You'd simply time them out more
> :aggressively (see adaptive.{start,end} in pf.conf(5) if your version
> : has that already) or not allow a new state to be created.
> :...
> :It really depends on what you want to achieve. If you are after
> : security for a network of clients with bad/broken TCP stacks then
> : leaving out the window checks is not a good idea. I can see that
> : there are cases where you'd want to check only the
> : (src,dst,proto)-tuple and pass every matching packet regardless.
> : Currently pf doesn't allow for this to happen statefully and I don't
> : think OpenBSD is going to make that change, ever. If you think of pf
> : as a security first and foremost mechanism this makes sense. I'm
> : also somewhat reluctant to make that change in FreeBSD, otoh there
> : are cases where you'd want that rope.
> :
> :--
> :/"\ Best regards, | mlaier@freebsd.org
> :\ / Max Laier | ICQ #67774661
>
> Yah, we have the adaptive.start/end stuff. I think I have a pretty
> good handle on the issues now. I understand NetBSD's viewpoint on
> connection tracking.
>
> But for my own network I am extremely uncomfortable allowing a
> router to drop a good TCP connection, and even more uncomfortable
> having the router control timeouts considering that the only way to
> overcome such a situation in the face of overload would be to drop the
> keepalive timeouts on all my machines down to fairly small values. I
> don't want a reboot of my router to blow up the several hundred active
> TCP connections from half a dozen servers that are running through it.
>
> At the same time I really want to use the keep-state mechanic to
> serve as a basis for caching that hash code for my fairq. I don't want
> to roll my own like the WFQ code does... that would be a massive
> duplication of work.Agreed. The code in WFQ is historical when there was altqd and /dev/altq
and the altq_classifier. pf (or any firewall for that matter) really is
the place to do the classification.> I think the solution is to add another flavor of keep state that
> is explicitly meant for use with fairq (or fairq-like) mechanisms,
> or for middle-of-network routing (verses edge routing), which want
> that hash code or want some sort of identification entity for
> flows.
>
> If I create a 'hash state' keyword that would be fairly obvious
> in its function. It would basically operate the same as keep
> state, but explicitly omit any checks which cannot be done if the state
> is picked up in the middle of the connection.
>
> I definitely want to make fairq portable to other OS's. What do
> you think about a 'hash state' keyword? From a coding perspective it's
> a little work in parse.y and maybe three or four conditionals in the
> TCP state code (to omit the sequence space checks for that case).I think "reduced state tracking" and the fairq are orthogonal. You can
have either independent of each other. If I were to do reduced states,
I'd probably make it a "state-opt" (see pf.conf(5) BNF) so that it could
be applied to any keep state rule with various effects. This way you
could even do modulate state or synproxy state as long as you see the
initial SYN. If not, you fall back to creating a reduced state. This
option would, of course, also have a setting where it would always just
create a reduced state and be done with it.As for the name ... maybe, 'extra-tcp-state' with a possible setting
of 'on' (default), 'off' and 'force-off' or something like that. This
could also be a global setting similar to the timeouts which can also be
set on a per-rule basis.--
/"\ Best regards, | mlaier@freebsd.org
\ / Max Laier | ICQ #67774661
X http://pf4freebsd.love2party.net/ | mlaier@EFnet
/ \ ASCII Ribbon Campaign | Against HTML Mail and News
From: Matthew Dillon <dillon@...> Subject: Re: FairQ ALTQ for PF - Patch #2 Date: Apr 7, 3:50 am 2008:...
:could even do modulate state or synproxy state as long as you see the
:initial SYN. If not, you fall back to creating a reduced state. This
:option would, of course, also have a setting where it would always just
:create a reduced state and be done with it.
:
:As for the name ... maybe, 'extra-tcp-state' with a possible setting
:of 'on' (default), 'off' and 'force-off' or something like that. This
:could also be a global setting similar to the timeouts which can also be
:set on a per-rule basis.
:
:\ / Max Laier | ICQ #67774661I came across an interesting item. I believe (but I'm not entirely
sure if I am correct) that NetBSD implies S/SA for TCP keep
state and it no longer needs to be specified in the rule. Is this
correct? It makes sense since keep state is completely broken for
TCP if S/SA isn't specified sans the type of augmentation we've been
discussing.With that in mind here is my proposed state_opt_item feature. I am
soliciting opinions on the feature:[additions to state_opt_item]
pickups
Specify that mid-stream pickups are to be allowed. The default
is to NOT allow mid-stream pickups and implies flags S/SA for TCP
connections. If pickups are enabled, flags S/SA are not implied
for TCP connections and state can be created for any packet.The implied flags parameters need not be specified in either case
unless you explicitly wish to override them, which also allows
you to roll-up several protocols into a single rule.Certain validations are disabled when mid-stream pickups occur.
For example, the window scaling options are not known for
TCP pickups and sequence space comparisons must be disabled.This does not effect state representing fully quantified
connections (for which the SYN/SYN-ACK passed through the routing
engine). Those connections continue to be fully validated.nopickups
Specify that mid-stream pickups are not to be allowed. This is the
default and this keyword does not normally need to be specified.
However, if you are concerned about rule set portability then
specifying this keyword guarantees flags S/SA for TCP connections,
and pfctl generates a parse-time error if it doesn't understand the
feature.hashonly
Implies pickups and maintains a state table entry but disables
most validations whether or not the connection has been fully
quantified. This feature is used if you do not wish to
validate connection state, for example for a router operating in the
center of a large network where such validations would be impossible
to maintain.However, even though such validations may not be desired you may
still require keep state for the purposes of driving the FAIRQ
ALTQ. FAIRQ depends on keep state to generate the hash codes
identifying the buckets in which it should place packets.You might also want to use this feature to identify high-bandwidth
connections via the state table for analysis purposes, even at
the center of a large network.-Matt
From: Cédric Berger <cedric@...> Subject: Re: FairQ ALTQ for PF - Patch #2 Date: Apr 7, 9:09 am 2008Matthew Dillon wrote:
> :...
> :could even do modulate state or synproxy state as long as you see the
> :initial SYN. If not, you fall back to creating a reduced state. This
> :option would, of course, also have a setting where it would always just
> :create a reduced state and be done with it.
> :
> :As for the name ... maybe, 'extra-tcp-state' with a possible setting
> :of 'on' (default), 'off' and 'force-off' or something like that. This
> :could also be a global setting similar to the timeouts which can also be
> :set on a per-rule basis.
> :
> :\ / Max Laier | ICQ #67774661
>
> I came across an interesting item. I believe (but I'm not entirely
> sure if I am correct) that NetBSD implies S/SA for TCP keep
> state and it no longer needs to be specified in the rule. Is this
> correct?Yes, quoting http://www.openbsd.org/faq/pf/filter.html:
In OpenBSD 4.1 and later, the default flags S/SA are applied to all TCP
filter rules.Since OpenBSD 4.1, "keep state" is also the default.
Cedric
From: Matthew Dillon <dillon@...> Subject: Re: FairQ ALTQ for PF - Patch #2 Date: Apr 7, 11:05 am 2008:Yes, quoting http://www.openbsd.org/faq/pf/filter.html:
:
:In OpenBSD 4.1 and later, the default flags S/SA are applied to all TCP
:filter rules.
:
:Since OpenBSD 4.1, "keep state" is also the default.
:
:CedricI found the code. NetBSD hasn't seemed to have adopted that change.
I'm not sure I want to adopt the keep state by default on pass
rules but S/SA clearly must be adopted and its default modified by
the new options (i.e. S/SA set by default (also for 'nopickups'),
and not set if 'pickups' or 'hashonly' since we want to pickup the
stream in the middle for the latter two.Some of this stuff is starting to look a little overboard. I can see
having keep state on as a default if it didn't have such an adverse
effect on existing TCP streams on reboot, but it does and because it
does I don't think I want it turned on as a default in DragonFly.Or, alternatively, we could turn it on by default in DragonFly but
as 'hashonly' unless a keep state directive is explicitly specified
in the rule. But then issues pop up where the administrator might not
have wanted keep state for everything due to extreme volumes and doing
that could blow out the areas he DID want keep state on. So, right now,
I'm inclined not to turn on keep state by default if it isn't specified
in the rule.-Matt
Matthew Dillon
From: Matthew Dillon <dillon@...> Subject: Re: FairQ ALTQ for PF - Patch #2 Date: Apr 7, 2:42 pm 2008:I concur. Keep state should be explicit. Furthermore, I don't expect
:keep state not to work across reboots. That's why I then write keep
:state flags S/SA. Something clearly need to be untangled here. Keep
:state should keep state as good as possible, but not reject connections.
:
:cheers
: simonI figured out another reason why linux boxes couldn't connect to me.
I wasn't running keep state on incoming traffic, only outgoing. That
means the keep state didn't have the initial SYN packet from an
outside host making a connection into me. No initial SYN, no window
scaling info.My current pickup check is not quite sufficient, either. I have to
check that the SYN was observed in both directions. Seeing just one
of the SYNs may not be enough. I'll have to re-read the window scaling
rules.Max, or anyone... do you happen to remember whether window scaling
is negotiated the same for both directions or whether each direction
in a TCP connection can use a different scaling factor?-Matt
Matthew Dillon
From: Max Laier <max@...> Subject: Re: FairQ ALTQ for PF - Patch #2 Date: Apr 7, 3:32 pm 2008On Monday 07 April 2008 20:42:08 Matthew Dillon wrote:
> :I concur. Keep state should be explicit. Furthermore, I don't expect
> :keep state not to work across reboots. That's why I then write keep
> :state flags S/SA. Something clearly need to be untangled here. Keep
> :state should keep state as good as possible, but not reject
> : connections.
> :
> :cheers
> : simon
>
> I figured out another reason why linux boxes couldn't connect to
> me.
>
> I wasn't running keep state on incoming traffic, only outgoing.
> That means the keep state didn't have the initial SYN packet from an
> outside host making a connection into me. No initial SYN, no window
> scaling info.
>
> My current pickup check is not quite sufficient, either. I have to
> check that the SYN was observed in both directions. Seeing just
> one of the SYNs may not be enough. I'll have to re-read the window
> scaling rules.
>
> Max, or anyone... do you happen to remember whether window scaling
> is negotiated the same for both directions or whether each
> direction in a TCP connection can use a different scaling factor?The latter, wouldn't make much sense if your peer could dictate a scaling
factor.The wscale for the other direction is set here:
http://fxr.watson.org/fxr/source/net/pf/pf.c?v=DFBSD#L3810 ff. Note that
this is in the state tracking already, we are looking at the first packet
from src and TH_SYN is set (-> this is the SYN+ACK) from the peer.
dst.wscale was already set when the state was created:
http://fxr.watson.org/fxr/source/net/pf/pf.c?v=DFBSD#L2727 (where src is
the other end sending the initial SYN).At least this is the way things behave when you have "flags S/SA".
--
/"\ Best regards, | mlaier@freebsd.org
\ / Max Laier | ICQ #67774661
X http://pf4freebsd.love2party.net/ | mlaier@EFnet
/ \ ASCII Ribbon Campaign | Against HTML Mail and News
From: Matthew Dillon <dillon@...> Subject: Re: FairQ ALTQ for PF - Patch #2 Date: Apr 7, 5:30 pm 2008:The latter, wouldn't make much sense if your peer could dictate a scaling
:factor.
:
:The wscale for the other direction is set here:
:http://fxr.watson.org/fxr/source/net/pf/pf.c?v=DFBSD#L3810 ff. Note that
:this is in the state tracking already, we are looking at the first packet
:from src and TH_SYN is set (-> this is the SYN+ACK) from the peer.
:dst.wscale was already set when the state was created:
:http://fxr.watson.org/fxr/source/net/pf/pf.c?v=DFBSD#L2727 (where src is
:the other end sending the initial SYN).
:
:At least this is the way things behave when you have "flags S/SA".
:
:\ / Max Laier | ICQ #67774661Got it. Oooh, that's nasty. It's confirming that the SYN is for
the other direction by testing the seqlo variable, which is non-zero
on the direction that already got the SYN, and zero on the direction
that hasn't. That code comment deserves to be expanded a bit :-)Here's a new patch, changing the one SYN detect flag into two flags
and setting them in the proper places. 'pfctl -s state -v -v' now
reports three possible states: 'indeterminate', 'incomplete', and
'good'.fetch http://apollo.backplane.com/DFlyMisc/pickups02.patch
I did some quick testing and all three states appear to work properly,
so if someone forgets to 'keep state' in both directions the state
output will say 'incomplete' instead of 'good'.-Matt
Matthew Dillon
From: Matthew Dillon <dillon@...> Subject: FairQ ALTQ for PF - Patch #3 Date: Apr 9, 2:27 pm 2008Ok, here is patch #3. This is the final patch short of bug fixes:
fetch http://apollo.backplane.com/DFlyMisc/pickups03.patch
* Added set keep-policy to set the default stateful inspection policy.
* Removed NetBSD's window scale patch.After playing with keep state for the last few days I understand now
why OpenBSD made it the default. I wound up having to put it on every
single pass rule I had on my router. However, I continue believe quite
strongly that keep state w/ flags S/SA is an inappropriate default due
to the adverse effect it has on pre-existing TCP connections, so I
wanted to come up with a solution that would be acceptable to projects
that might have a different opinion.I came up with set keep-policy in your pf.conf. For example:
set keep-policy keep state (pickups)
This will cause all pass rules to use the specified policy by default,
so it does not have to be specified for each rule.The policy can be overriden in each rule. I implemented the OpenBSD
'no keep' feature as well so it can also be turned off. I did not
see a similar feature to my 'set keep-policy' in OpenBSD.I think this is the best solution. This way the fact that stateful
inspection is being used is explicitly specified in the pf.conf,
which should satisfy everyone, plus additional features such
as 'pickups' can be specified cleanly.Unless something comes up I am going to commit this to DragonFly
on Friday and call it done. I would be pleased if other projects
picked up some or all of the work. Max, if you make fixes or further
enhancements to this for any porting you do to FreeBSD could you give
me a heads up? I'd like to keep them in sync at least for a little
while.-Matt
From: Matthew Dillon <dillon@...> Subject: Re: FairQ ALTQ for PF - Patch #3 Date: Apr 9, 2:40 pm 2008Er, in case it wasn't obvious from the content, that's PICKUPS patch #3,
not ALTQ patch #3. I borrowed the wrong Subject line.-Matt

Something similar on Linux?
Something similar on Linux?
Should be possible already
Should be possible already if you attach ESFQ to queues managed by HFSC
Traffic shaping
The linux equivalent of altq is the traffic shaping framework.
I'm not sure about the altq patch being discussed here, but the description of it sounds similar to the 'sfq' queue in linux.
Traffic shaping rocks, it really should get used more...
regular sfq user
i use sfq in my router pc at home. 3 years and counting.
fairq for freebsd
I hope that somebody will pick this up for freebsd, this is additional feature that i _really_ want to get in the tree.