All:
I see that IFCAP_VLAN_MTU is available, but IFCAP_VLAN_HWTAGGING, as seen
in ti(4), is absent in em(4). Version 6.6.6 of em(4) elsewhere is
promising TOE (TCP Segment Offload) and already supports a "vlanhwtag" and
"jumbo frames".For VLAN routing security boxes, this would be a big plus for a lot of
embedded SBCs that support only integrated Intel NICs.Two such units that I've been loooking at:
http://www.nycbug.org/?NAV=dmesgd;f_dmesg=;f_bsd=;f_nick=;f_descr=;dmesg...
http://www.nycbug.org/?NAV=dmesgd;f_dmesg=;f_bsd=;f_nick=;f_descr=;dmesg...l8*
-lava (Brian A. Seklecki - Pittsburgh, PA, USA)
http://www.spiritual-machines.org/
"this would be a big plus"?
sez who?
I say it doesn't make any difference at all.
now it is up to you to prove me wrong :)
(and I think you won't be able to)--
Henning Brauer, hb@bsws.de, henning@openbsd.org
BS Web Services, http://bsws.de
Full-Service ISP - Secure Hosting, Mail and DNS Services
Dedicated Servers, Rootservers, Application Hosting - Hamburg & Amsterdam
On Wed, 17 Oct 2007 10:52:34 +0200
Says my assumption that unlike TOE, VLAN hardware tagging has been well accepted and established for some time. But only an objective test can tell...
I've been looking for details on an F/OSS L2-L4 Performance Benchmark System.
Someone on the lists recently posted test results of tests using IXIA
I don't know where I'm going to find the time, but I will eventually have a go at this.
--
Brian A. Seklecki <bseklecki@collaborativefusion.com>IMPORTANT: This message contains confidential information and is intended only for the individual named. If the reader of this message is not an intended recipient (or the individual responsible for the delivery of this message to an intended recipient), please be advised that any re-use, dissemination, distribution or copying of this message is prohibited. Please notify the sender immediately by e-mail if you have received this e-mail by mistake and delete this e-mail from your system.
One box with two em interfaces.
One 175kpps stream in each direction, 64 byte frame size.Both interfaces with normal config = ~20% idle cpu
One interface converted to VLAN trunk with one vlan = ~15% idle cpuNo packet loss.
Just a 5 minute quick test, nothing too scientific.
/Tony
On Thu, 18 Oct 2007 14:16:59 +0100
Thanks! What was your IXIA platform? RHEL with gig interface or an appliance?
~BAS
--
Brian A. Seklecki <bseklecki@collaborativefusion.com>IMPORTANT: This message contains confidential information and is intended only for the individual named. If the reader of this message is not an intended recipient (or the individual responsible for the delivery of this message to an intended recipient), please be advised that any re-use, dissemination, distribution or copying of this message is prohibited. Please notify the sender immediately by e-mail if you have received this e-mail by mistake and delete this e-mail from your system.
I used an Optixia XM12, at least thats what I think it's called.
/Tony
I performed some quick additional tests with OpenBSD and vlan's just
for the fun of it, although I belive these tests were more about OpenBSD's
performance with lots of interfaces.If you want a openbsd router/firewall with 4000 interfaces don't go for a
low-end CPU =)http://www.layer17.net/openbsd-test-vlan-quick.html
/Tony
right, we're not too efficient with lots of lots of interfaces. on the
agenda...--
Henning Brauer, hb@bsws.de, henning@openbsd.org
BS Web Services, http://bsws.de
Full-Service ISP - Secure Hosting, Mail and DNS Services
Dedicated Servers, Rootservers, Application Hosting - Hamburg & Amsterdam
The box can in it's current state be a solid 100M router with 50
interfaces+uplinks on it.Once I have a few moments free I'll check the impact of pf with urpf and
basic stateless filters
filters enabled. Time to go look for a light sabre for my son./Tony
stateless filters? why oh why? they're SLOWER than stateful. far.
--
Henning Brauer, hb@bsws.de, henning@openbsd.org
BS Web Services, http://bsws.de
Full-Service ISP - Secure Hosting, Mail and DNS Services
Dedicated Servers, Rootservers, Application Hosting - Hamburg & Amsterdam
Stateful filters on an internet router does not seem like a very good
idea to me. Traffic may exit and enter on different devices, it is another
limited resource, and it adds another layer of complexity./Tony
well, we need a knob for lose state tracking to alow these assymetric
routing scenarios, it is on my agenda.
otherwise, either no filter at all or stteful. stateless is poop.--
Henning Brauer, hb@bsws.de, henning@openbsd.org
BS Web Services, http://bsws.de
Full-Service ISP - Secure Hosting, Mail and DNS Services
Dedicated Servers, Rootservers, Application Hosting - Hamburg & Amsterdam
What will happen when the limit of maximum concurrent states is reached ?
Will it stop forwarding new flows ?/Tony
depends on the way you write your ruleset.
if you do nothing, exactly that happens.--
Henning Brauer, hb@bsws.de, henning@openbsd.org
BS Web Services, http://bsws.de
Full-Service ISP - Secure Hosting, Mail and DNS Services
Dedicated Servers, Rootservers, Application Hosting - Hamburg & Amsterdam
An incoming packet is either dropped or not, I don't see how the router can
do nothing.Besides that, the environment I am looking at is as an edge/peering router.
Basic filtering to protect infrastructure and where possible prevent
spoofing,
I do not consider such an environment a suitable place for a statelful
device
as they normally change behaviour when the limit of states is exceeded.A router that has a major performance drop when a certain limit of flows is
reached is something I normally stay away from, a router that stops
forwarding of new flows when a flow limit is reached is worse.That is my reasoning for using stateless filters in my case.
If OpenBSD/pf has a solution that solves these stateful limitations I would
be
very interested in understanding it./Tony
you misunderstood; if you do nothing to prevent that situation what you
well, you can go stateful up to a certain point and handle stuff above
stateless (better than dropping), likepass out on X from $foo
pass in on X to $foo
pass out on X from $foo keep state(max 10000)--
Henning Brauer, hb@bsws.de, henning@openbsd.org
BS Web Services, http://bsws.de
Full-Service ISP - Secure Hosting, Mail and DNS Services
Dedicated Servers, Rootservers, Application Hosting - Hamburg & Amsterdam
To design a reliable IP network I would need the devices to be able to
handle
the desired pps rate even when that state limit is exceeded.Many routing devices have over the years achieved good performance by
different flow caching
methods, we have over the years also learnt that this is a bad thing in
uncontrolled environments
like the Internet.A reliable IP router is wirespeed and stateless. There is no getting around
that.
In my case I would verify that the box is wirespeed in the environment I put
it in, the fact that
it can be faster under certain conditions is less interesting./Tony
no, that is entirely bullshit, sorry.
if flow cahcing allows your device to work more efficient in the usual
case, hey, excellent, you would be dumb to not use it.this does NOT save you from either leaving enough headroom that you can
heandle the packet rate when exceeding your state limit or at leastoh really.
I say it is bullshit.
there is no single wirespeed in all circumstances router on the market,
not even for fast ethernet. that is a marketing gag. a 10 MBit/s stream
of correctly and purposefully craftet packets brings each and every
router you can buy to its knees. if it works like an OpenBSD machine
with stateful filters which prefers established states in the overload
case, it doesn't suffer as badly as the stateless ones.--
Henning Brauer, hb@bsws.de, henning@openbsd.org
BS Web Services, http://bsws.de
Full-Service ISP - Secure Hosting, Mail and DNS Services
Dedicated Servers, Rootservers, Application Hosting - Hamburg & Amsterdam
No contradiction. If the requirement is to be wirespeed the forwarding
A Cisco6509 SUPA1/MSFC2 could do around 10Mpps under normal conditions,
but not even 500kpps when flow count exceeded what it could handle
in hardware. Good boxes for the internal network, horrible for the
datacenter or internet core/edge.Are the 10Mpps they can do relevant if the policy states all devices
should be be wirespeed ? If we were to use them would we enable theAre you officially stating that the added complexity of stateful forwarding
does not increase the likelyhood of unpredictable behaviour ?Something as simple as being able to forward packets independently
of the source/destination pattern and protocol hardly qualifies as
the specific/unknown case where you can make a 80Mpps per line card
CRS-1's not even forward 10Mbps.OpenBSD once shipped with a remote root compromise, this was addressed.
When we find new scenarios that can prevent the routers from performing
as expected we try to address that. There will always be unknown corner
cases showing up, and that we need to handle. We do not give up
and go out and buy Ford Pinto's just because there is a possibility of
a new Mercedes blowing up from a slight nudge from behind.No need to get aggressive, Henning.
I don't agree with you. I say that a stateless device in general is more
reliable than a stateful one.Regards Tony
with the amount of states you can handle, I don't think it is a limit
very relevant in practice. Or, in other words, if you need to handle so
many more flows than we can handle statefully, you are at a point whereand I bet I can make up a 10 or maybe 100 Kpps stream that makes it fall
which is totally independent from specific implementations. this is
and I say that is totally poop. It is a marketing lie.
--
Henning Brauer, hb@bsws.de, henning@openbsd.org
BS Web Services, http://bsws.de
Full-Service ISP - Secure Hosting, Mail and DNS Services
Dedicated Servers, Rootservers, Application Hosting - Hamburg & Amsterdam
I didn't get that opinion from marketing.
No matter, we disagree, lets leave it at that./Tony
well, yeah, nontheless, I wanna point out the essence why stateful is
better (the way we do it in OpenBSD):1) it moves the limit where the box starts to suffer from overload quite
far, or, in other words, the box can handle a much larger amount of
traffic before it starts to drop stuff. thus it can withstand bigger
amounts of (D)DoS too.
2) once it gets to that point, it is more selective in dropping packets
than a stateless box, as it prefers established connections. this
behaviour cannot be valued enough in (D)DoS type of situations.--
Henning Brauer, hb@bsws.de, henning@openbsd.org
BS Web Services, http://bsws.de
Full-Service ISP - Secure Hosting, Mail and DNS Services
Dedicated Servers, Rootservers, Application Hosting - Hamburg & Amsterdam
I wish to implement things in a way where the link is the limitation,
not the box. But there is no point in re-doing that discussion.When I have some time free I'll test it in the lab to see that difference in
behaviour. Any ideas of when you will get around to handling assymetric
traffic in a stateful way ?/Tony
I know very little, but I would like to note that some providers (
http://www.rayservers.com/ddos-protection ) deploy OpenBSD with the
express purpose of offering dDoS protection. That has to count for
something.OTOH, Henning's word alone would be enough for me, because AFAIK
Henning wrote actual pertinent code and knows darn friggin well what
he's talking about. Did you contribute as much code to OpenBSD/pf as
Henning? Are you sure your understanding is deeper than his? (No
offense, by the way, all in good humour.)Cheerio,
--ropers
Henning has committed more code than me. If you count in percent
infinetly more. Does that mean that I don't know what I'm talking about ?I use OpenBSD because I like it, I think it is the best project I can find
on the net.
I don't belive a fan-boy attitude is an asset to the project, that is what
you
are contributing right now.This is a view of the a external peering link where I work now:
5 minute input rate 6165205000 bits/sec, 1036946 packets/sec
5 minute output rate 3134466000 bits/sec, 1000242 packets/sec
One link out of many, no DDOS going on. Maybe I should stick a rayserver on
it.Correct me if I'm wrong, but Henning needs someone to argue with him and
pester him./Tony
as I said before, you cannot buy a box that can handle 100M under all
if you keep pestering me, quickly, i keep forgetting it :)
lose (or loose? i keep mixing up) it'll be--
Henning Brauer, hb@bsws.de, henning@openbsd.org
BS Web Services, http://bsws.de
Full-Service ISP - Secure Hosting, Mail and DNS Services
Dedicated Servers, Rootservers, Application Hosting - Hamburg & Amsterdam
I'll throw this out there since its been something on my mind for a
while:Hardware VLAN tagging, TOE offload, IP/UDP/TCP Checksum offload,
interface polling are all ways to accelerate packet forwarding. How
about a standards-based hardware-software API equivalent to Cisco's
"CEF" or "MLS"?The basics:
- layer 3 or layer 4 state ("flow") is identified and established using
software IP-forwarding.
- the software dynamically programs the switching hardware backplane
ASIC to accelerate forwarding the "flow" w/o software further
inspection (Including Fragment Reassembly, etc.)There is probably a huge market out there for a commodity standards
based hardware (if it could be done)~BAS
not exactly a new idea. have a diff? :)
it is incredibly hard. we're slowly moving into a direction where this
becomes easier. slowly.--
Henning Brauer, hb@bsws.de, henning@openbsd.org
BS Web Services, http://bsws.de
Full-Service ISP - Secure Hosting, Mail and DNS Services
Dedicated Servers, Rootservers, Application Hosting - Hamburg & Amsterdam
We have hardware VLAN tagging support on many interfaces.
TOE helps not a single bit on routers and I don't trust TOE just think
about it. TOE is a TCP/IP stack in HW. With every network card generation
we get new features. DMA, IP checksumming, TCP checksumming and each and
every of these much simpler functions where cursed with tons of bugs.
I think there are probably 2 network cards that do the checksumming right,
all others have some more or less noticable bugs in them. So do you think
that the HW designers will create a correct TOE engine?How about a standards-based hardware-software API equivalent to Cisco's
"CEF" or "MLS"?
standards-based? with cisco? Cisco is not even able to follow standards
for easy stuff like VLAN etc.
CEF is a pure software gimmick. MLS needs a Layer-3 capable switch chip
which does all the work with its CAM. If you get me a PCI card with a L3
switching chip on it including a 500k entries CAM plus docu I will write aFragment Reassembly does not happen in the forwarding plane, it happens on
the end system. By doing "flow" based forwarding on the router you're no
longer able to do all the additional checks that pf(4) is doing in itsI doubt it, the necessary HW is just to expensive and complex.
and we don't actually need these on a non-edge router. I'd go so far
I totlly agree with the statement that there is a hugfhe market for
that - but getting supported, fully working hardware at reasonable
prices for it is indeed a gigantic challenge.--
Henning Brauer, hb@bsws.de, henning@openbsd.org
BS Web Services, http://bsws.de
Full-Service ISP - Secure Hosting, Mail and DNS Services
Dedicated Servers, Rootservers, Application Hosting - Hamburg & Amsterdam
I agree.
Just to confirm... you do not encourage the use of fragment reassembly
at forwarding points other than the network periphery?We recently ran into some intermittent TCP connection stalls in a
network where end point systems were behind as many a three PF systems
end-point to end-point. "pfctl -x loud" had a direct correlation to the
stalls and reassemble debug activity output.We didn't debug it too much because there was a mix of 3.7, 3.9, and 4.1
systems and we wanted to standardize on 4.2 before filing any
superfluous bug reports.
well, fragment reassembly probably doesn't hurt that much... don't
really think it makes too much sense in these scenarios either. On the
edge, yes, should be done.
I was more thinking about the sequence number tracking. We can't doi have a hard time to remember what was in 3.7 or 3.9 :)
--
Henning Brauer, hb@bsws.de, henning@openbsd.org
BS Web Services, http://bsws.de
Full-Service ISP - Secure Hosting, Mail and DNS Services
Dedicated Servers, Rootservers, Application Hosting - Hamburg & Amsterdam
| Mark Lord | PCIe Hotplug: NFG unless I boot with card already inserted. |
| Andrew Morton | 2.6.23-mm1 |
| Bart Van Assche | Integration of SCST in the mainstream Linux kernel |
| Greg Kroah-Hartman | [PATCH 005/196] Chinese: add translation of SubmittingDrivers |
| Wes Chow | Re: Multicast packet loss |
| Kenny Chang | Multicast packet loss |
| David Miller | [GIT]: Networking |
| Jarek Poplawski | Re: [PATCH] pkt_sched: Destroy gen estimators under rtnl_lock(). |
git: | |
