Hello, I'm testing an OpenBSD 4.2 firewall with Iperf and I'm experiencing a very strange behaviour. What happens is that when I reboot the backup node the connection rate drops while the backup node is coming back. Iperf log: [ 3] 233.0-234.0 sec 6.62 MBytes 55.5 Mbits/sec [ 3] 234.0-235.0 sec 6.62 MBytes 55.5 Mbits/sec [ 3] 235.0-236.0 sec 6.62 MBytes 55.5 Mbits/sec [ 3] 236.0-237.0 sec 6.70 MBytes 56.2 Mbits/sec [ 3] 237.0-238.0 sec 288 KBytes 2.36 Mbits/sec [ 3] 238.0-239.0 sec 3.40 MBytes 28.5 Mbits/sec [ 3] 239.0-240.0 sec 0.00 Bytes 0.00 bits/sec [ 3] 240.0-241.0 sec 3.55 MBytes 29.8 Mbits/sec [ 3] 241.0-242.0 sec 0.00 Bytes 0.00 bits/sec [ 3] 242.0-243.0 sec 3.49 MBytes 29.3 Mbits/sec [ 3] 243.0-244.0 sec 0.00 Bytes 0.00 bits/sec [ 3] 244.0-245.0 sec 3.49 MBytes 29.3 Mbits/sec [ 3] 245.0-246.0 sec 2.30 MBytes 19.3 Mbits/sec [ 3] 246.0-247.0 sec 5.23 MBytes 43.9 Mbits/sec [ 3] 247.0-248.0 sec 2.60 MBytes 21.8 Mbits/sec [ 3] 248.0-249.0 sec 5.37 MBytes 45.0 Mbits/sec [ 3] 249.0-250.0 sec 1.28 MBytes 10.7 Mbits/sec [ 3] 250.0-251.0 sec 4.69 MBytes 39.3 Mbits/sec [ 3] 251.0-252.0 sec 4.69 MBytes 39.3 Mbits/sec [ 3] 252.0-253.0 sec 6.62 MBytes 55.5 Mbits/sec [ 3] 253.0-254.0 sec 6.62 MBytes 55.5 Mbits/sec [ 3] 254.0-255.0 sec 6.62 MBytes 55.5 Mbits/sec That drop in connection is when the rebooted node is coming back ! Iperf is being tested from one machine behind one firewall interface and another machine behind another firewall interface. One machine is running Openbsd and the other Linux. Is there any reason for this behaviour ? I do not expect the backup node to have any influence over the flow on active node. Related to this is a problem with pfsync. Sometimes I get a bad state after the backup firewall comes back and then Iperf gets totally messed up, sometimes recovering others not. No difference if psync is configured with multicast or with syncpeer. Log from the active node: Apr 10 ...
John,
I ran a test using iperf on an external openbsd system (client) through a carp
firewall to an internal openbsd system (server). All systems are running
OpenBSD v4.2 with the latest patches.
external ---> CARP ---> internal
(iperf -i 1 -t 600 -c carp0) (iperf -s)
I did _not_ see any slow down through the MASTER when I rebooted the BACKUP
server. For example, I started the reboot of the BACKUP at 5 seconds and
the BACKUP finished rebooting at 102 seconds:
[ 3] 1.0- 2.0 sec 81.2 MBytes 681 Mbits/sec
[ 3] 2.0- 3.0 sec 82.3 MBytes 690 Mbits/sec
[ 3] 3.0- 4.0 sec 83.8 MBytes 703 Mbits/sec
[ 3] 4.0- 5.0 sec 86.6 MBytes 727 Mbits/sec -- start reboot
[ 3] 5.0- 6.0 sec 86.8 MBytes 728 Mbits/sec
[ 3] 6.0- 7.0 sec 86.3 MBytes 724 Mbits/sec
[ 3] 7.0- 8.0 sec 82.8 MBytes 695 Mbits/sec
[ 3] 8.0- 9.0 sec 86.7 MBytes 728 Mbits/sec
[ 3] 9.0-10.0 sec 85.8 MBytes 720 Mbits/sec
[ 3] 10.0-11.0 sec 86.1 MBytes 722 Mbits/sec
....cut....
[ 3] 96.0-97.0 sec 83.4 MBytes 699 Mbits/sec
[ 3] 97.0-98.0 sec 82.4 MBytes 692 Mbits/sec
[ 3] 98.0-99.0 sec 81.9 MBytes 687 Mbits/sec
[ 3] 99.0-100.0 sec 84.7 MBytes 710 Mbits/sec
[ 3] 100.0-101.0 sec 83.3 MBytes 699 Mbits/sec
[ 3] 101.0-102.0 sec 83.7 MBytes 702 Mbits/sec -- finished reboot
[ 3] 102.0-103.0 sec 83.3 MBytes 699 Mbits/sec
[ 3] 103.0-104.0 sec 83.6 MBytes 701 Mbits/sec
[ 3] 104.0-105.0 sec 85.3 MBytes 716 Mbits/sec
[ 3] 105.0-106.0 sec 83.4 MBytes 699 Mbits/sec
I also did not see any errors in the logs of either system running ipref
or on the firewalls. The load on the MASTER firewall was around 0.30.
Are the firewalls kernel patched? Are their any hardware failures to
report? Are the firewalls overloaded?
You are welcome to check out some of the "how to's" I have at
http://calomel.org if you need to.
--
Calomel @ http://calomel.org
Open Source Research and Reference
Hello, This got even more interesting. After reading your email I had the idea to start turning off the various carp interfaces to see what would be the effect. I have two onboard "Broadcom BCM5704C" and a "Intel PRO/1000MT QP (82546GB)" quad nic. One carp is configured for one onboard nic and two other for the quad nic. I removed the two carps for the quad nic at backup node and rebooted it a few times. There are no failures in iperf test (I used a long time to make sure it was always running between all the tests) which is the same as your tests and normal expected result. Removing the onboard carp and activating both or one of the quad nic carps gives the failures I reported previously. Without pfsync active in the master node, I get a small failure in iperf tests while the backup node is coming back. If I activate pfsync, I get the same small failure plus sometimes a total mess up of iperf connection states. So it seems the problem is happening with the quad nic. I don't see any performance problems with the quad nic because I left iperf running for 2 days without any problem. CPU usage in interrupts is around 15% and load 0.20 while doing tests. The firewall is still not in production, so only traffic is only my test and internet junk being dropped. Kernel is GENERIC 4.2 without any patches (I don't see any of them relevant to this problem). I doubt about any hardware problems because the same happens if I exchange their roles as master and backup. I can't understand how the backup node can generate these results with a reboot. While writing this I remembered to do another test. I destroyed the quad nic carps (with ifconfig carpX destroy) and then brought them back with sh /etc/netstart. Iperf keeps running smoothly this time... Master node receives the bulk update requests without any problems. Did this a few times and nothing happened. Even more weird now !!! Something is being done while those interfaces got up for the first time after the reboot! Any ideas ...
Is ACPI enabled? -J. On Apr 10, 2008, at 6:07 PM, "openbsd firewall" <openbsdfirewall@gmail.com
Hello, It's booting with default behaviour so no ACPI enabled. Here's dmesg output for the backup node (master is exactly the same hardware). Apr 10 17:40:23 bbq /bsd: OpenBSD 4.2 (GENERIC) #375: Tue Aug 28 10:38:44 MDT 2007 Apr 10 17:40:23 bbq /bsd: deraadt@i386.openbsd.org: /usr/src/sys/arch/i386/compile/GENERIC Apr 10 17:40:23 bbq /bsd: cpu0: Dual-Core AMD Opteron(tm) Processor 1210 HE ("AuthenticAMD" 686-class, 1024KB L2 cache) 1.80 GHz Apr 10 17:40:23 bbq /bsd: cpu0: FPU,V86,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,HTT,SSE3,CX16 Apr 10 17:40:23 bbq /bsd: real mem = 2146988032 (2047MB) Apr 10 17:40:23 bbq /bsd: avail mem = 2068418560 (1972MB) Apr 10 17:40:23 bbq /bsd: mainbus0 at root Apr 10 17:40:23 bbq /bsd: bios0 at mainbus0: AT/286+ BIOS, date 02/08/08, BIOS32 rev. 0 @ 0xf0010, SMBIOS rev. 2.4 @ 0xfbb50 (50 entries) Apr 10 17:40:23 bbq /bsd: bios0: vendor American Megatrends Inc. version "080011 " date 02/08/2008 Apr 10 17:40:23 bbq /bsd: bios0: Supermicro H8SSL-I2 Apr 10 17:40:23 bbq /bsd: pcibios0 at bios0: rev 2.1 @ 0xf0000/0x10000 Apr 10 17:40:23 bbq /bsd: pcibios0: PCI IRQ Routing Table rev 1.0 @ 0xf4d40/176 (9 entries) Apr 10 17:40:23 bbq /bsd: pcibios0: no compatible PCI ICU found: ICU vendor 0x1166 product 0x0205 Apr 10 17:40:23 bbq /bsd: pcibios0: PCI bus #3 is the last bus Apr 10 17:40:23 bbq /bsd: bios0: ROM list: 0xc0000/0xb000 0xcb000/0x3000! 0xce000/0x1600 0xcf800/0x1600 0xd1000/0x1000 Apr 10 17:40:23 bbq /bsd: acpi at mainbus0 not configured Apr 10 17:40:23 bbq /bsd: cpu0 at mainbus0 Apr 10 17:40:23 bbq /bsd: pci0 at mainbus0 bus 0: configuration mode 1 (no bios) Apr 10 17:40:23 bbq /bsd: ppb0 at pci0 dev 1 function 0 "ServerWorks HT-1000 PCI" rev 0x00 Apr 10 17:40:23 bbq /bsd: pci1 at ppb0 bus 1 Apr 10 17:40:23 bbq /bsd: ppb1 at pci1 dev 13 function 0 "ServerWorks HT-1000 PCIX" rev 0xb2 Apr 10 17:40:23 bbq /bsd: pci2 at ppb1 bus 2 Apr 10 17:40:23 bbq /bsd: ppb2 at pci2 dev 1 function 0 ...
I was implying that you should enable ACPI and try again. -J. On Apr 10, 2008, at 7:08 PM, "openbsd firewall" <openbsdfirewall@gmail.com
Same results with ACPI enabled on both nodes.
Let's see your dmesg with acpi enabled. --- Jason Dixon DixonGroup Consulting http://www.dixongroup.net
Dmesg for backup node: Apr 11 10:21:34 bbq /bsd: OpenBSD 4.2 (GENERIC) #375: Tue Aug 28 10:38:44 MDT 2007 Apr 11 10:21:34 bbq /bsd: deraadt@i386.openbsd.org: /usr/src/sys/arch/i386/compile/GENERIC Apr 11 10:21:34 bbq /bsd: cpu0: Dual-Core AMD Opteron(tm) Processor 1210 HE ("AuthenticAMD" 686-class, 1024KB L2 cache) 1.80 GHz Apr 11 10:21:34 bbq /bsd: cpu0: FPU,V86,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,HTT,SSE3,CX16 Apr 11 10:21:34 bbq /bsd: real mem = 2146988032 (2047MB) Apr 11 10:21:34 bbq /bsd: avail mem = 2068418560 (1972MB) Apr 11 10:21:34 bbq /bsd: mainbus0 at root Apr 11 10:21:34 bbq /bsd: bios0 at mainbus0: AT/286+ BIOS, date 02/08/08, BIOS32 rev. 0 @ 0xf0010, SMBIOS rev. 2.4 @ 0xfbb50 (50 entries) Apr 11 10:21:34 bbq /bsd: bios0: vendor American Megatrends Inc. version "080011 " date 02/08/2008 Apr 11 10:21:34 bbq /bsd: bios0: Supermicro H8SSL-I2 Apr 11 10:21:34 bbq /bsd: pcibios0 at bios0: rev 2.1 @ 0xf0000/0x10000 Apr 11 10:21:34 bbq /bsd: pcibios0: PCI IRQ Routing Table rev 1.0 @ 0xf4d40/176 (9 entries) Apr 11 10:21:34 bbq /bsd: pcibios0: no compatible PCI ICU found: ICU vendor 0x1166 product 0x0205 Apr 11 10:21:34 bbq /bsd: pcibios0: PCI bus #3 is the last bus Apr 11 10:21:34 bbq /bsd: bios0: ROM list: 0xc0000/0xb000 0xcb000/0x3000! 0xce000/0x1600 0xcf800/0x1600 0xd1000/0x1000 Apr 11 10:21:34 bbq /bsd: acpi0 at mainbus0: rev 0 Apr 11 10:21:34 bbq /bsd: acpi0: tables DSDT FACP APIC OEMB Apr 11 10:21:34 bbq /bsd: acpitimer at acpi0 not configured Apr 11 10:21:34 bbq /bsd: acpiprt0 at acpi0: bus 0 (PCI0) Apr 11 10:21:34 bbq /bsd: acpiprt1 at acpi0: bus 1 (P0P1) Apr 11 10:21:34 bbq /bsd: acpiprt2 at acpi0: bus 2 (P1P2) Apr 11 10:21:34 bbq /bsd: acpicpu at acpi0 not configured Apr 11 10:21:34 bbq /bsd: acpicpu at acpi0 not configured Apr 11 10:21:34 bbq /bsd: acpibtn at acpi0 not configured Apr 11 10:21:34 bbq /bsd: acpibtn at acpi0 not configured Apr 11 10:21:34 bbq /bsd: cpu0 at mainbus0 Apr 11 10:21:34 bbq ...
Hello, Some news about this... If I change vhid on the backup node this problem doesn't occurs since the ARP for the master node is still in cache and backup node now has a different mac address for the carp interfaces. Of course changing vhid and IP doesn't give any trouble at all. It seems the backup node is messing with arp (maybe at switch level ???) when it's coming back! All switches are CISCO 2900 and 3500. Is there any recommend configuration for these switches ? Thanks, John
yes. involves a nice pack of explosives and a lighter. that said, i have used these shitty things in a dark time long long ago, and they don't require special config w/ carp. just take care to not use port-security with static leraning (they might use different words to confuse the matter) so that the carp mac is statically bound to one of the ports. -- Henning Brauer, hb@bsws.de, henning@openbsd.org BS Web Services, http://bsws.de Full-Service ISP - Secure Hosting, Mail and DNS Services Dedicated Servers, Rootservers, Application Hosting - Hamburg & Amsterdam
