PF and States

Previous thread: ice-hockey by Youssef Ossama on Thursday, December 2, 2010 - 4:12 pm. (1 message)

Next thread: clients not receiving dhcp acks from dhcpd on bridge ports by Joel Wiramu Pauling on Thursday, December 2, 2010 - 11:40 pm. (2 messages)
From: Godesi
Subject: PF and States
Date: Thursday, December 2, 2010 - 9:22 pm

Hi,

We recently deployed OBSD4.7 boxes to do load balancing in our
environment with relayd.

After few hours we encountered problem with the server going beyond
10,000 states.  After much research and man pages, we setup states to a
"ridiculous" number.
Yes the number was 100,000.  We also changed the states to expire much
faster.  Redeployed the box and everything was normal for few days till
again we started having issues with the box.
This time the states were 20,000 and again pf/relayd started having
issues.  The box has like 4gig of ram, multiple cores etc.  By issues I
mean can't ssh to box sometimes , can't get relayctl to show hosts etc.

Can someone who is expert at this look at it and tell me what may be
wrong here?
I have couple of questions:

1.  Do I need pf for relayd when I am not doing redirects?
2.  How much states can i "really" have on a box that has 4 gig ram?
Is it governed by how much mem is allocated to kernel? (i read it
somewhere while googling).  Can I change that?


Here is pf.conf.  Basically since the box is BEHIND a corporate
firewall Juniper.  We didn't really need to block anything. So pf.conf
is very simple and so is the relayd.conf:

I would really appreciate any help.

ext_if="fxp0"
web_if="fxp1"

set loginterface $ext_if
set optimization aggressive
set skip on lo
set limit { states 100000  }


set timeout tcp.first           10
set timeout tcp.opening         10
set timeout tcp.established     60
set timeout tcp.closing         10
set timeout tcp.finwait         10
set timeout tcp.closed          10


pass quick on $ext_if
pass quick on $mgt_if


Here is the relayd.conf file:


# $OpenBSD: relayd.conf,v 1.13 2008/03/03 16:58:41 reyk Exp $
#
# Macros
#

images_vip="10.1.0.107"

#
# Global Options
#
interval 30
#timeout 180
#
# Each table will be mapped to a pf table.
#
table <webhosts> {   web01 web02  web03   web04   web05  web06 }
                        table <fallback> { 127.0.0.1 }

#
# Services ...
From: Ryan McBride
Date: Friday, December 3, 2010 - 1:41 am

More than 100,000. I havn't tested lately (planning to do so soo), but I


Not directly. In fact, having too much RAM in your box will COST you
memory, as more kernel memory is used up tracking all your RAM. So
cutting your ram to 2 GB will probably improve the upper limit, though
it doesn't seem that that's the limit you are hitting.


What does 'pfctl -vvsi' show when this problem is happening?

From: dabheeruz
Date: Friday, December 3, 2010 - 12:32 pm

Thanks Ryan! Unfortunately when this happened I was remote and could not 
grab those stats.  But what should I be looking for in term of badness.  
Maybe I can quickly setup something to monitor for particular stat.  
Really appreciate your input.

Thx.


From: dabheeruz
Date: Wednesday, December 8, 2010 - 1:39 pm

Hi Ryan,

We are seeing the issue again and I am writing a script to get the 
"pfctl -vvsi" data at regular intervals.  Can you please point me to 
what values I should be looking out for?

Thanks
Parvinder Bhasin


From: Ryan McBride
Date: Wednesday, December 8, 2010 - 3:09 pm

You want to look for any of the counters in the Counters section besides
'match' increasing "A Lot". How much depends on your specific situation,
but if you get a feel for what you see when you're NOT having problems,
you should be able to see if any of the counters increases suddenly.

In your case, the most likely ones are:

	- memory
	- congestion
	- state-limit

From: dabheeruz
Date: Saturday, December 11, 2010 - 10:10 pm

Thanks Ryan!!

From: Henning Brauer
Date: Sunday, December 19, 2010 - 5:16 am

you're way off ;)
I had 2 million during a DDoS. things got a bit slow but everything
worked.

-- 
Henning Brauer, hb@bsws.de, henning@openbsd.org
BS Web Services, http://bsws.de
Full-Service ISP - Secure Hosting, Mail and DNS Services
Dedicated Servers, Rootservers, Application Hosting

From: dabheeruz
Date: Sunday, December 19, 2010 - 1:08 pm

Hmm..thanks guys.  I am stumped as even with 100K states set in pf, the 
box was dying.  Dying meaning I couldn't ssh (intermittent) , carp was 
failing etc, relayd (intermittent failure on the checks etc).

Using pftop I saw that there was only slight increase in states (around 
15-20K - total).  As I tried bunch of things which didn't work.   When 
the traffic was around 8-10K (total) states then the box was responding 
perfectly well.  I am on 4.7 for amd64.  This has now happened around 4 
times and I am totally clueless now as to what should my next 
troubleshooting step be like.  Wondering if there is some issue with 4.7 
amd64.

From: Kevin Wilcox
Date: Monday, December 20, 2010 - 7:52 am

Henning - out of curiosity, what were the specs on that hardware?

My understanding was that pf won't use more than 1GB of RAM, which I
thought to equal about 1 million states, but I never verified that
information and now it's been so long I can't recall the source.

Obviously, my incorrectness probably exists on several levels here...

kmw

From: Gabriel Linder
Date: Tuesday, December 21, 2010 - 1:41 am

It may be interesting to know of any specifics tweaks in that setup 

According to pf_var.h, a struct pf_state is roughly 212 bytes on amd64.

From: Henning Brauer
Date: Tuesday, December 21, 2010 - 5:50 am

OpenBSD 4.8-stable (GENERIC) #1: Mon Oct  4 16:19:06 CEST 2010
    henning@terak.bsws.de:/usr/src/sys/arch/i386/compile/GENERIC
cpu0: Intel(R) Core(TM)2 CPU 6600 @ 2.40GHz ("GenuineIntel" 686-class) 2.40 GHz
cpu0: FPU,V86,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,TM,SBF,SSE3,MWAIT,DS-CPL,VMX,EST,TM2,SSSE3,CX16,xTPR,PDCM
real mem  = 1072128000 (1022MB)
avail mem = 1044631552 (996MB)
mainbus0 at root
bios0 at mainbus0: AT/286+ BIOS, date 08/25/07, BIOS32 rev. 0 @ 0xfd470, SMBIOS rev. 2.51 @ 0x3feeb000 (31 entries)
bios0: vendor Phoenix Technologies LTD version "6.00" date 08/25/2007
bios0: Supermicro PDSMi
acpi0 at bios0: rev 0
acpi0: sleep states S0 S1 S4 S5
acpi0: tables DSDT FACP MCFG HPET APIC BOOT ASF! SSDT SSDT
acpi0: wakeup devices DEV1(S5) EXP1(S5) PXHA(S5) EXP5(S5) EXP6(S5) PCIB(S5) KBC0(S1) MSE0(S1) COM1(S5) COM2(S5) USB1(S4) USB2(S4) USB3(S4) USB4(S4) EUSB(S4)
acpitimer0 at acpi0: 3579545 Hz, 24 bits
acpihpet0 at acpi0: 14318179 Hz
acpimadt0 at acpi0 addr 0xfee00000: PC-AT compat
cpu0 at mainbus0: apid 0 (boot processor)
cpu0: apic clock running at 268MHz
ioapic0 at mainbus0: apid 1 pa 0xfec00000, version 20, 24 pins
ioapic1 at mainbus0: apid 2 pa 0xfec10000, version 20, 24 pins
acpiprt0 at acpi0: bus 0 (PCI0)
acpiprt1 at acpi0: bus 1 (DEV1)
acpiprt2 at acpi0: bus 9 (EXP1)
acpiprt3 at acpi0: bus 10 (PXHA)
acpiprt4 at acpi0: bus 13 (EXP5)
acpiprt5 at acpi0: bus 14 (EXP6)
acpiprt6 at acpi0: bus 15 (PCIB)
acpicpu0 at acpi0: PSS
acpibtn0 at acpi0: PWRB
bios0: ROM list: 0xc0000/0xb000
ipmi at mainbus0 not configured
cpu0: Enhanced SpeedStep 2395 MHz: speeds: 900, 600 MHz
pci0 at mainbus0 bus 0: configuration mode 1 (bios)
pchb0 at pci0 dev 0 function 0 "Intel E7230 Host" rev 0xc0
ppb0 at pci0 dev 1 function 0 "Intel E7230 PCIE" rev 0xc0: apic 1 int 16 (irq 11)
pci1 at ppb0 bus 1
ppb1 at pci0 dev 28 function 0 "Intel 82801GB PCIE" rev 0x01: apic 1 int 17 (irq 12)
pci2 at ppb1 bus 9
ppb2 at pci2 dev 0 function ...
From: Jan Johansson
Date: Saturday, December 4, 2010 - 12:58 am

Are you convinced that it is a state problem?

In our tests we have found that a default setup of relayd will
handle 2540 connections and will then stop responding to new
connections might this be the limit you are seeing?

Our pf.conf is the default that comes with the install.

From: dabheeruz
Date: Sunday, December 5, 2010 - 6:16 pm

Hi Jan,

This actually happened again really late at night , one thing that 
strangely happened was that we had nagios setup to monitor CARP state 
and basically the secondary lb (same config etc) had its carp interface 
in "init" state and once again the primary relayd box was displaying 
problems.  Users not being able to get to site and sometimes they 
could.  When I tried to ssh into the box , I  couldn't and after couple 
of retries when I was finally logged in.  I try to do "relayctl show 
hosts " or "relayctl show sessions " or any other command. I got error.  
When I looked at PF states they were around 20K.   I logged on to the 
secondary (backup carp) and of course saw that it was confused.  These 
two boxes are connected directly.  No switches or anything.  It seems 
like the secondary box also wasn't able to fully communicate with the 
MASTER.  When the states were back to around 8K, everything was back to 
normal.  I could do "relayctl show sessions" etc.

Very strange this problem!! Is it PF? or relayd?  can't really tell but 
I have to come up with something soon otherwise I would have to part way 
with this solution.  Which I really don't want to :(

thx

Previous thread: ice-hockey by Youssef Ossama on Thursday, December 2, 2010 - 4:12 pm. (1 message)

Next thread: clients not receiving dhcp acks from dhcpd on bridge ports by Joel Wiramu Pauling on Thursday, December 2, 2010 - 11:40 pm. (2 messages)