People have started trickling into the hackathon rooms as the morning wears on. The music is louder than yesterday, and discussions continue around the various tables. A CTV television crew arrives, circling the room taking footage that will be distributed throughout Canada. One day earlier, a photographer arrived getting photos for an upcoming four page spread in Forbes magazine. Discussions about tomorrow's tear down start, reflecting on how much effort and time is involved in packing everything up. But the primary focus remains on the many projects currently in progress that people still hope to get finished. At least, finished enough.
One of the projects that has multiple people involved is PF, OpenBSD's packet filter. The packet filter's original author, Daniel Hartmeier [interview], talked about his ongoing efforts and reflected on the evolution PF has seen in the past few years. Mike Frantzen talked about his work on improving the PF optimizer. Henning Brauer described his work to allow PF to filter on interface groups. And Ryan McBride [interview] spoke about his efforts to turn pfctl into more of a compiler offering a number of useful benefits.
PF was originally started by Daniel Hartmeier, born in June of 2001 as a replacement for IPFilter which had been suddenly removed from OpenBSD due to licensing issues. PF was first found in OpenBSD 3.0, released in December of 2001. Since that time it has been actively improved and enhanced by numerous OpenBSD developers, and has been widely adopted in the BSD communities, imported into NetBSD, FreeBSD, and DragonflyBSD.
I sat down with Daniel as he worked at the 2005 hackathon in Calgary. He was working on a large AMD 64 laptop, displaying a simple screen managed by OpenBSD's default fvwm window manager, coding in a vi window with mutt open to the side. "I don't install much other software," he explained when I asked him about his working environment. "I have too many machines, and once you get used to something you need it everywhere. It's too much work getting things customized on all these machines." For that reason, he chooses to use tools that are provided in the base OpenBSD distribution, and the tools provide everything he needs.
I asked him what he thought of the current incarnation of PF, to which he replied, "it does everything I need it for, and it has gained a lot of features which I never use." He continued, "when we first started, people were saying they would use IPFilter for the first five years, waiting until PF was more mature. I guess we're there now." Indeed, in a conversation I had with some other OpenBSD developers over dinner here in a Calgary pub one night, they reflected on how widely used PF is by companies and government facilities all over the world, even though most of these institutions don't talk about it.
When I approached him, Daniel was working on a new PF extension to log the userid and pid of PF users. Daniel described the problem referring to another OpenBSD developer's configuration which involved a nat gateway in a university. "It translates private IP's into a single public IP. They use authpf to login and provide Internet access. However, once the addresses are translated there's no way to know which user created which sessions." His solution is to log the user id and process id with pflog. The effort involved extending some PF header structures, which requires coordination with multiple developers, something easier to do in person at a hackathon.
Otherwise, Daniel was focused on fixing bugs. "I have no long list of stuff to do," he noted. "I'll work on whatever shows up. Usually the best ideas just pop up in discussions when we can sit together and brain storm." In particular, Daniel focuses on bug reports that stick around for more than a week, "that's when it gets interesting." He did note that he doesn't spend as much time working on PF as when the project began nearly four years ago, "I don't have as much time as when I started."
Mike Frantzen has been working on improving PF's optimizer logic. Essentially, the optimizer will parse through the running ruleset and logically group things in such a way that they can be more efficiently parsed. "We define blocks inside a PF ruleset", Mike explains. "These blocks contain similar rules that we can reorder, condense, and otherwise make faster to work with." He gave an example with the existing optimizer in which someone had a 10,000 line rule set generated automatically by a 1,000 line perl script. "With the old optimizer, it dropped the rule set to maybe 3,000 lines. With the improved optimizer I hope to drop this to maybe 1,000 lines." The optimizations happen at startup time usually taking less than a second, but in this more complex example taking around 3 minutes.
The key to the optimizer is to organize and condense things in such a way that the rule set can be parsed quicker, but so that nothing logically changes. The new work that Mike is doing allows the optimizer to not only work with the different logical groups, but to actually compare rules between the different logical groups as well. "That's the trick," he explained, "if two rules can be reordered, and we can gain performance, then we do it." The resulting ruleset may not look like anything which was originally written, but functionally it is identical. "The packet is passed or blocked," Mike summarized. "In the end, the optimizer has no effect on how we treat the packet."
Henning Brauer's efforts on PF for this year's hackathon actually began earlier when he flew into Montreal on May 7'th and started doing cleanup work. "I cleaned up the interface abstraction code," he explained, "which makes sure that a ruleset can refer to interfaces that aren't there yet, such as a wireless card that isn't plugged in." Henning has a friendly, outgoing demeanor, taking me around the various tables when I first arrived, introducing me to the hackathon attendees.
His work on the interface abstraction code allows PF to filter on interface groups. Interface groups are configured outside of PF using ifconfig, allowing you to group one or more interfaces under an arbitrary name. For example, an interface group could be named "external", and could contain all your external network interfaces. "Inside of PF you can use groups of interfaces where you originally referred to an interface before," Henning explained. "This allows for a lot of stuff. For example, you can use this to write completely hardware independent rulesets." He went on to describe how you could take this single configuration and use it on multiple servers, no matter if each of them had a different number or type of physical interfaces. The configuration could work on a machine with one external interface as well as it would on another machine with several external interfaces, so long as each of them were properly configured with interface groups.
Henning went on to excitedly explain how the new interface group support in PF also works with brace notation. Essentially, when a group is contained within braces, this tells PF to dynamically translate all interfaces to addresses on the fly. Conversely, when the group is not contained within braces, it translates the interfaces to addresses as load time. The advantage to doing this dynamically is that you don't have to release your ruleset whenever there's a change in IP address. This is especially useful if you have interfaces which can join and leave a group while you're using it, such as a wireless interface.
Henning plans to stay in Canada for a few more weeks, moving on to Vancouver where he'll be staying with Ryan McBride. Speaking of friends back in Hamburg where he lives, he laughs and says, "friends say I live in Canada and come back to Germany for work. I'm not that bad, but I do spend quite some time here."
Ryan Mcbride has been working on pfctl, "I've been turning it into more of a compiler so it builds an entire rule set in userland before loading it into the kernel." He explained that currently pfctl loads each rule one at a time, limiting how much processing it can do. "By doing it all in userland first," Ryan continued, "it allows us to make the syntax more complex allowing us to do more, and it allows us to do far more esoteric optimizations such as over or across anchors. Once implemented, we'll probably also look into doing error detection or bogus rule set detection, spewing out warnings if the user creates rule sets that pass all traffic, or block all traffic." He went on to explain that those are just the obvious examples of things to check for, but with the entire ruleset in memory there's many more subtle things they can also check for and warn against. "I expect that once that's done, we'll think of other things we can do based on the same work," he added.
Functionality wise, Ryan estimated that he was about 1/3'rd complete. However, he had this same idea implemented a year ago at a previous hackathon, "but then Mike did the optimizer, and Daniel changed the way anchors work, and the that made my earlier diff useless." He said this with a smile, noting that there were some nits with his earlier work, "I basically had to start from scratch." One of the final steps involves merging in some parser code that Theo wrote, "I just have to graft it in." He estimates that he has around 3 more days of hacking to get it fully working, and hopes to get it into the tree within the next month or so.
Ryan's own efforts aren't his prime focus at the hackathan. "Some people come to the hackathon and hack like crazy, but then they disappear for the rest of the year because they don't have time," he explained. "I'm trying to help them to get their stuff in quickly, because if they're working on it now we should take advantage of that." He added, "I know I can do my stuff later."
For the past year and half, up until February, Ryan was paid by the OpenBSD project. "It came to an end because I felt it was better for the project. It's a lot of money for them to spend, and I felt it was better to spend it on someone else, to give someone else an opportunity to benefit from it." The initial arrangement had been for him to be paid by OpenBSD for only one year, "mentally that's what I had prepared for." But at the end of the year, there were projects that weren't finished yet, compelling him to continue working and finish them up. Now he does consulting work, spending half his time working with OpenBSD firewalls, and the other half of his time working on general corporate security assessments, penetration testing, and the like.
Living in Vancouver, Ryan described Calgary as his second home, "On average, I come here every three to four months for about a week or so. I do a hike with Theo and/or Peter, then I do some hacking and hang out at Theo's place." He adds with laugh, "I drink his booze if he's not around, then I go home again." The friendship between the OpenBSD hackers is very apparent.
Ryan and Henning will continue to work on PF for another 10 days or so after the hackathon, doing a large cleanup of the code. He described their main focus, "we're cutting out code, and making it more readable. This builds a new foundation for adding cool new stuff later. We didn't want to do the big cleanup during the hackathon, because we didn't want to break it for the others," he explained. "New features tend to patch small areas of the code, but cleanup tends to touch a lot of files everywhere, making other people's diffs not merge correctly, and making it harder for people doing active development." He noted that after the hackathon, there's generally a lull which provides a perfect chance for them to focus on their cleanup.
Friday night was the last night that many of these developers and friends would have together, so as evening came many migrated out of the hackathon rooms out into a few Calgary pubs. Discussions continued as beer was drunk, sharing stories and with quite a few ideas evolving as the night wore on. Eventually the bars closed down, and the group migrated back to the hackathon rooms in the hotel. Time didn't seem to have much meaning, with people still working as the sun began coming up outside. Being the last night, the room seemed more active than the night before. "Tomorrow night is for sleeping," Bob Beck explained to me, "not tonight."