One new attendee of this year's OpenBSD hackathon was Fernando Gont, a diverse individual from Argentina whose current job titles include teacher, technical writer, system administrator and network researcher. His presence at the hackathon was the result of an internet-draft he wrote about some flaws in the ICMP protocol, flaws he discovered while writing the "Security Considerations" of a different internet-draft titled "TCP's reaction to soft errors" for the IPv6 Operations working group. In researching that earlier draft, he considered various attacks against TCP using ICMP error messages, and proposed some extra validation that could be done as prevention. Following up, Fernando reviewed the IETF specifications for ICMP and TCP and was surprised to discover that they didn't propose similar validation checks, ultimately deciding to write his latest internet-draft highlighting the security impact.
Fernando was interested in discussing the ideas with his peers, but was concerned about vendors trying to patent his suggested fixes. He'd read some comments by OpenBSD creator Theo de Raadt [interview] which led him to believe that he could safely talk with Theo about his ICMP discoveries. Theo was impressed by the ideas, and as Fernando was already heading to BSDCan, Theo helped arrange for him to stay in Canada longer to attend CanSecWest and the OpenBSD hackathon. At the hackathon, Fernando worked around the clock to implement some of his suggested fixes into the OpenBSD networking stack, during which time I spoke with him.
The ICMP flaw is in the design of the protocol, not in any specific implementation. Theo explains, "here we have a 20 year old protocol, a part of the Internet infrastructure that hasn't been touched in 10 years and we were all sure was right, and now is cast in doubt." He went on to add, "these things have to be done carefully. We can't ignore the problem, which is what the IETF and the other vendors are telling us to do."
Three Blind ICMP Attacks:
Fernando stressed that the issues in ICMP are with the specification itself, "this makes the problem more important because it affects everyone, not just one implementation from a programmer mistake." He goes on to point out that the problem won't truly be fixed until the IETF specification themselves are fixed, as it is from these specifications that vendors implement their systems. "Most vendors have, are, or will be implementing the recommended counter-measures in the near future," Fernando acknowledges, "however, vendors have not bothered to participate in the relevant IETF working group to update the existing specifications." Thus Fernando is concerned that future implementations will continue to be made following these outdated and now known-to-be-flawed specifications.
All three ICMP flaws can be exploited without sniffing network traffic, and do not require a "man in the middle". Unlike the earlier "slipping in the window" TCP reset attack [story], these ICMP-based TCP attacks don't require an attacker to guess a correct TCP sequence number, making it simpler to disrupt network traffic. As a brief overview, the three flaws are:
"Hard" ICMP Errors:
The ICMP protocol was first defined in RFC 792, published in September of 1981. Referring to TCP connections, ICMP errors are classified as either "hard" or "soft". A "hard" error results in the TCP connection being torn down, much the same as if a RST packet was received. There are three ICMP type 3 'destination unreachable' errors that are defined in RFC 1122 as hard errors. Code 2, 'protocol unreachable', code 3, 'port unreachable', and possibly code 4, 'fragmentation needed and don't fragment bit set' are all hard errors that if received can cause a TCP stack to tear down an existing connection. (Code 4 is only a 'hard' error if Path MTU discovery is not implemented.)
Other ICMP errors are considered "soft" errors. "Soft" errors are reported to the application affected, but the connection continues. Fernando's solution for the "hard" ICMP error flaw is to simply treat them like "soft" ICMP errors. "If treated that way," he said, "the stack becomes immune to the problem." As to why the ICMP stack was designed this way in the first place, "the basic idea of hard errors was to avoid keeping TCP connections from retrying and retrying lots of times," Fernando explained. "Maybe it made sense many years ago when you didn't have the processing power you have now, but these days there is no problem with just letting the TCP connection eventually timeout when there is a legitimate network problem."
ICMP type 4 code 0 packets are defined as "source quench" messages. When a router between two endpoints or the remote endpoint itself begins to run out of buffer space for processing incoming packets, it can send a source quench ICMP packet to the endpoint from where the traffic originated. As defined in RFC 792, when an endpoint receives a source quench packet it should slow the rate at which it is sending out packets. After ten minutes, the endpoint should gradually increase the rate at which it's sending packets up to the original rate.
Fernando's paper points out that source quench messages can also be abused. If the messages are spoofed at a high enough rate, a TCP connection can be slowed to a crawl. "While this would not reset the connection," Fernando explained, "it would certainly degrade the performance of the data transfer taking place." Fortunately the solution is simple he goes on to explain, "you can just completely disable ICMP source quenching for TCP because the TCP protocol has its own handling for these conditions, and routers, as specified by RFC 1812, should not be sending source quench packets either."
Path MTU Discovery
IP sessions are composed of many packets. The largest size of each of these packets is known as the maximum transmission unit, or MTU, and ideally it's sized for maximum throughput. If packets are too large, there's extra overhead for routers in between the endpoints that have to break the large packets into smaller fragments, and again overhead for the final endpoint that has to reassemble the fragments back into the original packets. If packets are too small, there's extra overhead creating and processing all the additional smaller packets. Additional research into the potential problems of fragmentation can be found in the 1987 paper "Fragmentation considered harmful" and the more recent "Fragmentation considered very harmful" from 2004. Thus, it's important to configure your endpoints to use an appropriate MTU, usually the maximum packet size that doesn't require fragmentation.
Path MTU Discovery is defined in RFC 1191, and is a technique using ICMP packets to dynamically discover the maximum transmission unit of an arbitrary internet path. Essentially PMTU works by beginning with sending large packets with the "don't fragment" bit set in the IP header. The "don't fragment" bit tells routers along the way that the data payload of the packet shouldn't be broken into smaller pieces. If a router receives the packet and finds it is too big to forward, it will drop the packet and reply to the original host with an ICMP error stating "packet too large and don't fragment bit set". Additionally, RFC 1191 defines the use of a header field to specify the MTU of the hop that generated the ICMP error. The originating host lowers the size of the packet to this MTU and tries again. The process continues until the packet successfully reaches the destination endpoint. In this way, the host is able to discover the best possible MTU for the current internet path.
In Fernando's 3'rd ICMP attack, ICMP error packets are spoofed saying "packet too large and don't fragment bit set", causing the endpoint to reduce the size of its packet to a smaller than optimal size, as small as 68 bytes. RFC 1812 specifies that once a system has reduced the Path MTU, it will leave it at the reduced size for ten minutes before it tries increasing it again, thus a sustained attack only requires the sending of one packet every ten minutes. With the increased number of smaller packets, the interrupt rate increases on both the client and the server, degrading the performance of both systems. One of the most susceptible systems to this type of attack are BGP routers, which require maintaining long TCP sessions with high data throughput. As this doesn't cause the session to abort, it's much more difficult to detect and can result in very slow data transmissions.
The solution for this third attack is more complex than for the earlier types of attacks. Essentially, Fernando's solution is to delay the processing of the ICMP error messages. Instead of immediately reducing the MTU when a "packet too large and don't fragment bit" ICMP error is received, the system can simply remember that it received the packet and wait for an appropriate amount of time before acting on it. The appropriate amount of time depends on the network and is thus dynamically calculated, but essentially it is the average amount of time taken for a packet to make a round trip between the two endpoints, multiplied by a factor. If during that time you receive a delivery acknowledgment for the same packet that you also received an ICMP error, you know that the ICMP error wasn't real and thus can safely be ignored. Alternatively, if after that amount of time no acknowledgment is received then you can act appropriately on the ICMP error, reducing the MTU.
Additional generic countermeasure:
In addition to the first two countermeasures mentioned above, and inherently part of the third countermeasure, it is also possible to generically defend against ICMP attacks on TCP sessions by verifying the TCP sequence number of the packet contained within an ICMP error. This works because all ICMP error packets are required to contain the IP header and at least 8 more bytes of the packet that caused the error in the first place. In the case of TCP packets, these 8 bytes include the TCP sequence number, and thus this sequence number can be compared against the active session that generated the packet. If the sequence number is not within the sequence number window [story], the ICMP error is obviously not real and can be safely ignored. Evidently many vendors did not provide even this amount of prevention, which is why the ICMP issues described in Fernando's paper are so easy to exploit. While sequence number validation is a useful preventative measure, it is not enough by itself. Fernando notes, "it may serve as a counter-measure nowadays, but if in the future we begin to use larger windows, we will be facing the same problem again." He points to the earlier discussed counter-measures as the appropriate complete solution to the problem.
The politics of vulnerabilities:
Once Fernando understood the vulnerabilities he'd found in the ICMP protocol, he began to try and safely report the problem so that it could be fixed in the many ICMP implementations that comprise the Internet. To begin, he wrote an internet draft which he submitted to the IETF in August of 2004. At that time he contacted CERT/CC and NISCC, and privately notified several open source projects including OpenBSD, NetBSD, FreeBSD and Linux, as well as larger vendors such as Microsoft, Cisco, and Sun Microsystems. He described to each the vulnerabilities to give them an opportunity to address the issues before they became public.
Around this same time, Fernando began receiving emails from Cisco who had numerous technical questions about his solutions to the problems. He continued to reply thoroughly to all their questions, until two months later when he received an email from Cisco's lawyer claiming that Cisco held a patent on his work. He asked their lawyer for specifics, but they refused to reveal any details. For two more months this continued, until Fernando was cc'd on an email thread between Cisco, Linus Torvalds, and David Miller. Reading back through the thread, Fernando found where David Miller had asked Cisco how they could possibly patent sequence tracking as Linux had been doing it for many years, and later in the same thread Cisco noted that they had withdrawn their patent. Fernando found the experience frustrating, "a third party knew what it was all about before I did. One would expect the person who discovered the vulnerability to be the most involved, but that didn't happen. To this day I still don't know exactly what the patent was about, I'm only guessing it was about TCP sequence tracking based on the email thread I read."
While the patent issue was happening with Cisco, CERT/CC created a mailing list to allow vendors to communicate amongst themselves about the newly discovered vulnerability. "They blamed me for submitting my work," Fernando said in exasperation. "One of Cisco's managers of PSIRT said I was cooperating with terrorists, because a terrorist could have gotten the information in the paper I wrote!" Fernando was familiar with intellectual property arguments with last year's Slipping In The Window paper, so he had intentionally publicly published his findings to prevent it from being patented. "Then they accused me of working with terrorists, and even still tried to patent my work!" He noted that he now suspected had he actually worked exclusively with Cisco as they had requested, they probably would have managed to patent all of his ideas. "I decided to work this issue with NISCC, as they were much more responsive. But Cisco wanted me to work with Cert/CC. And as I didn't, maybe our relationship was harder than it should have been."
Fernando also found Microsoft difficult to work with. "Microsoft's acknowledgment policy says that you must report the issues to them 'confidentially'", he explained. As he chose to contact CERT and various open source projects as well, he claimed that they refused to give him credit for the discovery. Only with much effort did he finally get them to acknowledge that he had discovered the issue.
The actual disclosure of the ICMP issue was an adventure in itself, delayed multiple times. It was originally planned to be disclosed in January of 2005, but was repeatedly delayed until April 12'th because many of the bigger vendors weren't ready yet with fixes. Fernando acknowledged, "CERTs don't have many choices here. They get paid for providing a so-called 'responsible disclosure' process. Suppose that they disclosed a security issue while Cisco or Microsoft were still vulnerable. Do you think they would keep their jobs if the bad guys began to attack the Cisco and Microsoft systems based on the information published by CERTs?" He went on to note, "I don't know what the community could do to educate vendors," suggesting that perhaps the public disclosure should happen no matter what after a couple of months of being announced privately. "Maybe after being hit by the media several times, then the big vendors would learn that they must become more responsive."
Fernando went on to point out that from his experience vendors seem to be more concerned about who gets credit for finding a flaw, rather than about actually fixing it. Fernando explained, "Cisco was worried about not giving me credit because they claimed to have been working on the problem for four years. They offered to set up a meeting with some people of Cisco Argentina to show me documentation that would prove they had been working on the Path MTU Discovery attack for more than a year. It didn't happen. Of course, that's ironic, as in the same way I could fake a document and say that I have been working on my draft privately for ten years. On the other hand, if it were true, then it would mean that Cisco takes about two years to address these issues. I would be concerned about this if I were one of their customers."
One week prior to the eventual discloser, Fernando received a call from the CTO of Cisco Argentina who asked him for a copy of his resume. "He said he wanted to have a meeting with me, telling me they might have a job for me," Fernando shrugged. "The meeting was delayed a few times, then I never heard from him again. I wouldn't have thought much of it, but I mentioned it to other people and it turns out they'd had similar experiences. It seems this is a common practice for Cisco to offer someone work in the hopes you'll not talk to the media when the security issues are disclosed."
Following the public disclosure of Fernando's findings, the media began to discuss the flaws in ICMP. "Instead of contacting at least NISCC, or myself, they contacted the affected parties," Fernando explained. "For example, ZD-NET contacted Microsoft and thus came to the conclusion that there was no problem, that the only way for the attack to work was for the attacker to sniff traffic on the network. This isn't true! It seems the reporters hadn't even read the draft and thus didn't understand what is wrong with ICMP. What's even worse, it seems that they nor the contact at Microsoft realize what the word 'blind' means in each of the attacks." As discussed earlier, due to the fact that most vendors didn't even check the TCP sequence number of packets within ICMP errors, an attacker could blindly spoof ICMP errors and thus trivially exploit these vulnerabilities.
The issues affecting the ICMP protocol are legitimate, and will need to be dealt with by all vendors and open source groups. Fernando told me that Linux quickly implemented the counter measure for ICMP Source Quench upon receiving his internet draft, and had already been working on prevention for blind connection reset attacks and on TCP sequence checking. He reported that FreeBSD also had been working on a counter-measure for blind connection reset attacks, and that they removed Source Quench processing and added additional TCP sequence checking upon receiving his internet draft. As far as Fernando is aware, NetBSD has not acted on his internet draft, and is thus still vulnerable.
As for OpenBSD, they were already working on implementing the counter measure for blind connection reset attacks. "In August of 2004," Fernando said, "Markus Friedl implemented the TCP sequence check in OpenBSD following my report." He went on to discuss efforts at the recent Hackathon in Calgary, "at the hackathon, Chad Loder and I worked with Markus to implement the counter-measure for ICMP Source Quench attacks, then we began to work on the counter-measure for the Path MTU Discovery attack." The PMTUD fix was the last to be merged, on June 30'th, so now all ICMP fixes are in the OpenBSD -current source tree and will be part of OpenBSD 3.8. Regarding the PMTUD fix, Fernando notes, "other projects said they liked the idea, but wanted to hear about experiments with it. OpenBSD decided to implement it because it was clear to the project that it was the only definitive solution to the problem. This is a clear example of what being proactive is about: fixing problems before you really face them."