BSDCan 2008: Stream Control Transmission Protocol

Submitted by Jeremy
on May 17, 2008 - 3:52am

Randall Stewart of Cisco Systems gave a talk titled SCTP, what it is and how to use it, discussing the Stream Control Transmission Protocol (SCTP). A paper that was displayed on the overhead projecter before the talk began summarized:

"Integrated into FreeBSD 7.0 -- first standardized by the Internet Engineering Task force (IETF) in October of 2000, in RFC 2960 and later updated by RFC 4960. SCTP is a message oriented protocol providing reliable end to end communication between two peers in an IP network."

Randall explained that SCTP is an alternative protocol to TCP, UDP. To describe SCTP, he suggested you start with TCP features, including: reliable retransmission, congestion control, flow control, connection oriented, and selective acknowledgements. You then add to it more features, including: "association" 4-way handshake, framing and ordered service, multistreaming, multihoming, and reachability.

Derived from notes taken at a one-hour BSDCan talk.

Handshakes
A TCP connection starts with a three-way handshake, SYN, SYN-ACK, ACK. When computer A sends computer B a SYN, computer B allocates memory to handle the new connection and replies with a SYN-ACK. If computer A is malicious, it can repeatedly send SYN packets with forged unique source IP addresses, causing computer B to continue allocating more and more memory as it tries to establish all of the requested connections, eventually exhausting the available buffers and preventing any more inbound connections, including valid ones. This is a SYN-Flood.

SCTP implements a four-way handshake that makes a SYN-Flood type attack not possible. In SCTP, computer A sends computer B an INIT packet. Computer B replies with an INIT_ACK packet containing a cookie, but does not yet set-up state. Computer A then has to reply with a COOKIE_ECHO, which computer B validates and replies with a COOKIE_ACK. It is possible for the COOKIE_ECHO and COOKIE_ACK packets to have a data payload, thus in spite of having a four way handshake, data can actually start transferring quicker than with TCP.

Multi-homed considerations
A multi-homed peer defines a "primary destination address", which was required by the IETF due to concerns with congestion control. Data is sent to the primary destination address unless it fails for any reasons, then SCTP selects a new primary destination address from the remaining list.

TCP transfers data over single path, and if that path fails the whole connection is aborted. Randall explained that SCTP supports alternative paths, so if one path fails SCTP will automatically continue sending data along a different path. For example, Server A could have multiple source IP addresses, and Server B could have multiple destination IP addresses, all on their own subnets. A connection can be defined between Server A and Server B using all available interfaces, and if any interface fails the protocol automatically starts using another of the available interfaces.

Real World Usage
Randall pointed out that SS7 networks in Europe (used by SMS messages) currently use SCTP.

At this point, it was asked if SCTP supported multicast transmissions, to which Randall replied it does not yet, but there is a draft RFC coming which will then in theory be implemented in FreeBSD.

SCTP Multistreaming
To explain multistreaming, Randall offered a diagram on one of his slides looking something like:

Host A    ---------->     Host B
send queues       receive queues
     stream 0 ----> 
     stream 1 ---->
     stream 2 ---->
receive queues       send queues
     <---- stream 0

It was explained that an application can set up multiple independent streams with SCTP, and data in one stream will be delivered regardless of what is happening to data in another stream. It was also noted that to SCTP streams are one way things, and even though most applications set them up in both directions it's not a requirement, as illustrated in the above ASCII diagram.

He then showed a short video to illustrate a real world advantage of these features. The video showed two web browsers loading a page through one connection comprised of a little text and a series of progressive jpegs. On the left side the connection was with TCP, and on the right side it was with SCTP. Both connections were subjected to a ~4% packet loss. On the TCP side, things loaded slowly and in clear order, one image at a time. On the SCTP side, what is downloaded of the progressive jpegs is quickly displayed and additional images are loaded in spite of the dropped packets. Half way through the video on the TCP connection much of the web page is still blank, while on the SCTP connection everything is already visible. On fast and reliable networks you'd likely not notice the advantage much of the time, but on slower less reliable networks the advantages become very obvious.

SCTP Socket Types
The SCTP socket API offers to forms of connections, one-to-one (SOCK_STREAM) which is the "TCP model", and one-to-many (SOCK_SEQPACKET) which is the "udp model". The one-to-one model provides a "compatibility" mode allowing virtually seamless porting of TCP apps to SCTP. Using this mode, it's possible to convert a standard TCP application to SCTP with a simple search and replace and recompiling. Randall noted that there are patches available converting Apache 2.x and FireFox both to SCTP, providing a true SCTP connection in which both the browser and the server are using SCTP.

Randall noted that when using the one-to-one model, it's not possible to add a data payload to the COOKIE packets, as doing so would break drop-in TCP compatibility.

He then posted source code for a simplistic one-to-one example server:

int sd, newfd, sosz;
struct sockaddr_in6 sin6;
sosz = sizeof(sin6);
sd = socket(AF_INET5, SOCK_STREAM, IPPROTO_SCTP);
listen(sd, 1);
while (1) {
  newfd = accept(sd, (struct sockaddr *)&sin6, &sosz)
 do_child_stuff(newfd, &sin6, sosz);
}

The one-to-many model was developed especially for peer-to-peer applications. You use the listen() call to receive connections. Whether or not you call listen(), you can start sending data to any address (just like you would with a UDP socket). Randall explained that if an association does not exist, it will get implicitly created. Sending is a non-blocking, and data will be queued until the connection is successfully established.

He went on to note that with a one-to-many model, select() on write will always return true. If, for example, you have 5 associations defined, if the buffer for one association is full it will just move on to the next one. A special sctp_peeloff() call is available if you need to write to a specific association, convert it into its own one-to-one socket.

One advantage he explained of this model is that SCTP applications don't need to track connection state, instead they simply send and receive. If they want to track connection state for whatever reason, they can subscribe to notifications using a 32-bit association-id. If you choose to not track the connection state, Randall noted that you should set the AUTO_CLOSE socket option which will automatically and gracefully close connections that have been idle for more than 20 seconds.

He then posted source code for a simplistic one-to-many example server:

int sd, newfd, sosz, msg_flags;
struct sockaddr_in6 sin6;
struct sctp_sndrcvinfo snd_rcv;
char buf[8000]
sosz = sizeof(sin6);
sd = socket(AF_INET5, SOCK_STREAM, IPPROTO_SCTP);
listen(sd, 1);
while (1) {
  len = sctp_recmsg(sd, buf, sizeof(buf), (sockaddr *)&sin6,
        &sosz, &snd_rcv, &msg_flags);
  do_msg_stuff(sd, buf, len, &sin6, &snd_rcv, msg_flags);
}

Socket Options
SCTP on FreeBSD has over 20 socket options, which can be set using setsockopt(), and read with getsockopt(). Some of these options are BSD specific.

Notifications
There are many transport events that can be communicated up to the application. In SCTP, Randall explained, these events are called notification. In socket.h, msg_notification has a new rcv_message flag related to this notifications. As data is read, the message flag is checked.

Eight types of notifications are defined:

  • SCTP_ASSOC_CHANGE:
    This is the most commonly used, and tells the application when associations come and go.
  • SCTP_PEER_ADDR_CHANGE:
    This notification is for when the primary address goes down, or when addresses are added or removed. For example, this might be used in a telephony application to trigger an alarm that a cable has been cut and needs to be fixed.
  • SCTP_REMOTE_ERROR:
    This should never be used, but is intended for when the TLB of a remote association send an error chunk that SCTP doesn't know how to handle.
  • SCTP_SEND_FAILED
    When a send fails, the unsent data will be passed back up to the application through this notification. It includes a flag that indicates whether or not the data was ever written to the wire.
  • SCTP_SHUTDOWN_EVENT
    Notifies when the peer shuts down.
  • SCTP_ADAPTION_INDICATION
    "32-bits of information on cookie, RDDP -- remote dma protocol. Pass information." [Editor's comment: my notes aren't clear here, sorry.]
  • SCTP_PARTIAL_DELIVERY_EVENT
    This notification is used if, for example, a 64k message is being sent, and the peer is only reading it 1k at a time, during which time the connection goes away for some reason.
    • SCTP_AUTHENTICATION_EVENT
      Packet authentication, signed messages, key changes, peer changed keys, etc.

    Each notification has its own structure that's passed up to the application. For example, the SCTP_ASSOC_CHANGE event notification uses the following structure:

    struct sctp_assoc_change {
    	uint16_t	sac_type;
    	uint16_t	sac_flags;
    	uint32_t	sac_length;
    	uint16_t	sac_state;
    	uint16_t	sac_error;
    	uint16_t	sac_outbound_streams;
    	uint16_t	sac_inbound_streams;
    	sctp_assoc_t	sac_assoc_id;
    };
    

    Randall ran out of time as he started explaining the purpose of each of the variables within the above structure. He then opened the floor to questions.

    Questions
    Q: What do you think of people referring to SCTP as KSCTP, where the K stands for Kitchen Sink?
    A: Randall replied by noting two interesting events that happened in his life in 1980. First, he went to a local computer store and bought 16 more K for his Atari 100, maxing it out at 64K. That same year, TCP was standardized. Randall explained that TCP is old, and was developed at a time when 64K was a lot of RAM. Since then, processors have become a lot faster, and computers have a lot more memory. He suggests that applications shouldn't have to do work that can now be handled by the transport protocol. He has heard of SCTP being called the Kitchen Sink Protocol, but points out that it only has a 130 page spec, compared to a much more complicated TCP road map. He suggested that SCTP is cleaner, though it has more overhead. He concluded that by providing these features in the transport protocol, each application no longer has to keep reimplementing the same feature.

Typo?

Lawrence D'Oliveiro (not verified)
on
May 19, 2008 - 5:45am

What exactly is AF_INET5?

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.