From: Stefani Seibold <stefani@seibold.net>
Changelog:
31.12.2010 first proposal
01.01.2011 code cleanup and fixes suggest by Eric Dumazet
02.01.2011 kick away UDP-Lite support
change spin_lock_irq into spin_lock_bh
faster udpcp_release_sock
base is now linux-next
02.01.2011 fix camel style
fix coding style
fix types in comments
add per socket max. connection limit (pevents against abuse)
make udpcp adjustable through /proc/sys/net/ipv4/udpcp_
UDPCP is a communication protocol specified by the Open Base Station
Architecture Initiative Special Interest Group (OBSAI SIG). The
protocol is based on UDP and is designed to meet the needs of "Mobile
Communcation Base Station" internal communications. It is widely used by
the major networks infrastructure supplier.
The UDPCP communication service supports the following features:
-Connectionless communication for serial mode data transfer
-Acknowledged and unacknowledged transfer modes
-Retransmissions Algorithm
-Checksum Algorithm using Adler32
-Fragmentation of long messages (disassembly/reassembly) to match to the MTU
during transport:
-Broadcasting and multicasting messages to multiple peers in unacknowledged
transfer mode
UDPCP supports application level messages up to 64 KBytes (limited by 16-bit
packet data length field). Messages that are longer than the MTU will be
fragmented to the MTU.
UDPCP provides a reliable transport service that will perform message
retransmissions in case transport failures occur.
The code is also a nice example how to implement a UDP based protocol as
a kernel socket modules.
Due the nature of UDPCP which has no sliding windows support, the latency has
a huge impact. The perfomance increase by implementing as a kernel module is
about the factor 10, because there are no context switches and data packets or
ACKs will be handled in the interrupt service.
There are no side effects to the network subsystems so i ask for merge it
into ...Hmm, so 'connections' is increased, never decreased. This seems a fatal flaw in this protocol, since a malicious user can easily fill the list with garbage, and block regular communications. --
You are right, there is now way to detect which connection is no longer needed. I have not designed this protocol, so i cannot fix it. But in our environment this will be used together with an firewall and/or ipsec. In this case it it safe. --
Hmm, the first thing that springs into my head as a possible band-aid (which is probbaly wrong for many reasons I've not considered, so feel free to shoot it down) is; couldn't we use a timer (set to some outrageous high value by default and admin tunable) that would decrement 'connections' (discount dead connections) when there has not been any acctivity for a huge period of time? Kill off connections that have been idle for ages. Not perfect, but that would at least let the system recover after a while if a malicious client did something nasty with many connections... -- Jesper Juhl <jj@chaosbits.net> http://www.chaosbits.net/ Don't top-post http://www.catb.org/~esr/jargon/html/T/top-post.html Plain text mails only, please.
This will not work for two reasons: - First there is no way to detect a dead connection. A connection can stay for a very long time without data transfer. - Second it will not save against a attack where all communication slots will be eaten by an attacker and then new valid connections will be not handled. The only thing what is possible to make an ioctl call which allows the user land client to cancel connections. But this will be in my opinion dead code, because white lists of trusted address must be fostered and this will make the upgrading of a infrastructure to complicate. --
Yep, and as UDP messages can easily spoofed, this means you need more than a list of trusted addresses. You also need to encapsulate the thing in an secured layer. Stefani, your implementation has very litle chance being added in standard kernel, because it is not correctly layered, or documented. Copying hundred (thousand ?) of lines from existing code only shows there is a design error in your proposal. It means every time we have to make a change in this code, we'll have to do it twice. SUNRPC uses UDP/TCP sockets, and use callbacks to existing UDP/TCP code, maybe you should take a look to implement an UDPCP stack in kernel. For instance, a pure socket API seems not the correct choice for UDPCP, since a transmit should give a report to user, of frame being delivered/aknowledged or not to/by the remote side ? With send(), this means you have only one message in transit, no asynchronous handling. At least you forgot to document the API, and restrictions. --
I copied about 400 of 3000 lines with was heavy modified to need my needs. And i use only document features of the linux IP stack. So it is normal to have duplicate code for the basics. How can you do a routing, how can you determinate the MTU of the route. This are basics. Look into other code how this things will be handled is in my opinion the right way, since there a no function provide to do this. Otherwise you can say the same about all the filesystem or PCI drvivers , which do also a lot in the same way. But since this is the I have looked around the whole LINUX source code, also in the SUNRPC This will be done through the error queue. The user client will receive No, the messages will be queued. You can have more than a messages in API documentation is still there, i can these provide under Documentation/udpcp.txt if you like. Here is the API documentation: Socket interface programming manual The socket interface is a derivate of the UDP sockets. All setsockopt(), getsockopt() and ioctl() kernel system calls which are valid for UDP sockets should work on UDPCP sockets. There are some extensions to the sockopt and ioctl interface for the UDPCP sockets. Include the C header file <net/udpcp.h> to use the UDPCP socket options and ioctl calls. A UDPCP can be opened with socket(PF_INET, SOCK_DGRAM, PF_UDPCP). All operation which are valid for UDP sockets can also performed with UDPCP sockets. sockopt The setsockopt and getsockopt are defined as following: int getsockopt(int sockfd, int level, int optname, void *optval, socklen_t *optlen); int setsockopt(int sockfd, int level, int optname, const void *optval, socklen_t optlen); The level parameter for the UDPCP socket is SOL_UDPCP, where the following options are defined: UDPCP_OPT_TRANSFER_MODE - set default transfer mode. The optval is one of the following: UDPCP_NOACK - no ACK for the transmitted message is requiered UDPCP_ACK - a ACK for each transmitted message fragment is ...
Hmm, how user land can perform this task then ? Is there an open source implementation of UDPCP ? What are its problems ? You say its dog slow, I really wonder why. UDP stack is pretty scalable these days, yet some improvements are possible. Why not adding generic helpers if you believe you miss some These drivers are here because of high performance on top of high performance specs. While UDPCP is only a layer above UDP. If the problem comes from UDP being too slow, it'll be slow too. --
Userspace is much more complicate and more overhead than kernel space. UDP is fast... but UDPCP depends extremely on latency due the missing of Maybe i don't have the knowledge, maybe i don't have the time. Getting in new API functions into LINUX is much more complicate than getting new driver into LINUX. I know what i am talk, it takes me one year to the Because of latency. Handling the UDPCP into the data_read() bh function is much faster: - No context switch - Assembly Multi-Fragment Message is very efficient using skb buffer chaining. - Immediately handling an ack or data message save a lot of latency Implementing it in User Space is to slow, due the context switches. Also the sunrpc approach is not faster due the using of kernel threads which are not better than user space (okay, a little bit because not switching the MMU). The implementation is clean. I did fix all issues what i was asked for. The protocol has now absolut no side effects. So i ask again for merge into linux-next. - Stefani --
