Nine Years of UDT


This August marks the 9th year of the development of UDT. The project originated from SABUL (Simple Available Bandwidth Utilization Library)  that I worked together with Xinwei Hong (now at Microsoft) between 2001 and 2003. Around year 2000, researchers had noticed that stock TCP (newReno) was not efficient for the wide spreading OC-12 and  1GE networks connecting research labs around the world. SABUL was one of the first research projects to resolve the problem.

SABUL used UDP to transfer large data blocks and TCP to transfer control packets (e.g., for reliability). It ran very well for our own applications on private networks, but there were three areas that would require (significant)  future work. The congestion control algorithm was not suitable for shared networks. The use of TCP for its control channel limited the protocol’s design choices. And the API was not friendly to generic application development.

In the second half of 2002, I started to design a new protocol to remove these limitations. This protocol was later named as UDT by Bob because it is completed based on top of UDP and it uses single UDP socket for both data and control packets. The first version of UDT was out in early 2003. Compared to SABUL, UDT provides streaming style API so that it can simulate the TCP socket semantics, which is an important step to gain a large user community.

I spent about one year to investigate the congestion control algorithm. Having considered many approaches, I chose to modify the traditional loss-based AIMD algorithm, which had worked stably with TCP. Delay-based approaches have several attractions, especially that they are less affected by non-congestion related packet loss, but they face a fundamental problem, which is to learn the “base” delay value. An inaccurate “base” value can make a delay-based algorithm either too aggressive  if the base is overestimated or too “friendly” (co-existing flows may simply kill it) otherwise.

UDT uses packet-pair to estimate the available bandwidth so that it can rapidly explore the large bandwidth, while it can still share it fairly and friendly with other flows.

In addition to its native congestion algorithm, I have also implemented most of the major congestion control algorithms available by 2005. To this end, I have made UDT a composable framework so that a new control algorithm can be easily implemented by overriding several call back functions.  Due to the nature of the feedback delay and unknown coexisting flows, there is no “perfect” control algorithm. Each algorithm may work well in some situation but may behave poorly in others. UDT can be quickly customized to suit a specific environment.

By 2005, UDT3 already became production ready and had a large user community. As the user community grows, people had started to use UDT in commodity networks with relative small bandwidth (e.g., Cable, DSL, etc.). There was an important feature of UDP that accelerated this change: it is much easier to use UDP to punch a firewall than using TCP and UDT is completed based on UDP.

This is a completely different use scenario compared to the original design goal. While UDT can easily scale to inter-continental multi-10GE networks, it did not scale well to high concurrency. This motivated the  birth of UDT4 in 2007.

UDT4 introduces UDP multiplexing and buffer sharing to allow an application to start a very large number of UDT connections. UDP multiplexing makes it much easier to support firewall management and NAT punching because a single UDP socket can carry multiple UDT connections. Buffer sharing significantly reduced the memory usage. Today UDT can efficiently support 100,000 concurrent connections on a commodity computer. The scalability will be further improved when the epoll API and session multiplexing over UDT connections are completed.

Over these years, I have received numerous useful feedbacks and learned a lot from the users. In particular, many features were motivated by users’ requirements. Several users have even developed their own UDT implementation, while some others created and shared wrappers for non-C++ programming languages.

UDT also benefits a lot from the open source approach, which helps UDT to reach to a greater community. Users can do code review, debug and submit bug fixes whenever necessary, which greatly increases the code quality.

At a user space protocol, UDT is able to include related new technologies from computer networking and adapt to new network environments and use scenarios. I am confident that UDT will continue to evolve and serve the data transfer jobs in more and more applications.

Advertisements

About Yunhong Gu
Yunhong is a computer scientist and open source software developer. He is the architect and lead developer of open source software UDT, a versatile application level UDP-based data transfer protocol that outperforms TCP in many cases, and Sector/Sphere, a cloud computing system software that supports distributed data storage, sharing, and processing. Yunhong earned a PhD degree in Computer Science from the University of Illinois at Chicago in 2005.

4 Responses to Nine Years of UDT

  1. 千里孤行 says:

    When will these two features, epoll API and session multiplexing over UDT connections, be available? Thanks.

  2. David says:

    Dear Yunhong,

    Thank you so very much for your truly pioneering and amazing work!! Society is truly in your debt.

    I have three questions if you would be so kind, and please pardon my ignorance regarding these topics which must strike you as obvious:

    1) Is there any specific congestion control algorithm which gives the highest speed for uploading and downloading on common consumer internet networks, such as 10 megabit/second down and 4 megabit/second up with 17ms ping?

    2) What is the best approach regarding adding security comparable to ssl/tls on top of udt, in terms of security and speed of use? The use case is in uploading files from a java applet to a custom multi-threaded udt server.

    3) Do you know of any open source projects which implement a multi-threaded udt4 server? If not, do you have any tips you’d like to share on the optimal design of one?

    Thanks again!!!

    Best,
    David

    • Yunhong Gu says:

      Hi, David

      Thanks for the comment. There is no “best” algorithm in all situations, an aglgorithm may be optimized for raw throughput, latency, or minimal impact on other traffic. UDT is trying to be generic in both aspects. Also, protocol design (how packets are acknowledged, etc.) can also have a siginifcant impact.

      It should be similar to TCP+SSL/TLS. You can encrypt data, send it over UDT, and decrypt it. The difficult part is key exchange, for which you can copy it from SSL as well (or simply to use SSL to exchange key and then use UDT for the following data traffic).

      GridFTP can send data using UDT. Sector can also provide data serving over a file system interface.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: