What do we know about TCP vs UDP? The internet, blockchain, cloud computing – no matter the field, computers are most effective when they are connected to one another. However, the process of sending data – and making sure it reaches its audience entirely whole – demands a complex interconnection of networks and nodes. Before delving into the specifics of the Transport Control Protocol (TCP) vs User Datagram Protocol (UDP) protocols, it’s important to first take a step back.
All networking models aim to describe a similar process: how bits of data are sent across wires as an electrical pulse, or via fiber optic cables in patterns of light. And – on the upstream end – how these bits of data are received and reassembled.
Transferring data from one application on one computer to another would technically be possible with a single protocol. However, this method would be incredibly fragile: any modifications in hardware would mean the entire application and protocol software need to be changed. To achieve the flexibility required for connecting devices, networking uses layered stacks of different protocols. Each layer in the stack has a specific function and interacts with the layers directly above and below it.
Today, the most inclusive framework is Open System Interconnection (OSI). UDP and TCP both chop and change some of the layers involved, so starting with the OSI grants the best jump-off point for delving deeper.
A rapid-fire overview of the OSI model
Physical layer
At the most basic level, this layer is made up of the mechanical elements of data sending and reception. This determines how the pins and wires encode the 0s and 1s into a signal that is then sent to local media in the form of light, electrical, and radio signals.
Data Link Layer (DLL)
The data link layer acts as a middleman between a device’s physical components and the network-based data packets. The data in the physical layer is essentially unstructured – and it’s this layer’s job to package it into frames. This takes the constant stream of submitted data and groups them into clusters, a process called multiplexing. Here, each packet of data is given a header to label its intended sender and destination.
Network layer
With the packets of data ready to go, the network (or internet) layer is able to route packets of data from one device to another across a network. It does this by assigning each device a unique IP address, which is used to identify the device and determine the route that packets should take to reach it.
Transport layer
Sitting between the network and session layers, the transport layer focuses on delivering messages to a host. The transport layer allows applications and services to run faster, as upstream processes don’t need to consider the unique characteristics of every device’s communication network. This layer is the foundation of the UDP and TCP protocols that we’ll get to shortly.
Session layer
The transport layer doesn’t care if two computers have already communicated – it just focuses on shuttling packets back and forth. The session layer organizes traffic into a logical beginning and end.
Presentation layer
The volume of these data packets can be very high. The presentation layer mainly focuses on encoding and compressing these packets in order to reduce their bandwidth demands before they leave the sender’s device.
Application Layer
Finally, this received data (or data about to be sent out) needs to be communicated to the user. While OSI is a useful model, keep in mind that no modern protocols follow its precise contours across all 7 layers. For instance, the TCP and UDP protocols focus on just the network and transport layers.
What is TCP
The Transport Control Protocol was essentially the first networking protocol. Invented shortly after the advent of the internet in 1981, its first form was focused on segmenting data into packets. In the decades since, it has grown and developed into the TCP/IP stack that is today’s standard. The packets within TCP are split into 4 layers: data link, network, transport, and application layers.
As the name suggests, TCP is most influential at the transport layer. Before any data is sent, the TCP establishes a short connection between two network endpoints via a three-way handshake.
First, the client initiates the connection by sending a SYN (synchronize) message (in the form of a random number) – when received, the server acknowledges it with a SYN-ACK (which just adds a 1 at the end of the client’s number). Finally, this SYN-ACK is acknowledged by the client with an ACK message. This virtual handshake ensures that both the sender and receiver are ready to communicate and know that the channel is established. If the receiving port is offline or unavailable, the ACK message is replaced by a TCP RST (reset) packet.
It’s not just the initial process that relies on a two-way acknowledgment: two-way termination also confirms when a client or server has also finished sending data. In a similar pattern, after all relevant pieces of data have been sent across, the client sends across a FIN segment in the form of its own sequence number. Again, the server acknowledges it. When all data has been received by the server, it’s the server’s turn to send a FIN packet, which is acknowledged and the connection terminated.
The importance of IP
Critical to this process is the fact that every connection needs to have a destination defined before being sent. Only after that can the TCP handshake take place. This is where the Internet Protocol (IP) acts as the other side of the TCP/IP coin.
Every device or domain connected to the internet has a unique IP address. In the header of every data packet is the IP address of the intended recipient, meaning data gets to where it’s needed. Once the IP address is identified, the TCP process can initiate the transferral of data. The data that’s received needs to get to the right port, however – which is why TCP packets include port data. Consider the IP as a postcode, and TCP port data as a specific house number. Finally, TCP packets include a sequence number – similar to numbered pages in a letter; these identify what order the packets are sent in.
While TCP/IP has become the standard for safe and reliable data transmission over the internet, a major advantage of the IP approach is its flexibility – and TCP isn’t the only way to transport data from point A to point B.
What is UDP
The User Datagram Protocol (UDP) is a lightweight data transport protocol that works on top of IP’s foundations. Instead of the TCP’s lengthy handshake process, however, UDP sends small, independent packets known as datagrams, without cross-referencing a connection first. While this may appear to be risky, UDP is built with the ability to drop information and not suffer a critical communications breakdown.
Each datagram is a self-sufficient unit and contains no information on the pieces of data coming before or afterwards. This way, instead of having to process a backlog of delayed packets, a server can handle requests in a completely linear fashion. This process is supported by the size of packets involved in UDP. Consider how TCP packets have inbuilt error checking and sequential numbering: by removing all of these optional extras, UDP datagrams are a fraction of the size. These streamlined pieces of data can then be processed at lightning-fast speeds.
As a result, UDP is used in services that need near-real-time processing. Think video streaming and computer games: they may require a lot of UDP datagrams, but their speed is necessary for a good user experience. And if a packet drops or goes missing, the real-life implications would be perhaps a missing pixel that lasts for a frame. Linking a domain name to a recognizable web address, Domain Name Service (DNS) queries also use UDP. These requests are tiny, consisting of a single request and reply pair – but they need to happen quickly, making UDP a natural use case.
Feature-by-feature differences between TCP and UDP
Feature | UDP (User Datagram Protocol) | TCP (Transmission Control Protocol) |
Connection | Connectionless, does not establish a connection before sending data. | Connection-oriented, establishes a three-way handshake before data transfer. |
Reliability | Does not guarantee delivery, order, or error checking. | Packet architecture ensures delivery in the correct order. |
Speed | Fast. | Slower than UDP due to extensive error checking mechanisms. |
Data flow control | No flow control; sends data as fast as the sender can generate and the network can handle. | Flow control is managed through windowing techniques to adjust the data flow. |
Usage | Used for time-sensitive applications where loss of some data is tolerable (e.g., live audio/video streaming). | Used where reliable data delivery is crucial (e.g., web browsing, email). |
Header size | Smaller header size (8 bytes). | Larger header size (20 bytes minimum). |
Sequencing | Does not sequence data packets. | Sequences data packets to ensure ordered delivery. |
Some final security-flavored food for thought
When choosing between UDP or TCP protocols within your own projects, keep in mind that – while UDP’s speed makes it an attractive prospect – it is inherently more susceptible to security issues such as spoofing and Denial of Service attacks. TCP’s handshake mechanism would prevent any forged IP addresses from causing issues, but note that TCP isn’t immune. In more recent years, DDoS amplification attacks have hijacked this back-and-forth mechanism to force a server into waiting for the final ACK message, which never comes.