System Design Fundamentals
Storage & Data Modeling
Partitioning, Replication & Consistency
Caching & Edge
Messaging & Streaming
Reliability & Operability
Security & Privacy
Transport Layer Protocols: TCP, UDP and QUIC
Why should a system designer care about what happens between IP and the application? Because IP only answers one question: "which host should receive this packet?" It does not promise the packet will arrive, arrive once, or arrive in order. The transport layer sits directly above IP and turns that best-effort packet delivery into a usable conversation between processes on different machines.
Think of it this way. IP is like a postal service that delivers envelopes to a building's street address. The transport layer is the mailroom inside the building that reads the apartment number (port) and drops each envelope into the correct mailbox (process). Without that mailroom, every envelope piles up at the front door with no way to sort it.
Core Transport Responsibilities
The transport layer handles four core responsibilities:
- Port-based demultiplexing: the kernel uses source and destination ports to route traffic to the correct socket, so Nginx on port 443 and PostgreSQL on port 5432 each receive only their own packets. A port is a 16-bit number (0 to 65535), giving each host thousands of possible endpoints.
- Segmentation and reassembly: a 5 MB response cannot cross a network path whose MTU is roughly 1500 bytes in one piece. The transport breaks it into segments and rebuilds the original payload on the receiving side. Without segmentation, large application messages would be impossible to send over physical networks with bounded frame sizes.
- Optional reliability and ordering: some transports guarantee every byte arrives in order; others deliberately skip that guarantee. This is the dimension that most strongly differentiates TCP, UDP, and QUIC from each other.
- Flow and congestion control: protecting the receiver from being overwhelmed (flow control) and protecting the network from being overloaded (congestion control). These are separate problems with separate mechanisms, and not every transport implements both.
Three Protocols, Three Philosophies
Today three transport protocols matter for system design:
- TCP provides a reliable ordered byte stream. It handles retransmission, ordering, flow control, and congestion control automatically. The application writes bytes in and reads bytes out, and TCP handles everything in between.
- UDP provides lightweight independent datagrams with almost no built-in guarantees. It adds port-based demultiplexing and a checksum, and nothing more. The application decides what to do about loss, ordering, and congestion.
- QUIC runs over UDP but adds its own reliability, stream multiplexing, congestion control, and built-in TLS 1.3, all implemented in user space so it can evolve without waiting for kernel upgrades.
Why three? Because different workloads have fundamentally different relationships with failure. A bank transfer cannot afford to lose a single byte. A voice call cannot afford to wait 200 ms for a retransmission. A mobile web session cannot afford to restart from scratch every time the user's phone switches cell towers. No single transport protocol can optimize for all three of those constraints at once.
Transport choice is really failure-semantics choice. You are deciding whether the system should prefer completeness, timeliness, or evolvability when the network is imperfect, and the network is always imperfect.
Choosing a transport protocol is not a small implementation detail. It determines handshake cost, how the system behaves under packet loss, whether streams can make independent progress, and how quickly the protocol can evolve. Those choices shape latency, scalability, and incident behavior across the entire application. In a system design interview, explaining why you chose a transport protocol is often more valuable than naming one.
TCP exists to turn best-effort IP into a reliable ordered byte stream. It is connection-oriented, stateful, and conservative: it would rather delay delivery than hand the application incomplete or out-of-order data. That philosophy makes TCP the workhorse of the internet. HTTP, database protocols, email, SSH, and most business APIs all run on TCP because they value correctness over raw speed.
Why does TCP succeed so broadly? Because most applications need exactly this contract: "send these bytes, deliver them all, in order, and tell me if that becomes impossible." That contract is simple to program against. The application writes bytes to a socket and reads bytes from the other end. TCP handles segmentation, retransmission, ordering, flow control, and congestion control underneath. The application does not need to know about any of it.
Connection Setup: The Three-Way Handshake
Before any data moves, TCP performs a three-way handshake so both sides agree on connection state and synchronize initial sequence numbers:
- SYN: The client sends a segment with the SYN flag set and its chosen initial sequence number (ISN). This says "I want to start a conversation, and my byte numbering starts here."
- SYN-ACK: The server responds with its own SYN flag, its own ISN, and an acknowledgment of the client's ISN. This says "I agree to talk, my numbering starts here, and I received your starting number."
- ACK: The client acknowledges the server's ISN. Now both sides know each other's starting point and the connection is established.
This costs 1 RTT (round-trip time) before data can flow. On a cross-region path with 50-100 ms RTT, that handshake alone adds 50-100 ms of latency. Add TLS on top and you pay another 1-2 RTTs for encryption setup. For a user in Tokyo connecting to a server in Virginia, that can mean 300-400 ms before the first byte of application data moves.
Why pay this cost? Because the handshake establishes shared state that makes everything afterward reliable. Both sides agree on initial sequence numbers, window sizes, and supported options. Without this agreement, neither side would know how to interpret the byte stream or detect loss. The handshake also prevents stale packets from old connections from being mistakenly accepted as part of a new conversation.

Reliable Delivery
After the handshake, TCP numbers every byte in the stream. The receiver sends cumulative acknowledgments reporting the highest contiguous byte offset it has received. If the sender does not receive an ACK within a timeout (typically hundreds of milliseconds, calculated from measured round-trip times), it retransmits the missing data.
But waiting for a full timeout is slow. TCP has a faster mechanism: if the sender receives three duplicate ACKs for the same byte offset, it infers that one specific segment was lost while later segments arrived successfully. The receiver is saying "I keep getting new data but I am still missing byte N." The sender retransmits the missing segment immediately without waiting for the timer. This is called fast retransmit and it repairs gaps in tens of milliseconds rather than the hundreds of milliseconds a timeout would require.
Modern TCP implementations also support selective acknowledgments (SACK), which let the receiver report exactly which byte ranges it has received beyond the gap. This helps the sender retransmit only the missing pieces rather than resending everything from the gap onward.
The combination of sequence numbers, acknowledgments, and retransmission is what makes TCP reliable. The application never sees the loss. TCP hides it. But hiding loss takes time, and that hidden recovery latency is the cost of TCP's reliability guarantee.
Flow Control and Congestion Control
TCP does two different kinds of protection:
- Flow control protects the receiver. The receiver advertises a receive window (rwnd) telling the sender how much unacknowledged data it can buffer. If the window shrinks to zero, the sender pauses until the receiver processes existing data and opens the window again. This prevents a fast server from overwhelming a slow client. Without flow control, a gigabit server could flood a mobile device's tiny buffer in milliseconds.
- Congestion control protects the network. TCP starts with a small congestion window (slow start), typically 10 segments (about 14 KB). It grows the window exponentially (doubling each RTT) until it detects loss or reaches a threshold. After loss, TCP cuts the window (typically by half) and then grows it additively, one segment per RTT. This pattern is called AIMD (additive increase, multiplicative decrease). The result is that TCP connections start cautiously and ramp up over several round trips. A fresh connection on a 100 ms path might take 500 ms or more to fully utilize available bandwidth because slow start must probe the path capacity gradually.
Flow control and congestion control are easy to confuse but they solve different problems. Flow control asks "can the receiver handle more data right now?" Congestion control asks "can the network path handle more data right now?" Both can throttle the sender, but for different reasons. In practice, the sender's effective rate is limited by whichever window is smaller: the receiver's advertised window or the congestion window.
Head-of-Line Blocking
TCP's biggest limitation for modern multiplexed protocols is head-of-line blocking. Because TCP delivers a single ordered byte stream, if byte range 1000-1999 is lost, all later bytes (2000, 3000, 4000 and beyond) must wait until that gap is repaired, even if those later bytes belong to completely unrelated application-level streams.
This is precisely why HTTP/2 over TCP still suffers latency spikes under packet loss. HTTP/2 multiplexes many logical streams over one TCP connection, but TCP knows nothing about those streams. It sees a single sequence of bytes. One lost segment stalls everything behind it, regardless of which HTTP stream the lost data belonged to.
Under normal conditions with low loss rates (well under 1%), this is barely noticeable. But on lossy mobile networks or congested paths where loss rates climb to 1-5%, head-of-line blocking can cause tail latency to spike dramatically. Measurements show that HTTP/2 over TCP can actually perform worse than HTTP/1.1 with multiple connections under moderate loss, because HTTP/1.1's separate connections isolate failures while HTTP/2's single connection amplifies them. This limitation is the single biggest reason QUIC and HTTP/3 exist.
Reliable ordered delivery sounds universally good until you consider what it implies: one missing segment stalls every later byte on that connection. TCP's greatest strength and its biggest latency penalty come from the same design choice.
UDP goes in the opposite direction from TCP. It keeps the transport layer deliberately thin: no connection handshake, no retransmission, no in-order delivery, and no built-in congestion control. You send datagrams and the network does its best. If a datagram is lost, duplicated, or arrives out of order, UDP does nothing about it. The application either handles it or does not.
That sounds like a deficiency, but it is a deliberate design choice. UDP gives the application full control over what to do when the network misbehaves, rather than imposing TCP's "recover everything in order" policy. This makes UDP the thinnest useful transport: it adds port-based demultiplexing and a checksum on top of IP, and nothing more.

Why Thin Is Sometimes Exactly Right
A voice packet that arrives 500 ms late is worse than useless because the conversation has moved on. A game state update from 200 ms ago can be replaced by a fresher one. A DNS lookup is small enough that retrying at the application layer is simpler than carrying full TCP semantics for every query.
UDP has an 8-byte header compared to TCP's minimum 20 bytes. More importantly, it imposes zero round trips before the first data byte moves. For a DNS query that fits in a single packet, that means the answer can arrive in 1 RTT total rather than the 2+ RTTs TCP would need (1 for handshake, then 1 for the query itself). When a resolver makes thousands of lookups per second, that difference in setup cost matters.
Because UDP is connectionless, the server does not need to maintain per-connection state. A DNS server handling millions of queries does not allocate a socket buffer, track sequence numbers, or maintain a congestion window for each client. That statelessness is a major scalability advantage for high-volume, short-lived exchanges where the cost of maintaining TCP state for each client would dwarf the cost of the actual work.
Control, Not Just Speed
UDP is about control as much as speed. The application chooses which guarantees to rebuild, if any. Some UDP-based systems add:
- Sequence numbers so receivers can detect gaps and reordering
- Selective acknowledgments for targeted retransmission of only the critical events that must not be lost
- Forward error correction (FEC) so some percentage of loss can be repaired without any retransmission at all
- Application-level congestion behavior so the sender remains network-friendly and does not cause congestion collapse
Others accept occasional loss as normal and move on to fresher data. The important insight is that UDP is not "TCP minus features." It is a different philosophy: give the application raw access to datagrams and let the application decide what reliability means for its workload.
Common UDP Use Cases
- DNS: single request-response exchanges where retrying a lost query at the application layer is cheaper than maintaining a full TCP connection for every lookup.
- Video streaming (RTP/SRTP): stale video frames are worthless to the viewer. The player skips ahead rather than waiting for a retransmission of a frame the user has already passed.
- Online gaming: the latest position update supersedes any lost older update. Retransmitting an outdated position would waste bandwidth and delay the current state.
- VoIP: a late audio sample creates a gap the listener has already passed. Humans tolerate brief silence better than the "underwater" effect that TCP retransmission creates.
The common thread is that all of these workloads can articulate a specific strategy for handling loss. That is the key test for whether UDP is appropriate. If the engineering team cannot explain what happens when a datagram disappears, UDP is the wrong choice.
Use UDP when the application can say something precise about loss: ignore it, replace it, or recover it selectively. If the only plan is to make it reliable somehow later, you probably wanted TCP or QUIC instead.
QUIC is not "UDP with a few extras." It is a full transport protocol implemented over UDP so it can evolve in user space and pass through internet middleboxes that already understand UDP. It combines transport behavior and TLS 1.3 into a single integrated protocol stack. QUIC was originally developed by Google and later standardized as RFC 9000 by the IETF. Today it powers HTTP/3, which major browsers and CDNs already support.
Why Build on UDP?
Deploying a brand-new IP transport would require updates to every firewall, NAT, and load balancer on the internet. That kind of rollout takes decades. TCP itself has barely evolved since the 1980s partly because middleboxes freeze the wire format. By running over UDP, QUIC can be deployed as a library update on endpoints while the network treats it as ordinary UDP traffic. This is the same pragmatism that made HTTPS successful: work within existing infrastructure rather than waiting for the entire network to upgrade.
Running in user space also means QUIC can ship improvements in application releases rather than OS kernel updates. This is why QUIC's congestion control algorithms have evolved faster than TCP's because a library update is days, while a kernel rollout is months or years. For companies that operate at scale and need to iterate on transport behavior, this velocity advantage is significant.
Independent Streams: Solving Head-of-Line Blocking
The most important design improvement over TCP is independent streams. A single QUIC connection can carry many streams, and each stream has its own sequence space and flow control. Loss on stream A does not force streams B and C to wait. QUIC still retransmits missing data for stream A, but it delivers already-received data from other streams immediately.
Why does this matter so much? Because modern web pages load dozens of resources simultaneously: CSS files, JavaScript bundles, images, API responses. With TCP, all of these share a single byte stream. With QUIC, each resource gets its own stream, and one slow resource does not block the others.

Compare this to HTTP/2 over TCP. TCP sees one byte stream. If segment 42 is lost, segments 43 through 100 must wait even though they may belong to completely different HTTP requests. QUIC eliminates this transport-level head-of-line blocking.
The practical impact is most visible on lossy networks. At a 1% packet loss rate, TCP-based HTTP/2 shows significant latency spikes because one lost segment stalls all multiplexed streams. QUIC-based HTTP/3 under the same conditions performs substantially better because only the affected stream stalls while others continue making progress. This is not a theoretical advantage. It is the measured reason Google, Cloudflare, and other large operators invested in QUIC deployment.
Integrated Encryption and Faster Handshakes
TCP and TLS were designed as separate layers. That layering means a new TCP+TLS connection requires at least 2 RTTs before encrypted application data flows: 1 RTT for the TCP three-way handshake and 1 RTT for the TLS 1.3 handshake. With older TLS versions, this could be 3 RTTs total.
QUIC integrates TLS 1.3 directly into its transport handshake. A new connection completes both transport setup and encryption in a single round trip (1-RTT). On a 100 ms path, that saves 100 ms compared to TCP+TLS. More importantly, QUIC always encrypts. There is no unencrypted mode. Even the transport headers are encrypted, which prevents middleboxes from inspecting or modifying QUIC traffic and is one reason the protocol can evolve faster than TCP.
On a resumed connection (where the client has previously connected to the same server), QUIC can use 0-RTT: the client sends application data in the very first flight, reducing repeat-connection latency to essentially zero added round trips. For a mobile app that reconnects to the same API server hundreds of times per day, this eliminates the most common source of connection-setup latency.

Connection Migration
TCP binds a connection to a 4-tuple: source IP, source port, destination IP, destination port. If a phone switches from Wi-Fi to cellular, the IP changes and the TCP connection dies.
QUIC uses connection IDs instead. Both endpoints identify the connection by a cryptographic ID rather than by the IP/port tuple. When the client's IP changes, both sides recognize the new packets as belonging to the existing session. No new handshake is needed.
Consider a user streaming a video on a train. With TCP, every cell tower handoff can kill the connection and force a reconnect, requiring a new handshake, a new slow start ramp-up, and a visible stall in the video. With QUIC, the stream continues uninterrupted because the connection ID persists across IP changes. For mobile-heavy products, this single feature can measurably reduce connection failure rates and improve user experience metrics.
0-RTT lowers latency, but the early data can be replayed by an attacker. Treat 0-RTT as safe only for idempotent operations like GET requests. State-changing requests such as payments or writes should require the full handshake confirmation.
Now that you understand each protocol individually, the fastest way to compare them is to ask what each one optimizes. TCP optimizes reliable ordered delivery. UDP optimizes minimal built-in semantics and application-level control. QUIC optimizes modern reliable transport with faster evolution, better multiplexing, and built-in encryption.

| Dimension | TCP | UDP | QUIC |
| Connection setup | 1 RTT handshake, plus 1-2 RTTs for TLS | None | 1-RTT integrated (transport + TLS), 0-RTT on resumption |
| Reliability | Built in (retransmit, ACK) | None (app-defined) | Built in, per-stream |
| Ordering | Global byte-stream order | None | Per-stream order |
| Multiplexing | Single byte stream | N/A (independent datagrams) | Native independent streams |
| Encryption | Optional (TLS layered on top) | Optional (DTLS layered on top) | Always on (TLS 1.3 integrated) |
| Head-of-line blocking | Yes, across all data | No (no ordering) | No, across streams |
| Connection migration | No (bound to 4-tuple) | N/A (connectionless) | Yes (connection IDs) |
| Congestion control | Built in (slow start, AIMD) | None (app-defined) | Built in (pluggable algorithms) |
| Implementation | Kernel space | Kernel space | User space (library) |
| Evolution speed | Slow (kernel/OS updates) | N/A | Fast (app-level library updates) |
When to Use Each
- TCP: when you need a proven reliable ordered byte stream with broad compatibility: REST APIs, database connections, file transfers, email (SMTP/IMAP), SSH.
- UDP: when the application can reason precisely about loss and values low overhead: DNS lookups, real-time voice and video, game state updates, internal telemetry where occasional drops are acceptable.
- QUIC: when you want modern reliable transport with independent streams, built-in encryption, faster repeated connections, and connection migration: HTTP/3 web traffic, mobile applications, any new protocol that benefits from stream multiplexing without transport-level head-of-line blocking.
The important question is not "which is faster?" It is "which failure behavior matches the workload?" A payment request, a DNS lookup, and a live audio packet should not all want the same thing from the network.
One common mistake is assuming QUIC is strictly better than TCP because it is newer. QUIC has higher CPU cost than kernel TCP (because it runs in user space rather than the highly optimized kernel network stack), less mature tooling, and may not be worth the migration effort for workloads that are already well-served by TCP on stable networks.
Another common mistake is assuming UDP is always faster. Raw UDP avoids transport overhead, but once you add the reliability, encryption, and congestion behavior that most real applications need, the effective overhead approaches or exceeds what TCP provides out of the box. The right comparison is always the complete system, not just the transport header.
Application protocols sit on top of transport protocols, and the pairing is not arbitrary. Each application protocol assumes specific transport behavior, and violating that assumption breaks things. Understanding which application protocol rides on which transport helps you reason about performance characteristics, failure modes, and migration paths.
HTTP Family
- HTTP/1.1 runs on TCP. Each connection handles one request-response at a time. Browsers work around this by opening 6-8 parallel TCP connections per domain, but each connection still carries the overhead of a separate handshake and congestion window.
- HTTP/2 also runs on TCP. It adds multiplexed streams at the application layer so many requests can share a single connection. But it inherits TCP's transport-level head-of-line blocking, which partially negates the multiplexing benefit under packet loss. This is the fundamental motivation for HTTP/3.
- HTTP/3 runs on QUIC, not on raw UDP. It relies on QUIC's independent streams to finally eliminate transport-level head-of-line blocking for multiplexed HTTP. Each HTTP/3 stream maps to a QUIC stream, so loss on one request does not stall others.
Why did HTTP/3 choose QUIC over TCP? Because TCP cannot provide per-stream independence. No amount of application-layer framing fixes the fact that TCP delivers a single ordered byte stream underneath. HTTP/3 needed a transport that could scope loss recovery to individual streams.
This was not a theoretical concern. Google measured real-world improvements on mobile and high-latency networks after deploying QUIC at scale. Search latency, video rebuffering, and connection setup times all improved, particularly for users on lossy or high-RTT connections. Those measurements drove the IETF standardization effort that produced HTTP/3.
DNS
Classic DNS uses UDP for most lookups because queries are small and independent. A typical DNS query and response each fit in a single packet, making the overhead of a TCP handshake unnecessary. TCP handles zone transfers (where a secondary server copies the entire zone from a primary) and responses that exceed the UDP message size limit (originally 512 bytes, extended to roughly 4096 bytes with EDNS). Modern encrypted variants include DNS-over-HTTPS (DoH, which uses TCP or QUIC via HTTP) and DNS-over-QUIC (DoQ), both designed to add privacy by encrypting DNS traffic.
RPC and Streaming
- gRPC uses HTTP/2 over TCP. That means gRPC inherits TCP's head-of-line blocking, which can matter for services with many concurrent RPCs over a single connection. Some experimental gRPC implementations explore QUIC, but TCP remains the production-standard path.
- WebSocket upgrades from an HTTP/1.1 connection over TCP, then maintains a persistent bidirectional TCP channel. This is common for real-time features like chat, notifications, and live dashboards.
- WebRTC media commonly uses UDP-based transports (SRTP over UDP) because media values timeliness over completeness. The signaling channel typically uses WebSocket or HTTP, but the media itself travels over UDP.
The pattern is consistent: application protocols choose transports whose failure behavior matches their needs. Reliable request-response protocols pair with reliable transports. Real-time media protocols pair with lightweight transports that prioritize freshness. Understanding these pairings helps you reason about system behavior during incidents and makes your architecture diagrams more precise.
In a system design interview, if you mention gRPC, be ready to explain that it runs on HTTP/2 over TCP. If you propose HTTP/3, be ready to explain that it runs on QUIC, not on raw UDP. The transport stack is part of the answer.
This lesson covered the three transport protocols that matter most for system design. Here is a simple framework for choosing between them:
- Choose TCP when correctness, ordered delivery, and mature operational behavior matter most. It is the default for APIs, databases, file transfers, and any workload where missing bytes are unacceptable.
- Choose UDP when the application can reason precisely about loss and wants lightweight datagrams. It is the fit for real-time voice, gaming, DNS lookups, and internal telemetry where occasional drops are acceptable.
- Choose QUIC when you want reliable modern transport with independent stream multiplexing, built-in encryption, faster repeated connections, and connection migration. It is the foundation of HTTP/3 and the direction the web is heading.
None of these transports is "the fastest" in every situation. The right one is the one whose failure behavior matches the job.
The Industry Trend
QUIC adoption is growing because it solves real problems that TCP cannot fix without breaking backward compatibility. Major platforms including Google, Meta, and Cloudflare already serve significant traffic over QUIC and HTTP/3. Browser support is broad. CDN support is broad. The trend line is clear.
But TCP is not going away. It has decades of kernel optimization, universal middlebox support, and battle-tested behavior for workloads that simply need a reliable ordered stream. Database replication, file transfer, internal RPC, and many other workloads will run on TCP for years to come.
UDP is not going away either. Real-time media, gaming, DNS lookups, and high-volume telemetry still benefit from its minimal overhead and application-level control. As long as there are workloads that value freshness over completeness, UDP will have a role.
Each protocol occupies a distinct point in the trade-off space between completeness, timeliness, and evolvability.
Interview Advice
When discussing transport protocols in a system design context, the most impressive thing you can do is not name the "right" protocol but articulate the trade-offs. Explain what failure looks like under each option, what latency costs the handshake adds, and whether the workload can tolerate loss, ordering gaps, or connection restarts. That reasoning is what separates a strong answer from a superficial one.