Caching & Performance Networking & Load Balancing System Design

One Request, Three Layers: Where Work Actually Belongs

April 11, 2026

A user clicks once. Inside the system, that single action becomes a dependency graph that touches three layers in a specific order. Understanding which layer is supposed to do which kind of work is one of those quiet skills that separates engineers who fix outages from engineers who cause them.

The three layers, from outside in.

The edge layer is the first thing a request meets. CDN, WAF, global load balancer, ingress gateway. It does not know much about your business and that is the point. Its job is to terminate TLS, drop obvious garbage, cache static assets, enforce coarse rate limits, and pick which region or which pod gets the request. The work here is stateless, fast, and parallel.

The service layer is where business logic lives. Auth, profile, feed, search, recommendations, payments. This layer reads inputs, talks to its dependencies, and produces an answer. It is allowed to be slow. It is allowed to be complicated. It is not allowed to keep durable state of its own.

The data layer is where state actually lives. Primary databases, caches, queues, blob storage, search indexes. Everything that has to survive a service restart sits here. The contract this layer offers is durability and consistency, and you pay for it in latency and operational complexity.

Each layer has a job. Pushing work to the wrong one is how teams quietly accumulate problems.

Two examples I keep watching.

Putting ACID-style multi-row consistency at the service tier. A team writes a "transaction coordinator" service that reads from two databases, decides, and writes back. It works in staging. In production, network blips cause partial writes. Reconciliation jobs proliferate. Half of the data team's roadmap becomes cleaning up rows that should never have existed. The fix is not better retry logic. Either keep the transaction inside one database that already does this correctly, or accept eventual consistency with an outbox pattern. The service tier is the wrong place to invent ACID.

Putting rate limiting at the service tier instead of the edge. This is the failure mode worth burning into memory. A team adds per-user rate limits inside the API service. Fine, until someone launches a botnet at a public endpoint. The traffic still has to be terminated, parsed, routed, and authenticated before the rate limiter can reject it. The bots eat your TLS handshakes, your load balancer connection budget, your service threadpools. The rate limiter returns clean 429s and the site is still down because the work of saying no happens too deep in the stack. Rate limits belong at the edge, where a request can be dropped before it costs anything.

Edge routes. Services compute. Data stores. Put each kind of work where it belongs and the system mostly stays out of your way.

Key takeaway

Edge routes traffic. Services compute. Data stores and moves state. Put work at the wrong layer and you will rediscover why the layers exist, usually during an outage.

Originally posted on LinkedIn. View original.