sho.rt/aB3kZ9) that redirects visitors back to the original. POST /api/v1/urls Create short URL (body: long_url, alias?, ttl?)
GET /{code} Redirect to original URL (301/302)
GET /api/v1/urls/{code} Get URL metadata + analytics
PATCH /api/v1/urls/{code} Update alias, TTL, or active status
DELETE /api/v1/urls/{code} Deactivate / delete a short URL
GET /api/v1/urls/{code}/stats Click analytics (time-series, referrers, geo)
GET /api/v1/users/{uid}/urls List all URLs for a user (paginated)
The URL shortening service sits behind a global CDN layer that caches the most popular redirects at edge nodes close to users, so the majority of traffic never reaches the origin. Requests that miss the edge flow through an API gateway — which handles rate limiting and authentication — and are routed to one of two stateless service pools: a read-optimised redirect service for resolution lookups, and a write service for creating and managing URLs.
The redirect service checks a Redis cluster first; on a cache hit it returns the destination URL in under 5ms. On a miss it falls back to the primary database, backfills Redis, and warms the CDN for subsequent requests. Every resolved click asynchronously emits an event to a message queue, which feeds a stream processor that aggregates click analytics into a time-series store — completely off the critical path so redirects are never slowed by analytics writes.
The write service draws short codes from a pre-generated token pool rather than a live counter, avoiding hot-spot contention at scale. Persistent state lives in a primary relational or key-value database with read replicas serving analytics queries. A background worker continuously replenishes the token pool and a scheduled job handles TTL expiry, keeping the hot path free of any housekeeping work.
This is the most critical design decision. We need ~7-character codes that are globally unique and not guessable. Two viable strategies:
Base62 encoding of an ID — a central counter (or a distributed one via Twitter Snowflake) generates a monotonically increasing integer; we Base62-encode it. A 7-character Base62 string handles 62⁷ ≈ 3.5 trillion URLs. The downside is that sequential IDs produce predictable codes, making enumeration easy.
Pre-generated token pool — a background job pre-generates random Base62 codes, stores them in a "tokens" table, and atomically marks them used on demand. This avoids the hot counter bottleneck and produces unpredictable codes. It's the preferred approach at scale.
The redirect path is entirely read-only and has three tiers: CDN edge (fastest), Redis in-memory cache (fast), and DB (slowest but always correct). On every cache miss, the layer below is queried and the result is backfilled upward. Click events are emitted asynchronously — the redirect response is never blocked on analytics writes.
The core urls table holds code (PK, indexed), long_url, user_id, created_at, expires_at, and is_active. A separate clicks table (or a time-series store like ClickHouse/TimescaleDB) stores code, timestamp, referrer, user_agent, country, and ip_hash. The tokens table for pre-generated codes holds token and used_at.
Click events flow into Kafka. A stream processor (Flink or Lambda) aggregates them — total clicks, unique clicks by day, top referrers — and writes results to a read-optimised analytics store. Raw events are never queried directly.
A scheduled worker scans the urls table for rows where expires_at < now() and marks them inactive. Redis TTLs are set to match the URL's expiry. This prevents serving stale redirects after expiry without scanning the DB on every request.