Designing A Simple Url Shortening Service A TinyURL Approach - System Design

Requirements

Functional Requirements:

The core of the service is simple: given a long URL, generate a short alias (e.g. sho.rt/aB3kZ9) that redirects visitors back to the original.
Beyond that, users should be able to create custom aliases, set optional expiry dates, and view basic analytics (click counts, referrers, geolocation). Authenticated users get a dashboard to manage their links.

Non-Functional Requirements:

The system needs to handle ~100M redirects/day (~1,200 req/sec average, ~10K peak), with redirection latency under 10ms at P99. URLs and their mappings must be durable — no silent data loss.
The short code namespace must be globally unique, and we should aim for ~99.99% uptime. The read-to-write ratio is heavily skewed (~100:1), so the design optimises for reads.

API Design

POST /api/v1/urls Create short URL (body: long_url, alias?, ttl?)

GET /{code} Redirect to original URL (301/302)

GET /api/v1/urls/{code} Get URL metadata + analytics

PATCH /api/v1/urls/{code} Update alias, TTL, or active status

DELETE /api/v1/urls/{code} Deactivate / delete a short URL

GET /api/v1/urls/{code}/stats Click analytics (time-series, referrers, geo)

GET /api/v1/users/{uid}/urls List all URLs for a user (paginated)

High-Level Design

The URL shortening service sits behind a global CDN layer that caches the most popular redirects at edge nodes close to users, so the majority of traffic never reaches the origin. Requests that miss the edge flow through an API gateway — which handles rate limiting and authentication — and are routed to one of two stateless service pools: a read-optimised redirect service for resolution lookups, and a write service for creating and managing URLs.

The redirect service checks a Redis cluster first; on a cache hit it returns the destination URL in under 5ms. On a miss it falls back to the primary database, backfills Redis, and warms the CDN for subsequent requests. Every resolved click asynchronously emits an event to a message queue, which feeds a stream processor that aggregates click analytics into a time-series store — completely off the critical path so redirects are never slowed by analytics writes.

The write service draws short codes from a pre-generated token pool rather than a live counter, avoiding hot-spot contention at scale. Persistent state lives in a primary relational or key-value database with read replicas serving analytics queries. A background worker continuously replenishes the token pool and a scheduled job handles TTL expiry, keeping the hot path free of any housekeeping work.

Detailed Component Design

1. Short code generation

This is the most critical design decision. We need ~7-character codes that are globally unique and not guessable. Two viable strategies:

Base62 encoding of an ID — a central counter (or a distributed one via Twitter Snowflake) generates a monotonically increasing integer; we Base62-encode it. A 7-character Base62 string handles 62⁷ ≈ 3.5 trillion URLs. The downside is that sequential IDs produce predictable codes, making enumeration easy.

Pre-generated token pool — a background job pre-generates random Base62 codes, stores them in a "tokens" table, and atomically marks them used on demand. This avoids the hot counter bottleneck and produces unpredictable codes. It's the preferred approach at scale.

2. Redirect service (the hot path)

The redirect path is entirely read-only and has three tiers: CDN edge (fastest), Redis in-memory cache (fast), and DB (slowest but always correct). On every cache miss, the layer below is queried and the result is backfilled upward. Click events are emitted asynchronously — the redirect response is never blocked on analytics writes.

3. Database schema

The core urls table holds code (PK, indexed), long_url, user_id, created_at, expires_at, and is_active. A separate clicks table (or a time-series store like ClickHouse/TimescaleDB) stores code, timestamp, referrer, user_agent, country, and ip_hash. The tokens table for pre-generated codes holds token and used_at.

4. Analytics pipeline

Click events flow into Kafka. A stream processor (Flink or Lambda) aggregates them — total clicks, unique clicks by day, top referrers — and writes results to a read-optimised analytics store. Raw events are never queried directly.

5. Expiry & cleanup

A scheduled worker scans the urls table for rows where expires_at < now() and marks them inactive. Redis TTLs are set to match the URL's expiry. This prevents serving stale redirects after expiry without scanning the DB on every request.