Designing A Simple Url Shortening Service A TinyURL Approach - System Design

Requirements

Functional Requirements:

Create a short URL for a given long URL.
Return the long URL associated with a given short URL.

Non-Functional Requirements:

High Availability：99.9% uptime
Low Latency：Redirect < 10ms
Scale：每天 1億次寫入，100億次讀取
Durability：資料保存 5 年，約 91 TB

API Design

Write Path：

POST /api/v1/urls

Request: { long_url, custom_alias?, expires_at? } Response:

201 Created → { short_url, long_url, expires_at }

400 Bad Request → { error: "Invalid URL" }

409 Conflict → { error: "Alias already taken" }

Read Path：

GET /{short_code}

Response:

302 Redirect → Location: long_url

404 Not Found → { error: "URL not found" }

410 Gone → { error: "URL expired" }

High-Level Design

The system consists of the following components:

1. Load Balancer

- Distributes incoming traffic across multiple API servers

- Ensures no single server is overwhelmed

2. API Servers (Stateless)

- Handle write path: receive long URL, generate short code

- Handle read path: receive short code, return redirect

- Stateless design allows horizontal scaling

3. ID Generator

- Generates unique IDs using Auto Increment

- Converts ID to Base62 short code (6 characters)

- Ensures no duplicate short URLs

4. Cache (Redis)

- Stores frequently accessed short_code → long_url mappings

- Cache-Aside pattern: check cache first, fallback to DB

- Reduces database load for read

-heavy traffic (100:1 read/write)

5. Database (Primary + Replicas)

- Primary handles all writes

- Replicas handle reads (read/write separation)

- Sharding by hash(short_code) for horizontal scaling

Write Path: Client → Load Balancer → API Server → ID Generator → Database Primary → return short_url

Read Path: Client → Load Balancer → API Server → Redis Cache (Hit: 302 Redirect) → Cache Miss → Database Replica → write back to Cache → 302 Redirect

Detailed Component Design

Component 1: ID Generator

How it works:

- New request comes in → Database generates Auto Increment ID

- Convert ID to Base62 (characters: a-z, A-Z, 0-9 = 62 chars)

- Example: ID 10001 → Base62 → "2Bi"

- 6-digit Base62 = 62^6 ≈ 56.8 billion unique combinations

How it scales:

- Single ID Generator becomes a bottleneck at high traffic

- Solution: Use Snowflake ID (distributed ID generation)

- Each server generates its own unique ID

- No coordination needed between servers

Tradeoffs:

- Base62 vs Hash(MD5)

- Base62: no collision, but needs centralized ID generator

- MD5: no central dependency, but collision risk exists

- Cost of change: switching hash algorithm after launch

requires DB migration + all existing URLs break

- Conclusion: Base62 preferred for reliability

---

Component 2: Cache (Redis)

How it works:

- Pattern: Cache-Aside (Lazy Loading)

1. API Server checks Redis for short_code

2. Cache Hit → return long_url immediately (< 1ms)

3. Cache Miss → query Database → write result back to Cache

- Key: short_code, Value: long_url

How it scales:

- Redis Cluster for horizontal scaling

- Consistent Hashing to distribute keys across nodes

- Adding new Redis node only remaps ~20% of keys

Capacity:

- Hot URLs follow 80/20 rule

→ 20% of URLs receive 80% of traffic

→ Cache only needs to store top 20% to achieve >90% hit rate

- Estimated cache size: 20% × daily active URLs × 500bytes

Tradeoffs:

- TTL too short → low hit rate, high DB pressure

- TTL too long → stale data, wasted memory

- Solution: TTL 24 hours + LRU eviction policy

---

Component 3: Database

How it works:

- Schema:

TABLE urls

id BIGINT PRIMARY KEY AUTO_INCREMENT

short_code VARCHAR(8) UNIQUE INDEX

long_url VARCHAR(2048)

created_at DATETIME

expires_at DATETIME

How it scales:

- Read scaling → Replication (1 Primary + 2 Replicas)

- Primary: handles all writes

- Replicas: handle all reads

- Write scaling → Sharding by hash(short_code) % N

- Each shard handles a subset of data

- No cross-shard queries needed (always query by short_code)

Capacity:

- 100M writes/day × 365 days × 5 years = 182.5B rows

- 182.5B × 500 bytes ≈ 91 TB total storage

- With 10 shards → ~9 TB per shard (manageable)

Tradeoffs:

- Replication lag: Replica may return slightly stale data

→ Acceptable for this system (eventual consistency is fine)

- Sharding complexity: harder to query across shards

→ Not an issue here since all queries use short_code