Requirements
Functional Requirements:
- Create a short URL for a given long URL.
- Return the long URL associated with a given short URL.
Non-Functional Requirements:
- High Availability:99.9% uptime
- Low Latency:Redirect < 10ms
- Scale:每天 1億次寫入,100億次讀取
- Durability:資料保存 5 年,約 91 TB
API Design
Write Path:
POST /api/v1/urls
Request: { long_url, custom_alias?, expires_at? } Response:
201 Created → { short_url, long_url, expires_at }
400 Bad Request → { error: "Invalid URL" }
409 Conflict → { error: "Alias already taken" }
Read Path:
GET /{short_code}
Response:
302 Redirect → Location: long_url
404 Not Found → { error: "URL not found" }
410 Gone → { error: "URL expired" }
High-Level Design
The system consists of the following components:
1. Load Balancer
- Distributes incoming traffic across multiple API servers
- Ensures no single server is overwhelmed
2. API Servers (Stateless)
- Handle write path: receive long URL, generate short code
- Handle read path: receive short code, return redirect
- Stateless design allows horizontal scaling
3. ID Generator
- Generates unique IDs using Auto Increment
- Converts ID to Base62 short code (6 characters)
- Ensures no duplicate short URLs
4. Cache (Redis)
- Stores frequently accessed short_code → long_url mappings
- Cache-Aside pattern: check cache first, fallback to DB
- Reduces database load for read
-heavy traffic (100:1 read/write)
5. Database (Primary + Replicas)
- Primary handles all writes
- Replicas handle reads (read/write separation)
- Sharding by hash(short_code) for horizontal scaling
Write Path: Client → Load Balancer → API Server → ID Generator → Database Primary → return short_url
Read Path: Client → Load Balancer → API Server → Redis Cache (Hit: 302 Redirect) → Cache Miss → Database Replica → write back to Cache → 302 Redirect
Detailed Component Design
Component 1: ID Generator
How it works:
- New request comes in → Database generates Auto Increment ID
- Convert ID to Base62 (characters: a-z, A-Z, 0-9 = 62 chars)
- Example: ID 10001 → Base62 → "2Bi"
- 6-digit Base62 = 62^6 ≈ 56.8 billion unique combinations
How it scales:
- Single ID Generator becomes a bottleneck at high traffic
- Solution: Use Snowflake ID (distributed ID generation)
- Each server generates its own unique ID
- No coordination needed between servers
Tradeoffs:
- Base62 vs Hash(MD5)
- Base62: no collision, but needs centralized ID generator
- MD5: no central dependency, but collision risk exists
- Cost of change: switching hash algorithm after launch
requires DB migration + all existing URLs break
- Conclusion: Base62 preferred for reliability
---
Component 2: Cache (Redis)
How it works:
- Pattern: Cache-Aside (Lazy Loading)
1. API Server checks Redis for short_code
2. Cache Hit → return long_url immediately (< 1ms)
3. Cache Miss → query Database → write result back to Cache
- Key: short_code, Value: long_url
How it scales:
- Redis Cluster for horizontal scaling
- Consistent Hashing to distribute keys across nodes
- Adding new Redis node only remaps ~20% of keys
Capacity:
- Hot URLs follow 80/20 rule
→ 20% of URLs receive 80% of traffic
→ Cache only needs to store top 20% to achieve >90% hit rate
- Estimated cache size: 20% × daily active URLs × 500bytes
Tradeoffs:
- TTL too short → low hit rate, high DB pressure
- TTL too long → stale data, wasted memory
- Solution: TTL 24 hours + LRU eviction policy
---
Component 3: Database
How it works:
- Schema:
TABLE urls
id BIGINT PRIMARY KEY AUTO_INCREMENT
short_code VARCHAR(8) UNIQUE INDEX
long_url VARCHAR(2048)
created_at DATETIME
expires_at DATETIME
How it scales:
- Read scaling → Replication (1 Primary + 2 Replicas)
- Primary: handles all writes
- Replicas: handle all reads
- Write scaling → Sharding by hash(short_code) % N
- Each shard handles a subset of data
- No cross-shard queries needed (always query by short_code)
Capacity:
- 100M writes/day × 365 days × 5 years = 182.5B rows
- 182.5B × 500 bytes ≈ 91 TB total storage
- With 10 shards → ~9 TB per shard (manageable)
Tradeoffs:
- Replication lag: Replica may return slightly stale data
→ Acceptable for this system (eventual consistency is fine)
- Sharding complexity: harder to query across shards
→ Not an issue here since all queries use short_code