1. Requirements
Functional Requirements
- Generate short URL for a given long URL
- Redirect short URL β original URL
- Optional:
- Custom aliases
- Expiration time
- Analytics (click count)
Non-Functional Requirements
- Latency: < 50ms redirect
- Throughput: High read-heavy system (~100:1 read/write)
- Scalability: Billions of URLs
- Availability: Very high (redirect must not fail)
- Consistency: Eventual consistency acceptable for analytics
2. Estimations
Traffic
- Writes: ~10K/sec
- Reads (redirects): ~1M/sec
Storage
Assume:
- 100B URLs over time
- Each record ~200 bytes
π Total β 20 TB β requires distributed storage
Read/Write Pattern
- Highly read-heavy
- Requires aggressive caching
3. API Design
3.1 Create Short URL
POST /shorten
Request:
{
"long_url": "https://example.com/abc",
"custom_alias": "optional",
"expiry": "optional"
}
Response:
{
"short_url": "https://sho.rt/xyz123"
}
3.2 Redirect
GET /{short_code}
Response:
- HTTP 302 β redirect to long URL
3.3 Analytics (Optional)
GET /analytics/{short_code}
4. Data Storage & Design
4.1 Database (Primary Store)
Use:
- Distributed KV store (e.g. Amazon DynamoDB / Cassandra)
Schema:
short_code (PK)
long_url
created_at
expiry
user_id
4.2 Cache (Critical)
Use:
- Redis
Key:
short_code β long_url
5. High-Level Architecture (HLD)
System divided into:
5.1 Data Plane (Redirect Flow)
- Client hits short URL
- Request goes to CDN
- CDN checks cache:
- If hit β redirect
- If miss β go to backend
- Backend:
- Check Redis cache
- If miss β fetch from DB
- Return 302 redirect
5.2 Control Plane (Short URL Creation)
- Client sends long URL
- Service generates short code
- Store mapping in DB
- Cache in Redis
6. Detailed Breakdown
6.1 ID Generation (Make This Precise)
6.1 Short Code Generation (Robust Strategy)
Approach: Distributed Counter + Base62 Encoding
Step 1: Unique ID Generation
Use:
- Centralized ID service OR
- Distributed ID generator (e.g., Snowflake-like)
Properties:
- Globally unique
- Monotonically increasing
- No collisions
Step 2: Base62 Encoding
ID β Base62 β short_code
- Characters: aβz, AβZ, 0β9
- Compact representation
Example:
123456789 β "8M0kX"
Why This Works:
- No collisions
- Predictable length
- Efficient storage
Optional Enhancement:
- Add randomness to prevent predictability (security)
6.2 Read Optimization (Critical)
- CDN caching (edge)
- Redis caching (hot data)
π Most requests never hit DB
6.3 Write Path
- Generate ID
- Encode β short code
- Store in DB
- Populate cache
6.4 Scaling Strategy
Stateless App Servers
- No local state
- Horizontally scalable
DB Scaling
- Sharding by:
hash(short_code)
Cache Scaling
- Redis cluster
6.5 Hot Key Handling (Critical for Viral Traffic)
Problem:
- A single short URL (e.g., viral link) can receive millions of requests/sec
- Can overload:
- Redis
- Backend
- DB
Multi-Layer Solution:
1. CDN Caching (Primary Defense)
- Popular URLs cached at edge
- Majority of requests never reach backend
2. Redis Replication + Read Scaling
- Use replicated Redis nodes
- Hot keys served from replicas
3. Request Coalescing (Important)
- Multiple simultaneous cache misses:
Only 1 request hits DB
Others wait or reuse result
4. Local In-Memory Cache (Side Optimization)
- App servers cache hottest keys temporarily
5. Hot Key Detection
Track:
- Access frequency per key
If threshold exceeded:
- Promote to βhot tierβ cache
6. Dedicated Hot Key Tier (Advanced)
- Store extremely popular URLs in:
- Separate Redis cluster OR
- In-memory distributed cache
π Ensures:
- No single key overwhelms system
- Scales even for viral traffic
7. Additional Considerations
7.1 Burst Traffic Handling
- CDN absorbs sudden spikes
- Redis handles high QPS
- Backend protected from overload
7.2 Cache Miss Handling
- Redis miss β DB lookup
- Populate cache after read
7.3 Degraded Mode
Case 1: Redis Down
- Direct DB reads (higher latency)
Case 2: DB Down
- Serve from cache if available
Case 3: Full Failure
- Graceful degradation (temporary errors)
7.4 Expiration Handling
- TTL on DB + Redis
- Background cleanup jobs
7.5 Analytics
- Async logging
- Stream processing (Kafka)
7.6 Security
- Prevent abuse:
- Rate limiting
- Spam detection
Failure Handling & Recovery
Case 1: Redis Failure
- Fallback to DB reads
- Increased latency but system continues
Mitigation:
- Redis replication + failover
Case 2: Database Failure
- Serve from:
- CDN cache
- Redis cache
If cache miss:
- Return graceful error
Case 3: CDN Failure
- Requests hit backend directly
- Redis absorbs increased load
Case 4: App Server Failure
- Stateless β replaced automatically
- Load balancer reroutes traffic
Case 5: Partial Network Failure
- Retry with exponential backoff
- Circuit breaker to prevent cascading failures
Recovery Strategy
- Automatic failover (Redis / DB replicas)
- Health checks + auto-scaling
Observability
Track:
- Cache hit ratio
- DB latency
- Error rate
Tools:
- Prometheus
- Grafana
TTL & Data Lifecycle Management
Problem:
- Billions of URLs β unbounded growth
Solution:
1. Expiry at Creation
- Optional TTL per URL
2. TTL in Storage
- Redis:
- Native TTL support
- DB:
- Store expiry timestamp
3. Background Cleanup Jobs
- Periodic jobs delete expired entries
- Prevents storage bloat
4. Cold Data Archival (Optional)
- Move inactive URLs to cheaper storage
5. Cache Eviction Policy
- Redis uses:
- LRU / LFU eviction
π Ensures:
- Controlled storage growth
- Efficient memory usage
π Final Summary
- CDN + Redis caching ensures ultra-low latency
- Base62 encoding ensures compact unique URLs
- Distributed DB + sharding ensures scalability
- Stateless services ensure horizontal scaling
- Multi-layer caching handles extreme read load