Requirements
Functional Requirements:
- Create a short URL for a given long URL.
- Return the long URL associated with a given short URL.
Non-Functional Requirements:
- High Availability: The system should be operational and accessible at all times, ensuring that users can shorten and retrieve URLs without downtime.
- Low Redirect Latency: Users expect quick redirects when they click on a short URL, so minimizing latency is crucial for a good user experience.
- Horizontal Scalability: As your service grows, it should be able to handle increased load by adding more servers rather than upgrading existing ones.
API Design
Post request to /short-url Body:{url:"longurl1"}
returns:{shorturl}
So this would first validate the URL if it is correct or not so if correct then would proceed to back end then the bacon would have shot in url functionality then it was saving the database and return the URL.
Get: /short-url will fetch all the url which user has it and will alow params based and paginated
Post: /long-url Body:{url:shorturl} will check the db if this short url exist in our place and if found return the orignal url
High-Level Design
1. Entry Layer (API / Edge Tier)
When user hits the system:
For shortening:
POST /shorten
Responsibilities:
- ✅ URL validation (is it valid format?)
- ✅ Authentication (optional)
- ✅ Rate limiting (prevent abuse, e.g. 5 req/sec)
- ✅ Forward request to backend service
2. Write Path (URL Shortening Flow)
Step-by-step:
- User sends long URL
- API validates & rate limits
- Request goes to URL Shortener Service
Inside URL Shortener Service:
Step 1: Check cache (optional optimization)
- If already exists → return existing short URL
Step 2: Generate short code
Options:
- Base62(counter)
- Random string
- Hash
Example:
https://example.com/abc → "xYz123"
Step 3: Store in DB
shortCode → longURL
Step 4: Update cache
cache[xYz123] = longURL
Step 5: Return response
https://short.ly/xYz123
3. Read Path
When user clicks short URL:
GET /xYz123
Flow:
Step 1: Check cache (FAST path 🚀)
if exists → return immediately
Step 2: If cache miss → query DB
xYz123 → longURL
Step 3: Update cache
Step 4: Redirect user
HTTP 302 → original URL
short_code (PRIMARY KEY)
long_url
created_at
expiry (optional)
Key points:
- Indexed on
short_code - Optimized for fast reads
- Request always hits API
- Cache is checked inside system, not bypassing API
Correct understanding:
User → API → Cache → DB (if needed)
Let’s make it solid:
- Apply at API layer
- Example:
- 5 requests/sec per user
- Prevents:
- Spam
- System overload
| ComponentResponsibility | |
| API Layer | Validation, auth, rate limiting |
| Write Service | Generate short URL, store |
| Read Service | Handle redirects |
| Cache (Redis) | Fast lookup |
| DB | Persistent storage |
- “For identifier generation, I would use a distributed ID generator like Snowflake to ensure globally unique, collision-free IDs without coordination. These IDs are Base62 encoded to generate short URLs.”
- “For caching, I would use Redis with a cache-aside pattern and write-through on creation. LRU eviction ensures hot URLs stay cached, and TTL is used for expiring links.”
- “Consistency is handled via write-through updates and lazy cache population on misses, giving us eventual consistency with high performance.”
Here is a complete, production-ready system design for a TinyURL-like URL shortener, with all critical details included end-to-end.
1. High-Level Architecture
User → Load Balancer → App Servers
Write Path:
User → API → Identifier Service → Storage → Cache → Response
Read Path:
User → API → Cache → DB → Redirect
2. Core Components Overview
- Identifier Generation Service → generates unique short codes
- Storage Layer → stores URL mappings
- Redirection Service → handles high-volume reads
- Cache Layer → reduces latency and DB load
- Rate Limiter → protects system from abuse
3. Identifier Generation Service
Responsibility
Generate globally unique, scalable, collision-safe IDs
Design: Distributed Snowflake + Base62
ID Structure (64-bit)
| timestamp (41) | machine_id (10) | sequence (12) |
Flow
- Request hits app server
- Generate ID locally (no DB dependency)
- Convert ID → Base62
- Return short code
Algorithms & Complexity
- Bit manipulation → O(1)
- Base62 encoding → O(log₆₂ N)
Scaling
- Each server has unique
machine_id - No coordination required
- Horizontal scaling is linear
Throughput
- Per node: 4096 IDs/ms
- 100 nodes → millions/sec
Collision & Failure Handling
1. DB-Level Guarantee
UNIQUE(short_code)
2. Retry Logic
Generate → Insert
↓
Conflict → regenerate → retry (max 2–3 times)
3. Clock Drift Handling
- If clock goes backward:
- Pause generation OR
- Use last_timestamp + 1
4. Sequence Overflow
- If >4096 IDs/ms:
- Wait for next millisecond
Tradeoffs
Pros
- No DB bottleneck
- Highly scalable
- Practically collision-free
Cons
- Slight predictability
- Requires machine_id coordination
4. Redirection Service (Read Path)
Responsibility
Handle:
GET /abc123 → Redirect to original URL
Flow
- Load balancer routes request
- Check cache
- Cache HIT → return immediately
- Cache MISS → query DB → populate cache → return
Performance Goal
- Latency < 50ms
- 90%+ requests served from cache
Data Structures
Cache
- Key: short_code
- Value: long_url
DB Index
- Primary index on
short_code
Multi-Layer Scaling
1. CDN (optional)
- Edge caching for global users
2. Cache Layer
Use Redis
- O(1) lookup
- Handles majority traffic
3. DB Layer
- Only for cache misses
Tradeoffs
Pros
- Extremely fast reads
- Scales horizontally
Cons
- Cache misses hit DB
- Memory cost
5. Storage Layer
Responsibility
Store mapping:
short_code → long_url
Schema
Table: url_mapping
short_code (PK)
long_url
created_at
expiry_time
user_id (optional)
Indexing
- Primary:
short_code - Secondary:
user_id(optional)
Data Structures
- SQL → B-tree
- NoSQL → hash-based
Scaling
1. Sharding
hash(short_code) % N
- Even distribution
- Avoid hotspotting
2. Replication
- Primary → writes
- Replicas → reads
Capacity Estimation
- 100M URLs
- Avg 200B per URL
100M × 200B = 20GB
index overhead → ~40–50GB
Tradeoffs
| OptionProsCons | ||
| SQL | Strong consistency | Harder to scale |
| NoSQL | Scalable | Eventual consistency |
Failure Handling
- Replication failover
- Backup + restore
- Retry on transient failures
6. Caching Strategy (Detailed)
Cache Type
Use Redis
Cache Pattern
Hybrid Approach
- Write-through (on create/update)
- Cache-aside (on read)
TTL Strategy
1. Expiring URLs
TTL = expiry_time − current_time
2. Permanent URLs
- Default TTL: 24 hours
- Sliding TTL:
- Reset TTL on each access
3. Hot URLs
- TTL: 1–6 hours
- Frequently refreshed by traffic
Eviction Policy
- LRU (Least Recently Used)
Preventing Cache Bloat
1. Memory Limit
- Example: 10–20 GB cap
2. Admission Control
- Cache only after 2–3 hits
- Avoid caching one-time URLs
3. Negative Caching
short_code → NULL
TTL = 5 minutes
Cache Invalidation
On Update/Delete
DB update → Cache delete/update
Event-Based Invalidation
- Publish event:
URL_UPDATED / URL_DELETED
- Consumers remove from cache
TTL as Backup
- Ensures stale data eventually removed
Consistency Model
- Write-through ensures freshness
- Cache-aside ensures performance
- Eventual consistency acceptable
7. Rate Limiting
Where Applied
- URL creation API → strict limits
- Redirect API → relaxed limits (DDoS protection)
Algorithm: Token Bucket
Configuration Example
Capacity: 100 requests
Refill: 10 req/sec
Behavior
- Allows bursts up to 100
- Then throttles to 10/sec
Implementation
Use Redis
Key
rate_limit:{user_id}
Store
- Tokens
- Last refill timestamp
Burst Handling
- Within limit → allow
- Exceeded → reject
Response
429 Too Many Requests
Retry-After: 10
Client Backoff Strategy
Exponential backoff:
1s → 2s → 4s → 8s → capped
Advanced Protection
- Per IP limit
- Per user limit
- Global rate limit
Tradeoffs
| ApproachProsCons | ||
| Token Bucket | Handles bursts well | Slight complexity |
| Fixed Window | Simple | Burst spikes |
| Sliding Window | Accurate | Expensive |
8. End-to-End Flow
Write Path
User → API
→ Generate ID
→ Base62 encode
→ Store in DB
→ Update cache
→ Return short URL
Read Path
User → API
→ Cache lookup
→ HIT → Redirect
→ MISS → DB → Cache → Redirect
9. Key Design Characteristics
- O(1) ID generation
- O(1) cache lookup
- Horizontally scalable
- Read-optimized architecture
- Controlled memory usage via TTL + LRU
- Strong uniqueness guarantees
- Resilient to failures and high concurrency