My Solution for Designing a Simple URL Shortening Service: A TinyURL Approach with Score: 9/10
by dewdrop_pinnacle261
System requirements
Functional Requirements:
- Shorten a given long URL into a short, unique URL
- Redirect to the original long URL when the short URL is accessed
- Allow optional custom short URLs (like
tinyurl.com/myalias
) - Track basic analytics: number of clicks per short URL
- Support expiration time on URLs (optional)
- Provide API for shortening and expanding URLs
Non-Functional Requirements:
- High availability
- Low latency (~10-50ms for redirect)
- Horizontal scalability to handle growth
- Consistency: strong for creation, eventual for analytics
- Durability: no loss of mappings on failures
- Security: prevent abuse (rate limiting, spam detection)
Capacity estimation
Let’s assume:
- 100M new URLs per month → ~40 URLs/sec
- 1B total URLs stored after a few years
- 10B redirection requests per month → ~4K requests/sec
Storage:
- Each record ~500 bytes → 1B × 500B = ~500GB total
- Write throughput ~40 ops/sec, Read throughput ~4K ops/sec
API design
POST /api/shorten
Body: { longUrl: "https://...", customAlias?: "myalias", expireAt?: "2025-12-31" }
Response: { shortUrl: "https://tinyurl.com/abc123" }
GET /:shortCode
Redirects to the original long URL
GET /api/analytics/:shortCode
Response: { clicks: 12345 }
Database design
Schema (Relational / NoSQL hybrid example):
FieldTypeNotes | ||
short_code | String (PK) | Unique 6-8 char code |
long_url | String | Original URL |
created_at | Timestamp | Creation time |
expire_at | Timestamp | Optional expiration |
click_count | Integer | For analytics |
Indexes:
short_code
→ fast lookupexpire_at
→ periodic cleanup
High-level design
Components:
- API Gateway / Load Balancer
- URL Shortening Service → handles shorten requests
- Redirect Service → handles GET /:shortCode
- Database → persistent store (e.g., MySQL/Postgres, Redis for cache)
- Cache Layer (Redis/Memcached) → speed up hot redirects
- Analytics Service → tracks click counts asynchronously
- Background Workers → cleanup expired URLs, process analytics
Request flows
Shorten Flow:
- Client → API Gateway → URL Shortening Service
- Validate long URL, check custom alias availability
- Generate short code (base62/random), store in DB
- Return short URL
Redirect Flow:
- Client → API Gateway → Redirect Service
- Check Redis cache → fallback to DB if cache miss
- Redirect (HTTP 302) to long URL
- Send click event to Analytics Service (async)
Analytics Flow:
- Analytics Service consumes events
- Batches updates to DB or stores in fast counter system
Detailed component design
a) Short Code Generator
- Use Base62 (a-zA-Z0-9) → 6-8 char codes
- Option 1: Random generation + check uniqueness in DB
- Option 2: Incremental counter + Base62 encode
- Option 3: Hash long URL + collision resolution
b) Cache Layer (Redis)
- Store
short_code → long_url
mappings - Use TTL aligned with
expire_at
- Use LRU eviction
c) Analytics Service
- Async event queue (Kafka/SQS)
- Batch writes to DB or use Redis atomic counters
- Daily materialization of click stats
Trade offs/Tech choices
- Relational DB (MySQL) → strong consistency
- Redis cache → fast redirects, reduce DB load
- Async analytics → avoids slowing user redirects
- Base62 encoding → short, URL-safe codes
- Custom aliases → requires uniqueness checks, extra DB load
Failure scenarios/bottlenecks
DB failure: Use master-slave, replication, backups
Cache failure: Fallback to DB, rebuild cache on access
Hot key (popular short code): Use cache, sharded counters
Code collision: Check DB, retry generation
Network partition: Favor availability on reads, quorum writes for consistency
API abuse (spam): Add rate limits, CAPTCHA, user auth
Future improvements
Use CDN edge caching for global low-latency redirects
Add user accounts, link management dashboard
Support QR code generation
Add AB testing for link targeting
Use a distributed ID generator (e.g., Twitter Snowflake)
Cross-region replication for disaster recovery
Areas of Improvement
- Algorithm detail: Improve collision resistance (e.g., Bloom filters)
- Granularity: Per-region traffic monitoring
- Dynamic adjustments: Autoscaling backend services, dynamic cache TTLs
- Redis complexity: Use pipelining, Lua scripts for atomic operations
Potential Gaps / Edge Cases
- Edge Cases:
- Empty or invalid URLs
- Expired links
- Custom alias conflicts
- Network Partitions:
- Redis failover
- Database write quorum handling
Areas of Depth for Next Level
- Algorithm implementation:
- Distributed ID generation
- Conflict-free replicated data types (CRDTs) for click counts
- Scalability enhancements:
- Partition DB by short code prefix
- Use of CDN edge nodes
- Cross-region consistency:
- Global DB clusters (e.g., CockroachDB)
- Async replication with conflict resolution