My Solution for Designing a Simple URL Shortening Service: A TinyURL Approach
by whisper3949
Requirements
Functional Requirements:
- Create a short URL for a given long URL.
- Return the long URL associated with a given short URL.
Non-Functional Requirements:
- Low Redirect latency
- High Availability
- Horizontal Scaling
- Durability
- Ready Heavy optimization
- Minimal URL Length
- Non Guessable
API Design
Write PATH - Create Short URL
POST /api/v1/urls
Headers:
Authorization: Bearer
Content_type: application/json
Request Body:
{
"long_url": "https://example.com/very/long/path/to/resouce",
"custom_alias": "my-link",
"expiry_date": "2027-01-01"
}
response (201 created)
{
"short_url" : "https://short.ly/Ab3xK9z",
"short_code": "Ab3xK9z",
"long_url": "https://example.com/very/long/path/to/resouce",
"created_at": "2026-04-15T10:30:00Z", "expires_at": "2027-01-01T00:00:00Z"
}
Errors:
400 Bad Request - Invalid URL format
409 Conflict - Custom Alias already exists
429 Too Many request - Rate limit Exceed
Read path - Redirect Short URL
GET /:shortCode
Example: GET https://short.ly/Ab3xK9z
Response (301 Moved Permanently):
HTTP/1.1 301 Moved Permanently
Location: https://example.com/very/long/path/to/resouce
Why 301 and not 302?
301 = Permanent Redirect
302 = Temporary Redirect
Errors:
404 Not Found
410 Gone- Url Expired
High-Level Design
The system is split into two main paths that both enter through an API Gateway:
WRITE PATH (Create Short URL):
Client sends POST request with long URL → API Gateway handles rate limiting, authentication, and request validation → Routes to Shortener Service → Shortener Service calls ID Generation Service which uses Base62 encoding of a distributed counter with pre-allocated ranges to generate a unique 7-character short code → Shortener Service stores the short_code → long_url mapping in the NoSQL Database (partitioned by short_code for even distribution) → Pre-warms Redis Cache with the new mapping → Returns the short URL to the client.
READ PATH (Redirect Short URL):
User clicks short URL → API Gateway routes to Redirect Service → Redirect Service checks Redis Cache first (1ms latency). On cache HIT, immediately returns 301 Permanent Redirect to the original long URL. On cache MISS, queries the NoSQL Database (5-10ms), stores the result in Redis Cache for future requests, then returns 301 redirect.
KEY COMPONENTS:
- API Gateway: Single entry point for both paths. Handles rate limiting (Token Bucket), authentication via API keys, and request routing.
- Shortener Service: Handles URL creation logic. Validates input, coordinates with ID Generation Service, writes to database.
- Redirect Service: Handles URL resolution. Optimized for speed with cache-first approach.
- ID Generation Service (Base62): Generates unique short codes using counter-based Base62 encoding with range allocation across multiple servers. Zero collision risk.
- Redis Cache: Cache-Aside pattern. Caches top 20% of URLs (2GB). 24-hour TTL. Handles 80% of read traffic.
- NoSQL Database: DynamoDB or Cassandra. Partitioned by short_code. Stores all URL mappings with metadata.
- Analytics Service: Asynchronously tracks click counts via message queue. Does not affect redirect latency.
https://link.excalidraw.com/readonly/hTYiWAyA3tITszIxyM9B
Detailed Component Design
Component 1: ID Generation Service (Base 62)
A. How does it works?
Base62 counter with range allocation .
62 characters (a-z,A-Z,0-9] , 7 chars= 3.5 trillions unique URLs
Why Base 62 not MD5 Hash
Base62 have zero collision.
B. How does it scale ?
Range allocations: Server 1 get 1-1M, server 2 gets 1M-2M . each server works indenpendently . No coordinations . Zookeeper/etcd assigns new ranges when a server range runs out.
C. How Big?
Write: 100M/2.5M = 40QPS
Read: 400QPS
Storage: 100M x 200B x 60month = 1.2TB
Cache: 34.5M/day X 0.20 x 200B = 2GB redis
Each server handle 500sec . one server is enough. Add more for redudancy not capacity.
D. What if ?
- Counter service fails ? -> servers use pre-allocated range, keep working
- Two servers get same range -> zookeeper guarantees unique range.
- Custom alias requested -> check db first return 409 if taken
Fallback mechanism for ID Generation outages:
Primary: Distributed counter with Zookeeper range allocation
Fallback 1 (short outage < 5 min):
Each server pre-allocates a range of 1 million IDs locally.
If Zookeeper is down, server continues generating from its
local range. No disruption to URL creation.
Fallback 2 (extended outage > 5 min):
Switch to UUID-based generation (timestamp + server_id + random).
Format: milliseconds(41 bits) + shard_id(5 bits) + sequence(12 bits)
Similar to Twitter Snowflake. Guarantees uniqueness without
any central coordination.
Fallback 3 (split-brain — two Zookeeper leaders):
Range gaps are acceptable. If Server A thinks its range is
1M-2M and Server B also gets 1M-2M due to split-brain,
both generate IDs but with a server_id prefix to avoid collision.
After split-brain resolves, reconcile and reassign clean ranges.
Recovery: When Zookeeper recovers, servers request fresh ranges
and resume normal counter-based generation. No data migration needed.
Component 2 : Caching
How does it works?
Cache - Aside pattern with redis
Read : Check Redis-> HIT = returns (1ms) -> MISS = query DB (5ms) -> stor e in redis
Write : Store in DB -> pre-warm cache
TTL : 24 hours . LRU eviction when cache is full.
Eviction Policy: LRU (Least Recently Used)
- When Redis hits 2GB memory limit, LRU automatically removes
the URL that hasn't been accessed for the longest time.
- Criteria: Last-access timestamp. URLs clicked today stay,
URLs not clicked for weeks get evicted first.
- Impact on performance: LRU keeps cache hit rate at ~80%
because popular URLs are always recently accessed.
- Combined with 24hr TTL: even frequently accessed URLs get
a fresh DB read once per day, preventing stale data.
- Redis config: maxmemory-policy = allkeys-lru
Why Cache Aside not write-through?
Because we dont want to cache every new url. only the one people actually click. 80/20 rule. 20$ of url get 80% of clicks.
How does it scale?
one redis instance handle 2GB easily, if traffic grow we can do horizontal scaling.
How big?
Daily Read: 400QPS x 86400 = 34.5M/day
Cache 20% = 34.5M x 0.20 x 200 bytes = 1.4 GB -> 2GB redis
What if
- Redis crashes -> All read hits DB latency will increase from 1ms to 5ms but systems works but slow.
- Cache stampede -> Distributed lock (SETNX) . only one request fetches from DB.
- URL deleted -> delete redis key immediately. next read get fresh data from DB.
- Cache full -> LRU eviction removed least recent used url automatically.
Component 3 : Database
How does it work?
NoSql Database (DynamoDB) . PartitionKey = short_code
Why no sql, simple key-value lookup. No joins horizontal scale build in
Why not choosing SQL because of complex Joins, strong ACID .
How does it scale?
partition by short_code -> event distributions (Base62)
no hot partitions because short codes are random-ish
read replicas for read scaling if needed.
How Big
100M x 200Byte x 60 months= 1.2TB (5 years)
with 3x replication = 3.6TB
What if
- Hot partition -> cache absorbs 80% reads. DB barely touched.
- URL expires -> Background cleanup job
- Need analytics -> can run a analytics service which connect with the db to get the analytics report.
Handling cache miss impact on database:
Normal state: 80% cache hit rate → only 20% of reads hit DB
400 QPS reads × 0.20 = 80 queries/sec to DB (easily handled)
Worst case (Redis down): 100% cache miss → all 400 QPS hit DB
DB can handle 5,000-10,000 QPS, so 400 QPS is still fine.
Latency increases from 1ms to 5-10ms but no outage.
Cache warming after restart:
Cold cache gradually warms through natural traffic.
Within 1 hour, hit rate recovers to ~60%.
Within 24 hours, back to normal ~80%.
Protection against DB overload during cache miss spikes:
- Connection pooling: limit max DB connections to 100
- Circuit breaker: if DB latency > 100ms, return cached
stale data (better stale than slow)
- Request coalescing: if 100 users request same uncached URL,
only 1 query goes to DB, others wait for result