Requirements
Functional Requirements:
- Create a short URL for a given long URL.
- Return the long URL associated with a given short URL.
Non-Functional Requirements:
- List the key non-functional requirements (eg low latency, scalability, reliability, etc.)...
## 1. Create Short URL API (Write Path)
POST /api/v1/shorten
Purpose:
This API accepts a long URL and generates a unique short URL.
Request:
{
"long_url": "https://example.com/very-long-url"
}
Write Flow:
1. Client sends long URL request
2. Request reaches Load Balancer
3. Load Balancer routes request to App Server
4. App Server validates URL
5. Generate unique short code
6. Store mapping in Primary Database
7. Store mapping in Redis cache
8. Return generated short URL
Response:
{
"short_url": "https://tiny.ly/aX91BcQ"
}
## 2. Redirect API (Read Path)
GET /{short_code}
Example:
GET /aX91BcQ
Purpose:
This API redirects the user from short URL to the original long URL.
Read Flow:
1. User requests short URL
2. Request reaches Load Balancer
3. Request routed to App Server
4. App Server first checks Redis cache
5. If cache miss → lookup Database
6. Populate Redis cache if DB hit
7. Return HTTP 302 Redirect response
Response:
HTTP 302 Redirect
Location: https://example.com/very-long-url
The URL shortening system is designed as a scalable distributed architecture optimized for heavy read traffic and low-latency redirects.
The system mainly consists of:
- Client
- Load Balancer
- Multiple App Servers
- Redis Cache
- Primary Database
- Read Replica Databases
The Load Balancer distributes requests across multiple app servers to support horizontal scaling and high availability.
App Servers handle:
- URL validation
- short code generation
- redirect handling
- cache interactions
Redis is used for ultra-fast lookups because redirect requests are significantly higher than URL creation requests.
The Primary Database stores permanent URL mappings, while replica databases help scale read traffic and improve availability.
flowchart TD
A["Client / User"]
B["Load Balancer"]
C1["App Server 1"]
C2["App Server 2"]
C3["App Server 3"]
D["Redis Cache"]
E["Primary Database"]
F1["Read Replica 1"]
F2["Read Replica 2"]
A --> B
B --> C1
B --> C2
B --> C3
C1 --> D
C2 --> D
C3 --> D
D --> E
E --> F1
E --> F2
## 1. Short Code Generation Service
The short code generation service is responsible for generating unique short URLs for every long URL submitted by users.
Initially, the system can use:
Database Auto Increment ID + Base62 Encoding
Flow:
1. Insert long URL into DB
2. DB generates unique numeric ID
3. Convert ID to Base62 string
4. Use generated Base62 value as short code
Example:
125789 → gTb
Base62 characters:
[a-zA-Z0-9]
This provides compact and URL-friendly identifiers.
Scaling:
At very large scale, sequential IDs become predictable and expose system growth patterns.
The system can later evolve toward:
- Randomized Base62 tokens
- NanoID
- Snowflake-based distributed IDs
to support distributed generation and improve security.
Tradeoffs:
Auto Increment + Base62:
- Simple
- Collision-free
- Predictable IDs
Random Tokens:
- Hard to guess
- Requires collision checks
Snowflake IDs:
- Highly scalable
- More complex implementation
## 2. Redis Cache Layer
The redirect operation is the most frequently used operation in the system.
Since:
Reads >>> Writes
the system uses Redis cache to reduce database load and improve redirect latency.
Cache Lookup Flow:
1. User requests short URL
2. App Server checks Redis
3. If found → immediate redirect
4. If cache miss → query Database
5. Populate Redis cache
6. Return redirect response
This pattern is called:
Cache Aside Pattern
Why Redis?
Redis provides:
- in-memory lookups
- millisecond latency
- high throughput
- distributed caching support
Scaling:
Redis can scale using:
- Redis replication
- Redis clustering
- partitioned cache nodes
Tradeoffs:
Benefits:
- Very fast reads
- Reduced DB load
- Improved latency
Drawbacks:
- Additional infrastructure
- Cache invalidation complexity
- Memory cost
## 3. Database Design and Scaling
The database stores the permanent mapping:
short_code → original_url
Example schema fields:
- id
- short_code
- original_url
- created_at
Indexing:
Indexing is applied on:
short_code
because redirect lookups happen continuously.
This significantly improves lookup speed.
Replication:
The system uses:
Primary DB + Read Replicas
Writes:
- Go to Primary DB
Reads:
- Served from Replica DBs
Benefits:
- Better scalability
- High availability
- Reduced read load on primary DB
Sharding / Partitioning:
As data grows massively, a single database can become bottleneck.
The system can scale using:
- Horizontal partitioning
- Sharding
Example:
hash(short_code) % N
This distributes records across multiple database shards.
Tradeoffs:
Replication:
- Improves reads and availability
- Replication lag possible
Sharding:
- Massive scalability
- Operational complexity
Indexing:
- Faster lookups
- Extra storage and write overhead
## Additional Tradeoffs Discussion
Multiple App Servers:
Advantages:
- Better scalability
- Fault tolerance
- High availability
Disadvantages:
- Higher infrastructure cost
- Deployment complexity
- Monitoring overhead
Redis Caching:
Advantages:
- Faster redirects
- Reduced DB load
Disadvantages:
- Cache consistency challenges
- Additional memory cost
Database Sharding:
Advantages:
- Supports huge scale
Disadvantages:
- Complex operations
- Harder debugging
## Conclusion
The proposed URL shortening system is designed as a highly scalable distributed read-heavy architecture optimized for:
- Low latency redirects
- High availability
- Horizontal scalability
- Fast lookups
Core scalability techniques include:
- Load Balancing
- Redis caching
- Database replication
- Indexing
- Sharding
The system can initially start simple using:
Primary DB + Redis + App Servers
and later evolve incrementally toward distributed large-scale architecture as traffic grows.