My Solution for Designing a Simple URL Shortening Service: A TinyURL Approach
by tidal_solstice279
Requirements
Tiny URL System Design
The following diagram illustrates the high-level architecture for a Tiny URL service, outlining the flow of a request from a client to the backend data stores.
flowchart TD
B["Client"]
n2["Load Balancer"]
n3["WAF"]
C{"Application Server"}
n4[("DynamoDB")]
n6[("Redis")]
B --> n2
n2 --> n3
n3 --> C
C --> n4
C --> n6
n4 --> n6
Architecture Flow Explanation:
- Client: The user's browser or application initiates a request (either to create a short URL or to resolve one).
- Load Balancer: The incoming traffic is distributed across multiple servers running the application logic. This ensures no single server becomes a bottleneck, providing scalability and high availability.
- Web Application Firewall (WAF): This layer filters out malicious traffic, such as SQL injection or DDoS attacks, before it reaches the application servers, enhancing security and reliability.
- Application Server: This is the core of the system. It handles the business logic:
- URL Creation: Generates a unique short code for a given long URL and stores the mapping in the database.
- URL Redirection: Receives a request for a short URL, looks up the corresponding long URL, and redirects the client.
- DynamoDB: The primary, persistent database. It stores the mappings between all short codes and their corresponding long URLs. Its managed nature offers high scalability and durability.
- Redis: Serves as a in-memory cache. Frequently accessed or recently resolved URL mappings are stored here to serve redirect requests with extremely low latency. The cache is populated from the database.
Functional Requirements:
- Create a short URL for a given long URL.
- Return the long URL associated with a given short URL.
Non-Functional Requirements:
The system is designed to meet the following key non-functional requirements:
- Low Latency: The most critical user-facing function, URL redirection, must be extremely fast. This is achieved through the use of an in-memory cache (Redis) to serve the vast majority of redirects, minimizing response times.
- High Availability & Reliability: The service must be operational 24/7. Strategies like using a Load Balancer for traffic distribution, multi-AZ deployments for databases, and redundant application servers ensure the system remains available even if individual components fail.
- Scalability: The system must handle a massive number of read (redirect) and write (URL creation) requests. The architecture is horizontally scalable:
- Read Scalability: Handled by adding more application servers behind the load balancer and using a distributed cache (Redis).
- Write Scalability: Handled by the inherently scalable nature of DynamoDB.
- Durability: Once a short URL is created, the mapping must not be lost. Using a persistent, reliable, and replicated database like DynamoDB ensures data is safely stored.
- Security: The WAF layer protects the application from common web exploits. Additionally, the system can implement measures to prevent link spam and abuse.
API Design
sequenceDiagram
participant C as Client
participant LB as Load Balancer
participant AS as Application Server
participant DD as DynamoDB
participant R as Redis
C->>LB: POST /api/v1/shorten
LB->>AS: Forward request
AS->>AS: Validate URL & generate short code
AS->>DD: Check if short code exists
DD-->>AS: Return existing record (or null)
alt Code already exists
AS-->>AS: Generate new code (retry)
else Code available
AS->>DD: Store URL mapping
DD-->>AS: Write confirmed
AS->>R: Cache URL mapping (optional)
R-->>AS: Cache stored
end
AS-->>LB: Return short URL response
LB-->>C: 201 Created
sequenceDiagram
participant C as Client
participant LB as Load Balancer
participant AS as Application Server
participant R as Redis
participant DD as DynamoDB
C->>LB: GET /abc123
LB->>AS: Forward redirect request
AS->>R: Get long URL for "abc123"
alt Found in Cache
R-->>AS: Return long URL
AS->>DD: Increment access counter (async)
else Not in Cache
R-->>AS: Cache miss
AS->>DD: Lookup long URL
DD-->>AS: Return long URL record
AS->>R: Cache the URL mapping
R-->>AS: Cache stored
end
alt URL Found
AS-->>LB: 301 Redirect with Location header
LB-->>C: 301 Moved Permanently
else URL Not Found
AS-->>LB: 404 Not Found
LB-->>C: 404 Error
end
sequenceDiagram
participant C as Client
participant LB as Load Balancer
participant AS as Application Server
participant DD as DynamoDB
participant R as Redis
C->>LB: GET /api/v1/analytics/abc123
LB->>AS: Forward analytics request
AS->>R: Check cached analytics
alt Cached analytics available
R-->>AS: Return cached data
else Not cached
R-->>AS: Cache miss
AS->>DD: Query URL metadata & click stats
DD-->>AS: Return analytics data
AS->>R: Cache analytics (5 min TTL)
R-->>AS: Cache stored
end
AS-->>LB: Return analytics response
LB-->>C: 200 OK with analytics data
sequenceDiagram
participant C as Client
participant LB as Load Balancer
participant AS as Application Server
participant DD as DynamoDB
participant R as Redis
C->>LB: DELETE /api/v1/urls/abc123
LB->>AS: Forward delete request
AS->>DD: Soft delete URL record
DD-->>AS: Delete confirmed
AS->>R: Invalidate cache entry
R-->>AS: Cache invalidated
AS-->>LB: 204 No Content
LB-->>C: 204 No Content
High-Level Design
1. Client Layer
- Web Browsers: Primary users accessing via browsers
- Mobile Applications: Native mobile apps using the API
- API Clients: Third-party integrations and developers
2. Traffic & Security Layer
- DNS Router: Routes traffic to nearest region (GeoDNS)
- CDN (CloudFront/Akamai): Caches static assets and frequently accessed redirects
- Load Balancer: Distributes traffic across application servers
- WAF (Web Application Firewall): Protects against DDoS, SQL injection, etc.
3. Application Layer (Microservices)
URL Service
- Handles short URL creation
- Validates URLs and generates unique codes
- Manages custom aliases and expiration
Redirect Service
- High-performance URL resolution
- Handles billions of redirect requests
- Minimal latency critical path
Analytics Service
- Tracks click statistics
- Generates reports and insights
- Handles data aggregation
Auth Service
- Manages user authentication/authorization
- API key management for developers
Async Processing
- Message Queue: Handles async tasks (analytics updates, cache warming)
- Worker Nodes: Process background jobs
4. Data Layer
Caching Layer
- Redis Cluster:
- URL mappings (short_code → long_url)
- Frequently accessed analytics
- Rate limiting data
- Session storage
Storage Layer
- DynamoDB:
url_mappingstable (short_code, long_url, user_id, created_at, expires_at, click_count)userstable (user_id, email, api_keys, quota)analyticstable (short_code, timestamp, referrer, country, device)
Detailed Component Design
## **1. Redirect Service - The Performance Critical Path**
### **How It Works**
```mermaid
flowchart TD
subgraph A [Request Processing]
R[Receive GET /short_code] --> V[Validate Short Code Format]
V --> RC[Redis Cache Lookup]
end
subgraph B [Cache Resolution Path]
RC --> CH{Cache Hit?}
CH -->|Yes| RH[Return 301 Redirect]
CH -->|No| DB[DynamoDB Lookup]
DB --> FC{Found in DB?}
FC -->|Yes| CU[Cache URL in Redis]
FC -->|No| NF[Return 404]
CU --> RH
end
subgraph C [Async Processing]
RH --> MQ[Send Analytics to Message Queue]
MQ --> W[Worker: Update Click Count]
W --> DBU[Update DynamoDB Counter]
end
```
**Core Algorithm:**
```python
def redirect_flow(short_code):
# Step 1: Validate input (62^6 possible codes = 56.8 billion)
if not is_valid_short_code(short_code):
return 400
# Step 2: Check cache (O(1) operation)
long_url = redis.get(f"url:{short_code}")
if long_url:
# Async update analytics
message_queue.send_analytics(short_code, metadata)
return 301 with Location: long_url
# Step 3: Cache miss - check database
url_mapping = dynamodb.get_item(
TableName='url_mappings',
Key={'short_code': short_code}
)
if not url_mapping or url_mapping.expired:
return 404
# Step 4: Cache warming with expiration
redis.setex(
f"url:{short_code}",
TTL=calculate_ttl(url_mapping),
value=url_mapping.long_url
)
# Step 5: Return redirect
message_queue.send_analytics(short_code, metadata)
return 301 with Location: url_mapping.long_url
```
### **Scaling Strategy**
```mermaid
graph TB
subgraph Scaling [Horizontal Scaling Approach]
LB[Load Balancer] --> RS1[Redirect Service 1]
LB --> RS2[Redirect Service 2]
LB --> RSN[Redirect Service N]
RS1 --> RC1[Redis Shard 1]
RS2 --> RC2[Redis Shard 2]
RSN --> RCN[Redis Shard N]
RC1 --> DB[DynamoDB Table]
RC2 --> DB
RCN --> DB
end
subgraph Metrics [Scaling Triggers]
CPU[CPU Utilization > 70%] --> AS[Auto Scaling Group]
LAT[P95 Latency > 50ms] --> AS
REQ[Requests/sec > 10K] --> AS
AS --> SCALE[Add Instances]
end
```
**Scaling Techniques:**
- **Stateless Services**: Redirect services are completely stateless
- **Consistent Hashing**: Redis cluster uses consistent hashing for data distribution
- **Connection Pooling**: Redis connection pools to handle high throughput
- **Read Replicas**: DynamoDB read replicas for cache miss scenarios
---
## **2. URL Generation Service - The Uniqueness Challenge**
### **How It Works**
```mermaid
flowchart TD
subgraph Generation [Short Code Generation Flow]
Input[Receive Long URL] --> Validate[Validate URL]
Validate --> Custom{Has Custom Alias?}
Custom -->|Yes| CheckCustom[Check Alias Availability]
CheckCustom --> Available{Alias Available?}
Available -->|No| ReturnError[Return 409 Conflict]
Custom -->|No| Generate[Generate Random Code]
Generate --> CheckDB[Check Code Uniqueness]
CheckDB --> Unique{Code Unique?}
Unique -->|No| Generate
Available -->|Yes| Store
Unique -->|Yes| Store
Store[Store in DynamoDB] --> Cache[Warm Cache]
Cache --> Return[Return Short URL]
end
```
### **Key Algorithms & Data Structures**
#### **Base62 Encoding for Short Codes**
```python
import string
BASE62 = string.digits + string.ascii_uppercase + string.ascii_lowercase
def base62_encode(num):
"""Convert integer to base62 string"""
if num == 0:
return BASE62[0]
result = []
while num:
num, rem = divmod(num, 62)
result.append(BASE62[rem])
return ''.join(reversed(result))
def generate_short_code():
"""Generate random 6-character short code"""
# Using 56.8 billion possible combinations (62^6)
import secrets
return ''.join(secrets.choice(BASE62) for _ in range(6))
```
#### **Distributed ID Generation**
```python
class SnowflakeIDGenerator:
def __init__(self, datacenter_id, worker_id):
self.datacenter_id = datacenter_id
self.worker_id = worker_id
self.sequence = 0
self.last_timestamp = -1
def generate_id(self):
timestamp = self.current_timestamp()
if timestamp == self.last_timestamp:
self.sequence = (self.sequence + 1) & 0xFFF # 12 bits
if self.sequence == 0:
# Wait for next millisecond
timestamp = self.wait_next_millis(self.last_timestamp)
else:
self.sequence = 0
self.last_timestamp = timestamp
# 64-bit ID: timestamp(41) + datacenter(5) + worker(5) + sequence(12)
return ((timestamp << 22) |
(self.datacenter_id << 17) |
(self.worker_id << 12) |
self.sequence)
```
### **Scaling Strategy**
```mermaid
graph LR
subgraph GenerationScaling [URL Generation Scaling]
LB[Load Balancer] --> G1[Generator 1]
LB --> G2[Generator 2]
LB --> G3[Generator N]
G1 --> DB1[(DynamoDB Shard 1)]
G2 --> DB2[(DynamoDB Shard 2)]
G3 --> DBN[(DynamoDB Shard N)]
G1 --> ID1[Snowflake ID Gen]
G2 --> ID2[Snowflake ID Gen]
G3 --> IDN[Snowflake ID Gen]
end
subgraph Partitioning [Database Partitioning]
ShortCode[short_code Partition Key] --> Sharding[Consistent Hashing]
Sharding --> P1[Partition A: a-m]
Sharding --> P2[Partition B: n-z]
Sharding --> P3[Partition C: 0-9]
end
```
**Collision Resolution Strategy:**
```python
def create_short_url(long_url, custom_alias=None, max_retries=3):
if custom_alias:
# For custom aliases, check existence first
if url_exists(custom_alias):
raise ConflictError("Custom alias already exists")
short_code = custom_alias
else:
# For auto-generated codes, retry on collision
for attempt in range(max_retries):
short_code = generate_short_code()
if not url_exists(short_code):
break
else:
# Increase length after max retries
short_code = generate_short_code(length=7)
return store_url_mapping(short_code, long_url)
```
---
## **3. Caching Layer - The Performance Multiplier**
### **Architecture & Data Structures**
```mermaid
graph TB
subgraph RedisArch [Redis Cluster Architecture]
Client[Application] --> Proxy[Redis Cluster Proxy]
Proxy --> Shard1[Shard 1: Slots 0-5460]
Proxy --> Shard2[Shard 2: Slots 5461-10922]
Proxy --> Shard3[Shard 3: Slots 10923-16383]
Shard1 --> Replica1[Replica 1-1]
Shard1 --> Replica2[Replica 1-2]
Shard2 --> Replica3[Replica 2-1]
Shard2 --> Replica4[Replica 2-2]
end
subgraph DataStructures [Redis Data Structures]
subgraph StringTypes [String Operations]
S1[SET url:abc123 "long_url"]
S2[SETEX url:abc123 3600 "long_url"]
S3[GET url:abc123]
end
subgraph HashTypes [Hash Operations]
H1[HSET analytics:abc123 clicks 1500 last_access timestamp]
H2[HINCRBY analytics:abc123 clicks 1]
H3[HGETALL analytics:abc123]
end
subgraph SortedSets [Sorted Sets for Ranking]
Z1[ZADD popular_urls 1500 "abc123"]
Z2[ZREVRANGE popular_urls 0 99 WITHSCORES]
Z3[ZINCRBY popular_urls 1 "abc123"]
end
end
```
### **Cache Strategies & Algorithms**
#### **Memory Management with LRU**
```python
# Redis configuration for LRU eviction
REDIS_CONFIG = {
'maxmemory-policy': 'allkeys-lru',
'maxmemory': '16gb',
'timeout': 300,
'lru_clock_resolution': 10
}
class AdaptiveTTL:
def calculate_ttl(self, url_mapping):
base_ttl = 3600 # 1 hour
# Popular URLs get longer TTL
if url_mapping.click_count > 1000:
return 86400 # 24 hours
# Recently created URLs get medium TTL
if is_recent(url_mapping.created_at):
return 7200 # 2 hours
return base_ttl
```
#### **Cache Warming Strategy**
```python
def predictive_cache_warming():
"""Warm cache with likely-to-be-accessed URLs"""
# 1. Popular URLs from analytics
popular_urls = get_top_urls(limit=1000)
for url in popular_urls:
warm_cache(url.short_code, url.long_url)
# 2. Recently created URLs
recent_urls = get_recent_urls(hours=24, limit=500)
for url in recent_urls:
warm_cache(url.short_code, url.long_url)
# 3. User's frequently used URLs (if authenticated)
user_popular = get_user_popular_urls(user_id, limit=100)
for url in user_popular:
warm_cache(url.short_code, url.long_url)
```
### **Scaling the Cache Layer**
```mermaid
graph TB
subgraph CacheScaling [Redis Cluster Scaling]
subgraph Current [Current Cluster]
C1[Shard 1] --> R1[Replica]
C2[Shard 2] --> R2[Replica]
C3[Shard 3] --> R3[Replica]
end
subgraph Scaled [Scaled Cluster]
S1[Shard 1] --> SR1[Replica]
S2[Shard 2] --> SR2[Replica]
S3[Shard 3] --> SR3[Replica]
S4[New Shard 4] --> SR4[New Replica]
S5[New Shard 5] --> SR5[New Replica]
end
Current -->|Resharding| Scaled
end
subgraph Monitoring [Cache Performance Monitoring]
HitRate[Hit Rate < 95%] --> ScaleUp[Increase Cache Size]
Memory[Memory > 85%] --> ScaleOut[Add Shards]
Latency[P99 > 5ms] --> Optimize[Optimize Commands]
end
```
**Performance Optimizations:**
- **Pipeline Operations**: Batch Redis commands
- **Lua Scripting**: Atomic operations on server side
- **Connection Pooling**: Reuse connections
- **Local Cache Layer**: L1 cache in application memory for hottest items