1. Requirements

Functional Requirements

Generate short URL for a given long URL
Redirect short URL → original URL
Optional:
- Custom aliases
- Expiration time
- Analytics (click count)

Non-Functional Requirements

Latency: < 50ms redirect
Throughput: High read-heavy system (~100:1 read/write)
Scalability: Billions of URLs
Availability: Very high (redirect must not fail)
Consistency: Eventual consistency acceptable for analytics

2. Estimations

Traffic

Writes: ~10K/sec
Reads (redirects): ~1M/sec

Storage

Assume:

100B URLs over time
Each record ~200 bytes

👉 Total ≈ 20 TB → requires distributed storage

Read/Write Pattern

Highly read-heavy
Requires aggressive caching

3. API Design

3.1 Create Short URL

POST /shorten

Request:

{
"long_url": "https://example.com/abc",
"custom_alias": "optional",
"expiry": "optional"
}

Response:

{
"short_url": "https://sho.rt/xyz123"
}

3.2 Redirect

GET /{short_code}

Response:

HTTP 302 → redirect to long URL

3.3 Analytics (Optional)

GET /analytics/{short_code}

4. Data Storage & Design

4.1 Database (Primary Store)

Use:

Distributed KV store (e.g. Amazon DynamoDB / Cassandra)

Schema:

short_code (PK)
long_url
created_at
expiry
user_id

4.2 Cache (Critical)

Use:

Redis

Key:

short_code → long_url

5. High-Level Architecture (HLD)

The system consists of Entry Layer, Data Plane (read path), and Control Plane (write path).

5.1 Entry Layer (Critical for Read Traffic)

Requests first hit:
- CDN (edge caching)
- Then Load Balancer

Responsibilities:

Absorb massive read traffic
Serve cached redirects directly
Protect backend from spikes

5.2 Read Path (Redirect Flow)

Client requests short URL
CDN:
- If cache hit → return redirect
- Else → forward to Load Balancer
Load Balancer → routes to App Server
App Server:
- Check Redis cache
- If miss → fetch from DB
Return HTTP 302 redirect

5.3 Write Path (Short URL Creation)

Client sends long URL
App server:
- Generates unique ID
- Encodes to short code
Stores mapping in DB
Caches in Redis

6. Detailed Breakdown

6.1 Identifier Generation (Distributed & Collision-Free)

Requirements:

Globally unique
High throughput (~10K/sec writes)
No coordination bottleneck
Compact output

Approach: Distributed ID Generator + Base62

Step 1: Generate Unique ID (Snowflake-style)

Structure:

[ timestamp | machine_id | sequence ]

Example:

41 bits → timestamp
10 bits → machine ID
12 bits → sequence

👉 Ensures:

No collisions
Works across multiple servers
Scales horizontally

Step 2: Base62 Encoding

Convert ID → Base62:

ID → Base62 → short_code

Properties:

Compact (6–8 chars)
URL-safe
Deterministic

Why Not Simple Counter?

Single point bottleneck ❌
Not scalable ❌

Optional Enhancement:

Add random bits to prevent predictability

6.2 Read Optimization (Critical)

CDN caching (edge)
Redis caching (hot data)

👉 Most requests never hit DB

6.3 Write Path

Generate ID
Encode → short code
Store in DB
Populate cache

6.4 Scaling Strategy

Stateless App Servers

No local state
Horizontally scalable

DB Scaling

Sharding by:

hash(short_code)

Cache Scaling

Redis cluster

6.5 Hot Key Handling (Critical for Viral Traffic)

Problem:

A single short URL (e.g., viral link) can receive millions of requests/sec
Can overload:
- Redis
- Backend
- DB

Multi-Layer Solution:

1. CDN Caching (Primary Defense)

Popular URLs cached at edge
Majority of requests never reach backend

2. Redis Replication + Read Scaling

Use replicated Redis nodes
Hot keys served from replicas

3. Request Coalescing (Important)

Multiple simultaneous cache misses:

Only 1 request hits DB
Others wait or reuse result

4. Local In-Memory Cache (Side Optimization)

App servers cache hottest keys temporarily

5. Hot Key Detection

Track:

Access frequency per key

If threshold exceeded:

Promote to “hot tier” cache

6. Dedicated Hot Key Tier (Advanced)

Store extremely popular URLs in:
- Separate Redis cluster OR
- In-memory distributed cache

👉 Ensures:

No single key overwhelms system
Scales even for viral traffic

7. Additional Considerations

7.1 Burst Traffic Handling

CDN absorbs sudden spikes
Redis handles high QPS
Backend protected from overload

7.2 Cache Miss Handling

Redis miss → DB lookup
Populate cache after read

7.3 Degraded Mode

Case 1: Redis Down

Direct DB reads (higher latency)

Case 2: DB Down

Serve from cache if available

Case 3: Full Failure

Graceful degradation (temporary errors)

7.4 Expiration Handling

TTL on DB + Redis
Background cleanup jobs

7.5 Analytics

Async logging
Stream processing (Kafka)

7.6 Security

Prevent abuse:
- Rate limiting
- Spam detection

Failure Handling & Recovery

Case 1: Redis Failure

Fallback to DB reads
Increased latency but system continues

Mitigation:

Redis replication + failover

Case 2: Database Failure

Serve from:
- CDN cache
- Redis cache

If cache miss:

Return graceful error

Case 3: CDN Failure

Requests hit backend directly
Redis absorbs increased load

Case 4: App Server Failure

Stateless → replaced automatically
Load balancer reroutes traffic

Case 5: Partial Network Failure

Retry with exponential backoff
Circuit breaker to prevent cascading failures

Recovery Strategy

Automatic failover (Redis / DB replicas)
Health checks + auto-scaling

Observability

Track:

Cache hit ratio
DB latency
Error rate

Tools:

Prometheus
Grafana

TTL & Data Lifecycle Management

Problem:

Billions of URLs → unbounded growth

Solution:

1. Expiry at Creation

Optional TTL per URL

2. TTL in Storage

Redis:
- Native TTL support
DB:
- Store expiry timestamp

3. Background Cleanup Jobs

Periodic jobs delete expired entries
Prevents storage bloat

4. Cold Data Archival (Optional)

Move inactive URLs to cheaper storage

5. Cache Eviction Policy

Redis uses:
- LRU / LFU eviction

👉 Ensures:

Controlled storage growth
Efficient memory usage

10. Security & Abuse Prevention

10.1 Rate Limiting

Apply per-user/IP limits using:
- Token bucket / sliding window
Implement at:
- API Gateway layer

👉 Prevents:

Spam URL creation
Abuse attacks

10.2 Spam Detection

Techniques:

1. URL Blacklist

Maintain list of malicious domains

2. Heuristic Checks

Excessive URL creation rate
Suspicious patterns

3. ML-Based Detection (Optional)

Classify URLs as safe/malicious

10.3 Safe Redirects

Warn users before redirecting to:
- Suspicious links
Use interstitial page

10.4 Authentication & Authorization

Auth required for:
- URL creation
- Analytics access

10.5 HTTPS Enforcement

All short URLs served over HTTPS

🏁 Final Summary

CDN + Redis caching ensures ultra-low latency
Base62 encoding ensures compact unique URLs
Distributed DB + sharding ensures scalability
Stateless services ensure horizontal scaling
Multi-layer caching handles extreme read load