1. Requirements

Functional Requirements

  • Generate short URL for a given long URL
  • Redirect short URL β†’ original URL
  • Optional:
    • Custom aliases
    • Expiration time
    • Analytics (click count)

Non-Functional Requirements

  • Latency: < 50ms redirect
  • Throughput: High read-heavy system (~100:1 read/write)
  • Scalability: Billions of URLs
  • Availability: Very high (redirect must not fail)
  • Consistency: Eventual consistency acceptable for analytics


2. Estimations

Traffic

  • Writes: ~10K/sec
  • Reads (redirects): ~1M/sec

Storage

Assume:

  • 100B URLs over time
  • Each record ~200 bytes

πŸ‘‰ Total β‰ˆ 20 TB β†’ requires distributed storage

Read/Write Pattern

  • Highly read-heavy
  • Requires aggressive caching


3. API Design

3.1 Create Short URL

POST /shorten

Request:

{ "long_url": "https://example.com/abc", "custom_alias": "optional", "expiry": "optional" }

Response:

{ "short_url": "https://sho.rt/xyz123" }

3.2 Redirect

GET /{short_code}

Response:

  • HTTP 302 β†’ redirect to long URL

3.3 Analytics (Optional)

GET /analytics/{short_code}



4. Data Storage & Design

4.1 Database (Primary Store)

Use:

  • Distributed KV store (e.g. Amazon DynamoDB / Cassandra)

Schema:

short_code (PK) long_url created_at expiry user_id

4.2 Cache (Critical)

Use:

  • Redis

Key:

short_code β†’ long_url




5. High-Level Architecture (HLD)

System divided into:

5.1 Data Plane (Redirect Flow)

  1. Client hits short URL
  2. Request goes to CDN
  3. CDN checks cache:
    • If hit β†’ redirect
    • If miss β†’ go to backend
  4. Backend:
    • Check Redis cache
    • If miss β†’ fetch from DB
  5. Return 302 redirect

5.2 Control Plane (Short URL Creation)

  1. Client sends long URL
  2. Service generates short code
  3. Store mapping in DB
  4. Cache in Redis






6. Detailed Breakdown

6.1 ID Generation (Make This Precise)

6.1 Short Code Generation (Robust Strategy)

Approach: Distributed Counter + Base62 Encoding

Step 1: Unique ID Generation

Use:

  • Centralized ID service OR
  • Distributed ID generator (e.g., Snowflake-like)

Properties:

  • Globally unique
  • Monotonically increasing
  • No collisions

Step 2: Base62 Encoding

ID β†’ Base62 β†’ short_code
  • Characters: a–z, A–Z, 0–9
  • Compact representation

Example:

123456789 β†’ "8M0kX"

Why This Works:

  • No collisions
  • Predictable length
  • Efficient storage

Optional Enhancement:

  • Add randomness to prevent predictability (security)

6.2 Read Optimization (Critical)

  • CDN caching (edge)
  • Redis caching (hot data)

πŸ‘‰ Most requests never hit DB

6.3 Write Path

  • Generate ID
  • Encode β†’ short code
  • Store in DB
  • Populate cache

6.4 Scaling Strategy

Stateless App Servers

  • No local state
  • Horizontally scalable

DB Scaling

  • Sharding by:
hash(short_code)

Cache Scaling

  • Redis cluster

6.5 Hot Key Handling (Critical for Viral Traffic)

Problem:

  • A single short URL (e.g., viral link) can receive millions of requests/sec
  • Can overload:
    • Redis
    • Backend
    • DB

Multi-Layer Solution:

1. CDN Caching (Primary Defense)

  • Popular URLs cached at edge
  • Majority of requests never reach backend

2. Redis Replication + Read Scaling

  • Use replicated Redis nodes
  • Hot keys served from replicas

3. Request Coalescing (Important)

  • Multiple simultaneous cache misses:
Only 1 request hits DB Others wait or reuse result

4. Local In-Memory Cache (Side Optimization)

  • App servers cache hottest keys temporarily

5. Hot Key Detection

Track:

  • Access frequency per key

If threshold exceeded:

  • Promote to β€œhot tier” cache

6. Dedicated Hot Key Tier (Advanced)

  • Store extremely popular URLs in:
    • Separate Redis cluster OR
    • In-memory distributed cache

πŸ‘‰ Ensures:

  • No single key overwhelms system
  • Scales even for viral traffic




7. Additional Considerations

7.1 Burst Traffic Handling

  • CDN absorbs sudden spikes
  • Redis handles high QPS
  • Backend protected from overload

7.2 Cache Miss Handling

  • Redis miss β†’ DB lookup
  • Populate cache after read

7.3 Degraded Mode

Case 1: Redis Down

  • Direct DB reads (higher latency)

Case 2: DB Down

  • Serve from cache if available

Case 3: Full Failure

  • Graceful degradation (temporary errors)

7.4 Expiration Handling

  • TTL on DB + Redis
  • Background cleanup jobs

7.5 Analytics

  • Async logging
  • Stream processing (Kafka)

7.6 Security

  • Prevent abuse:
    • Rate limiting
    • Spam detection

Failure Handling & Recovery

Case 1: Redis Failure

  • Fallback to DB reads
  • Increased latency but system continues

Mitigation:

  • Redis replication + failover

Case 2: Database Failure

  • Serve from:
    • CDN cache
    • Redis cache

If cache miss:

  • Return graceful error

Case 3: CDN Failure

  • Requests hit backend directly
  • Redis absorbs increased load

Case 4: App Server Failure

  • Stateless β†’ replaced automatically
  • Load balancer reroutes traffic

Case 5: Partial Network Failure

  • Retry with exponential backoff
  • Circuit breaker to prevent cascading failures

Recovery Strategy

  • Automatic failover (Redis / DB replicas)
  • Health checks + auto-scaling

Observability

Track:

  • Cache hit ratio
  • DB latency
  • Error rate

Tools:

  • Prometheus
  • Grafana


TTL & Data Lifecycle Management

Problem:

  • Billions of URLs β†’ unbounded growth

Solution:

1. Expiry at Creation

  • Optional TTL per URL

2. TTL in Storage

  • Redis:
    • Native TTL support
  • DB:
    • Store expiry timestamp

3. Background Cleanup Jobs

  • Periodic jobs delete expired entries
  • Prevents storage bloat

4. Cold Data Archival (Optional)

  • Move inactive URLs to cheaper storage

5. Cache Eviction Policy

  • Redis uses:
    • LRU / LFU eviction

πŸ‘‰ Ensures:

  • Controlled storage growth
  • Efficient memory usage



🏁 Final Summary

  • CDN + Redis caching ensures ultra-low latency
  • Base62 encoding ensures compact unique URLs
  • Distributed DB + sharding ensures scalability
  • Stateless services ensure horizontal scaling
  • Multi-layer caching handles extreme read load