Requirements


Functional Requirements:


  • Create a short URL for a given long URL.
  • Return the long URL associated with a given short URL.



Non-Functional Requirements:


  1. High Availability: The system should be operational and accessible at all times, ensuring that users can shorten and retrieve URLs without downtime.
  2. Low Redirect Latency: Users expect quick redirects when they click on a short URL, so minimizing latency is crucial for a good user experience.
  3. Horizontal Scalability: As your service grows, it should be able to handle increased load by adding more servers rather than upgrading existing ones.



API Design

Post request to /short-url Body:{url:"longurl1"}

returns:{shorturl}

So this would first validate the URL if it is correct or not so if correct then would proceed to back end then the bacon would have shot in url functionality then it was saving the database and return the URL.



Get: /short-url will fetch all the url which user has it and will alow params based and paginated



Post: /long-url Body:{url:shorturl} will check the db if this short url exist in our place and if found return the orignal url


High-Level Design



1. Entry Layer (API / Edge Tier)

When user hits the system:

For shortening:

POST /shorten

Responsibilities:

  • ✅ URL validation (is it valid format?)
  • ✅ Authentication (optional)
  • ✅ Rate limiting (prevent abuse, e.g. 5 req/sec)
  • ✅ Forward request to backend service


2. Write Path (URL Shortening Flow)

Step-by-step:

  1. User sends long URL
  2. API validates & rate limits
  3. Request goes to URL Shortener Service

Inside URL Shortener Service:

Step 1: Check cache (optional optimization)

  • If already exists → return existing short URL

Step 2: Generate short code

Options:

  • Base62(counter)
  • Random string
  • Hash

Example:

https://example.com/abc → "xYz123"

Step 3: Store in DB

shortCode → longURL

Step 4: Update cache

cache[xYz123] = longURL

Step 5: Return response

https://short.ly/xYz123

3. Read Path

When user clicks short URL:

GET /xYz123

Flow:

Step 1: Check cache (FAST path 🚀)

if exists → return immediately

Step 2: If cache miss → query DB

xYz123 → longURL

Step 3: Update cache

Step 4: Redirect user

HTTP 302 → original URL

short_code (PRIMARY KEY) long_url created_at expiry (optional)

Key points:

  • Indexed on short_code
  • Optimized for fast reads


  • Request always hits API
  • Cache is checked inside system, not bypassing API

Correct understanding:

User → API → Cache → DB (if needed)

Let’s make it solid:

  • Apply at API layer
  • Example:
    • 5 requests/sec per user
  • Prevents:
    • Spam
    • System overload

ComponentResponsibility
API LayerValidation, auth, rate limiting
Write ServiceGenerate short URL, store
Read ServiceHandle redirects
Cache (Redis)Fast lookup
DBPersistent storage
  • “For identifier generation, I would use a distributed ID generator like Snowflake to ensure globally unique, collision-free IDs without coordination. These IDs are Base62 encoded to generate short URLs.”
  • “For caching, I would use Redis with a cache-aside pattern and write-through on creation. LRU eviction ensures hot URLs stay cached, and TTL is used for expiring links.”
  • “Consistency is handled via write-through updates and lazy cache population on misses, giving us eventual consistency with high performance.”



Here is a complete, production-ready system design for a TinyURL-like URL shortener, with all critical details included end-to-end.

1. High-Level Architecture

User → Load Balancer → App Servers Write Path: User → API → Identifier Service → Storage → Cache → Response Read Path: User → API → Cache → DB → Redirect

2. Core Components Overview

  • Identifier Generation Service → generates unique short codes
  • Storage Layer → stores URL mappings
  • Redirection Service → handles high-volume reads
  • Cache Layer → reduces latency and DB load
  • Rate Limiter → protects system from abuse

3. Identifier Generation Service

Responsibility

Generate globally unique, scalable, collision-safe IDs

Design: Distributed Snowflake + Base62

ID Structure (64-bit)

| timestamp (41) | machine_id (10) | sequence (12) |

Flow

  1. Request hits app server
  2. Generate ID locally (no DB dependency)
  3. Convert ID → Base62
  4. Return short code

Algorithms & Complexity

  • Bit manipulation → O(1)
  • Base62 encoding → O(log₆₂ N)

Scaling

  • Each server has unique machine_id
  • No coordination required
  • Horizontal scaling is linear

Throughput

  • Per node: 4096 IDs/ms
  • 100 nodes → millions/sec

Collision & Failure Handling

1. DB-Level Guarantee

UNIQUE(short_code)

2. Retry Logic

Generate → Insert Conflict → regenerate → retry (max 2–3 times)

3. Clock Drift Handling

  • If clock goes backward:
    • Pause generation OR
    • Use last_timestamp + 1

4. Sequence Overflow

  • If >4096 IDs/ms:
    • Wait for next millisecond

Tradeoffs

Pros

  • No DB bottleneck
  • Highly scalable
  • Practically collision-free

Cons

  • Slight predictability
  • Requires machine_id coordination

4. Redirection Service (Read Path)

Responsibility

Handle:

GET /abc123 → Redirect to original URL

Flow

  1. Load balancer routes request
  2. Check cache
  3. Cache HIT → return immediately
  4. Cache MISS → query DB → populate cache → return

Performance Goal

  • Latency < 50ms
  • 90%+ requests served from cache

Data Structures

Cache

  • Key: short_code
  • Value: long_url

DB Index

  • Primary index on short_code

Multi-Layer Scaling

1. CDN (optional)

  • Edge caching for global users

2. Cache Layer

Use Redis

  • O(1) lookup
  • Handles majority traffic

3. DB Layer

  • Only for cache misses

Tradeoffs

Pros

  • Extremely fast reads
  • Scales horizontally

Cons

  • Cache misses hit DB
  • Memory cost

5. Storage Layer

Responsibility

Store mapping:

short_code → long_url

Schema

Table: url_mapping short_code (PK) long_url created_at expiry_time user_id (optional)

Indexing

  • Primary: short_code
  • Secondary: user_id (optional)

Data Structures

  • SQL → B-tree
  • NoSQL → hash-based

Scaling

1. Sharding

hash(short_code) % N
  • Even distribution
  • Avoid hotspotting

2. Replication

  • Primary → writes
  • Replicas → reads

Capacity Estimation

  • 100M URLs
  • Avg 200B per URL
100M × 200B = 20GB index overhead → ~40–50GB

Tradeoffs

OptionProsCons
SQLStrong consistencyHarder to scale
NoSQLScalableEventual consistency

Failure Handling

  • Replication failover
  • Backup + restore
  • Retry on transient failures

6. Caching Strategy (Detailed)

Cache Type

Use Redis

Cache Pattern

Hybrid Approach

  • Write-through (on create/update)
  • Cache-aside (on read)

TTL Strategy

1. Expiring URLs

TTL = expiry_time − current_time

2. Permanent URLs

  • Default TTL: 24 hours
  • Sliding TTL:
    • Reset TTL on each access

3. Hot URLs

  • TTL: 1–6 hours
  • Frequently refreshed by traffic

Eviction Policy

  • LRU (Least Recently Used)

Preventing Cache Bloat

1. Memory Limit

  • Example: 10–20 GB cap

2. Admission Control

  • Cache only after 2–3 hits
  • Avoid caching one-time URLs

3. Negative Caching

short_code → NULL TTL = 5 minutes

Cache Invalidation

On Update/Delete

DB update → Cache delete/update

Event-Based Invalidation

  • Publish event:
URL_UPDATED / URL_DELETED
  • Consumers remove from cache

TTL as Backup

  • Ensures stale data eventually removed

Consistency Model

  • Write-through ensures freshness
  • Cache-aside ensures performance
  • Eventual consistency acceptable

7. Rate Limiting

Where Applied

  • URL creation API → strict limits
  • Redirect API → relaxed limits (DDoS protection)

Algorithm: Token Bucket

Configuration Example

Capacity: 100 requests Refill: 10 req/sec

Behavior

  • Allows bursts up to 100
  • Then throttles to 10/sec

Implementation

Use Redis

Key

rate_limit:{user_id}

Store

  • Tokens
  • Last refill timestamp

Burst Handling

  • Within limit → allow
  • Exceeded → reject

Response

429 Too Many Requests Retry-After: 10

Client Backoff Strategy

Exponential backoff:

1s → 2s → 4s → 8s → capped

Advanced Protection

  • Per IP limit
  • Per user limit
  • Global rate limit

Tradeoffs

ApproachProsCons
Token BucketHandles bursts wellSlight complexity
Fixed WindowSimpleBurst spikes
Sliding WindowAccurateExpensive

8. End-to-End Flow

Write Path

User → API → Generate ID → Base62 encode → Store in DB → Update cache → Return short URL

Read Path

User → API → Cache lookup → HIT → Redirect → MISS → DB → Cache → Redirect

9. Key Design Characteristics

  • O(1) ID generation
  • O(1) cache lookup
  • Horizontally scalable
  • Read-optimized architecture
  • Controlled memory usage via TTL + LRU
  • Strong uniqueness guarantees
  • Resilient to failures and high concurrency