Designing A Simple Url Shortening Service A TinyURL Approach - System Design

Functional Requirements
URL Shortening: Generate a unique short alias for a given long URL.
Redirection: Redirect users to the original long URL when they access the short URL.
Custom Aliases: Allow users to provide a custom short ID (optional).
Expiration: URLs should expire after a default or user-defined time.
Analytics: Track click metrics (count, location, browser).
Non-Functional Requirements
High Availability: The system should be 99.99% available.
Low Latency: Redirection must be sub-100ms.
Scalability: Support 100M new URLs per month.
Read-Heavy: Optimize for a 100:1 read-to-write ratio.

API Design

// 1. Create Short URL POST /api/v1/shorten Request: { "long_url": "string", "custom_alias": "string?", "expire_at": "timestamp?" } Response: { "short_url": "https://tiny.url/abc123" } // 2. Redirect GET /{short_id} Response: 302 Found (Location: long_url) // 3. Analytics GET /api/v1/analytics/{short_id}
Response: { "click_count": 1500, "last_accessed": "timestamp" }

High-Level Design

CDN (CloudFront/Cloudflare): Caches redirection mapping at the edge to reduce origin latency.
Load Balancer (Layer 7): Distributes traffic to stateless API servers.
API Servers: Handle the logic for shortening and redirection.
Distributed ID Generator (Snowflake): Generates globally unique 64-bit IDs to avoid collisions.
Cache (Redis): Uses LRU policy to store the most active URL mappings.
Database (DynamoDB/Cassandra): A distributed NoSQL DB sharded by short_id for horizontal scaling.
Analytics Worker: Asynchronously processes click logs using a Message Queue (Kafka) to avoid blocking the main request path.

Detailed Component Design
1. Distributed ID Generation & Base62
To avoid a Single Point of Failure, we use a Ticket Server or Snowflake ID approach. The resulting 64-bit integer is converted to Base62 to produce a 7-character string. This allows for 627≈3.5 trillion unique IDs.
2. Database Sharding & Partitioning
To handle billions of rows, we shard the database by short_id. A consistent hashing algorithm ensures that a specific short_id always maps to the same database node, ensuring O(1) lookup time and preventing "scatter-gather" queries.
3. Caching Strategy & 80/20 Rule
Following the Pareto Principle, 20% of URLs generate 80% of traffic. We use a Redis Cluster with a Write-around cache strategy:
Read: Check Redis → if miss, check DB → update Redis.
Eviction: Least Recently Used (LRU) to keep only hot data in memory.
4. Handling Expiration
Instead of actively scanning the entire database, we use a Lazy Deletion strategy combined with a scheduled batch job. When a user accesses an expired link, we delete it and return 404. A low-priority background process cleans up remaining expired entries during off-peak hours.

Functional Requirements
URL Shortening: Generate a unique short alias for a given long URL.
Redirection: Redirect users to the original long URL when they access the short URL.
Custom Aliases: Allow users to provide a custom short ID (optional).
Expiration: URLs should expire after a default or user-defined time.
Analytics: Track click metrics (count, location, browser).
Non-Functional Requirements
High Availability: The system should be 99.99% available.
Low Latency: Redirection must be sub-100ms.
Scalability: Support 100M new URLs per month.
Read-Heavy: Optimize for a 100:1 read-to-write ratio.

API Design

// 1. Create Short URL POST /api/v1/shorten Request: { "long_url": "string", "custom_alias": "string?", "expire_at": "timestamp?" } Response: { "short_url": "https://tiny.url/abc123" } // 2. Redirect GET /{short_id} Response: 302 Found (Location: long_url) // 3. Analytics GET /api/v1/analytics/{short_id}
Response: { "click_count": 1500, "last_accessed": "timestamp" }

High-Level Design

CDN (CloudFront/Cloudflare): Caches redirection mapping at the edge to reduce origin latency.
Load Balancer (Layer 7): Distributes traffic to stateless API servers.
API Servers: Handle the logic for shortening and redirection.
Distributed ID Generator (Snowflake): Generates globally unique 64-bit IDs to avoid collisions.
Cache (Redis): Uses LRU policy to store the most active URL mappings.
Database (DynamoDB/Cassandra): A distributed NoSQL DB sharded by short_id for horizontal scaling.
Analytics Worker: Asynchronously processes click logs using a Message Queue (Kafka) to avoid blocking the main request path.

Detailed Component Design
1. Distributed ID Generation & Base62
To avoid a Single Point of Failure, we use a Ticket Server or Snowflake ID approach. The resulting 64-bit integer is converted to Base62 to produce a 7-character string. This allows for 627≈3.5 trillion unique IDs.
2. Database Sharding & Partitioning
To handle billions of rows, we shard the database by short_id. A consistent hashing algorithm ensures that a specific short_id always maps to the same database node, ensuring O(1) lookup time and preventing "scatter-gather" queries.
3. Caching Strategy & 80/20 Rule
Following the Pareto Principle, 20% of URLs generate 80% of traffic. We use a Redis Cluster with a Write-around cache strategy:
Read: Check Redis → if miss, check DB → update Redis.
Eviction: Least Recently Used (LRU) to keep only hot data in memory.
4. Handling Expiration
Instead of actively scanning the entire database, we use a Lazy Deletion strategy combined with a scheduled batch job. When a user accesses an expired link, we delete it and return 404. A low-priority background process cleans up remaining expired entries during off-peak hours.