Designing A Simple Url Shortening Service A TinyURL Approach - System Design

Requirements

Functional Requirements:

Create a short URL for a given long URL.
Return the long URL associated with a given short URL.

Non-Functional Requirements:

High Availability: The system must be available 24/7 (e.g., 99.99%).

Low Latency: Redirects must occur in less than 100 ms.

Read-Heavy: The read-to-write ratio is approximately 100:1.
Horizontal Scalability: The system must support horizontal scaling to handle growing data volumes (billions of records) and traffic without compromising performance.

API Design

POST /api/v1/shorten — Create a link. Accepts long_url, returns short_url.

GET /{short_url_id} — Redirect. Returns a 302 Found status and a Location header.

High-Level Design

Entry Layer: We use an API Gateway and a Load Balancer to accept requests from clients, handle SSL certificates, and distribute traffic to the application servers.

Application Services:

Write Service: Creates short links, interacts with the ID Generator, and stores data in the database.

Read Service: Processes redirects, first checking Redis and then the DB.

Storage Layer: We use NoSQL (Cassandra/DynamoDB) for horizontal scaling and high-speed storage of billions of links.

Detailed Component Design

Key Generation: We use Base62 encoding. For a 7-character ID, we get 62^7 (about 3.5 trillion) unique combinations. Caching: Using Redis to cache “hot” links. Since 20% of links typically generate 80% of traffic, the cache will significantly reduce the load on the database.

ID Collision & Concurrency: To avoid collisions under high load, we use a Distributed ID Generator (such as Snowflake ID or the KGS—Key Generation Service), which reserves key ranges in server memory.

Fault Tolerance: If one key generator fails, the servers use pre-reserved ID blocks from memory, allowing the system to operate autonomously.

Cache Strategy: We use the LRU (Least Recently Used) policy to remove old keys. To protect against “Cache Stampede,” we set a random TTL (Time-to-Live) for each entry.

Rate Limiting: We implement the Token Bucket algorithm at the API Gateway level to limit the number of requests from a single IP address and protect the system from abuse.