Requirements


Functional Requirements:


  • Create a short URL for a given long URL.
  • Return the long URL associated with a given short URL.



Non-Functional Requirements:


  • High Availability: The system must be available 24/7 (e.g., 99.99%).


  • Low Latency: Redirects must occur in less than 100 ms.


  • Read-Heavy: The read-to-write ratio is approximately 100:1.
  • Horizontal Scalability: The system must support horizontal scaling to handle growing data volumes (billions of records) and traffic without compromising performance.


API Design

POST /api/v1/shorten — Create a link. Accepts long_url, returns short_url.


GET /{short_url_id} — Redirect. Returns a 302 Found status and a Location header.



High-Level Design

Entry Layer: We use an API Gateway and a Load Balancer to accept requests from clients, handle SSL certificates, and distribute traffic to the application servers.


Application Services:


Write Service: Creates short links, interacts with the ID Generator, and stores data in the database.


Read Service: Processes redirects, first checking Redis and then the DB.


Storage Layer: We use NoSQL (Cassandra/DynamoDB) for horizontal scaling and high-speed storage of billions of links.




Detailed Component Design

Key Generation: We use Base62 encoding. For a 7-character ID, we get 62^7 (about 3.5 trillion) unique combinations. Caching: Using Redis to cache “hot” links. Since 20% of links typically generate 80% of traffic, the cache will significantly reduce the load on the database.

ID Collision & Concurrency: To avoid collisions under high load, we use a Distributed ID Generator (such as Snowflake ID or the KGS—Key Generation Service), which reserves key ranges in server memory.


Fault Tolerance: If one key generator fails, the servers use pre-reserved ID blocks from memory, allowing the system to operate autonomously.


Cache Strategy: We use the LRU (Least Recently Used) policy to remove old keys. To protect against “Cache Stampede,” we set a random TTL (Time-to-Live) for each entry.


Rate Limiting: We implement the Token Bucket algorithm at the API Gateway level to limit the number of requests from a single IP address and protect the system from abuse.