Designing A Simple Url Shortening Service A TinyURL Approach - System Design

Requirements

Functional Requirements:

Create a short URL for a given long URL.
Return the long URL associated with a given short URL.

Non-Functional Requirements:

List the key non-functional requirements (eg low latency, scalability, reliability, etc.)...
horizontal scalability
reliability
low latency
high availability
low redirect latency

API Design

We will have an api Gateway which will route the api endpoints.

We will have the following endpoints:

/shortUrl - > which points to the shorten service to create the short url. It will take in the given url as payload and gives you 200 status code on success and 403 on failure.

/redirect -> Redirects the shorturl to the original url. This points to the redirection service which does the redirections. Give 301 for redirection

/genID -> generates id for the urls. Points to the ID generation service. Gives 200 for success and 403 for failure.

High-Level Design

Describe the overall system architecture. Identify the main components needed to solve the problem end-to-end. Use the diagramming tool to create a block diagram.

We will have a CDN/Edge for our frontend which will talk to the load balancer from there we have API gateway which will route our apis. To ratelimit we have a ratelimit to avoid any Bruteforce Attacks. From there the request is routed to shorten service where the short url is generated and and ID is generated for the shorturl by the ID generation service. After this, the data genearated is stored in a DB and in cache to reduce reads on the DB making the application fast. We will be using either redis/memcached for caching and Postgres for Database.

Detailed Component Design

Deep dive into 2-3 key components. Explain how they work, how they scale, discuss tradeoffs, capacity, and any relevant algorithms or data structures.

Since we are using CDN we would be able to handle millions of requests without any problem. And the requests will be load balanced by the load balancer and are properly routed by api gateway to the service related to the endpoint being accessed. To avoid any Brute force attacks we are going to rate limit using ip and location (black listed countries). The services mentioned will be in Kubernetes cluster which will spin up pods for our services and can scale easily. The trade off here is that setting kubernetes is a hassle and can be time consuming. But once set it is going to give us the performance and scale required to handle large no of requests. Also Since the data getting stored is just of short url and its original URL we can just use the traditional RDBMS such as Postgres to store it in a table. For indexing we can use the universally unique UUIDs which even if we generate millions of id per second for 100 years the chaces of collision is only 50% and I think it works for our usecase. The tradeoff is that the ids can be of 128-bit and is not easy to read. As per the caching we will be using redis which is a key value store. and would make perfect sense to use it.