Requirements
Functional Requirements:
- Given a long URL, return a unique short URL
- Given a short URL, return the corresponding long URL
Non-Functional Requirements:
- The short URL should be generated in 10s of milliseconds
- The redirection from short URL should happen in 10s of milliseconds
Capacity Estimates:
- Short URL to long URL is read op and long to short is write op
- Assuming this is a read heavy application, i.e. a lot more read ops than write ops
- Assuming we get 10,000 reads per second and ~500 writes per second
- The system is highly available
API Design
This is a simple application, we only need two endpoints:
- POST /shortenURL{longURL}
- GET /getOriginalURL{shortURL}
High-Level Design
Let me start with a simple design, shown in the high level diagram. Clients make requests to the API Gateway, which checks the cache and returns the result. On cache miss, it forwards the request to the Application Server. The Application Server queries the Database and returns the response to the client as well as write the response to the cache.
Our database entry is a long URL and a corresponding short URL. Assuming 5 kb per entry, we are looking at writing 2.5 mb per second and around 250 gb per month. At this scale we can use a simple Relational database for the application, I'll use Postgres here.
For high availability, we will deploy multiple application servers and the gateway will use a load balancer to route incoming requests. We can use Kubernetes to spin-up application servers, check for health and scale up /down depending on traffic. We will also keep a read replica of our database in case of failures, to quickly switch over, so that we minimize any downtime.
Detailed Component Design
Application Servers:
The application servers serve two purpose, shorten a given long URL and get associated long URL given a short one. For shortening a URL, we can use randomly generated 20-30 character string, let's go with 24 characters. Using upper and lower case alphabets, number and special characters will give us a 72 unique characters. This gives us a space of 24 ** 72 size string space for the short URL, which easily handles ~10 billion URL on the internet. The collision probability is also less and on collision the application servers can retry once or twice.
Cache:
The Redis cache will be use to reduce load on our application servers. In case of cache miss, the application servers will fetch the data from the database and write it to the cache as well. We will use TTL as 30 minutes, to minimize cache size while keeping latency low. Since our data is static, we can even use CDNs to cache hot URL at edge, reducing latency further.
Database:
We need to store just one table in the database URL_MAP. But due to the scale of our use, we expect to store around 250 gb of data per month. As the database grows, our reads will take longer and longer. So we need to partition the database. We will shard the database by short_url, the random nature of the short URL ensures uniform distribution. For availability we will keep at least read replicas of the database so that we can promote any one to main in case of failure.