Requirements
Functional Requirements:
- Create a short URL for a given long URL.
- Return the long URL associated with a given short URL.
- Have a expiration date TTL
Non-Functional Requirements:
- The user can specify Expiration time, be default 10 years.
- Auto deletion of expired records.
- the read latency should be very less, and write latency can be little higher.
- duplicate records of same url should not be created.
- The system should be highly available and horizontal scaling should be possible.
API Design
Define the APIs expected from the system. This is your chance to analyze and define the read and write paths so that you can come up with the high-level design...
- POST request -> returns 201 created status in case of success, returns 409 conflict that record already exists.
- Get request -> for shortened URL, the full URL should be returned which it represents.
it returns 302 redirect request
- Delete request -> 200 for successful deletion and 404 for not found.
High-Level Design
Describe the overall system architecture. Identify the main components needed to solve the problem end-to-end. Use the diagramming tool to create a block diagram.
Detailed Component Design
Deep dive into 2-3 key components. Explain how they work, how they scale, discuss tradeoffs, capacity, and any relevant algorithms or data structures.
- It is a read extensive structure, normally 100:1 for read vs write.
- Consistency vs Availability : The system should be highly available, as per assumption that once the shortened URL is created, it won't be immediately shared to use.
- The expiration date is set as 10 years by default, so we need to store the shortened URLs for at-least 10 years.
- With 1 million QPS, the write requests would be 1000 per second, so for 10 years , around 3 trillions records.
- the characters which can be used for the shortened URL would be [0-9][A-Z][a-z], totalling 62 characters, so the shortened URL should be characters as 62^7 so will be 3.15 trillion records supported.
- The client give GET request, there is caching at DNS level and Redis level, if there is caching miss, then only hits DB.