Requirements
Functional Requirements:
- R1 - Create a short URL for a given long URL.
- R2 - Return the long URL associated with a given short URL.
Non-Functional Requirements:
- R3 - Low latency
- R4 - High scalability
- R5 - High availability
- R6 - 0 Chance of short URL collisions
- R7 - Horizontally Scalable
API Design
GET domain.io/link/{uuid} (Satisfies R2)
Request Body: None
HTTP Response: 301 to actual URL
Response Body: None
POST domain.io/link - Creates a unique link (Satisfies R1)
Request Body: { "url": "url to create a short link for" }
HTTP Headers may or may not contain a JWT.
HTTP Response: 200
ResponseBody: { "link": "domain.io/link/{uuid}"}
Update domain.io/link/{uuid} - Updates an link origin URL
Headers require a valid JWT
Request Body: None
HTTP Response: 200
ResponseBody: { "link": "domain.io/link/{uuid}"}
Delete domain.io/link/{uuid} - Deletes an link origin URL
Headers require a valid JWT
Request Body: None
HTTP Response: 200
ResponseBody: { "link": "domain.io/link/{uuid}"}
We will also have correct error codes for malformed requests. Update and delete APIs will only be useable if the user creating the link was auth'd into the platform.
High-Level Design
Frontend can be whatever library we choose. Asynchronous API calls will be made to the backend server. This server will store the creators, original links, and the short links in the database.
Backend will consist of a DynamoDB server for high scalability & availaibility (R4 & R5) in the order of millions of requests per second. Transactional writes can also be used to ensure that one URL does not overwrite another (R6). An in memory cache like redis will be used to quickly respond to clients and save database hits if recent URLs have been visited. The in memory cache will satisfy R3 by saving unnecessary databse queries.
This design will take a standard CRUD approach where by the read / GET API will just return a 301 redirect to the actual full link. For create, anyone can create a link however, if they would like to delete this link or update it in the future they need to be logged into the platform. If a user is auth'd, the user ID as the creator of the link will be stored in the database and associated with the link itself. Any future requests to delete/update the link will check the calling user against the creator and either complete the request or deny it.
The service itself is horizontally scalable (R7) because as many servers that need to be spun up to handle the traffic can be spun up. All the servers will sit behind a load balancer so traffic is evenly distributed amongst them. Each server will have its own in memory cache and if we need to at some point in the future implement some sort of distributed cache mechanism instead of strictly in memory, that can easily be added as well. The dynamoDB database is by definition horizontally scalable.
We will also have rate limiting up front right to prevent abuse and enforce throttling. Guidance will be given to customers on how to handle throttles.
Detailed Component Design
GET domain.io/link/{uuid}
When the server gets this request, it will not check auth, it will immediately query the database for the corresponding uuid. Should a match be found in the database the server will respond immediately with a 301 for the original link. The redis cache will be checked for all incoming short URL requests, if the URL hasn't been seen before, it will be stored in the cache before the response goes to the client. If it has been seen before, then the client will get a response immediately without the need to hit the database.
POST domain.io/link - Creates a unique link
When creating a link, auth will be checked. An original URL is expected on the request body. When the request is received, a UUID will be created. The UUID, creating user (if one logged in exists), and the original link will all be stored in the database. Using a UUID will help to ensure scalability and reduce the chance for collisions. We will fully avoid collisions by using transactional dynamoDB conditional writes whereby the condition will be that the PK for the UUID does not already exist in the database. If the condition is violated, the server will simply recreate the UUID and attempt to save that in the database again. Before a response is sent to the client, this link will be updated in the in memory cache as well.