Requirements
Functional Requirements:
- Create a short URL for a given long URL.
- Return the long URL associated with a given short URL.
Non-Functional Requirements:
- Scale- 1 million Users and 100 M URLs and URLs never expire because URLs can be used in emails and documents. To ensure availaibilty we never expire them.
- to handle huge data we scale horizontally. We have distributed the work in 2 services as there are 10 times n=more reads than write.
- Latency- 5 ms approximately
- Highly Availability over consistency
API Design-
Post the URL
POST/ v1/ shortenURL/
Request body
{
longURL:
}
Response body{
longURL:
shorturl:
time created:
}
Redirect the URL
Get/v1/ shortUrl
response we get a 302 redirect and the location header is set with the original long URL. the browser will redirect automaticallly.
High-Level Design
We need 2 services one for shortening of URL and other for redirection of URL.
Client: user sends a long url in the request body and wants to get a short URL in return.
Shortening service- uses POST api to create the unique short URL using base62 hashing for corresponding longURL and save it into DB. In response the user gets back the short url which will be displayed on the UI.
Cache: the redirection service first checks the cache which contains the frequently accessed URLs. if cache miss we will check the DB.
RedirectionService- user enters short url and the service fetches the corresponding long url from db and redirects to that URL.
DB- we use a DB to store the long and short URLS.
Detailed Component Design
Shortening Service- checks if the long URL is present in the db, if yes then return the same short URL, otherwise uses shortening algorithm to create a short URL which is saved to the DB.
Shortening Algorithm- uses base 62 to shorten the URL. We will not use MDA or SHA hashing because the length of the hash created is too long( approx 15 characters) and trimming it will result in a lot of collisions.
By using Base62 hashing we can easily create a large number of unique URLS with 7-8 character length.
Caching- we can use different cache strategies, because our system is read heavy we can use a read through cache.
DB- We use No Sql because the data is non relational and we need very fast reads on the data. We will use a Key Value pair DB.
Redirection Service- once a Url is shortened and user enters the short URL in the browser the api GET call will be hit and it will fetch the Long URL from the DB and will redirect to it.
Future Enhancements
We can add an expirydate and Time to the short URL and reduce the size of data.
once the data is exipred we can remove it from the db permanently.
we can add a feature to add custom short URL where a premium user can use a custom URL. We will have to authenticate the user then check it it is a premium user and then save the url with user id in the db. Then urls will be user specific.