System requirements
Functional:
- shorten url from given long url
- can have expiry date
- user can provide customized short url
- Free User/ premium users
- Redirect user to long url if clicked on short url
- service should collect metrices like click count.
Non-Functional:
- Service should be running
- user can use api endpoint to create
- Scalable
- able to handle lot of requests
- url redirect should be fast
- integeration with third party api using api endpoints
Capacity estimation
Assuming that no of unique request coming to the api endpoint is 1 million
then no. of link generated per seconds = 100 million/(30*24*60*60) = 50 urls per second
total unique url in one year = 1200 million and lets say we storing it for 10 year it will be 12000 million
Assuming that each data like long url, short url, expiry_date, created_date, user_id 500 bytes, 12000 million * 500 bytes = 60TB.
API design
post : /api/app/create
request body: url: long_url
return 200
get /short_url/
return http redirect
We could choose base 62 and with the length of 6, we can store 56 billion urls
Database design
We will be using mysql
We will have following Tables
User
- id
- name
- premium
- created
- modified
Url
- id
- user_id (foreign key with User table)
- long_url
- short_url
- created
- modified
High-level design
there can be multiple request and our server can go down due to that we could have multiple server and load balancer in between
we could also caching for readirection and can use any no sql caching here
database can be divided into sharding, along with master-slave.
Request flows
from the client side we send a request to server which will first received by load balancer and it will send the request to one of server, and then it will go to url shortener service and after saving to the database it will return short url.
if our api get short url, same process will happen but instead of going to url shortner service it will first check cache and if not then check the database which return a redirect
Detailed component design
for hashing we will be using base62 which gives us lot of url based on length, we can always choose 6-7 characters.
Trade offs/Tech choices
Explain any trade offs you have made and why you made certain tech choices...
Failure scenarios/bottlenecks
Try to discuss as many failure scenarios/bottlenecks as possible.
Future improvements
What are some future improvements you would make? How would you mitigate the failure scenario(s) you described above?