System requirements


Functional:

  • shorten url from given long url
  • can have expiry date
  • user can provide customized short url
  • Free User/ premium users
  • Redirect user to long url if clicked on short url
  • service should collect metrices like click count.



Non-Functional:

  • Service should be running
  • user can use api endpoint to create
  • Scalable
  • able to handle lot of requests
  • url redirect should be fast
  • integeration with third party api using api endpoints



Capacity estimation

Assuming that no of unique request coming to the api endpoint is 1 million

then no. of link generated per seconds = 100 million/(30*24*60*60) = 50 urls per second

total unique url in one year = 1200 million and lets say we storing it for 10 year it will be 12000 million

Assuming that each data like long url, short url, expiry_date, created_date, user_id 500 bytes, 12000 million * 500 bytes = 60TB.






API design


post : /api/app/create

request body: url: long_url

return 200


get /short_url/

return http redirect



We could choose base 62 and with the length of 6, we can store 56 billion urls



Database design

We will be using mysql


We will have following Tables

User

  • id
  • email
  • name
  • premium
  • created
  • modified

Url

  • id
  • user_id (foreign key with User table)
  • long_url
  • short_url
  • created
  • modified





High-level design

there can be multiple request and our server can go down due to that we could have multiple server and load balancer in between


we could also caching for readirection and can use any no sql caching here

database can be divided into sharding, along with master-slave.







Request flows

from the client side we send a request to server which will first received by load balancer and it will send the request to one of server, and then it will go to url shortener service and after saving to the database it will return short url.

if our api get short url, same process will happen but instead of going to url shortner service it will first check cache and if not then check the database which return a redirect



Detailed component design

for hashing we will be using base62 which gives us lot of url based on length, we can always choose 6-7 characters.



Trade offs/Tech choices

Explain any trade offs you have made and why you made certain tech choices...






Failure scenarios/bottlenecks

Try to discuss as many failure scenarios/bottlenecks as possible.






Future improvements

What are some future improvements you would make? How would you mitigate the failure scenario(s) you described above?