System requirements


Functional:

List functional requirements for the system (Ask interviewer if stuck)...

  • Create a new URL either in the browser, or via API
  • Keep shortened URLs alive for at least 10 days
  • Scale to 10 million clicks per day
  • Scale to 1 million new URLs generated per day
  • URLs will stay alive for 10 days
  • Respond with the number of clicks per shortenedURL as well.


Non-Functional:

List non-functional requirements for the system...

  • Shortened urls are not edited
  • No users
  • 1 to 1 shortened url to url


Capacity estimation

Estimate the scale of the system you are going to design...

  • 10 million clicks per day
  • 1 million new URLs per day
  • Run for 10 days, so a total of 10 million URLs


API design

Define what APIs are expected from the system...

A rest api should be sufficient.

GET /shortenedURL should return a 302 found, with a header that contains the target URL GET /shortenedURL/clicks for info on how many clicks the shortenedURL has generated.

POST /create should return a 201 if created and redirect the user to the new shortened URL.


Database design

Defining the system data model early on will clarify how data will flow among different components of the system. Also you could draw an ER diagram using the diagramming tool to enhance your design...

  • 10 million unique rows
  • numbers, uppercase and lowercase letters hashed sha-256 against the URL itself
  • we should replicate the database to every region, with a


High-level design

You should identify enough components that are needed to solve the actual problem from end to end. Also remember to draw a block diagram using the diagramming tool to augment your design...


Server will have a load balancers in place.

Server will have a static ip address.

Server domain will resolve to a single ip address that is an anycast ip address. it means they always get the closest and fastest data center.

Server will replicate to different geo-distributed regions.

Server will serve multiple languages.


Server will accept either browser or api traffic.


Server will have two routes:

/create for creating new shortenedURLs

/shortenedURL for getting urls using shortenedURLs


Server will either return a shortenedURL for create, a URL for get requests to an existing shortenedURL, otherwise return an error page.


Database will store the URL together with a hashed ID, which is the shortenedURL. So /abc123 is the shortenedURL and hashed ID and number of clicks. It will generate numbers and uppercase or lowercase letters based on a sha-256 hash of the URL.


Request flows

Explain how the request flows from end to end in your high level design. Also you could draw a sequence diagram using the diagramming tool to enhance your explanation...


POST a url to /create

this checks if there is already a shortenedURL for that URL

if yes, return that shortenedURL

if no, generate a new shortenedURL

check if the shortenedURL exists

if yes, generate a new shortenedURL

if no, link the shortenedURL to the posted URL and return it to the client.


GET a /shortenedURL

this checks if there is still a URL

if yes, return that URL

if no, return a 404 not found



Detailed component design

Dig deeper into 2-3 components and explain in detail how they work. For example, how well does each component scale? Any relevant algorithm or data structure you like to use for a component? Also you could draw a diagram using the diagramming tool to enhance your design...


  • Load balancer will be configured with reasonable limits specifically to protect against DDOS.
  • Load Balancer will accept traffic destined for our static ip address
  • Anycast IP address will route them to their closest regional data center hosting the site.
  • All queries to our domain and it's endpoints will get regional responses.



Trade offs/Tech choices

Explain any trade offs you have made and why you made certain tech choices...


  • choosing to use a MySQL database
  • just one table, with two columns, unique hashedID and unique URL
  • I chose this because it is simple and will get the job done based on the requirements
  • I also chose it because it will allow me to work easily with others as MySQL is very popular
  • Because we are not storing any user data, we don't have to worry about GDPR


Failure scenarios/bottlenecks

Try to discuss as many failure scenarios/bottlenecks as possible.


  • If someone generates the same shortenedURL for the same URL at the same time, it can lead to a race condition. This is okay since we plan to always generate the same shortenedURL per URL.
  • bottlenecks could be regional based on traffic.
  • URL Validity


Future improvements

What are some future improvements you would make? How would you mitigate the failure scenario(s) you described above?

  • editing URLs
  • users
  • custom shortened urls
  • click tracking analytics and trends using timestamps