Requirements


Functional Requirements:


  • Create a short URL for a given long URL.
  • Return the long URL associated with a given short URL.



Non-Functional Requirements:


  • List the key non-functional requirements (eg low latency, scalability, reliability, etc.)...
  • latency should be under 200 ms
  • should be redundant
  • system should have high availability
  • should have low redirect latency
  • should scale easil;y as user base grows


API Design

Define the APIs expected from the system. This is your chance to analyze and define the read and write paths so that you can come up with the high-level design...


GET api/v1/short_url/full_url - this check if the full URL exists return short url, if does not return an error


GET api/v1/short_url/short_url - this check if the short URL exists return full url, if does not return an error


POST api/v1/short_url/full_url - this adds the full url and its shortened value to the DB. If the full url is present return url already present error


UPDATE api/v1/short_url/full_url - this updates the value of the full_url to short url


High-Level Design

Describe the overall system architecture. Identify the main components needed to solve the problem end-to-end. Use the diagramming tool to create a block diagram.


the user inputs a url to be converted. this then authenticates the user, otherwise promts to log in. After authentication, the full url as a payload is passed through a load balancer to pass it through the least used server. Before creating a short url, it is checked with the DB(if there is no cache hit), if not present in db it add the short url and long url to the db(where the key is the long url). if this is a success the return the right code back to the user. For redirection the same process through the load balancer is followed then the app first checks for a cache hit , if not it fetched the full url from DB , and redirects. The server will be using horizontal scalling to ditrubte the load as well as have redunacny. The data store as redundancy using replication , where the data is read from multuiple sources



Detailed Component Design

Deep dive into 2-3 key components. Explain how they work, how they scale, discuss tradeoffs, capacity, and any relevant algorithms or data structures.


for ID generation - there will be a hashing system which help in avoiuding collisions when multiople urls are being generated. If there is a failure in one of the nodes, the load balancer after performaning health check will start redirectign the traffic tpo other nodes untill the broken node is back up


The db for storing the url - this should be a no sql db as the input data is simple and the reads are heavily focused, where the read perfromance for the redirecting matters a lot. No sql allowws for easily scaling and very high performance. the data will be stored


The cache used will be redis , thsi will be used in the application layer to reduce the number of times the db is hit. The key will be the the short url as the number of read will far outview the writes so focusing on the perfoirmance of read operation is more important. the cache will be use the least frequent strategery to maintain the most used URLs. if there is a cache miss or ttl expiriy the data is data is read from the DB. for findinbg


The load balancer will use least connection method to choose the most apt server