Requirements


Functional Requirements:


  • Create a short URL for a given long URL.
  • Return the long URL associated with a given short URL.
  • What should be the no of links that will be generated in this
  • let's assume that 100 million links can be generated per day
  • so in per second it will be 100million/24/60/60
  • 1160 write operations per sec
  • if we consider a ratio of 10:1 read:write
  • so 11600 reads per second is our requirement
  • let's say it wok for 10 years so we can define the size of url to be 100. SO totla space is
  • 100 bytes*365*100million
  • we can return the shoretn url from combination of (0-9) and (a-z) and (A-Z)
  • so total possibility for each length is 62
  • the length require to adjust the million of differetn URL is 62^n>=365*100million


Non-Functional Requirements:


  • List the key non-functional requirements (eg low latency, scalability, reliability, etc.)...
  • It should have low latency, high availability, scalability and reliabilitrt it should must return the Longer URL


API Design

Define the APIs expected from the system. This is your chance to analyze and define the read and write paths so that you can come up with the high-level design...

We can use Rest APIs

GET and Post request is accepted

Get(user enter the shorhter URL it will fetch the longer for e.g)

https://shorterUrl

Type: Get

Response it will give 301


Post

http://www.amazon.com.hdh/product/t3467

stores the longer url with a shorter one in the database


High-Level Design

Describe the overall system architecture. Identify the main components needed to solve the problem end-to-end. Use the diagramming tool to create a block diagram.

Basic flow will be that client will call the server with the given longer website

Server will check if it is already cached if yes it will return the tiny url to the client

If it's not cached it will call the database to get the shorter URL

If we talk about storing the URL in data base it is done as 3 columns

1) Unique ID

2) Longer URL

3) Shorter URL mapped to it


so we need to perform the hash operation on the longer URL to convert it into the shorter form





Detailed Component Design

Deep dive into 2-3 key components. Explain how they work, how they scale, discuss tradeoffs, capacity, and any relevant algorithms or data structures.

For hashing we can use different methods

for eg, one way is like using CRC32 function to hash the longer URL it will produce some output will the tak ethe first 7 character and check if existed in db and has collision will add predefned string to it and again hash it.


Other way is

TO do it we can use Base64 method that will unique generate the hashing

BAse64 we can apply to the Unique id which then converted to Base 64 hence it will be unique

To generate the Unique ID we can use twitter snow Flake algo, int that the bit is divided into 4 parts

1) Parity(1 Bit)

2) Time Stamp(41 bytes) it will have have conversion to convert the value into the actual date and time

3) Servers(5 bits)

4) Unique identifier(Since every 1milli second generate a different timestamp we will require to have different values of the id if requested under 1 milli seconds)


Second method will never lead to collision but we can detect the hash value for the other urls so has safety concerns


We should use load balancing to distribute the calls to other networkas this is stateless it is easy to implement too

To prevent from DDos Attack we can use rate limiter which will allow certain no of api calls per second per user. We can use lago like leaky bucket to implement that