Requirements


Functional Requirements:


  • Create a short URL for a given long URL.
  • Return the long URL associated with a given short URL.



Non-Functional Requirements:


  1. Availability: The System should be highly available
  2. Scalable: System should be highly Scalable for 100M users and be able to tolerate request spikes.
  3. Latency: The Read and write request should have very low latency.
  4. Consistency:
    1. Two uses trying to access the long URL for a given short URL should return the same long URL.
    2. Two uses trying to create a short URL for a same long URL should get the same URL.
    3. A User created a short URL for a long URL and again trying to create a short URL again for a the same long URL should get the same short URL.
  5. Durability
    1. Once short URL created, needs to ensure that the data never lost
  6. Reliability
    1. The System should behave correctly and deliver the functional requirement even in case of failure, request spikes and other outage.


API Design

  1. Create Short URL
    1. HTTP Method: POST
    2. Endpoint: /url-shortener
    3. Request: accept the Long URL as a request body
    4. Response: returns the short URL
    5. Function: get(user_id, longUrl)
  2. Get the Long URL
    1. HTTP Method: GET
    2. Endpoint: /url-extractor
    3. Request: get the short URL as a request parameter
    4. Response: returns the corresponding long URL.
    5. Function: String getLongUrl(String shortUrl)



High-Level Design

As the application is read and write heavy. create a separate applications to read/get the long URL for a short URL and separate application servers to generate and write the Short URL for a given long URL.

the URL will be persisted in the database and at the same it it will be added in the cache.

Once the read request receives from the user first lookup into cache if found then return to the user else lookup to database and write into cache before retuning to the user.

cache evict policy will be lease recently used. the URLs which used less number of times will be removed from the cache.




Detailed Component Design

Following are the key components of the system

  1. API Gateway
    1. API Gateway is responsible to route the request to the read server or write server by looking into the endpoints.
    2. It will perform the routing based on the round Robbin technique
    3. Apply a rate limiting on the User address within a 30 seconds one user cannot sent more than 20 request.
  2. Read Application Server
    1. Read applications are responsible to return a long URL for a provided short URL.
    2. This application can scale and it will first look up into a cache for a Long URL corresponding to the short URL.
    3. If it found the match in the cache then it will return it to the user else as a fallback it will lookup into the Database fetch the long URL and put it into catch before retuning to the user.
  3. Write Application Server
    1. Write Application is also scalable and every scaled application server will be assign with the numeric range which will be help to maintain the uniqueness in the URL generation.
    2. Once Short URL is generated it will persist into the database.
  4. Cache
    1. Caching is the key component here to prevent the DB hits in case of read request.
    2. The Cache store the Short URL as a key and Long URL as a value.
    3. Cache evict policy is the least frequently used. the Short URL which is least read by user will be removed from the Cache
  5. Database
    1. Database will be partitioned by Hash Key. the Hash of Short URL.
    2. We can have a fix set of database partitions here.
    3. In case of Write operation it first identify the Hash of the Short URL and then make a DB query.