Requirements


Functional Requirements:


  • Allow users to upload and store text or code snippets.
  • Generate a unique shareable URL for each paste.
  • Enable retrieval of paste content by URL.
  • Support expiration and TTL for pastes.
  • Allow paste owners or the system to delete a paste before its natural expiration.



Non-Functional Requirements:


  1. Availability: The System should be highly available
  2. Scalable: System should be highly Scalable for 100M users and be able to tolerate request spikes.
  3. Latency: The Read and write request should have very low latency.
  4. Consistency:
    1. Two uses trying to get the URL for a given content should not return the same URL.
  5. Durability
    1. Once URL created, needs to ensure that the data never lost
  6. Reliability
    1. The System should behave correctly and deliver the functional requirement even in case of failure, request spikes and other outage.  


API Design

  • Upload and Store text/code

HTTP POST: /api/v1/paste

Request

Body

{

"msgBody":"",

}

Header

userid=23423

Response

{

"url":""

}

 

  • Get Content by URL
    • HTTP GET: /api/v1/paste/{paste-id}
    • Response: Return the Paste Content Body
  • Delete The pasted content
    • HTTP Delete: /api/v1/paste/{paste-id}




High-Level Design

he execution flow consist of 3 major flows

    1. Creation of the paste and generate the unique paste-id
      1. User send a request with the required details and the request first hit to the API Gateway.
      2. API Gateway Route the request to the appropriate service and the service instance.
      3. Once request reach to the service first it create the unique paste-id with the combination if User Id and time stamp.
      4. The paste content will be push to Object Storage and the Object storage Id will be mapped to Paste-id.
      5. The metadata will be persisted into the database.
      6. Once all the above operations completed successful commit the transaction and the paster-id and data will be push to Cache with expiry time.
      7. User will get the URL to access the paste created.
    2. Retrieval of the paste by id
      1. Any user can access the paste content by URL.
      2. Once user send a read request it first goes to the API gateway and API gateway route the request to the appropriate service and a service instance.
      3. Once Request reached to the service it first lookup for a paste-id in the cache.
      4. If match found then retrieve the result from cache and response to the user and in case of cache miss it first lookup for the paste-id in the database from the paste metadata it will retrieve the object storage id get the paste content.
      5. Before returning it to the user it first put the paste into cache and then return it to the user.
    3. Delete the paste by user request or batch processing.
      1. To delete the existing paste user need to request with the paste-id.
      2. The entry with provided paste-id will be deleted from the database and also removed from the Object storage.
      3. Once Entry from the Database removed it also invalidate the respective entry from the cache
      4. There will be the scheduled batch job which is responsible to trigger after every specified interval and lookup for the paste which is expire.
      5. The Expire jobs will be delete from the database and also those will be mark as invalidate from cache.



Detailed Component Design

  1. API Gateway
    1. Once User request for a the paste creation or retrieval service it first hit the api gateway.
    2. API gateway route the request to an appropriate service and its instance.
    3. It perform the load balancing so that traffic can be distributed.
    4. Application can be scale horizontally and load will be distributed.  
  2. Caching
    1. The newly created and frequently used paste will be store within the cache.
    2. The paste will be store with the evection policy as Least Frequently used.
    3. While caching key will be the paste key and value will be the paste metadata and details.
    4. Radis Distributed cache will help here to manage the cache data.    
  3. Object Storage
    1. The Paste data will be maintain inside the Object storage.
    2. The Object Storage ID will be mapped with the Paste-id so that paste details can be track.
  4. Paste-creation-service
    1. This service is scalable and serves the user request such as creating a url for a given paste, deleting the paste.
    2. User send the request to create a paste along with the content. System will generate the unique ID for each user with the combination of userId + ddmmyyhhmmss. This way of ID generation make sure that there is no duplicate ID getting generated for another paste of the same user or any other user.
    3. Paste id will always be unique as it contains the User Id and the Content Creation Data time, even if other users trying to create a paste at the same time concurrently all of them will get the different paste-id as all of the user id will be unique even though the creation time is same.
    4. Hash of the content will be generated and persisted. The content hash will be use to identify if same paste content is used to generate the URL by the same user.
    5. The Paste Content will be uploaded on the Object Storage and its ID will be mapped to paste-id.
    6. The paster metadata will be persisted into the relational database and the Paste-id and content will be added in the cache with the expiry time.
    7. User can also delete the paste which is created by the user by sharing the Paste-id. Once paste deleted the entry from the Object Storage and the cache will be removed.
  5. Paste-retrieval-service
    1. This service is highly scalable as there will be the more reads than the write that is this system is the read heavy system.
    2. This service is responsible to serve the read request of the user.
    3. There will be multiple instance of the application to achieve the horizontal scaling of the application.
    4. User Request first goes through the API gateway which distribute the load and call an appropriate instance of the application.
    5. This service first lookup into the cache for a given paste-id. If match found then return the paste details form there this help to improve the latency of the system.
    6. If there is a cache miss then it hit the database and add the paste details into cache and then return to the user.
  6. Batch-processor-service
    1. Batch processor service has a batch job scheduler which trigger after specific interval.
    2. It lookup for a paste which is going to expire and delete it.
    3. Once it deleted from the database it make it invalid at the cache layer as well.