Requirements


Functional Requirements:


  • user is able to create and save text
  • user is able to share text
  • user is able to set time to live of the text
  • user is able to delete the text
  • the system is able to assign the unique id for each text
  • the user is able to retrieve their text specific unique url



Non-Functional Requirements:


  • availability: the system has 99.9 % uptime
  • scalability: the system is able to handle 100000 users at peak traffic (100000 req/s)
  • reliability: the text stays true to itself until it is expired or deleted



API Design


APIs are deployed through api gateway, which possess rate limiting algorithm from source ip.


PATH: /paste

REST: POST

BODY:

text: String: the content of each paste

RESPONE:

Success:

status code: 200

paste_id: id

time_to_live: timestamp



PATH: /uuid

REST: POST

RESPONSE:

Success:

status_code: 200

uuid: str : string length 8 of random alpha numeric characters


PATH: /paste/${id}

REST: GET

Response:

Success:

status_code: 200

text: str


PATH: /paste

REST: DELETE

RESPONSE

SUCCESS:

status_code:200







High-Level Design




Detailed Component Design


Caching:

In this case we utilize cluster of Redis as our cache.

To invalidate cache we use combination of Time to live (temporally expire item in cache) and Least recently use invalidation policy (remove the oldest unused item).

The cache can also benefit from write back strategy. As it boost read/write performance of the cache and if the paste is drop it doesn't that critical we can always send apology letter.


Clean up:

We set daily cronjob on serverless service such as lambda to check the database if the paste is expire if so we delete the item from the database. And for cache we don't need to run through the cache as it already has its own ttl and we can use condition to check the retrieved paste is expired if so we just refuse to return that paste.


S3 bucket:

As we have to store the content of the paste somewhere and the size of the content is unbound. In this case, it isn't suitable to store the content directly in postgresql as large text will kill the database performance. To solve this we can store text in s3 bucket and use postgresql to map the id of the paste and the uri of the content together.


Idempotency:

To implement the idempotency, we always need unique id in every operation of the system. And we also need to keep track which unique id is being worked on and if that id reappear we know that we have completed that operation and skip that operation.