Requirements
Functional Requirements:
- user is able to create and save text
- user is able to share text
- user is able to set time to live of the text
- user is able to delete the text
- the system is able to assign the unique id for each text
- the user is able to retrieve their text specific unique url
Non-Functional Requirements:
- availability: the system has 99.9 % uptime
- scalability: the system is able to handle 100000 users at peak traffic (100000 req/s)
- reliability: the text stays true to itself until it is expired or deleted
API Design
APIs are deployed through api gateway, which possess rate limiting algorithm from source ip.
PATH: /paste
REST: POST
BODY:
text: String: the content of each paste
RESPONE:
Success:
status code: 200
paste_id: id
time_to_live: timestamp
PATH: /uuid
REST: POST
RESPONSE:
Success:
status_code: 200
uuid: str : string length 8 of random alpha numeric characters
PATH: /paste/${id}
REST: GET
Response:
Success:
status_code: 200
text: str
PATH: /paste
REST: DELETE
RESPONSE
SUCCESS:
status_code:200
High-Level Design
Detailed Component Design
Caching:
In this case we utilize cluster of Redis as our cache.
To invalidate cache we use combination of Time to live (temporally expire item in cache) and Least recently use invalidation policy (remove the oldest unused item).
The cache can also benefit from write back strategy. As it boost read/write performance of the cache and if the paste is drop it doesn't that critical we can always send apology letter.
Clean up:
We set daily cronjob on serverless service such as lambda to check the database if the paste is expire if so we delete the item from the database. And for cache we don't need to run through the cache as it already has its own ttl and we can use condition to check the retrieved paste is expired if so we just refuse to return that paste.
S3 bucket:
As we have to store the content of the paste somewhere and the size of the content is unbound. In this case, it isn't suitable to store the content directly in postgresql as large text will kill the database performance. To solve this we can store text in s3 bucket and use postgresql to map the id of the paste and the uri of the content together.
Idempotency:
To implement the idempotency, we always need unique id in every operation of the system. And we also need to keep track which unique id is being worked on and if that id reappear we know that we have completed that operation and skip that operation.