System requirements


Functional:

List functional requirements for the system (Ask the chat bot for hints if stuck.)...

  1. Allow users to input a large text and store it in the system.
  2. Generate a unique URL for each text input that users can access (the user can also name the text).
  3. Automatically delete the stored text after 30 days.
  4. Support retrieving the stored input text by the generated URL.
  5. Allow users to delete the paste they have stored.


Non-Functional:

List non-functional requirements for the system...

  1. The system should be able to handle a large number of concurrent users and store a significant amount of text data efficiently
  2. Text retrieval and storage operations should have low latency to provide a seamless user experience. P99 should be less than 300ms.
  3. The system should ensure high availability
  4. The pastebin generated URL should be hard to guess so as to avoid unathorized access to the pasted content
  5. Once the content has been stored, it should not be lost unless it was deleted or expired. And the content delivery should be reliable


API design

Define what APIs are expected from the system...


// The API to post the text to store

POST v1/text/

paste: String // the text to store

language: enum

response: 201 created (textId)


GET v1/text/{textId}

response: 200 -

paste: string // text to return

language: enum

404 - paste not found

400 - bad request


DELETE v1/test/{textId}

response 204

404 - The content doesn't exist

401 - The user does not have permission to delete the content


High-level design

You should identify enough components that are needed to solve the actual problem from end to end. Also remember to draw a block diagram using the diagramming tool to augment your design. If you are unfamiliar with the tool, you can simply describe your design to the chat bot and ask it to generate a starter diagram for you to modify...


The system should consist of a client, a server, a database to serve the basic requests. In order to remove the expired texts from the database, there should be a seperate cronjob that is executed daily.



Detailed component design

Dig deeper into 2-3 components and explain in detail how they work. For example, how well does each component scale? Any relevant algorithm or data structure you like to use for a component? Also you could draw a diagram using the diagramming tool to enhance your design...


  1. The server. We need to use a good algorithm to generate the url of the text. We could use [0-9][a-z][A-Z] characters in the url, and that makes it 62 characters. To make sure all urls are unique, initially we need to make sure the url hash is at least 3 characters in length, however as more and more users are joining and using the service, we can maintain longer urls. We should generate a random string for the url. If there is a collision, re-generate the string.
  2. Database. Initially, the database is very small and can be fitted into a single machine, however, in order to maintain high availability, we should provide duplications of the database. Initially we can have 1 redundant duplication of the database, and use a master-slave config. We should always write to master database machine and have it send its data synchronized to slaves. The slaves could serve reads.
  3. When user base has grown enough and the database size has grown large, sharding of the database should be introduced. As the number of users is expected to be growing continuously, consistent hashing should be used to distribute the stored content.
  4. Similarly, our server should have duplication, to serve all the requests. Besides, when more and more users are joining the platform, more machine should be added in the server layer.




Database design

Defining the system data model early on will clarify how data will flow among different components of the system. Also you could draw an ER diagram using the diagramming tool to enhance your design...


Table: Text

Id (varchar): id of text

ownerId (varchar): customerId of the creator of the text.

textToStore (varchar): the text to store

uri: uriOfText

insertTime (DateTime)

expireTime (DateTime)