System requirements
Functional:
List functional requirements for the system (Ask the chat bot for hints if stuck.)...
- Create, view, edit, delete paste
- search paste
Non-Functional:
List non-functional requirements for the system...
- Performance: need to response quickly
- Availability: need to continue service even server is down
- Scalability: need to be horizontal scalable to handle more clients.
Capacity estimation
Estimate the scale of the system you are going to design...
- Say we have 1 million clients, each client create 10 pastes per day, each paste is 10KB, the storage is 36TB per year.
API design
Define what APIs are expected from the system...
- create: http://XXX/create, arg: text, title, bucket
- view: http://XXX/view, arg: bucket + title
- edit: http://XXX/edit, arg: bucket + title
- delete: http://XXX/delete, arg: bucket + title
Database design
Defining the system data model early on will clarify how data will flow among different components of the system. Also you could draw an ER diagram using the diagramming tool to enhance your design...
- We use NoSql to store, so that we can hold so many documents. Key would be bucket + title, content will be stored as one column, other metadata will also be stored together.
High-level design
You should identify enough components that are needed to solve the actual problem from end to end. Also remember to draw a block diagram using the diagramming tool to augment your design. If you are unfamiliar with the tool, you can simply describe your design to the chat bot and ask it to generate a starter diagram for you to modify..
Request flows
Explain how the request flows from end to end in your high level design. Also you could draw a sequence diagram using the diagramming tool to enhance your explanation...
- Client create paste, paste will be stored in Databse
- Client view Paste, first check in cache, if not exist, query from Database
- client delete Paste, Documents removed from Databse to Trash.
- Update, first get original version from cache and databse, during edit process, document stored in UpdateServer, document will be persist to databse eventually.
- Query via content, IndexServer will hold revert indexes of the documents, clients can first check the indexServer, then check documents from cache and database.
Detailed component design
Dig deeper into 2-3 components and explain in detail how they work. For example, how well does each component scale? Any relevant algorithm or data structure you like to use for a component? Also you could draw a diagram using the diagramming tool to enhance your design...
- For Update, contents will not be synched back to Database directly, but it's saved in UpdatServer.
- For query, client will check cache and UpdateServer first, if cache and updateServer can not find the document, then find in Database.
- In updateServer, updates will be stored as tmp files with timestamp
- Edit files will be stored for every 10 seconds, if not updated for 1 minute, it will be persisted back to Database.
- For content query, we have IndexServer to create revert index for fast query.
- The generate index will be using MQ to fetch new documents.
Trade offs/Tech choices
Explain any trade offs you have made and why you made certain tech choices...
- We use NoSql instead of MySql because NoSql is better to hold large amount of data.
Failure scenarios/bottlenecks
Try to discuss as many failure scenarios/bottlenecks as possible.
- For server, cache, updateServer, indexServer, database, we all have replicas to support availability.
- For updateServer, we can add a tmp storage to hold the being edit documents, and it's document level, and it only holds the latest updates of the document.
Future improvements
What are some future improvements you would make? How would you mitigate the failure scenario(s) you described above?
- We have Trash to hold the deleted documents, but if user want to rollback the delete, we can move data from trash to database.
- For retention, we can have notification server to send users messages when document is about to expire.