System requirements
Functional:
List functional requirements for the system (Ask the chat bot for hints if stuck.)...
- Store Text: The service should allow users to store plain text, formatted text (like paragraphs or quotes), and code snippets.
- Syntax Highlighting: Users should have the ability to format their pastes with syntax highlighting for various programming languages when submitting code blocks.
- Code Block Formatting: When a user pastes programming code, the service should automatically recognize and format it as a code block for better readability.
- Generate Random URL: Upon saving a paste, the system should generate a unique, random URL that users can share to access the paste.
- View Paste via URL: Anyone with the generated URL should be able to view the content of the paste, subject to the visibility settings (public or private).
- Set Expiration Date: Users can set an expiration date or duration for the paste. After this period, the paste should be marked for automatic deletion.
- Automatic Deletion: Once the expiration date is reached, the paste should be automatically deleted from the database, and the URL should become invalid.
Non-Functional:
List non-functional requirements for the system...
- Service Level Agreement (SLA): The system should provide users with a selection of paste expiration times, including options for 5, 10, 30, 60, and 90 days.
- System Scalability: The service should be designed to handle a burst capacity of at least 10,000 concurrent users. The system should have mechanisms in place to monitor usage and trigger alerts when nearing capacity limits.
- Alerting Mechanism: An alerting system should notify the development team when the user count approaches the 10,000 threshold, prompting them to consider increasing horizontal scaling capacity (e.g., to 20,000 or 30,000 users).
- User Feedback during Overload: When the system is at capacity or under heavy load, it should provide users with a clear message, such as "Please try again later, the service is currently overloaded."
Capacity estimation
Estimate the scale of the system you are going to design...
- Horizontal Scaling:
- The system should be designed to support horizontal scaling, allowing additional servers to be added seamlessly to manage increased user load. This ensures high availability and performance without disrupting the existing infrastructure.
- User Limit Notification:
- Implement a monitoring system that tracks user capacity. When the number of concurrent users approaches a defined limit (e.g., 10,000 users), an alert should be generated to notify the engineering team to provision additional servers.
- Geographical Distribution:
- To improve latency and user experience, the service should consider deploying servers in multiple geographic locations. This can include multiple states within a single country or across various countries to ensure that users connect to the nearest server.
- Disaster Recovery:
- In case of server outages, data loss, or other catastrophes, the system should have a disaster recovery plan. This could involve redundancy, such as backup servers in different regions, and data replication across servers to ensure business continuity.
- Load Balancing:
- Use load balancers to distribute incoming traffic across multiple servers, ensuring that no single server becomes a bottleneck.
API design
Define what APIs are expected from the system...
API Endpoints
- Create Paste (POST /api/pastes)
- Description: Creates a new paste with the provided text and generates a unique URL.
- Request Body:
{"userId":"uniqueUserId","text":"This is the content of the paste.","expiration":"30"// Could be 5, 10, 30, 60, or 90 days}- Process:
- Check for duplicate URLs and generate a new random URL if necessary.
- Store the paste in the database with the generated URL as the key.
- Response:
- Success (201):
{"url":"https://pastebin.com/example123","message":"Paste successfully created."}- Error (409 - Conflict): If unable to generate a unique URL after multiple attempts.
{"error":"Unable to generate a unique URL. Please try again."}
- Retrieve Paste (GET /api/pastes/{url})
- Description: Retrieves the content of a paste using its unique URL.
- Request Parameters:
url: The unique URL identifier for the paste (as part of the endpoint).
- Response:
- Success (200):
{"url":"https://pastebin.com/example123","text":"This is the content of the paste."}- Error (404 - Not Found): If the paste does not exist or has expired.
{"error":"Paste not found or has been deleted."}
Notes
- Update and Delete Operations: As specified, there is no separate API for updates or deletions since the pasted content has an expiration date and will be automatically removed.
Database design
Defining the system data model early on will clarify how data will flow among different components of the system. Also you could draw an ER diagram using the diagramming tool to enhance your design...
Database Design with MongoDB
Collection Name: pastes
Document Structure: Each document in the collection will represent a paste. The structure will take the form of a key-value pair where:
- Key: Randomly generated URL (as the document key).
- Value: A JSON object containing relevant data.
Here's a proposed structure for your JSON object:
{ "url": "https://pastebin.com/example123", "text": "This is the content of the paste.", "expiration": ISODate("YYYY-MM-DDTHH:mm:ssZ"), // Set the expiration date/time here "ttl": 2592000 // Time in seconds (30 days)}
Key Components in the Document:
- URL:
- Holds the full URL (the website name plus the random URL fragment).
- Text:
- Contains the entire message that the user has pasted.
- Expiration:
- A timestamp (ISO date format) that indicates when the paste should expire. This can be derived from the user's selected expiration (e.g., 5 days, 10 days, 30 days) converted into a date format.
- TTL (Time To Live):
- In seconds, this property allows you to set how long the document should exist in the database before it is automatically deleted.
- For example:
- 5 days = 5 * 24 * 60 * 60 = 432000 seconds
- 10 days = 864000 seconds
- 30 days = 2592000 seconds
- 60 days = 5184000 seconds
- 90 days = 7776000 seconds
High-level design
You should identify enough components that are needed to solve the actual problem from end to end. Also remember to draw a block diagram using the diagramming tool to augment your design. If you are unfamiliar with the tool, you can simply describe your design to the chat bot and ask it to generate a starter diagram for you to modify...
Request flows
Explain how the request flows from end to end in your high level design. Also you could draw a sequence diagram using the diagramming tool to enhance your explanation...
Detailed component design
Dig deeper into 2-3 components and explain in detail how they work. For example, how well does each component scale? Any relevant algorithm or data structure you like to use for a component? Also you could draw a diagram using the diagramming tool to enhance your design...
Trade offs/Tech choices
Explain any trade offs you have made and why you made certain tech choices...
Failure scenarios/bottlenecks
Try to discuss as many failure scenarios/bottlenecks as possible.
Future improvements
What are some future improvements you would make? How would you mitigate the failure scenario(s) you described above?