System requirements
Functional:
- The unregistered users can create and share the paste. But they can not update or edit pastes or have a view of pastes they created before.
- Registered users can create and share pastes. They can also view the pastes they have created before and update or delete the paste.
- Everyone can view the paste with given URL.
- We should have basic moderation feature to monitor and report harmful pastes.
- For registered users, we can offer analytics features, like views, expiration times for them.
- For paste created by unregistered user, the expiration time for their paste is 30 days. For registered users, the time is unlimited.
Non-Functional:
- This system should be highly available.
- The response time should be pretty fast.
- We'll implement read-you-write consistency for creator of paste and eventual consistency for all read requests.
- The data is durable that will never lost.
Capacity estimation
We assume we'll have 1 millions daily active users in 3 years
Each user create 1 pastes a day.
Average size of a paste is 1KB.
We assume read/write ratio would be 10 : 1.
So the write QPS is around 10^4
Read QPS is around 10^3
Each day we have 1GB new Data. Each year we have 365GB new data.
Each data has 3 replication, that's around 1TB each year.
API design
For registered new users, we have an RESTFUL API
POST /v1/users. The request body looks like following
{ user_name, pass_word_Hashed, email, phone}
To allow user login, we have an RESTFUL API
POST /v1/login. I'll skip this part to save sometime
To allow user logout, we have an RESTFUL API
DELETE /v1/login. I'll skip this part to save sometime
For creating paste, the RESTFUL API is
POST /v1/pastes, the body looks like
{content}
For reading a paste, the RESTFUL API is
GET /v1/pastes/{paste_id}. The return content looks like
{content, creator, expiration}
To update a paste for registered user, the API is
PUT /v1/pastes/{paste_id}
To delete a paste for registered user, the API is
DELETE /v1/pastes/{paste_id}
Database design
Based on the use case and estimated capacity, we can use SQL database. We'll need two table for this system.
The first table is User table. The schema looks like
userID - PrimaryKey
userName
phone
The second table is Paste table. The schema looks like
PasteID
PasteContent
ownerID
expiration
High-level design
You should identify enough components that are needed to solve the actual problem from end to end. Also remember to draw a block diagram using the diagramming tool to augment your design. If you are unfamiliar with the tool, you can simply describe your design to the chat bot and ask it to generate a starter diagram for you to modify...
Request flows
Explain how the request flows from end to end in your high level design. Also you could draw a sequence diagram using the diagramming tool to enhance your explanation...
Detailed component design
Dig deeper into 2-3 components and explain in detail how they work. For example, how well does each component scale? Any relevant algorithm or data structure you like to use for a component? Also you could draw a diagram using the diagramming tool to enhance your design...
Trade offs/Tech choices
Explain any trade offs you have made and why you made certain tech choices...
Failure scenarios/bottlenecks
Try to discuss as many failure scenarios/bottlenecks as possible.
Future improvements
What are some future improvements you would make? How would you mitigate the failure scenario(s) you described above?