System requirements
Functional:
List functional requirements for the system (Ask the chat bot for hints if stuck.)...
- A user can post a new tweet, and the tweet will display on the homepage
- All the user's friends can see the his post
- The user can see the latest posts from all his friends and the posts are sorted by timestamp
Non-Functional:
List non-functional requirements for the system...
- Consistency
- Scalability
- High performance
Capacity estimation
Estimate the scale of the system you are going to design...
- 1000 users per mins
- each user post a tweet with 1000 characters: 1000 * 4bytes = 4000 bytes
- 1000 * 4000 = 4 millions bytes / mins
- 1day * 24hours * 60mins = 1440 mins
- 4millions bytes * 1440 mins = 5760 bytes / day
- 5760 bytes* 30days = 172800 millions bytes / month = 172.8 G / month
- 2703G/yr
API design
Define what APIs are expected from the system...
Endpoints:
- get: api/users_name
- return the homepage of this user
- post: api/user_name
- body: {user_id: molly123, content: 'hello world', timestamp: '04-23-2024 15:30:00'}
- return: 20 ok for successful
Database design
Defining the system data model early on will clarify how data will flow among different components of the system. Also you could draw an ER diagram using the diagramming tool to enhance your design...
- user_table:
- user_id
- user_name
- user_email
- friends_ids
- created_time
- content_table
- post_id
- user_id
- post_content
- timestamp
- user_table has one to many relationship with the content_table
High-level design
You should identify enough components that are needed to solve the actual problem from end to end. Also remember to draw a block diagram using the diagramming tool to augment your design. If you are unfamiliar with the tool, you can simply describe your design to the chat bot and ask it to generate a starter diagram for you to modify...
- See the high level diagram
B[client] --> A{CDN}
B[client] -->E{load balancer} -->C{server}
C --> F{cache}
C --> D[master Database]
D --> R{replication 1}
D --> P{replication 2}
Request flows
Explain how the request flows from end to end in your high level design. Also you could draw a sequence diagram using the diagramming tool to enhance your explanation...
- When user log in his twitter page, it will pull the content from the closest region through the cdn.
- Once user is in this twitter page, he can review the latest tweets from this friends that he follows from the cache. If the cache does not contain all the necessaries, the rest of the posts will be full from the database, and then store in the cache, then display on the user's page.
- If user wants to post a new tweet, the post will be stored in the database
Detailed component design
Dig deeper into 2-3 components and explain in detail how they work. For example, how well does each component scale? Any relevant algorithm or data structure you like to use for a component? Also you could draw a diagram using the diagramming tool to enhance your design...
load balancer 1
- It's used to distribute requests to the servers so that the system can be highly scalable
- The method of the distribution is used least loaded. It means that it will distribute the newest request to the server that has the lowest number of active connections at the current moment. This way can avoid overloaded for a single server, and improve the efficiency of the system
load balancer 2
- It's used to avoid single point of failure of the master database. If the master fails, the load balancer 2 will assign the one of the replication as master until the original master recovers.
cache
Trade offs/Tech choices
Explain any trade offs you have made and why you made certain tech choices...
Failure scenarios/bottlenecks
Try to discuss as many failure scenarios/bottlenecks as possible.
Future improvements
What are some future improvements you would make? How would you mitigate the failure scenario(s) you described above?