System requirements
Functional:
Users must be able to post a tweet (message in text form)
Users must be able to see other users' tweets
Users must be able to follow/unfollow other users
Users must be able to favorite other users tweets
Non-Functional:
99.9% Availability
The system should support 10,000 DAU
The system should update a feed of tweets in real-time
Capacity estimation
10,000 DAU
2 tweets per user per day
5 favorites per user per day
API design
Tweet API:
post_tweet (POST) - posts a tweet
fetch_tweets (GET) - fetches the most recent tweets from followed users
favorite_tweet (POST) - increments the number of favorites to a tweet by 1
Follow API:
follow_user (POST) - follows a user
unfollow_user (POST) - unfollows a user
Database design
Tweet DB Design (relational DB):
content (string)
favorites (int)
created_by (string)
created_on (dateTime)
User DB Design (key-value store):
username (string)
user_id (string)
Follower DB Design (graph):
user_id (string)
following (list of strings)
followers (list of strings)
High-level design
At a high level, the system contains two main API services: a tweet managing service which handles tweets and a user managing service which deals with user data. The tweet managing service interacts with the tweet database, which stores all tweets. The user managing service interacts with the user database, which contains all information for users. Finally, both services interact with the follower database, which contains the details of followers/following for each user.
Request flows
When a user creates a request, it gets sent to the load balancer which sends the request to the least busy server. This request is then processed by the web server, and sent to the appropriate service. The service then sends the information back to the client when it is done being processed.
Detailed component design
Tweet posting: When a user posts a tweet, the tweet is sent to the tweet managing service, and the appropriate database entry is created.
Tweet fetching: When a user loads their feed, the request is sent to the tweet managing service, which fetches the users being followed and their respective user ids. Then, it sends a request to the tweet database to fetch the top X tweets from the tweet database, sorted by time created and filtered to only include the users being followed.
Tweet favoriting: When a user favorites a tweet, a request is sent to the tweet managing service, which increments the num_favorites by 1.
Trade offs/Tech choices
Trade offs: Instead of generating a feed for every user on write, it is generated on read, which ultimately leads to slower loading times but reduces the amount of unnecessary requests. Additionally, the feed generation is done as a service, which could lead to slower runtimes based on how large the tweet database is, as it would have to filter through more content. However, this is a simpler implementation and would not cost as much to create initially, since we have a lower amount of DAU. Finally, the tweet favoriting does not keep track of the users who favorited the tweet, and there is no way to sort by popularity. This is a simpler feature, but again, will lead to faster and cheaper implementation.
Failure scenarios/bottlenecks
The biggest bottleneck lies in the tweet database, as the number of tweets could expand rapidly, and the current implementation might not scale very well, even with database sharding. Additionally, using just one database for each is not scalable, so database replication is necessary as well. Finally, the services could be overloaded when there are a large number of users online, leading to more latency when using the service.
Future improvements
Some future improvements could be database replication and sharding, in order to lead to better scalability. Additionally, creating feeds on write for more popular users, and creating feeds on read for less popular users could lead to better runtime. It would also be useful to separate the feed creation into a separate service that does not rely on many database reads as well. Finally, a message queue or rate limiter using a leaky bucket could be a way to make sure the service does not get overloaded while maintaining a simple solution.