System requirements
Functional:
Users:
- Users can sign up for a new account using email and password
- Users can delete their account
- Users can change their email or password
- Users can follow other users
Timeline
- Users can see the tweets of users they follow on their timeline
- Timeline is ordered from newest to oldest tweets
- When a timeline is refreshed, the newest tweets are displayed
Tweets:
- Users can compose a tweet with a limitation of 280 characters
- Users can share a tweet by reposting it on their timeline
- Users can share a tweet by copying a link and sending it via some other application
- Users can favorite a tweet
Non-Functional:
- Latency should be minimized
- Availability should be maximized
Capacity estimation
If we assume 500 million monthly active users, if we assume 50% are daily active users then we can assume 250 million daily active users.
Traffic is likely not spread out evenly over the course of the day, so if we assume 50% low traffic (10% of users active per hour), 25% medium traffic (25% of users active per hour), and 25% high traffic (50% of users active per hour) then we have a peak of 125M users in an hour. If we assume traffic is spread evenly over those hours, then we have a peak TPS of about 35k users/second.
Any time a user lands on twitter, their timeline will load so we can assume an operation to get a timeline has a peak of 35k requests/second.
Most users aren't posting all the time, so if we assume a worst case of 10% of users are posting then we have a tweet write peak of 3.5k requests/second
API design
User:
- Create User
- Update User
- Delete User
Timeline:
- Get timeline for user
Tweets:
- Post tweet
- Like tweet
Database design
User table
- userId (Primary Key) (string)
- username (string)
- follows (list of user_ids)
Tweet table
- tweetID (Primary Key) string
- date
- userId (Foreign Key) string
- type (string) (enum: Tweet | Like | Repost)
- content (string)
- likes (number)
- date
High-level design
You should identify enough components that are needed to solve the actual problem from end to end. Also remember to draw a block diagram using the diagramming tool to augment your design. If you are unfamiliar with the tool, you can simply describe your design to the chat bot and ask it to generate a starter diagram for you to modify...
We will have two microservices: UserService and TweetService
- UserService will be responsible for creating and managing users
- TweetService will be responsible for all tweets
User Service
- This will have one table: Users
- This will have 3 APIs
- CreateUser
- UpdateUser
- DeleteUser
- FollowUser
Tweet Service:
- This will have one table: Tweets
- This will have 4 APIs
- GetTimelineForUser
- PostTweet
- LikeTweet
- DeleteTweet
Request flows
Users:
- Create/Update/Delete will be typical CRUD database operations
Tweets:
- Get Timeline:
- First fetch all follows for a user from the user table
- For each followed user, fetch the 5 most recent tweets from the Tweets table
- Sort the tweets by date and return them
- Post Tweet:
- Writes a tweet to the database
- Like Tweet:
- Writes a like to the database. This will also have a stream attached to it to find the original tweet and increment the like value
- Delete Tweet:
- Deletes a tweet from the database
Detailed component design
Dig deeper into 2-3 components and explain in detail how they work. For example, how well does each component scale? Any relevant algorithm or data structure you like to use for a component? Also you could draw a diagram using the diagramming tool to enhance your design...
Trade offs/Tech choices
Explain any trade offs you have made and why you made certain tech choices...
Failure scenarios/bottlenecks
Try to discuss as many failure scenarios/bottlenecks as possible.
Future improvements
What are some future improvements you would make? How would you mitigate the failure scenario(s) you described above?