System requirements
Functional:
List functional requirements for the system (Ask the chat bot for hints if stuck.)...
1) Post a tweet
2) Track the feed of users you are following - home page feed
3) likes on the tweet
Non-Functional:
List non-functional requirements for the system...
1) Scalable
2) Eventual Consistency
3) Highly Available
4) Latency - 100 - 200 ms
Capacity estimation
Estimate the scale of the system you are going to design...
DAU - 50 million and 2 tweets per day
200 accounts followed / user - 400 tweets for activity feed
100 million tweets/day
Storage
100 million * 280 bytes for tweet + 220 bytes
100 million * 500 bytes
= 50 GB/ day
1500 GB / month
2 replicas
4500 GB/ month
54 TB/ month > storing the tweets
Number of Requests
100 million writes per day
20 billion reads per day
= 20 billion/ 24/ 3600
10000 reads / second
(231,481 tweets per second on average.?)
API design
Define what APIs are expected from the system...
Read API - GET and POST both - building user feed
/tweet/read?id=user_id
{
tweet_id
tweet
count of reactions (likes, claps, dislikes) on that tweet
}
Write API - POST request
/tweet/post/
body :
{
userid
tweet
date timestamp
browser
OS
}
/tweet/followers/?id= user_id
Returns list of followed user_ids
Database design
Defining the system data model early on will clarify how data will flow among different components of the system. Also you could draw an ER diagram using the diagramming tool to enhance your design...
NoSql database compared to relational as it will allow to make the development fast in case of any changes to data structure, or horizontal scaling will be better.
this a read heavy system
Tables :
1) users table
id userid username first last age gender
2) tweets table
userid tweetid tweet_text
3) followers
userid followerid
4) user_likes
userid tweetid reaction (clap, like)
High-level design
You should identify enough components that are needed to solve the actual problem from end to end. Also remember to draw a block diagram using the diagramming tool to augment your design. If you are unfamiliar with the tool, you can simply describe your design to the chat bot and ask it to generate a starter diagram for you to modify...
flowchart TD
A[client] ---> B[Load balancer]
B --->C[API Gateway, authentication]
C --> D[Rate limiter and Service discovery]
D --> E[Fanout services]
D --> H[Read service]
E --> F[DB - No sql]
F --> L[Replica]
E --> G[Caching]
H --> G
H --> F
F --> |pipeline|I[Data warehouse]
I --> J[Monitoring and Alerts]
I --> K[Reporting, Billing]
Request flows
Explain how the request flows from end to end in your high level design. Also you could draw a sequence diagram using the diagramming tool to enhance your explanation...
Detailed component design
Dig deeper into 2-3 components and explain in detail how they work. For example, how well does each component scale? Any relevant algorithm or data structure you like to use for a component? Also you could draw a diagram using the diagramming tool to enhance your design...
Fanout Service :
- find followers for user who posted a tweet - write them to feed building temp table
- cache/ table where we can keep adding tweets posted by users they are following
Read Service :
when users login, instead of doing all gets, we can read from temp table and cache
Caching : for users who have high following say more than 1000
> those tweet_ids and their content and their likes/reactions, which are very read-heavy, posted by celebrities
LFU to evict
write back cache
Trade offs/Tech choices
Explain any trade offs you have made and why you made certain tech choices...
Limited number of chars - 280
Number of replicas
Rate limiting to prevent DDOS attacks
Failure scenarios/bottlenecks
Try to discuss as many failure scenarios/bottlenecks as possible.
Database read failure - we can use replicas
Cache becoming old - cache evict with LRU
Future improvements
What are some future improvements you would make? How would you mitigate the failure scenario(s) you described above?
Alerts
Notification system when user posts to notify all followers