System requirements
Functional:
List functional requirements for the system (Ask the chat bot for hints if stuck.)...
Tweets stored forever, but can be deleted
Can tweet videos, text, pictures up to 100 megabytes
Can reply to other tweets
Have a user page with your own tweets
Can follow other users
Can see tweets of those you follow
Can search for users
Non-Functional:
List non-functional requirements for the system...
High scalability (500 million daily active users)
High availability
Low latency refresh and video loading
Capacity estimation
Estimate the scale of the system you are going to design...
Assuming 500 million daily active users with 5 tweets a day and an average size of 20 megabytes, that is 10 petabytes per day or 36,500 petabytes over 10 years.
API design
Define what APIs are expected from the system...
post_tweet() - post a tweet
delete_tweet() - delete a tweet
reply_tweet() - reply to a tweet
like_tweet() - like a tweet
toggle_follow() - toggle whether you follow a user
get_followed_tweets() - get tweets from people you follow
search() - search for tweets / users
Database design
Defining the system data model early on will clarify how data will flow among different components of the system. Also you could draw an ER diagram using the diagramming tool to enhance your design...
Use a master database which is sharded in the USA. Add replica databases by region, perhaps add one in india, austrailia, and germany, for better load times.
Use NoSQL databases for their high scalability. Store content like:
{
id:
user: (@ handle)
likes: (int)
date: (miliseconds)
content: {
text:
attachment:
}
}
Have a content delivery network in each region too that delivers static content. Put the master one in the USA.
Reads all go to the regional database. Writes all go to the master database, which propagates to all the other databases eventually. This achieves low latency and eventual consistency.
High-level design
You should identify enough components that are needed to solve the actual problem from end to end. Also remember to draw a block diagram using the diagramming tool to augment your design. If you are unfamiliar with the tool, you can simply describe your design to the chat bot and ask it to generate a starter diagram for you to modify...
We need a DNS for IP addresses, a load balancer to distribute requests, and api gateways and databases by region. We also want a content delivery network by region to hold static content.
Request flows
Explain how the request flows from end to end in your high level design. Also you could draw a sequence diagram using the diagramming tool to enhance your explanation...
The client makes a request. First the DNS gives them an IP address. Then the request is sent to a load balancer. If it is a read request, it goes to the regional API gateway then to the closest regional database. If it is a write request, it gets queued up to the master database which updates then propagates the change to the regional databases. Read requests finally transfer the regional database to the client.
Detailed component design
Dig deeper into 2-3 components and explain in detail how they work. For example, how well does each component scale? Any relevant algorithm or data structure you like to use for a component? Also you could draw a diagram using the diagramming tool to enhance your design...
If we use sharding in the regional and master database, the scaling should be great. They can scale by adding new databases for each hash in between.
The database information should be broken up into two types: one is a tweet level non relational database (use MongoDB) which contains all tweets in the schema above. The other is a graph database (like NeoJ4) which just contains the relationships between UserID's. For the api "get_followed_tweets()", we will use NeoJ4 to get all your followers. Then we will query the database shards to get tweets by those users. We can query a user level dataset for their recent tweets. That user level dataset will be auto sorted by time.
Trade offs/Tech choices
Explain any trade offs you have made and why you made certain tech choices...
I used non relational databases because they scale better due to not having to adhere to strict queries. I used a graph database to represent relationships between users because it will likely be faster.
Failure scenarios/bottlenecks
Try to discuss as many failure scenarios/bottlenecks as possible.
If the database gets overloaded with storage, and the number of shard keys runs out, that would lead to no more room in the database and no easy way to scale. Then you would have to expand the number of shard keys and re store your entire database which could take significant time. It is better to select a shard key with a large number of values first.
Future improvements
What are some future improvements you would make? How would you mitigate the failure scenario(s) you described above?
In the future, adding frequent requests between the CDN and the database to cache locally popular tweets could reduce the load on the database.