System requirements


Functional:

List functional requirements for the system (Ask the chat bot for hints if stuck.)...

Tweets stored forever, but can be deleted

Can tweet videos, text, pictures up to 100 megabytes

Can reply to other tweets

Have a user page with your own tweets

Can follow other users

Can see tweets of those you follow

Can search for users


Non-Functional:

List non-functional requirements for the system...

High scalability (500 million daily active users)

High availability

Low latency refresh and video loading



Capacity estimation

Estimate the scale of the system you are going to design...

Assuming 500 million daily active users with 5 tweets a day and an average size of 20 megabytes, that is 10 petabytes per day or 36,500 petabytes over 10 years.




API design

Define what APIs are expected from the system...

post_tweet() - post a tweet

delete_tweet() - delete a tweet

reply_tweet() - reply to a tweet

like_tweet() - like a tweet

toggle_follow() - toggle whether you follow a user

get_followed_tweets() - get tweets from people you follow

search() - search for tweets / users


Database design

Defining the system data model early on will clarify how data will flow among different components of the system. Also you could draw an ER diagram using the diagramming tool to enhance your design...

Use a master database which is sharded in the USA. Add replica databases by region, perhaps add one in india, austrailia, and germany, for better load times.

Use NoSQL databases for their high scalability. Store content like:

{

id:

user: (@ handle)

likes: (int)

date: (miliseconds)

content: {

text:

attachment:

}

}


Have a content delivery network in each region too that delivers static content. Put the master one in the USA.


Reads all go to the regional database. Writes all go to the master database, which propagates to all the other databases eventually. This achieves low latency and eventual consistency.


High-level design

You should identify enough components that are needed to solve the actual problem from end to end. Also remember to draw a block diagram using the diagramming tool to augment your design. If you are unfamiliar with the tool, you can simply describe your design to the chat bot and ask it to generate a starter diagram for you to modify...

We need a DNS for IP addresses, a load balancer to distribute requests, and api gateways and databases by region. We also want a content delivery network by region to hold static content.




Request flows

Explain how the request flows from end to end in your high level design. Also you could draw a sequence diagram using the diagramming tool to enhance your explanation...

The client makes a request. First the DNS gives them an IP address. Then the request is sent to a load balancer. If it is a read request, it goes to the regional API gateway then to the closest regional database. If it is a write request, it gets queued up to the master database which updates then propagates the change to the regional databases. Read requests finally transfer the regional database to the client.




Detailed component design

Dig deeper into 2-3 components and explain in detail how they work. For example, how well does each component scale? Any relevant algorithm or data structure you like to use for a component? Also you could draw a diagram using the diagramming tool to enhance your design...

If we use sharding in the regional and master database, the scaling should be great. They can scale by adding new databases for each hash in between.


The database information should be broken up into two types: one is a tweet level non relational database (use MongoDB) which contains all tweets in the schema above. The other is a graph database (like NeoJ4) which just contains the relationships between UserID's. For the api "get_followed_tweets()", we will use NeoJ4 to get all your followers. Then we will query the database shards to get tweets by those users. We can query a user level dataset for their recent tweets. That user level dataset will be auto sorted by time.


Trade offs/Tech choices

Explain any trade offs you have made and why you made certain tech choices...


I used non relational databases because they scale better due to not having to adhere to strict queries. I used a graph database to represent relationships between users because it will likely be faster.



Failure scenarios/bottlenecks

Try to discuss as many failure scenarios/bottlenecks as possible.


If the database gets overloaded with storage, and the number of shard keys runs out, that would lead to no more room in the database and no easy way to scale. Then you would have to expand the number of shard keys and re store your entire database which could take significant time. It is better to select a shard key with a large number of values first.



Future improvements

What are some future improvements you would make? How would you mitigate the failure scenario(s) you described above?


In the future, adding frequent requests between the CDN and the database to cache locally popular tweets could reduce the load on the database.