System requirements


Functional:

  • User is able to send text to the server
  • User has a list of other users they are friend with
  • User can favorite tweets
  • Users should be able to see the latest tweets from their friends


Non-Functional:

  • User information
  • Store user information and follower/following relationships
  • User settings/preferences
  • Tweets
  • Store content of the tweet, including media, and when it was posted
  • Store interactions associated with the tweet



Capacity estimation

Estimate the scale of the system you are going to design...


  • Support 100 million users
  • Each user can create 5 tweets per day, so 500 million tweets per day
  • Daily average users: 10%, so 10 million users browsing tweets on the platform
  • If everyone requests 10 tweets at a time, we're reading 100 million tweets per day
  • Each user has on average 100 followers, so total number of followers in the system would be 10 billion followers




API design

Define what APIs are expected from the system...

POST /tweet that takes in a json of

{

user_id: ,

text: string

media: bytes

}


GET /news_feed/

returns the top N tweets from friends


PUT /user

{

user_name,

email,

profile_pic

}


POST /user_settings

{

notification_setting,

...

}


POST /follow_user

{

user_id: ,

following:

}



Database design

Defining the system data model early on will clarify how data will flow among different components of the system. Also you could draw an ER diagram using the diagramming tool to enhance your design...


User metadata

  • User UUID (primary key)
  • username
  • email
  • created time
  • profile picture s3 path


User settings

  • User UUID (primary key)
  • Notification settings
  • Privacy settings


Tweets store

  • Each record will at least contain
  • User UUID
  • created time
  • media S3 paths
  • text


High-level design

You should identify enough components that are needed to solve the actual problem from end to end. Also remember to draw a block diagram using the diagramming tool to augment your design. If you are unfamiliar with the tool, you can simply describe your design to the chat bot and ask it to generate a starter diagram for you to modify...


  • The Tweets Microservice interacts with the sharded SQL database for storing user-related information such as user metadata, preferences, and follower/following relationships.
  • User metadata like user name, email, etc., is stored in the User Metadata Table.
  • User preferences and settings are housed in the User Preferences Table.
  • The Media Store (S3) stores images and videos, with the paths to these media files stored in the user metadata table.
  • The tweets store will be a NoSQL database, because we want low latency in both reads and writes
  • Follower/following relationships are maintained in the Follower/Following key-value store. We will store a list of followers and following for a given user UUID
  • This table would have to handle 10 billion records




Request flows

Explain how the request flows from end to end in your high level design. Also you could draw a sequence diagram using the diagramming tool to enhance your explanation...


The flow is a user will hit the POST endpoint to create a tweet, we'll write it to the NoSQL database with user metadata along with the timestamp. When a user hits the GET endpoint to populate their newsfeed, we can then fetch a list of followers from key value store and query the NoSQL database for the most recent tweets by the list of followers.


Detailed component design

Dig deeper into 2-3 components and explain in detail how they work. For example, how well does each component scale? Any relevant algorithm or data structure you like to use for a component? Also you could draw a diagram using the diagramming tool to enhance your design...


We can consider sharding the SQL database to help distribute the load and allows for better scalability as the user base grows. We can also consider having read-only replicas for the database if we need to read from the database more than having to update it. We could achieve an even sharding by using hashed sharding on the user ID, because the user IDs are UUIDs.


We will also shard the NoSQL database based on the user UUID. We will fetch tweets by user UUIDs. We can then route the request for a user's tweets to different shards and parallelize these requests if we have to fetch tweets for more than one user.


I think we will store the follower/following relationship in a key-value store if we maintain a list of followers and a list of following for each user. The key would be the user's UUID. There is more duplication in data and a query to get the list of followers/following for a user is necessary before updating a value, but the retrieval of followers/following would be fast. This approach is more flexible and scalable solution, especially for users with imbalanced follower/following ratios.


In the diagram, there is only one tweets microservice, but we can actually have many microservices running since the service is stateless.


Trade offs/Tech choices

Explain any trade offs you have made and why you made certain tech choices...


We can consider storing the follower and following user relationship in SQL. We can accomplish this in a single table so there will not be any data duplication. However, querying for a list of followers for a given user would become expensive if the table will house about 10 billion rows. The twitter service is also likely to have a lot more reads on the follower/following relationship than writes, so we chose to optimize the retrieval.


Failure scenarios/bottlenecks

Try to discuss as many failure scenarios/bottlenecks as possible.


The NoSQL database that contains tweets could become a bottleneck if a shard becomes unavailable. A user would not be able to fetch tweets that reside on an unavailable shard. This failure scenario also applies the follower/following store.


The User SQL database can become a bottleneck with querying with 100 million records.


Future improvements

What are some future improvements you would make? How would you mitigate the failure scenario(s) you described above?


To mitigate the failure scenario above, we can consider having read-only replicas of the tweets database and the follower/following store in different geographical locations, so that when a shard becomes unavailable, we can still redirect the read request to other shards.


We can consider having read-only replicas of the user SQL database to help mitigate the querying slowness.


The current system is a pull-based system, so the client would have to hit the tweets microservice to grab a list of tweets to display for a user. We can consider modifying the system to be more push based, so we can send push notifications to users to view tweets. We can add a message queue to the tweets microservice or adding a separate microservice to publish events to consumers and pre-load the client app with tweets, so they are ready to be viewed when the user opens the app from the push notification.