Design Twitter - System Design

System requirements

Functional:

User can post a tweet (less than 100 characters)
User can follow other users to see their tweets
User can search a single user's tweet
User can search tweets by text

Non-Functional:

Latency - user should be able to view their timeline in less than one second.
Availability - after a tweet is created, they should be visible to all other users in seconds.
Scalability - the system should handle growth in number of users and tweets
Durability - once stored, user tweets is not lost

Capacity estimation

Assume there are 1 million daily active user. Each one of them post a tweet daily on average. A tweet object is about 200B. Every day, we're storing 200MB of data without compression. That's 73GB over 5 years.

On the user side, each object is less than 100B. This is about 100MB total. If we forecast growth to 1billion user, this is still 100GB data. In addition, there is followership entity. It would follow a power-law distribution: some power user are super connected, while most are not. We can assume there are 10 followers on average per user. That's 10 million followings. So about 100MB.

To estimate request rate, we previously had 1 million write requests per day. That's about 11 requests/s on average. This is not a big number, even if we account it to be coming during active hours and not evenly over 24 hours. The factor to consider here is fanout and we will discuss it next during read.

For read, the most important piece is timeline. It requires reading tweets from followed users. One strategy which won't scale well is to join on request. If a user followed 10 users on average, this would mean it joins 10 times. It will create 10x load on the database. And for some power user it may be 1000x.

The alternative would be a early push strategy - write to users who followed this user. This will on average create 10 times more write, but read will be much simpler. The caveat is we have a variable number of followed user (in a power law distribution). So it would have a high write latency. Standard technique is to use a message queue for processing. The other aspect is that it will also increase storage cost by 10x if we all write it onto disk. We can consider storing recent timeline in cache so that storage is minimized but cache would contain the replicated version of timeline per user.

To estimate the cache size, it is about 200MB of data for new tweets per day. Suppose we replicate each tweet 10 times, and store only most recent 30 days. It would cause 60GB of memory. You can support this with redis or memcache.

API design

POST /tweet should authenticate the user and post tweet content under that user.
GET /tweet should show the recent tweets from a specific user. It would also support pagination.
DELETE /tweet should accept user id and tweet id to facilitate deletion.
GET /followed_tweets should show relevant tweets to the user who followed other users. It is a mix of recent tweets from followed user by the user.So to support timeline, it would be paginated calls to GET tweet endpoint. A user can submit new tweet via POST tweet; or remove his old tweets via DELETE.

Database design

Two entity: USER and TWEET

User:

id primary key
username
email

Tweet:

id primary key
user_id foreign key
content
created_at

Follow:

follower_id
followed_id

See ER diagram

High-level design

You should identify enough components that are needed to solve the actual problem from end to end. Also remember to draw a block diagram using the diagramming tool to augment your design. If you are unfamiliar with the tool, you can simply describe your design to the chat bot and ask it to generate a starter diagram for you to modify...

Request flows

In this diagram:

The user sends a POST request to the API to create a new tweet.
The API server saves the tweet in the database and confirms the action.
When the user requests followed tweets, the API retrieves the relevant tweets from the database and returns them to the user.

Detailed component design

Dig deeper into 2-3 components and explain in detail how they work. For example, how well does each component scale? Any relevant algorithm or data structure you like to use for a component? Also you could draw a diagram using the diagramming tool to enhance your design...

Trade offs/Tech choices

Explain any trade offs you have made and why you made certain tech choices...

Failure scenarios/bottlenecks

Try to discuss as many failure scenarios/bottlenecks as possible.

Future improvements

What are some future improvements you would make? How would you mitigate the failure scenario(s) you described above?