System requirements


Functional:

User should be able to

  • Compose a tweet and post it
  • Follow another user
  • View tweets of users you follow in your home feed
  • View feed of suggested content from accounts that is recommended based on popularity (not necessarily accounts you follow)
  • Favorite other users' tweets


Non-Functional:

  • Scalability: this system should be highly scalable. We want to be able to support many simultaneous users, potentially in different parts of the world
  • Response time: response time should be as low as possible for reads, we can tolerate longer response time for writes
  • Consistency: it is ok for there to be eventual consistency in this system. Users do not need to have newest updates instantaneously
  • Security: we need to ensure that users are authenticated in order to post, make changes to their account, and access the content from the users they follow




Capacity estimation

Estimate the scale of the system you are going to design...

User Base:

  • Let's assume daily active users (DAU) is 500 million.

Traffic:

We can calculate the traffic based on the number.

  1. Tweeting: each user tweets about 2 times per day, so 1 billion tweets per day total
  2. Home feed: each user loads their home feed about 10 times per day, 5 billion home feed loads per day total
  3. Favorite: each user favorites about 1 tweet per day, so 500 million favorites per day total
  4. Following: each user follows about 200 accounts, so 100 billion follow relationships total

Queries Per Second:

  1. write: 500m*2/3600/24= 15k/qps
  2. read: 500m*10/3600/24= 75k/qps
  3. Favorites: 500m*1/3600/24 = 7.5k/qps

Data size :

  1. Tweet: 1b tweets, each with 140 chars. considering encoding , let's assume 300 bytes. So total data is 280GB per day. It would be 100TB per year.



API design

Tweeting

  • POST: user ID, content of tweet

Home Feed

  • GET: user ID, page #

Following/Unfollowing

  • POST: user ID, followed user ID

Favoriting/Unfavoriting

  • POST: user ID, tweet ID




Database design

RDMS database with the following tables


Users table

  • user ID (UUID)
  • name (string)
  • email (string)
  • created at (timestamp)
  • updated at
  • etc.

Tweets table

  • tweet ID (UUID)
  • user ID (UUID)
  • tweet content (string)
  • created at (timestamp)

User Favorites table - many to many

  • user ID (UUID)
  • tweet ID (UUID)
  • created at (timestamp)
  • deleted at (timestamp)

User Followers table - many to many

  • user ID (UUID)
  • followed user ID (UUID)
  • created at (timestamp)
  • deleted at (timestamp)




High-level design

You should identify enough components that are needed to solve the actual problem from end to end.

Rate limiter

  • Protect again DOS attacks and ensure fair usage

Load balancer

  • use constant hashing to distribute load across servers

CDN

  • Serve cached data to localized areas to minimize response time

Services

  • Services to support major functions: user service (user management and following), tweet service (tweeting and favoriting), feed service (loading user feed)

Cache

  • application level caching

Database

  • use a relational database that is optimized to scale horizontally: amazon aurora, cockroach DB, etc.



Request flows

Explain how the request flows from end to end in your high level design.

  • The client will send a request that will initially be handled by the rate limiter and load balancer.
  • It will then be directed to the server and the CDN will deliver a response if possible
  • If CDN miss, then request will be routed to the correct service
  • The service interacts directly with the cache and the database to generate the response needed





Detailed component design

Dig deeper into 2-3 components and explain in detail how they work. For example, how well does each component scale? Any relevant algorithm or data structure you like to use for a component? Also you could draw a diagram using the diagramming tool to enhance your design...

Cache

  • Use Redis cache
  • Will use cache aside update strategy
  • Use TTL invalidation strategy
  • Use LRU eviction strategy

Services

  • User service: handles all requests to create/update user accounts and handle following/unfollowing
    • Writes associated with creating and following users will be done asynchronously to reduce response time and load on the server
  • Tweet service: handles all requests to post and favorite tweets
    • Writes associated with posting and favoriting tweets will be done asynchronously to reduce response time and load on the server
  • Feed service
    • Will check the cache for feed data for a particular user
    • Otherwise will generate the feed data itself
    • For home feed: queries the most recent tweets from followed users, results are paginated
    • For popular feed: queries the most popular recent tweets across twitter based on like count, results are paginated

Database

  • I will be using a combination of replication and sharding. There will be a master-slave replication pattern, with a number of slave instances to support the high volume of reads this system will handle.


Trade offs/Tech choices

Explain any trade offs you have made and why you made certain tech choices...

Database

  • We could have chosen a NoSQL database as they are known to scale well horizontally and handle large amounts of data
  • However, because it seems like the structure of Twitter's data is relatively stable and it seems like we will currently and increasingly need to represent out data in a relational way, I chose a RDBMS
  • Additionally there are strategies that can help us scale even if we use a RDBMS

Populating the Feed

  • There are 2 approaches to populating user's feeds: either pushing data to followers when a user posts or pulling all followed users data to display to a user when requested
  • The pull strategy requires us to do some heavier processing in our service to collect tweets from all the accounts a user follows, but it allows us to avoid the scenario of pushing messages to a large number of followers every time a user tweets.
  • The push model would also require us to save users' feeds (as the pushed messages need to go somewhere) and this would need additional storage
  • I have chosen to proceed with the pull strategy to avoid the requirement for additional storage and to avoid building a message queue system



Failure scenarios/bottlenecks

Try to discuss as many failure scenarios/bottlenecks as possible.

  • There may be some users with a very high number of followers that generate very high traffic when they post, our system will need to be able to handle these bursts
  • If we do not manage resources effectively, we could get backed up with asynchronous job handling if we get many write requests at the same time



Future improvements

What are some future improvements you would make? How would you mitigate the failure scenario(s) you described above?

How to address issues outlined in the previous section:

  • We can adjust our caching strategy to favor "hot" data so that we avoid making the full trip for these requests


Additional features we could add in the next iteration of the system:

  • Improving the algorithm for what kind of content is surfaced on the home feed: based on previous engagement by the user with other content, etc.
  • Allowing users to make their profiles private or public to control who can see their tweets
  • Content moderation: flagging and/or removing inappropriate content
  • Send/receive notifications whenever a followed user tweets, likes, or does any significant interaction