System requirements


Functional:

  1. User Registration and Authentication: Allow users to create an account, log in, and log out securely.
  2. Compose and Share Tweets: Users should be able to create, edit, and delete tweets.
  3. Follow Users: Users can track updates from other users by following them.
  4. Favorite Tweets: Users can indicate their appreciation for specific tweets by favoriting them.
  5. Timeline: Display a user's timeline that includes tweets from users they follow.
  6. Search: Allow users to search for specific tweets or users.
  7. Notifications: Notify users of interactions like new followers, mentions, or favorites.
  8. Trending Topics: Display popular topics or hashtags.
  9. Privacy Settings: Allow users to control their account privacy.
  10. Report and Block: Provide options to report inappropriate content and block users.




Non-Functional:

  1. Scalability: can scale up easily if needed.
  2. Availability: highly available.




Capacity estimation

  1. Total Tweets Created per Day: 100 million users * 5 tweets/user = 500 million tweets/day
  2. Total Tweets Read per Day: 100 million users * 20 tweets/user = 2 billion tweets/day
  3. a tweet could be estimated to be around 280 bytes for text-only tweets and around 2 MB plus additional bytes for tweets with media attachments.
  4. Assume 20% of the tweets have media, and we need to save tweets for 5 years, the total storage is roughly 570 PB.
  5. QPS of tweet creation is roughly 60.
  6. QPS of tweet read is roughly 23500.



API design

RESTful API:

Create user: POST v1/users/:{username}

parameters:

action: CreateUser

password: {password}


Login user: POST v1/users/:{username}

parameters:

action: Login

device: deviceid

password: {password}

Response:

encrypted-token: it contains info like uid, username, deviceid, etc


Logout user: POST v1/users

parameters:

encrypted-token: {token data}

action: Logout


Follow another user: POST v1/follows

parameters:

encrypted-token: {token data}

followee: {another-username}


Create tweet: POST v1/tweets

parameters:

encrypted-token: {token data}

text: string of up to 280 bytes

media-type: video/picture/etc

media-data: binary bytes of up to 2 MB


Favorite a tweet: POST v1/tweets/favorite

parameters:

encrypted-token: {token data}

tweet-id: {tweet id}


Report a tweet: POST v1/tweets/report

parameters:

encrypted-token: {token data}

tweet-id: {tweet id}


Block an user: POST v1/users

parameters:

encrypted-token: {token data}

action: BlockUser

blocked: {another username}


Get timeline: GET v1/users/me

parameters:

encrypted-token: {token data}





Database design

Users table: meta info of all users, including: uid, username, salt, saltedHash. Primary key: uid. Shard key: uid. Secondary index on: username.


Devices table: uid, deviceid, devicestr. Primary key: uid, deviceid. Shard key: uid.


Objects table: objId, objUrl, objBytes. Primary key: objId. Shard key: objId.


Tweets table: tweetid, text, mediaType, mediaId. Primary key: tweetId. Shard key: tweetId.


Posts table: uid, tweetId. Primary key: uid, tweetId. Shard key: uid.


Follower table: uid, followeeid. Primary key: uid. Shard key: uid.


Followee table: uid, followerid. Primary key: uid. Shard key: uid.


Favoritor table: uid, tweetid. Primary key: uid. Shard key: uid.


Favoritee table: tweetid, uid. Primary key: tweetId. Shard key: tweetid.


High-level design

Client requests come into LoadBalancer, and is routed by LoadBalancer to UserApi or TweetApi, depending on what the request is.


UserApi is responsible for user creation/authentication/follow/unfollow/favorite, it's backed by UserDB and the corresponding UserCache, FavoritorDB and FavoriteeDB.


TweetApi is responsible for tweet creation/favorite/fanout, it's backed by TweetDB and corresponding TweetCache.


UniqueIdGenerator is responsible to generate globally unique id, the unique id can be used as objId, tweetId.


After a client logs in, it's connected to a ConnectionServer, and the information about "which device is connected to which ConnectionServer" is stored in ConnectionManager.


When a client posts a new tweet, the tweet will be pushed to all its followers. When a client favorites a tweet, that'll be pushed to all clients that are reading that tweet.



Request flows

sequenceDiagram

  title User Login

  autonumber

  Client->>LoadBalancer: LoginRequest

  LoadBalancer->>UserApi: LoginRequest

  UserApi->>UserService: LoginRequest

  UserService->>UserCache: Lookup user

  UserService->>UserDB: If not found from cache, lookup from DB

  UserService->>UserCache: Update cache if user is found from DB

  UserService->>UserApi: If user not found, return "unknown user"

  UserApi->>Client: Return "unknown user"

  UserService->>UserService: If user found, compute hash of password plus salt, compare the computed hash with the saltedHash

  UserService->>UserApi: If the two hashes not match, return failure

  UserApi->>Client: Return failure

  UserService->>UserService: If the two hashes match, generate encrypted token

  UserService->>UserApi: return encrypted token

  UserApi->>Client: Return encrypted token

  Client->>LoadBalancer: CreateWSConnectionRequest

  LoadBalancer->>ServiceDiscovery: Pick a ConnectionServer, and route the CreateWSConnectionRequest to it

  ConnectionServer->>Client: Verify the encrypted token, and accept the connection

  ConnectionServer->>ConnectionManager: put the client/connectionserver info




sequenceDiagram

  title Post a tweet

  autonumber

  Client->>LoadBalancer: PostTweetRequest

  LoadBalancer->>TweetApi: PostTweetRequest

  TweetApi->>TweetService: PostTweetRequest

  TweetService->>UniqueIdGen: generate tweetId

  TweetService->>PostTweetMessageQueue: add PostTweetRequest along with tweetId into queue

  TweetService->>TweetApi: return success

  TweetApi->>Client: return success

  loop

  TweetWorker->>PostTweetMessageQueue: retrieve message which contains tweetId and PostTweetRequest.

  TweetWorker->>ObjDB: if the tweet contains media, use tweetId_m as objId, and insert new row into ObjDB for the contained media.

  TweetWorker->>TweetDB: add new row to TweetDB.

  TweetWorker->>PostDB: Add new row of uid/tweetId to PostDB

  TweetWorker->>FanoutMessageQueue: put uid/tweetid

  TweetWorker->>PostTweetMessageQueue: ack message

  end

  loop

  FanoutWorker->>FanoutMessageQueue: retrieve message of uid/tweetid

  FanoutWorker->>FolloweeCache: lookup uid to get followers

  FanoutWorker->>FolloweeDB: if not found in cache, lookup uid in DB

  FanoutWorker->>FollweeCache: update cache if found in DB

  FanoutWorker->>CelebrityCache: if uid is a celebrity, add the tweetid to CelebrityCache

  FanoutWorker->>FollowerCache: if uid is not a celebrity, add the tweetid to FollowerCache of each follower.

  FanoutWorker->>ConnectionManager: for each follower, find the ConnectionServer where the follower is connected

  FanoutWorker->>ConnectionServer: send RPC about "new tweets available"

  ConnectionServer->>Client: push "new tweets available" to client

  FanoutWorker->>FanoutMessageQueue: ack message

  end



sequenceDiagram

  title User retrieve timeline

  autonumber

  loop until no timeline data is retrieved

  Client->>LoadBalancer: RetrieveTimelineRequest

  LoadBalancer->>TweetApi: RetrieveTimelineRequest

  TweetApi->>TweetService: RetrieveTimelineRequest

  TweetService->>TimelineCache: get the first N tweetIds

  TweetService->>TweetCache: for each tweetId, find tweet content

  TweetService->>TweetDB: if not found in cache, lookup from DB

  TweetService->>TweetCache: update cache if found from DB

  TweetService->>TweetApi: compose all the tweet contents along with other necessary info, send them as response

  TweetApi->>Client: send the response

  end





Detailed component design





Trade offs/Tech choices

Explain any trade offs you have made and why you made certain tech choices...






Failure scenarios/bottlenecks

Try to discuss as many failure scenarios/bottlenecks as possible.






Future improvements

What are some future improvements you would make? How would you mitigate the failure scenario(s) you described above?