System Requirements


Functional -


  1. Users can send messages for up to 200 characters, message will be displayed to the user's followers
  2. Users can follow or unfollow other users
  3. Users can like messages from other users
  4. A user's home feed will be comprised of messages from the users it follows in reverse chronological order, newest one appearing first
  5. Users can block certain users, which would mean that their posts will stop appearing on their home feed
  6. Users should be able to manage their profiles
  7. Users should be able to see their and other user's profiles


Non-Functional -


  1. Scalability - System should be scalable, with the expectation of a growing user base in the future
  2. Availability - System must be available at all times
  3. Consistency - We are okay with a delay in the new posts being populated in a user's home feed. We are aiming for eventual consistency
  4. Latency - Home page should be populated in a second at maximum
  5. Security - Content moderation and anti abuse protection can be implemented, it is not covered here though



Capacity Estimation


  1. We assume a total of 1 billion users
  2. We assume 500M daily active users and a maximum message size of 200 bytes. We also add in overheads to a message and assume each message needs 500 bytes of data
  3. We assume that each user sends 2 messages a day. Thus, we generate 1 billion messages each day, which need 2 * 500 * 10^9 bytes of data each day, which is 10^12 bytes = 1 TB data
  4. In a month, we generate 30 TB data
  5. In a year, 360 TB data
  6. Factoring in replication, we need 1 PB data each year
  7. Over 10 years, we would need 10 PB data



API Design


  1. User Profile Related APIs
    1. Create new account
      1. /account/create
      2. POST
      3. Input
        1. Username
        2. First name
        3. Last name
        4. Date of birth
        5. Bio
        6. Profile picture
      4. Output
        1. Success message of the form 2XX, 4XX in case of error
        2. A unique UserID
    2. View existing profile
      1. /account/view
      2. GET
      3. Input
        1. UserID
      4. Output
        1. JSON with user profile
        2. 2XX message in case of success, 4XX is case of failure
    3. Edit Profile
      1. /account/edit
      2. PUT
      3. Input
        1. UserID
        2. First name
        3. Last name
        4. Bio
        5. Profile picture link
      4. Output
        1. 2XX message in case of success, 4XX is case of failure
    4. Delete Profile
      1. /account/delete
      2. DEL
      3. Input
        1. UserID
      4. Output
        1. 2XX message in case of success, 4XX is case of failure
  2. Message Related APIs
    1. Send Message
      1. /message/create
      2. POST
      3. Input
        1. UserID
        2. Content
        3. Link
      4. Output
        1. 2XX message in case of success, 4XX is case of failure
      5. Each message will have a unique ID generated for it by a Universally Unique ID (UUID) generator
    2. Edit Message
      1. /message/edit
      2. PUT
      3. Input
        1. UserID
        2. MessageID
        3. Content
        4. Link
      4. Output
        1. 2XX message in case of success, 4XX is case of failure
    3. Like Message
      1. /message/like
      2. PUT
      3. Input
        1. UserID
        2. MessageID
      4. Output
        1. 2XX message in case of success, 4XX is case of failure
    4. Get comments on a post
      1. /message/comments/get
      2. GET
      3. Input
        1. UserID
        2. MessageID
      4. Output
        1. JSON containing all of the comments on a post
        2. 2XX message in case of success, 4XX is case of failure
    5. Comment on a post
      1. /message/comment/post
      2. POST
      3. Input
        1. UserID
        2. Comment data
      4. Output
        1. 2XX message in case of success, 4XX is case of failure
  3. User Related APIs
    1. Follow a user
      1. /user/follow
      2. POST
      3. Input
        1. FollowerID
        2. FolloweeID
      4. Output
        1. 2XX message in case of success, 4XX is case of failure
    2. Unfollow a user
      1. /user/unfollow
      2. PUT
      3. Input
        1. FollowerID
        2. FolloweeID
      4. Output
        1. 2XX message in case of success, 4XX is case of failure
  4. Home Feed APIs
    1. /feed
    2. Get home feed
      1. GET
      2. Input
        1. UserID
        2. Offset, which is the number of posts the user is requesting
      3. Output
        1. JSON with needed fields
        2. 2XX message in case of success, 4XX is case of failure
      4. Offset can be a large value when the home feed is first populated, as the user reaches the end of the feed, we can make this API call again and have a smaller value for the offset, like 5. We can keep doing this to make the user feel like they are doing an infinite scroll
      5. Offset can be device specific, for example, for mobile phones, the offset can be a small value as phones do not need to display a lot of posts at once



Database Design

  1. Based on the capacity estimations done earlier, a relational database is not advised
  2. We do not utilize joins here in any capacity here, another reason to not use a relational database
  3. We opt for a non-relational database that is based on the document style store to effectively store posts made by the user
  4. MongoDB is a good choice here since it is non-relational, a document style store and allows schema flexibility
  5. The data model will be as given below.
    1. messages
      1. Stores post related data
      2. messageID Primary Key
      3. userID Index
      4. created_at
      5. links
      6. content
      7. like_count
    2. user
      1. Stores user information
      2. userID Primary Key
      3. email
      4. phone
      5. name
      6. date_of_birth
    3. follow
      1. Stores user follower relationship. This can be used to quickly find the users that follow a particular user. We can then quickly know that when a user makes a post, all the users that follow it need to see the post
      2. userID Primary Key
      3. followerList List of users that follow this particular user
  6. Instead of storing this in a document style format, we can store this relationship in a graph based non-relational database like neo4j
  7. The like counter would be a massive scalability challenge. If a celebrity person makes a post, it will be liked by a large number of people. This will make the like counter for that particular post a hot spot for change. To protect against this, we would implement a distributed counter service for users with over let's say 10,000 users. This is a number that can be configured later. In this case, shared counters would do the counting using multiple different counters implemented in memory, using an in-memory store like Redis and once in a while, let's say every 5 minutes, we would hit our document store and update the like counter


High-level design

  1. A content delivery network service like Akamai or Cloudfare can be used to serve all the static content
  2. A rate limiter can protect against denial of service attacks, intentional or otherwise
  3. Tweet service's job is to receive requests for new tweets and populate them in the document store. It can queue up messages in a message queue, which would decouple components and the same message can be utilized by analytics or other services as well
  4. Tweet analyzer pulls messages from the message queue and analyzes them for inappropriate content, hashtags etc.
  5. Home feed service handles populating the home feed of a user
  6. Data should be shared by region as users of a certain region would be more interested in posts of that region. Globally trending posts could be cached in the CDN and served to all regions
  7. Data should be replicated among data centers for greater fault tolerance


flowchart TD CL[client] --> CDN[CDN] CDN --> RL[Rate Limiter] RL --> LB[Load Balancer] LB --tweet()--> TW[Tweet Service] TW --Store--> DB[DB: MongoDB] TW --> MQ[Message Queue] TA[Tweet Analyzer] --Pull Message--> MQ TA --Update-->DB CA[Content Analyzer] --Pull Message-->MQ LB --home_feed()--> Home[Home Feed Service] Home --> DB Home --> Cache[Redis Cache] LB --like()-->CT[Sharded Counter] CT-->DB




Request flows

  1. Populate Home Feed
    1. Get the user IDs of the users following the person making the post.
    2. Populate the post in the cache of all the users.
  2. Get Home Feed
    1. Check if posts are present in the post cache, if yes, get them, if not, hit the DB
  3. Follow User
    1. Go to the user ID which our current user wants to follow
    2. Add ID to the follower list



Detailed component design


Score of Tweet = weight_like * number_of_likes + weight_followers * number of followers on the tweet's author + weight_retweet * number_of_retweets


The top-K posts to be shown on a user's home page can cached for quick retrieval. The top posts can be found by inserting all of the posts in a max heap data structure and then picking the top K posts. The max heapify operation would run based on the tweet score calculated above.


The cached data can be stored in a caching system like Redis, this does not need to be persisted on disk since it is subject to change frequently.


For users that do not log in frequently, getting top K posts can be considered a waste of computing power. We can stop calculating the top K posts for users that have not logged in in the last 7 days, where the number of days is a configurable parameter. For these users, the posts can be populated on login and then these users can be added to the circulation of users for whom posts are cached.



Trade offs/Tech choices


  1. We a non-relational database instead of a relational one, the relational database would have provided better consistency although it would not have as scalable as a non-relational database like MongoDB
  2. Cassandra would be better than MongoDB in terms of scalability but a column store would not have allowed us flexibility in storing our data
  3. Storing post data becomes very easy because of a document store like MongoDB
  4. Since at no point in time can we lose messages, we opt for a log based message queue like Apache Kafka instead of something like RabbitMQ since we can never lose messages and the order of messages should be preserved as well



Failure scenarios/bottlenecks


  1. A celebrity makes a post that is liked by millions of other users and is retweeted by thousands of users, this can overwhelm the server. A distributed counter service can help in this situation
  2. If a celebrity with millions of followers makes a post, then post must be populated in the feed of millions of users, this can overwhelm the post service


Future Improvements


  1. Adding the functionality for a user to re-post a post made by a different user
  2. Adding filters to post to filter content on basis of certain words
  3. Post moderation service to prevent hateful speech, graphic content
  4. Ability to hide posts from a certain user
  5. Ability to handle deletion of posts by a user, dealing with choices like deletion completely or archival
  6. Ability to handle deleting a user's account, dealing with choices like complete deletion or archival
  7. Ability to have public posts or to have posts directed to a certain segment of chosen users