System requirements


Functional:

  1. User Registration and Authentication:
    • Users should be able to create accounts with unique usernames and passwords.
    • Users should be able to log in securely using their credentials.
  2. Compose and Share Tweets:
    • Users should be able to compose tweets with a character limit (e.g., 280 characters per tweet).
    • Users should be able to post tweets to their profile for others to see.

For tweets, it also allow multi media tweets:

    1. Image Uploads:
      • Users should be able to upload images with a specific file size limit (e.g., up to 5 MB).
      • The platform should support common image formats such as JPEG, PNG, and GIF.
    2. Video Uploads:
      • Users should be able to upload videos, possibly with a duration limit (e.g., up to 2 minutes).
      • Support common video formats such as MP4, MOV, and WebM.
    3. Post Composition:
      • Users should be able to include images or videos alongside their tweets.
      • The character limit for tweets might still apply when including multimedia (e.g., a reduced character count when uploading a video/image).
    4. Media Display:
      • The platform should display images and videos properly in the feed, ensuring good quality and accessibility.
      • Users may also have the option to view media in full-screen mode.
    5. Storage and Retrieval:
      • Efficient storage solutions must be in place for handling multimedia files, considering the scale of uploads.



  1. Follow Users:
    • Users should be able to follow other users to see their tweets in their timeline.
    • Users should be able to unfollow others to stop seeing their tweets.
  2. Like and Share Tweets:
    • Users should be able to like tweets to show appreciation.
    • Users should be able to retweet tweets to share them with their followers.
  3. Timeline and News Feed:
    • Users should have a personalized timeline displaying tweets from users they follow.
    • Users should be able to view a general news feed with popular and trending tweets.
  4. Search and Discover:
    • Users should be able to search for other users, hashtags, or specific tweets.
    • Users should be able to discover new content and users based on their interests.
  5. Notifications:
    • Users should receive notifications for new followers, likes, retweets, and mentions.
    • Users should be able to manage their notification preferences.
  6. Profile Management:
    • Users should be able to edit their profile information, including bio, profile picture, and header image.
    • Users should be able to view their tweets, followers, following, and likes.


Non-Functional:

  1. Performance:
    • The system should support a high number of simultaneous users, estimated to handle at least 100,000 active users concurrently.
    • The average response time for fetching tweets or loading user profiles should not exceed 1 second under normal load.
  2. Scalability:
    • The architecture should be scalable to handle growth in users and data without significant degradation in performance. This might mean supporting millions of users and tweets over time.
    • The system should be capable of horizontal scaling, allowing for additional servers to be added as the user base grows.
  3. Reliability:
    • The system should ensure 99.9% uptime to provide a consistently available service.
    • The application should handle failures gracefully, with mechanisms in place for data recovery.
  4. Availability:
    • The service should be available 24/7, with provisions for planned maintenance to limit downtime.
    • Redundant systems should be in place to ensure high availability across different geographical regions.
  5. Security:
    • The platform should implement strong security measures, including data encryption in transit and at rest, to protect user information.
    • User authentication must follow best practices (e.g., two-factor authentication) to prevent unauthorized access.
  6. Usability:
    • The user interface should be intuitive and easy to navigate, enabling users to perform actions (like composing tweets or checking notifications) with minimal effort.
    • Accessibility features should be included to cater to users with disabilities.
  7. Data Retention and Compliance:
    • The system should comply with relevant data protection regulations, such as GDPR or CCPA, ensuring proper data handling and user consent.
    • Data retention policies should be established to manage the storage and deletion of tweets and user information.



Capacity estimation

  • Assuming the DAU for the system is 2,500,000. And 20% of them will post 5 tweets per day, tweet will consist of text, image and video. So let's assume the average tweet size is 3Mb. 2,500,000 x 5 x 20% x 3Mb = 7.5TB per day. That is 7.5TB x 365 = 2.737PB per year.
  • Based on the non-functional requirement, the system needs to handle at least 100,000 active users concurrently, all of them will read tweet so the RPS for read is about 100000 rps. And for write request, we assume 20% of the user will post tweets, so the throughput for write request is about 20000 rps.
  • And for peak time, we assume it will be 3 times of normal traffic. So read request will be like 300000 rps and write is 60000 rps.


API design

All api endpoints should only be accessed by https.

All of the following endpoints need a user token to perform operations except for the registration and login


User Management

  • POST /user/registration. It takes user input and create user account
  • POST /user/login. It takes user credential and log in user.
  • POST /user/notification. Manage user notification preference.
  • POST /user/{user_id}/follow. Follow other user
  • POST /user/{user_id}/unfollow. Unfollow other user
  • POST /user/{user_id}}profile. Edit user profile
  • GET /user/{user_id}/tweets. Get user tweets
  • GET /user/{user_id}/likes. Get user likes
  • GET /user/{user_id}/followers. Get user's followers
  • GET /user/{user_id}/following. Get user's followings


Tweets

  • POST /tweet/post. Post tweets
  • POST /tweet/{tweet_id}/share. Share/retweet tweets to the others
  • GET /user/timeline. Get user timeline
  • GET /tweet/news. Get general and popular tweets
  • GET / tweet/search. Search for specific tweets
  • GET /tweet/discover. Discover content based on user interests


Database design

User Table

  • user_id
  • user_name
  • password
  • bio
  • profile_picture_url
  • header_image_url
  • notification_preference
  • created_on
  • mfa_enable


User Relation table 1

  • user_id
  • following_user_id


User Relation table 2

  • user_id
  • followed_by_user_id


Tweet Table

  • tweet_id
  • user_id
  • hashtag
  • tweet_text
  • tweet_image_url
  • tweet_video_url
  • created_on
  • views


Tweet status table (Could be a non sql due to through put)

  • tweet_id
  • event_type( either liked or retweeted
  • user_id



High-level design

flowchart TD

B["User"];

C{"Load Balancer"};

D{"API Gateway"};

E["User managerment service"];

F["Tweet service"];

G["Discover service"];

I["CDN"]

H["Regulation Service"];

n1[("User Table")];

n2[("User Relation Table")];

n3[("Tweet Table")];

n4[("Tweet status Table")];

n5[("Blob")]

B --> C;

C --> D;

D --> E;

D --> F;

D --> G;

E --> n1

E --> n2

F --> n3

F --> n4

F --> n5

G --> n1

G --> n3

H --> n1

H --> n3

D --> I

I --> n5



Request flows

sequenceDiagram

participant A as User

participant B as User Management Service

participant C as Tweet service

participant D as Database

participant E as Discover Service

participant F as Regulation Service


A->>+B: Register

B->>+D: create user

B->>-A: user created

A->>+B: Login

B->>D: Autheticate

B->>-A: User token

A->>+B: With user token, follow/unfollo users

A->>C: With user token, read/post/like/retweet tweets

A->>E: With user token, fetch news and popular tweets

E->>D: discover new content and users based on their interests

F->>D: Delete data based on regulation





Detailed component design

All the services are independent and stateless so that they can be scale out easily.


User Management Service

  • Only store hash value with salt of user password.
  • Should integrate with a third party multi factor authentication service to enhance security.


Tweet Service

  • Put popular tweets in the cache to achieve performance requirement.
  • For multi media tweet, all the actual content(images, videos) are stored in blob storage and can be fetched from CDN.
  • Once the user post, like or retweet a tweet, there will be an event generated and sent to a message queue. And there is a fanout service consume from the queue, and will notify all the users who are following the author.
  • If the user get a new follower, or get mentioned. there will also be an event sent to a message queue. And a service to consume the message and notify corresponding users.
  • There will be a aggregation service to aggregate all likes, following and followers for the user.
  • When user try to view their timeline, there is a service to generate the content based on user's followings. We can have a cache for the result and set a expiry time for it. So that we dont have to abuse the service within a short range of time.
  • Based on the volume of the tweets, we need to have sharding based on tweet id, so that the possibility of hotspot issue can be spread out also we can add server easily without the need to relocate data.


Discover Service

  • This is a backend service may be run periodically to generate new context based on the user interests.
  • This service will also responsible for generating general news feed with popular and trending tweets.


Regulation Service

  • This service will only be used by admin to remove certain content according to adhere to compliance.


Database

  • All the tables have replications in different data center to achieve availability requirement. So that if there is failure in one data center, traffic will failover to another center.
  • For this system, I would like to choose availability over consistency, which means the data will be eventually consistent when it is syncing among replications.



Trade offs/Tech choices

  • I chose availability over consistency since we want the service to be 24/7 available, and tweet information is not necessarily to be real time for users, delay is not noticeable for most users. Also it help to reduce the response time since we dont have to wait until all data are synced before returning back to user.
  • Use no SQL database for storing tweet status since this table needs to be frequently read or written, it also has the in-memory option to provide better performance. It is easy to scale no-sql database.
  • Use message queue to handle peak time so there may be a delay before user can get the notification, however, this is considered to be acceptable for this system.
  • Use cache to provide better performance but it adds complexity to the system, we need to handle cache update, invalid cache issue.



Failure scenarios/bottlenecks

  • If there is a tweet posted by a celebrity, and if the tweet is not in cache yet, all the read request will hit database and it may bring it down.




Future improvements

  • Consider putting the tweets from celebrity into cache directly for a certain time period so that read request will be served from cache.