System requirements


Functional:

List functional requirements for the system (Ask the chat bot for hints if stuck.)...

  1. Users should be able to post tweets (140 chars)
  2. In the tweet, the user can attach images and videos in addition to 140 chars of text
  3. Users can share other tweets
  4. Users can favorite other tweets
  5. Users can view a timeline or feed of tweets to track updates of other users


Non-Functional:

List non-functional requirements for the system...

  1. Scalable - see capacity estimations below but essentially large # of tweets, large data store, horizontal scaling, load balancer, read-heavy database/process - need this to be scalable
  2. Highly available - no crash (at the risk of consistency) - read-heavy
  3. Reliable - no single points of failure (page should load)
  4. Low latency (low page load time) < 1s




Capacity estimation

Estimate the scale of the system you are going to design...

  1. DAU: 30 million users
  2. 1/10 are posting 5 tweets a day: 3 million * 5 = 15 million tweets posted a day
  3. 1/5 contains video/images -> 3 million images/videos
  4. Text -> 140 chars = 140 bytes * 15 million = 2100 MB or 2.1 GB data every day
  5. 30MB/video = 900MB per day
  6. Traffic: peak periods (event, live tweeting) - 86400 seconds in one day
  7. 100 million users posting -> 1000 writes/second
  8. 200 million users reading -> 2000 reads/second


Summary:

  1. Need scalable DB to store this much data
  2. Since this is a read-heavy system, will need read replicas + caching to improve performance
  3. Need object store for images/videos -> can leverage CDN for regional locations as a caching mechanism
  4. To process read/writes need multiple application servers (horizontal scaling) and can leverage load balancer to distribute the load
  5. Can additionally have a cache between app servers and DB




API design

Define what APIs are expected from the system...


  1. postTweet
  2. Arguments
  3. Username:str username of user posting tweet
  4. User location:[lat, long] location of user
  5. User device: str device from which tweet was posted
  6. Text: tweet text
  7. Images: optional image
  8. Videos: optional video
  9. Logic
  10. Validate text constraints (140 char) -> can be done on the frontend
  11. Upload images + video to S3
  12. Write data to database (user metadata, tweet info)
  13. Generates a tweet ID
  14. Response
  15. REST API
  16. 201 successfully created
  17. 500 internal server error (something went wrong on the backend)
  18. 400-level user validation issue


  1. shareTweet
  2. Arguments
  3. Tweet ID:str ID of the tweet to share
  4. Username:str user who retweeted
  5. Logic
  6. New tweet created in database with reference to original tweet (need column for retweet or not)
  7. Increment original tweet's retweet count
  8. Update timeline generation mechanism (for new user) to followers feeds
  9. Response
  10. 200 successfully retweeted
  11. Contains info for OG user notification to be sent
  12. favouriteTweet
  13. Arguments:
  14. Tweet ID: str ID of tweet to favourite
  15. Username: str user who retweeted
  16. Logic
  17. Original tweet favourite count is incremented
  18. Add list of usernames that have favourited
  19. Update user's list of favourited tweets

  20. Response
  21. 200 successfully favourited


  1. viewFeed
  2. arguments:
  3. username: str user whose feed we need to get
  4. logic
  5. can have a cache that gets updated with latest 50 tweets for a given user
  6. we simply query the cache and return tweets
  7. every time a new tweet is posted that's relevant to this user, it gets queued for the feed generation
  8. response:
  9. 200 successful
  10. 500 internal failure



Database design

Defining the system data model early on will clarify how data will flow among different components of the system. Also you could draw an ER diagram using the diagramming tool to enhance your design...


Object store: S3

  1. Stores images, video


Database: NoSQL (DynamoDB) because we need something highly scalable (horizontal) since we'll be storing millions of tweets each day

  • lose out on ACID (consistency) but that should be okay if we see a small delay before tweets render on the page or if we have stale tweets instead of latest on each reload
  1. Users
  2. Email
  3. Username
  4. User since date
  5. Tagline
  6. Tweets
  7. Pointers to tweets in tweet DB (foreign key)
  8. Favourited tweets
  9. Pointers to tweets in tweet DB
  10. Following
  11. Users they follow
  12. Followers
  13. Users that follow them
  14. Number of followers
  15. Tweets
  16. Tweet ID (primary key)
  17. Text
  18. Image/video pointers to object store
  19. Date created (sort key)
  20. Location of tweet
  21. Posted from device
  22. Retweet count
  23. Favourited count
  24. Whether it's a retweet or the original tweet
  25. User who created tweet (foreign key)


Common queries:

  1. Timeline: get 50 tweets from people the user follow


Indexing:

  1. User -> get all tweets by a given user in date range X


Potential areas/issues:

  1. Tweet gets deleted -> dangling reference in favourited tweets
  2. User gets deleted -> do we keep all their tweets?
  3. Update following/followers





High-level design

You should identify enough components that are needed to solve the actual problem from end to end. Also remember to draw a block diagram using the diagramming tool to augment your design. If you are unfamiliar with the tool, you can simply describe your design to the chat bot and ask it to generate a starter diagram for you to modify...






Request flows

Explain how the request flows from end to end in your high level design. Also you could draw a sequence diagram using the diagramming tool to enhance your explanation...






Detailed component design

Dig deeper into 2-3 components and explain in detail how they work. For example, how well does each component scale? Any relevant algorithm or data structure you like to use for a component? Also you could draw a diagram using the diagramming tool to enhance your design...






Trade offs/Tech choices

Explain any trade offs you have made and why you made certain tech choices...






Failure scenarios/bottlenecks

Try to discuss as many failure scenarios/bottlenecks as possible.






Future improvements

What are some future improvements you would make? How would you mitigate the failure scenario(s) you described above?