System requirements
Functional:
List functional requirements for the system (Ask the chat bot for hints if stuck.)...
support api for client to "tweet"
support api for client to "load timeline"
support api for user to "follow" another user
timeline is shown in reverse chronological order
timeline is filled with tweets from users that client follows
tweets are limited to 300 char
support only text in tweets
Non-Functional:
List non-functional requirements for the system...
availability - for good user experience
low latency - for good user experience
scalability - for potential rapid user adoption
Capacity estimation
Estimate the scale of the system you are going to design...
1M users
5 tweets per day per user
5M tweets per day at 300 characters per tweet
SPACE:
~1.5GB / day of tweets
compute:
1M users * 5 tweets/ day / 24 / 3600 ~ 57 tweets/sec
assume r/w ratio 4:1
where 1 read is a 1 request to "load timeline"
20 M / 24 / 3600 ~ 231 reads/sec
API design
Define what APIs are expected from the system...
POST /tweet?userid=
GET /timeline?range=
POST /follow?userid1=
Database design
Defining the system data model early on will clarify how data will flow among different components of the system. Also you could draw an ER diagram using the diagramming tool to enhance your design...
user table
user_id uuid
profile_info string
profile_photo url
#could rep this many to many with a graph db
followers
follower uuid
followee uuid
tweets table
tweet_id int
user_id uuid
content string
High-level design
You should identify enough components that are needed to solve the actual problem from end to end. Also remember to draw a block diagram using the diagramming tool to augment your design. If you are unfamiliar with the tool, you can simply describe your design to the chat bot and ask it to generate a starter diagram for you to modify...
client request goes to load balancer which routes to appropriate API
Tweet - goes to tweet msg queue and writes to posts table
Follow - executes some logic to prevent duplicate follows or other edge case then writes to graph db
load_timeline - load timeline reads from a timeline cache which is periodically rehydrated by a fanout service,
if the cache data is exhausted, we can contact the fanout service directly for a new page of posts (older) or a fresh timeline (read posts table again search for new) based on range sent
Request flows
Explain how the request flows from end to end in your high level design. Also you could draw a sequence diagram using the diagramming tool to enhance your explanation...
Detailed component design
Dig deeper into 2-3 components and explain in detail how they work. For example, how well does each component scale? Any relevant algorithm or data structure you like to use for a component? Also you could draw a diagram using the diagramming tool to enhance your design...
posts table scales by decoupling the tweet request and the write to the post db, in peak times if there are lots of tweets we can either, scale the servers to handle more writes or allow the queue to back up for a bit until db becomes available
timeline cache
- timeline cache improves latency to load timeline for user when they open the app
- assures content is ready at read time
- scales by using a distributed cache service like redis
Trade offs/Tech choices
Explain any trade offs you have made and why you made certain tech choices...
having fanout service run async means that you might not have the most recent data at load time
this trades the most up to date data for decoupling the fanout service (which can run in batches at regular interval) from the post logic
this can be addressed by having the server handling the tweet api send the post to the fanout service
Failure scenarios/bottlenecks
Try to discuss as many failure scenarios/bottlenecks as possible.
Future improvements
What are some future improvements you would make? How would you mitigate the failure scenario(s) you described above?