System requirements
Functional:
- Be able to post text online
Non-Functional:
- Available
- Low latency
- Scalable
Capacity estimation
- On average a post would be 1000 characters -> 1000Bytes -> 1KB
- On average about 10 new posts per second -> 600 posts per minute
- On average 10000 characters per second -> 10KB per second
API design
External:
- POST /post -> Posts the text to a URL
- GET /getPost -> Gets the post from URL
Database design
User Table:
- Username
- First Name
- Last Name
- UserID
Post Table:
- S3 Key
- UserID
- Author
- LastUpdatedtimeStamp
- PostID
High-level design
We could shard the database by time of day the post was last updated
Because a post is most likely posted once and then read multiple times,
this system would likely be ready heavy
- Use replicas to reduce the load on the primary DB
User sends request
Request is distributed to servers through the loadbalancer
For get requests we could check the cache (LRU cache to store top 20% of posts visited) and if cache miss then try grabbing from DB
For post requests we would store the text in S3 and then store the post data in the posts DB
Request flows
Client -> Load Balancer -> Server -> Cache -> DB -> Back to user
Detailed component design
Dig deeper into 2-3 components and explain in detail how they work. For example, how well does each component scale? Any relevant algorithm or data structure you like to use for a component? Also you could draw a diagram using the diagramming tool to enhance your design...
Trade offs/Tech choices
Explain any trade offs you have made and why you made certain tech choices...
Failure scenarios/bottlenecks
Bottle neck may be that there could be a spike of posts -> we shard the DB for this
Bottle neck may be that when there are too many posts in the DB, there is a delay to finding the right post to send back to use -> we shard and use replicas
Future improvements
- Adding images to the posts
- Any features that a large amount of the users are requesting