System requirements
# Functional requirements
-Create paste given text (max 10000 char (<=80KB - compression)) and expiration date (max 1 year)
-Get Paste by url/id
-Login
-Create User
-List your pastes
-Update/Delete a paste
# Non-functional Req
-99.999% availability
-99.999% reliability
-~100e6 total users
- <100ms p50 response time
- <1s p90 response time
-Support 1K read per minute (rpm)
-Support 1K write per minute (wpm)
Capacity estimation
Scale/Capacity Estimation
-Total users: 100e6 users
-Total Storage: 1e8 * ~1Kb = 1e11 B = 100 GB (low/medium)
-User read: 1% login daily (read) - 1e6 rpd ~ 12 rps (low)
-User write: 1% create or update daily (write) - 1e6 wpd ~ 12 wps (low)
Paste
-Read: 1% of users a daily read => 1e6 rpd ~ 12wps
-Write: 1% of users a daily write => 1e6 rpd ~ 12rps
-Paste Read/write data: ~12*100KB = 1.2 MBps ~ 10 Mbps(low/medium)
-Total Storage: 1e6 writes per day => 365e6 writes per year => 356e6 @100KB => 365e8 KB = 365e11 B = 365e2 GB ~ 36 TB (medium/large)
Access
User and Paste
- Simple CRUD queries
- Transactional
API design
# API
- /paste/create
Http: POST, Content: '{ user: 12, text: "pastetext", expires: 20260501 }'
- /paste/id. HTTP: GET
- /login - POST: '{user id:'[email protected]', passdhash: '12wefasdf34qfads' }'
- /logout - POST: ''
- /paste/id - HTTP DELETE
Database design
# Model
User
-id: long (32 B)
-passw hash: nvarchar (512 B)
-email: (64 chars - 512 B)
Paste
-id: long (32 B)
-user id: long (32 B, foreign key to User)
-text: text (80 KB)
-creation date: timestamp (32B)
-expiration date: timestamp (32B)
total: <100KB
# DB
User:
(recommended) - MySql or PostgreSQL or another RDMS (OLTP)
Pro:
- handles easily the required scale
- provides ACID (atomic, consistent and isolated transactions) required to user account handling
Cons: can hit scalability limits
(alternative) - PlanetScale (scalable RDMS)
Pros: handles scalability risks being a scalable RDMS based on MySQL, managed (handles DB updates etc).
Cons: can be more costly than an in-house MySQL server.
- it can be a risk if the user count increases 10x or 100x
(alternative) - DynamoDB or another Key/Value store (OLTP)
- Pros: scales well for current load and beyond, automatic scaling. Cons: may affect user experience due to read after write errors due to eventual consistency
(alternative) - MongoDB Document-based (OLTP)
Similar to Key/Value stores.
(do not use) - Any non-OLTP db such as OLAP databases - do not support well the transactional based access pattern
Paste table
(recommended) - DynamoDB or other managed OLTP Key/value stores
Pros: scale well and beyond for the requirement when setting an adecuate partitioning key, scales both in query load as in storage requirement.
Cons: can be expensive
(alternative) - MySql or PostgreSql with a partitioning schema
Pros: can handle the query load and storage requirements when setting an adecuate partitioning strategy, can be cheaper than managed KV stores
Cons: can be more complex to set up due to manual partitioning and replication logic
(alternative) - Search engine such as Solr, which is a Document-based db, with index on the user id
Pros: handle the scale, native support for complex text searches, decent support for normal transaction queries
(do not use) - OLAP databases - do not support well the transactional based access pattern
Logs:
timestamp
message
Event metrics:
timestamp
dimensions: ??
value:
DB
Log managment
(recommended) AXIOM
Pros: feature rich (charts, searches, alerts)
Cons: can more complex to set up with AWS services than alternatives
(alternative) AWS cloudwatch
Pros: better integration with aws services
Cons: can be very expensive
Metrics
(recommended) Influx
Pros: feature rich, massive scale
Cons: can more complex to set up with AWS services than alternatives
(alternative) AWS Cloudwatch
Pros: better integration with aws services, decent scale
Cons: can be very expensive, feautures may be limited
High-level design
# High-Level Architecture
User (u) -> Api Gateway (G)
G -> LoadBalancer (LB)
LB -> Servers (SVs)
SVs -CRUD-> MySQL (MYSQL) (User table)
SVs -CUD-> DynamoDB (DDB) (Paste table)
SVs -reads-> Redis (R) (in-memory cache)
We can add a layer for service observability
G, LB, SVs, MYSQL, DDB -> AXIOM (send logs to AXIOM)
G, LB, SVs, MYSQL, DDB -> Influx (send metrics to Influx)
Request flows
Explain how the request flows from end to end in your high level design. Also you could draw a sequence diagram using the diagramming tool to enhance your explanation...
Detailed component design
Dig deeper into 2-3 components and explain in detail how they work. For example, how well does each component scale? Any relevant algorithm or data structure you like to use for a component? Also you could draw a diagram using the diagramming tool to enhance your design...
Trade offs/Tech choices
Tradeoff:
Using KV storage for the Paste table allows for future scalability but limits the complexity of search queries. The cost is higher. Duplicates costs due to a second index to speed queries by user id.
To ensure low latency, availability and no regression, added layer for Service observability and log management which adds complexity but is essential in meeting the requirements
Failure scenarios/bottlenecks
Try to discuss as many failure scenarios/bottlenecks as possible.
Future improvements
What are some future improvements you would make? How would you mitigate the failure scenario(s) you described above?