System requirements


Functional:

List functional requirements for the system (Ask the chat bot for hints if stuck.)...

  1. Post text sources online with a time to live
  2. View other text sources that were posted


Non-Functional:

List non-functional requirements for the system...

  1. System should be highly scalable and be able to handle large amounts of data in requests
  2. System should be highly available with low latency




Capacity estimation

Estimate the scale of the system you are going to design...

  • 1KB a paste x 1 million pastes a day x 365 days x 5 years = 1.9 GB of Storage for 5 years
  • 1 million pastes a day / 24 hours / 60 minutes / 60 seconds = 11.5 pastes a second x 5 = 60 reads per second




API design

Define what APIs are expected from the system...

  • Upload Paste

Request

POST api/v1/paste

Body:

{

"content": "dear diary...",

"expires_in": "24 hours"

}

Response

{

"id":"xyz123"

}


  • Get Paste

Request

GET api/v1/paste/{id}


Response

{

"content" : "dear diary..."

}


  • Delete Paste

Request

DELETE api/v1/paste/{id}


Response

204 Status code


Database design

Defining the system data model early on will clarify how data will flow among different components of the system. Also you could draw an ER diagram using the diagramming tool to enhance your design...

  • Metadata will be kept on dynamoDb with the contents of each paste being kept on block storage. Upon creation of a new paste, a unique_id is generated along with a content_key pointer thats kept on the metadata dynamoDb along with the unique_id to prevent collision.
  • When accessed by retrieval service for the paste, the metadata is queried and retrieves the content_key pointer to the paste in block storage in order to prevent overhead and reduce latency by keeping only the metadata in dynamoDb

Paste_Tbl Schema

user:

content

expiration_date



High-level design

You should identify enough components that are needed to solve the actual problem from end to end. Also remember to draw a block diagram using the diagramming tool to augment your design. If you are unfamiliar with the tool, you can simply describe your design to the chat bot and ask it to generate a starter diagram for you to modify...

  • Client connects to a load balancer (AWS Network Load Balancer due to high volume of reads/writes) which directs client to different servers based on different regions/availability zones to avoid single point of failure and reduce latency for clients around the globe. Can use round-robin policy distribution for load balancing or a more weighted distribution policy for areas that are higher in population and traffic (NYC)
  • CDN aides in reducing latency by prioritizing pasted content relevant to user (same languages, same regions, etc), pull based CDN used since content will mostly be static (AWS Cloudfront)
  • API Gateway (AWS API Gateway) receives RPC calls and depending on different API request redirects the request to either a queue for writing or reading (Handled through Kafka). For write requests we employ the user of a unique identifier service to provider the corresponding id and making sure its unique to keep the system consistent. We use a queue because we are dealing with many write and read requests per second (11 pastes and 60 reads) and two separate queues will help us reduce latency. Also use a push based queue to delete existing pastes based on their expiration date using /delete RPC.
  • For read requests we will use a cache, done by redis to reduce latency and follows a simple LFU eviction policy since its more common to have very popular pastebin pages to look at that won't change
  • dynamoDb used for storage because of simple key-value and need to reduce latency, x5 number of read requests so more read replication done through asynchronous replication to reduce latency at the cost of potentially stale data




Request flows

Explain how the request flows from end to end in your high level design. Also you could draw a sequence diagram using the diagramming tool to enhance your explanation...

  • POST - containing the text and an optional expiry time. The load‑balancer forwards the call to the paste upload service where it employs its local unique identifier service for a six‑character token such as abc123, streams the text into object storage under that key, then records a metadata row (id, content_ref, created_at, expires_at) in the distributed database. If either step fails the operation is rolled back. Finally it primes the Redis cache with the content and TTL matching the expiry and returns https://…/paste/abc123 to the client—total latency tens of ms.
  • GET - Browser is directed to the read servers, where it first queries the cache then if not found it returns the metadata row, if not available or expired returns 404 not found. If metadata found, it fetches from blob storage using content_key and streams it to the client. Redis cache is also updated with the retrieval. CDN will also reduce latency by directing traffic to any viral pastes more regionally located near the user.
  • DELETE - Every row is written with a TTL, periodic job evicts any expired data in the cache and removes entries from s3 that have expired.




Detailed component design

Dig deeper into 2-3 components and explain in detail how they work. For example, how well does each component scale? Any relevant algorithm or data structure you like to use for a component? Also you could draw a diagram using the diagramming tool to enhance your design...

  • Servers need to scale up to meet any potential changes in traffic during the day, i.e. weekends. Can scale this using AWS Auto scaling groups if using EC2 instances
  • Cache might need to be scaled by TTLs depending on the size of the cache.
  • Database replication should scale well, can utilize asynchronous replication to avoid any impact on performance. If necessary could introduce sharding based on users or geographical location towards databases.





Trade offs/Tech choices

Explain any trade offs you have made and why you made certain tech choices...

  • Asynchronous replication of database gave us reduced latency compared to synchronous replication but at the cost of consistency in stale data
  • NoSQL key value choice is also optimal to reduce latency but could be constraining as opposed to using document based storage in the future if we needed to add more features or columns
  • SQL storage would also be better for enforcing consistency with ACID based properties but we are trading that off for better scalability and lower latency with NoSQL through key-value storage



Failure scenarios/bottlenecks

Try to discuss as many failure scenarios/bottlenecks as possible.

  • Possible bottlenecks could include sharp spikes in traffic, say a post goes viral and must require auto scaling groups to scale to demand along with any potential database replication.
  • Unique identifier service could be a single point of failure, if service goes down users can't create new pastes



Future improvements

What are some future improvements you would make? How would you mitigate the failure scenario(s) you described above?

  • Future additions could include the ability for users to create their own accounts and add comments to their existing paste bin or others
  • Future account creations would also allow users to save their pastebins to their accounts
  • Future improvements could be ability to post images, which would require some blob storage integration (AWS S3)