Design Pastebin - System Design

System requirements

Functional:

Users

Web Interface
API Access

Users can paste the text snippets for short period on time - Casual Users

Character limit is 20 k

Size of the paste per user - 200 MB

Automatically deleted after 30 days

Non-Functional:

Total users - 100,00 users

Daily active users = 10,000

Availability - 99.9 %

Latency - 500 ms

Key Business KPI's:

Number of shared snippets created
Number of Daily Active Users

Capacity estimation

Lets first calculate the bandwidth :

QPS = 100,000 * 5 / 10,000 = 5 pastes/second

Peak QPS = 2 * QPS = 10 pastes/second

Read to write ratio is 5:1

So read QPS = 10 reqs/second

Storage

10,000 * 5 * 200 *10^6 = 10 TB / day

For 30 days

300 TB

Cache Size = 10 TB * .2 = 2 TB

API design

Create paste

/api/v1/create_paste - POST method - 201 OK

payload{

user_id:

paste_name:

expiry:

content:

}

Get Paste

/api/v1/paste_id -- 200K if found , 404 if not found

{

paste_id

}

Delete Paste

/api/v1/paste_id

{

paste_id

}

Database design

We will need two tables

One for storing user data and one for storing metadata for the pastes themselves

use Table

user_id, name, email , createdAt

paste Table

paste_id, user_id, paste_name, expiryDate

The data itself suits well for a noSQL data type mostly because we are not looking for strong ACID properties and also the data is not very relational. Also its very easy to scale as we add expand more and more to new countries

For storing the paste itself we will be using a blob store like S3 as they are well suited for this kind of purpose.

High-level design

You should identify enough components that are needed to solve the actual problem from end to end. Also remember to draw a block diagram using the diagramming tool to augment your design. If you are unfamiliar with the tool, you can simply describe your design to the chat bot and ask it to generate a starter diagram for you to modify...

Request flows

Usecase 1: Creating a paste

The user visits the website and pastes the content and give it a paste name, the request then is passed onto the LB which then picks up an app server which hosts the API, we call create paste API, the app server stores the metadata in the NoSQL DB and stores the paste in the S3 storage system

Usecase 2:

When the user accesses the paste then enter the shortURL , we then call the GET API which gets the metadata from the DB and file from the S3 and returns the paste and the data, if its not found we return a 404 for API and in the UI we can show an error.

Detailed component design

We will talk about how we generate the alias for the paste, we will generate the alias using a key generator service which generates an id and stores it in the DB. The keys are pregenerated and stored in the DB and every time a request comes to the app server , it will then call the key gen service which will fetch the key and mark it as used in the key gen.

Trade offs/Tech choices

I have used S3 as the storage as it gives us a lot of features that will help scale the system and also help us achieve some of the KPI's mentioned like

As the system becomes popular we will need more storage S3 offers infinite storage , we can setup rules for autocleanup to reclaim space, we can also have more durability of the data, there is also versioning which can help us with collaborative editing.
NoSQL DB means its easy to update the new schema without having to run expensive migrations which may result in downtime. Its also easy to perform horizontal scaling. We will be not having strong ACID.
Since we are talking about low latency NoSQL database and S3 support eventual consistency to support this.
We will be using redis as a cache for speeding as it supports sorting and access to data structures.
Implementt background processes to ensure data consistency between NoSQL and S3, implement some sort of atomic operations between S3 and noSQL so that they can be rolled abck if necessary.

Failure scenarios/bottlenecks

The NO SQL Cassandra Database is a single point of failure.
The key generator DB is also a single point of failure.
The Key Gen service and application server are also under a single point of failure.
The API will need to be rate-limited to prevent abuse by developers.

Future improvements

We will need a secondary replica of the No SQL DB and Key generation DB so that in case of the primary failures the secondary can be promoted to primary.
We will implement autoscaling to work with the increased load.
We will need monitoring and alerting to be in place to check the health of the service.
We will need Blob storage in multiple Availability Zone to increase data durability .
We will use prometheus for monitoring .
We will also have cache infront of key gen DB to improve the efficiency of key fetching.
We will also have asyncronus replication between primary and seconday DB to esnure consistency between the primary and secondary.