Design Pastebin - System Design

System requirements

Functional:

User can create account and is required for creating pastes
User can create text based pastes
User can generate a unique URL for that paste
user can set expiry data for the paste (default 30days, max 365 days)
every user with a particular URL can visit the past
a paste can have at max 1mb in size

Non-Functional:

Low latency: pastes should be accessed with minimal latency
Durability: user expects the paste being reachable, consistent and without modification
Availability: User wants to be able to access the paste at any given time
Scalability: system should handle a growing number of users and pastes

Capacity estimation

Monthly users: 1M
Monthly created pastes/user: 5
Monthly created pastes: 5M
Annual created pastes: 60M
Storage / yr: 60M x 1mb = 60 TB → worst-case
as we considered the worst case for the object storage I won't consider the metadata, as it should be very small anyway

API design

POST /api/auth/register Register a new user

POST /api/auth/login Authenticate and get a token

POST /api/pastes Create a new paste

GET /api/pastes/{paste_id} Retrieve a paste by its unique URL

DELETE /api/pastes/{paste_id} Delete a paste (only the owner)

PATCH /api/pastes{paste_id} updates a paste (only the owner)

Database design

Core entities:

user
paste

Relationship between entities is one to many (user to paste)

User Schema:

id
username
email
password (hashed)
created_at
updated_at

Paste Schema

id
user_id (set index)
titile
storage_key
created_at
updated_at
expires_at (set index)

The content of the paste will be stored in s3 and there’s another database containing the users and the paste table. but the paste will just reference the storage key here and will not contain the saved text itself.

High-level design

flowchart TD

n0["Frontend Application"];

n1["Backend Server"];

n2[("S3")];

n3[("MetaData")];

n0 --> n1;

n1 --> n2;

n1 --> n3;

Request flows

The High level design proposes the main components and their interaction. A Frontend Application handles the user requests and forwards request to the api server. This API server is responsible for handling the CRUD operations, mentioned in the API design. The paste's content is stored in s3 and the metadata stores everything else including user and metadata about the pastes. the objects stored in s3 are referenced in the metadata as well to have the correct linking between a paste and its actual content

Detailed component design

flowchart TD

n0["Client"];

n1["CDN"];

n2[("Object Storage")];

n3["Load Balancer"];

n4["Write Service"];

n6["Read Service"];

n10[("Master Metadata")];

n11[("Slave Metadata")];

n13[("Slave Metadata")];

n16[("Metadata Cache")];

n17["[Async] Cleanup Service"];

n18["Message Broker [Replicator]"];

n0 --> n1;

n1 --> n2;

n0 --> n3;

n3 --> n4;

n3 --> n6;

n4 --> n10;

n4 --> n2;

n6 --> n11;

n6 --> n13;

n6 --> n16;

n17 --> n10;

n17 --> n2;

n4 --> n18;

n18 --> n13;

n18 --> n11;

Trade offs/Tech choices

First we introduce a CDN for the object storage s3. We'll serve the objects from S3 via a CDN, so we can leverage the cache of the CDN and the geo optimized serving of the objects.

The client hits a load balancer / api gateway as the backend is separated between a write and a read service. The mainr eason for this is the leader based replication we're approaching for the data storage of the metadata. As it's a very read heavy system, we need to ensure high availability on the data and thus it can hit sclaability and availability limits when we just use a single data storage instance.

We choose one master and multiple slaves (2 in this case, but can easily be extended). All the write requests are exclusively going to the master.

Now the need arises to keep the replicas in sync. Here we definitely favor async replication by trading off eventual consistency. Consistency is important in our case, but it doesn't need to be immediate and strong. More important is the availability and the durability. So we introduce a message broker which notifies all replicas about changes on the slave.

As we have a read heavy system, we introduce cache for the metadata. Here we use redis for it. We aim for a high hit ratio on the cache.

Lastly we need to consider the expiry date. Here another service is introduced which handles the cleanup asynchronously, The data can be synced via cron or even in the same message broker for the write replica to notify about the expiry dates of the pastes. Then it can delete the expired pastes and clean up both the storage and the metadata.

Failure scenarios/bottlenecks

As we use leader based replication with a single leader, changing the leader on failure is something we need to consider so we don't block writes in case of failure. We somehow introduce a single point of failure, so we need to make it reislient.

The replica count for the slaves might be increasing in future.

Future improvements

Even if we leverage the cache of the CDN for the objects in s3, with more scale we can think of additional caching mechanisms for the files stored in s3, so we don't need to read from the origin all the time.