System requirements


Functional:

  • User can create account and is required for creating pastes
  • User can create text based pastes
  • User can generate a unique URL for that paste
  • user can set expiry data for the paste (default 30days, max 365 days)
  • every user with a particular URL can visit the past
  • a paste can have at max 1mb in size



Non-Functional:

  • Low latency: pastes should be accessed with minimal latency
  • Durability: user expects the paste being reachable, consistent and without modification
  • Availability: User wants to be able to access the paste at any given time
  • Scalability: system should handle a growing number of users and pastes



Capacity estimation

  • Monthly users: 1M
  • Monthly created pastes/user: 5
  • Monthly created pastes: 5M
  • Annual created pastes: 60M
  • Storage / yr: 60M x 1mb = 60 TB → worst-case
  • as we considered the worst case for the object storage I won't consider the metadata, as it should be very small anyway




API design

POST /api/auth/register Register a new user

POST /api/auth/login Authenticate and get a token

POST /api/pastes Create a new paste

GET /api/pastes/{paste_id} Retrieve a paste by its unique URL

DELETE /api/pastes/{paste_id} Delete a paste (only the owner)

PATCH /api/pastes{paste_id} updates a paste (only the owner)



Database design

Core entities:

  • user
  • paste

Relationship between entities is one to many (user to paste)

User Schema:

  • id
  • username
  • email
  • password (hashed)
  • created_at
  • updated_at

Paste Schema

  • id
  • user_id (set index)
  • titile
  • storage_key
  • created_at
  • updated_at
  • expires_at (set index)

The content of the paste will be stored in s3 and there’s another database containing the users and the paste table. but the paste will just reference the storage key here and will not contain the saved text itself.



High-level design

flowchart TD

n0["Frontend Application"];

n1["Backend Server"];

n2[("S3")];

n3[("MetaData")];

n0 --> n1;

n1 --> n2;

n1 --> n3;







Request flows

The High level design proposes the main components and their interaction. A Frontend Application handles the user requests and forwards request to the api server. This API server is responsible for handling the CRUD operations, mentioned in the API design. The paste's content is stored in s3 and the metadata stores everything else including user and metadata about the pastes. the objects stored in s3 are referenced in the metadata as well to have the correct linking between a paste and its actual content



Detailed component design

flowchart TD

n0["Client"];

n1["CDN"];

n2[("Object Storage")];

n3["Load Balancer"];

n4["Write Service"];

n6["Read Service"];

n10[("Master Metadata")];

n11[("Slave Metadata")];

n13[("Slave Metadata")];

n16[("Metadata Cache")];

n17["[Async] Cleanup Service"];

n18["Message Broker [Replicator]"];

n0 --> n1;

n1 --> n2;

n0 --> n3;

n3 --> n4;

n3 --> n6;

n4 --> n10;

n4 --> n2;

n6 --> n11;

n6 --> n13;

n6 --> n16;

n17 --> n10;

n17 --> n2;

n4 --> n18;

n18 --> n13;

n18 --> n11;






Trade offs/Tech choices

First we introduce a CDN for the object storage s3. We'll serve the objects from S3 via a CDN, so we can leverage the cache of the CDN and the geo optimized serving of the objects.


The client hits a load balancer / api gateway as the backend is separated between a write and a read service. The mainr eason for this is the leader based replication we're approaching for the data storage of the metadata. As it's a very read heavy system, we need to ensure high availability on the data and thus it can hit sclaability and availability limits when we just use a single data storage instance.

We choose one master and multiple slaves (2 in this case, but can easily be extended). All the write requests are exclusively going to the master.

Now the need arises to keep the replicas in sync. Here we definitely favor async replication by trading off eventual consistency. Consistency is important in our case, but it doesn't need to be immediate and strong. More important is the availability and the durability. So we introduce a message broker which notifies all replicas about changes on the slave.

As we have a read heavy system, we introduce cache for the metadata. Here we use redis for it. We aim for a high hit ratio on the cache.


Lastly we need to consider the expiry date. Here another service is introduced which handles the cleanup asynchronously, The data can be synced via cron or even in the same message broker for the write replica to notify about the expiry dates of the pastes. Then it can delete the expired pastes and clean up both the storage and the metadata.




Failure scenarios/bottlenecks

As we use leader based replication with a single leader, changing the leader on failure is something we need to consider so we don't block writes in case of failure. We somehow introduce a single point of failure, so we need to make it reislient.


The replica count for the slaves might be increasing in future.





Future improvements

Even if we leverage the cache of the CDN for the objects in s3, with more scale we can think of additional caching mechanisms for the files stored in s3, so we don't need to read from the origin all the time.