Design Pastebin - System Design

System requirements

Functional:

user should be able to paste a given text and get a short url in return to access the text
text once pasted cannot be modified
user when pastes the short url in the brower should be redirected to the pasted content

Non-Functional:

system should be available all the time
system should be reliable
text should not be lost before the expiration time

Capacity estimation

Assuming there are 10 million daily active users, a user pastes 2 texts daily, max size of a paste is 1MB, a paste will be accessed 10 times throughout the day

Total storage = 20 petabytes (10 * 10^6 * 2 * 10^6)

Writes per second = 232 ((10^7 * 2) / 86400)

232 * 1MB = 232MB = 2.32 * 10^8 bytes of data per second

Reads per second = 2315 (2 * 10^8 / 86400)

2315 * 1MB = 2.3 * 10^9 bytes of data per second read

Cache data storage = taking 1/5th of the actual storage

20 * 10^12 / 5 = 4 * 10^12 = 4 peta bytes

Evidently the reads are over writes by a factor of 10. So, reads >>> writes. Since the data is huge, I took 1/5th of the actual storage for the cache

API design

We will use REST Apis. One api for pasting the text and getting short url. Other to access the paste using the short url.

POST /paste

Request Body:

{

"userId": "U123",

"expirationTime:": 12342354423,

"content": "some text",

"accessControl": "PUBLIC"

}

Response Body:

200, OK

{

"pasteUrl": "www.pastebin.com/a2457ze"

}

500 - internal server errors

GET /paste/{pasteUrl}

Response Body:

302 http redirection to the actual pasted content

400 - when the url is expired

500 - internal server errors

Database design

Users table:

userId -> primary key, varchar(7)

email -> varchar(255)

password -> varchar(15)

Paste Info table:

pasteId -> primary key

userId -> same as above

pasteUrl -> index on this for faster lookup, short url with 8 chars

expirationTime -> date

accessControl -> varchar(7) (has values either public or private)

createdAt -> date

Paste Table (like Amazon S3):

pasteId -> same as above

pasteContent -> usually a blob

Since the reads are going to be heavy compared to writes, we can have B+ trees for indexing. The read will be efficient sacrificing the writes because writes are done on disk i/o space. Also, single leader replication strategy with asynchronous replication or eventual consistency will be used to eliminate the write conflicts.

High-level design

There will be a CDN distributed across regions storing the content users requested with the given pasteUrl. There will be an api gateway, load balancer sitting next to it to route the requests to any of the healthy nodes. I am assuming that 100 healthy nodes are required in this cases because the text size, volume of reads, writes are heavy. Then comes the cache, application server, database in sequence. We will have a zookeeper to give you the address of a healthy server instance when requested.

Components in order:

CDN: CDN stores static data. Since, the text once pasted cannot be modified the text becomes static and hence caching it in the CDN is a great choice.it makes sense if the CDNs are distributed globally across regions because users in one area might fetch similar urls/texts.

Api Gateway: This will usually take care of the authorization part whether a user has right privileges to view the text or not. Once verified, this routes the request to a load balancer.

Load Balancer: Load balancer will distribute the load evenly across the 100 healthy instances. For this, we can use round robin or IP hashing algorithms.

Cache: In memory cache like Redis can be used for this.

Application server: For simplicity, I haven't split the server into 2 services: write and read.

Data store for content: Since the text is huge a blob storage like Amazon s3 can be used. The corresponding (upload) url of a paste will be stored in the database along with its meta data. The data in transit and at rest will be encrypted.

Database: MySQL can be used due to the simplicity and the need for B+ tree indexes. Data will be sharded and partitioned. The sharding key for the users table is userId and paste info table is pasteUrl. The partitioning key used is the range of the short urls / paste urls. Since the database is partitioned, the cache will also be partitioned.

Generating short url / paste url: Since the DAU is quite high, assuming we should support 1 trillion pastes, we can use 8 characters to generate the paste url. the number of characters utilised is lowercase and 10 digits (26 + 10 = 36, 36^8 ~2 trillion choices). this way, we would never run out of choices for the paste urls.

A periodic cron job will run periodically (once a week or 10 days) to remove the expired urls from the cache and from the s3 data store. This way we would not serve expired paste url requests.

Additionally we can have a rate limiter sitting right at the api gateway layer to block any duplicate requests coming from malicious users.

Write flow:

user sends a content to create a paste.
Server receives a request from the load balancer.
We will generate a short url and put a lock on it.
upload it to s3 and get the pasteId
write to database to the leader node.
leader node will propagate this write to all its replicas

Read flow:

user requests a short url
check if it is available in CDN. If so, return
check if it is available in the cache. If so, return
fetch from the database. If the entry is expired, return 400.
Get the data from the s3 store using the pasteId
Put it in the cache
Return it to the user

Request flows

same as above

Detailed component design

same as above.

Trade offs/Tech choices

Single vs Multi leader: Since writes are not that huge and basically we don't case if two users uploads the same content, multi leader replication makes the system more complex.
MySQL as primary database with Amazon S3 as blob storage instead of using a document based database like MongoDB to store the text on disk. This is better since the text size is quite big.

Failure scenarios/bottlenecks

I don't really know how to do this, so i am skipping this. But i would learn how to do this.

Future improvements

I don't really know how to do this, so i am skipping this. But i would learn how to do this.