Design Pastebin - System Design

Requirements

Functional Requirements:

Allow users to upload and store text or code snippets.
Generate a unique shareable URL for each paste.
Generated URLs should not be easily guessable by unintended users.
Enable retrieval of paste content by URL.
Support expiration and TTL for pastes. Pastes would be completely removed from the system on expiry/deletion
Allow paste owners or the system to delete a paste before its natural expiration.
Allow paste owners to edit a paste before it's deleted or expires.

Non-Functional Requirements:

High Availability: The system must remain continuously accessible to users, making sure there is minimal downtime. When a paste is created and shared, it should reliably be retrievable at all times, even in the event of partial system failures. The system should maintain availability through redundancy, failover mechanisms and a well-designed distributed architecture.
Performance: The system should provide low-latency responses when retrieving "bins", while prioritizing high availability, it must ensure a quick response time through efficient caching strategies and the use of reliable and high-performance database systems. the System should maintain low-latency for paste retrievals with targets such as:
- Average Response Time: 100ms
- P95 latency below 200 ms
- p99 latency below 500 ms
Scalability: The user base of such a system could grow quickly, therefor we should ensure the system can scale horizontally to cater to the growing number of users
Reliability: Reliability and durability in this system are key to ensuring data consistency. When a "bin" is created and shared, the text in that bin cannot be lost and should be retrievable by both the owner and the receivers. Once bins have expired or been deleted, the system should automatically and successfully purge the expired pastes.

Capacity Estimation:

Traffic:

Estimate Number of Users at 10 000 000
Estimate Daily Active Users 1000 000
Each User Creates 2 Writes per day, estimating to 2000 000 writes per day
Each paste would be accessed on average 20 times per day, which estimates 2000 000 * 20 = 40 000 000 reads per day
If we estimate the average paste size at 10 Kb per day
The peak traffic would be 5x the average
2000000 writes per day equals 24 writes per second
40 000 000 reads per day equals 463 reads per second

Storage:

Data to store: name (255), code (8 bytes), text (10 KB => 10000 characters), created_timestamp (8 bytes), updated_timestamp (8 bytes), expiration_timestamp (8 bytes), remaining_reads (4 bytes) this would account to 10291 bytes of storage or approximately 10.5 KB per request to round up to 11 - 12 KB per request
If we store 2000 000 pastes per day for 12 KB per paste this would equal approximately 22.8 GB per day considering data replication for high availability and Durability, the replication factor by 3 would make around 69 GB per day which would be around 25 TB per year

Bandwith:

Approximately 24 writes per second => 24 * 12 = 288 KB/s => for peak times 288 * 5 = 1.4 MB/s
Approximately 463 reads per second => 463 * 12 = 5556 KB/s = 5.43 MB/s => for peak times 5.43 * 5 = 27.15 MB/s

API Design

Create Paste:

Endpoint: POST /v1/bin

Description: Create a new paste and generate a unique URL code

Message Body:

name: string - the paste name
text: string (limit 10000 characters) - the paste text
burn_after_read: true or false - whether the paste is readable only once

Success Response:

201 Created
{ "code": "xusj8ajd", "name": "paste name", "text": "example text", "burn_after_read": 1, url: "https://example.com/xusj8ajd"}

Error Response:

400 Bad Request
{ "error": "Invalid paste text" }
{ "error": "Invalid paste name" }

Authentication: Token in the Authorization header. authentication is optional for creating pastes, the rate limit for creating pastes is higher and better managed when users are authenticated, also users would be able to edit and delete the pastes they create.

Rate Limiting:

Guest Users: 10 Requests per minute per IP

Authenticated Users: 20 Requests per minute per user

Returns 429 (Too Many Requests) when exceeded, with a Retry-After header indicating when the client can try again.

Validation: The server should validate the body fields to make sure they are correctly formatted and to avoid any abuse

Retrieve Paste:

Endpoint: GET /v1/bin/{code}

Description: Get a bin by code, if it was set to burn on read, also change the status do inactive

Path Parameters:

code: string - the paste code

Success Response:

200 OK
{ "name": "xusj8ajd", "text": "example text", "burn_after_read": 1}

Error Response:

404 Not Found
{ "error": "Paste not found" }

Rate Limiting:

Guest Users: 20 Requests per minute per IP

Authenticated Users: 100 Requests per minute per user

Returns 429 (Too Many Requests) when exceeded, with a Retry-After header indicating when the client can try again.

Edit Paste:

Endpoint: PATCH /v1/bin/{code}

Description: Edit a paste text and name

Message Body:

name: string - the paste name
text: string (limit 10000 characters) - the paste text

Success Response:

200 OK
{ "code": "xusj8ajd", "name": "paste name", "text": "example text", "burn_after_read": 1, url: "https://example.com/xusj8ajd"}

Error Response:

400 Bad Request
- { "error": "Invalid paste text" }
- { "error": "Invalid paste name" }
404 Not Found
- { "error": "Paste not found" }

Authentication: Token in the Authorization header. authentication is required for editing pastes.

Rate Limiting:

Authenticated Users: 20 Requests per minute per user

Returns 429 (Too Many Requests) when exceeded, with a Retry-After header indicating when the client can try again.

Validation: The server should validate the body fields to make sure they are correctly formatted and to avoid any abuse

Delete Paste:

Endpoint: DELETE /v1/bin/{code}

Description: Delete a paste

Path Parameters:

code: string - the paste code

Success Response:

200 OK
{ "message": "Paste deleted successfully", "code": "xusj8ajd" }

Error Response:

404 Not Found
- { "error": "Paste not found" }

Authentication: Token in the Authorization header. authentication is required for deleting pastes.

Rate Limiting:

Authenticated Users: 30 Requests per minute per user ( at a rate of deleting 2 pastes per second )

Returns 429 (Too Many Requests) when exceeded, with a Retry-After header indicating when the client can try again.

High-Level Design

Database:

The data flow relies heavily on generating URL short codes and on creating paste text related to these codes; the main workload would be focused on the generation of these short codes and then on retrieving specific paste text that is related to them. The flow will be read-heavy, retrieving shared pastes through the code from the URL. The estimation is 20:1 for 20 reads per 1 created paste.

In this scenario, the use of a key-value store database makes the most sense as the data would be heavily reliant on retrieving pastes by short code or on writing to the database with the short code as the key. No heavy join operations are required, relational constraints, and the data does not need to be normalized.

Write Flow:

The database writes need to be consistent; a small delay would be ok, yet consistency and ensuring the data is available and correct are of high importance. The flow would be Client --> Load Balancer --> Code Generation Service --> DynamoDB --> Redis. The write flow should be synchronous, as when a user creates a paste, they should only get a successful response after the paste has been persisted.

DynamoDB should use conditional writes to make sure no collisions in URL generation occur as another safety mechanism on top of the code generation service.

Read Flow:

The Read flow requires low latency and eventual consistency, we need to make sure the data is available when a URL is used, in the case of a burn after read event, we need to make sure that the paste is read only once, we need strong consistency for this case and this would take a synchronouse path with atomic database read and update. we need to make sure no cache is stored for burn after read pastes, yet for the normal flow of normal pastes using cache is key to ensure optimal reads and low server load.

Normal Flow: Client --> CDN --> Load Balancer --> Redis Cache --> DynamoDB

Burn After Read Flow: Client --> Load Balancer --> API --> DynamoDB (Cache-Control: no-store)

Cache:

Different layers of caching to make sure best response time for reads while making sure no cache is stored for burn-after-read operations.

CDN level caching with regional edge locations.

Application layer cache with a distributed Redis cluster, implementing cache partitioning by code, cache-aside with locking to make sure that only 1 request reaches the database in case of a cache miss, and writing the value to the Redis cache for subsequent queries, and preventing the risk of a cache stampede.

burn-after-read pastes bypass caching and go directly to the database.

Application Layer:

Code Generation Service (Multiple Instances):

Generate URL short codes for pastes at write time. The key purpose is to avoid collisions in code generation while still keeping low latency when creating pastes.

Collision handling at the level of the database is also an extra layer of protection. Conditional writes using PutItem with the condition.

Range-based allocation for ID assignment means writes succeed on the first time, as no 2 generated IDs can be the same, while each server has an ID pool to use, and the pools are tracked in the database

Paste Retrieval Service (Multiple Instances):

Look up paste URLs and return results to the user.

The request goes through CDN -> Load Balancers -> Application Servers -> Redis Cache -> Database

The first cache layer is the CDN for the fastest retrieval of request information. If the CDN cache is a miss, the request continues to the LB, which directs the traffic to the Retrieval service, which then looks up the system cache layer (Redis), in case of another miss, the service performs a database lookup for the data.

Redis:

A cache-aside strategy is used to ensure data is cached only after it is requested, and cached data does not pollute the cache with unused pastes.

Cache Management:

Eviction Policy: LFU (Least Frequently Used) to retain high traffic pastes, explicit invalidation is performed on deleted or expired entries.

Detailed Component Design

Caching:

Multiple layers of caching are implemented to ensure the load is managed at the application layer and at the Database layer, with the database connections being the bottleneck.

1- CDN (hot tier):

The CDN caches user requests at edge locations closer to users, reducing latency and server load. Cache policies are implemented to manage the stored requests using the Cache-Control header to make sure only appropriate data is cached. TTL based expiration ensures updates are timely propagated.

2- Application Layer, Redis (warm tier):

The Redis layer ensures warm requests are served from this cache. multiple shards based on the short code, a cache-aside pattern to ensure data is cached only when used, and no cache pollution with never-requested data. LRU (Least Recently Used) eviction policy to ensure the cache has data that is used most frequently and most recently.

Partitioning:

Data in both Redis and the database is partitioned by shortcode; this means each lookup path would hit one Redis shard, and if not found, one database partition to find the requested data.

Database:

DynamoDB, an AWS-managed key-value store type database, is recommended in this scenario for the simplicity of its use, and the data is also best stored in such a database for fast lookups and reliable writes.

CassandraDB would be another choice, but it would require much heavier operational resources and knowledge.

Both databases are highly scalable and support automatic partitioning/sharding out of the box, without the need for manual sharding like in SQL databases.

Code Generation Service:

Collision Handling:

Collision handling using generated range-based pools of IDs and unique writes to the database. This approach allows simplicity as there will be minimal runtime coordination between machines, and ranges would be tracked in the database.