Design Pastebin - System Design

System requirements

Functional:

User login
Create pastebin specifying the text and expiration. There is a limit of 64 kb per pastebin
Get pastebin from an url (no login requiered)
Delete pastebin

Non-Functional:

High availability and failover
Low latency when querying a pastebin
Eventual consistency is allowed, some seconds can happen
High scalability so user base and requests can increase
Security: users must be authenticated to create pastebins. The information is protected in transit and at rest

Capacity estimation

Storage

Pastebin (10 K) --> 12,5 K

Id (UUID 36): 36
Owner (UUID 36): 36
Creation timestamp: 4
Expiration timestamp: 4
Content (VARCHAR 64K): 10 K (mean size)

First year

100.000 users
100 pastebins x user x year: 100 M pastebins x year
Average expiration: 0,1 years
1st year storage: 125 MB
275 K pastebin creation by day
6 peak
5 requests x user x day: 0,5 M requests x day
250.000 peak requests x hour: 70 requests x second

Third year

Previous figures x 10

1 M users
1.000 M pastebins x year
Average expiration: 0,1 years
3rd year storage: 1,25 GB
5 M requests x day
Peak 700 requests x second

API design

POST /v1/pastes

The body of the request is a JSON with the expiration and the content of the pastebin

{

"expiration_in_minutes" : 10,

"content" : ""

}

Returns the paste id:

HTTP/1.1 201 Created

Location: /v1/pastes/a1b2c3d4

GET /v1/pastes/{paste_id}

Returns the pastebin information (creation date, expiration date and content)

GET /v1/pastes?limit=nn&offset=nn

Returns list of pastebins for the loged user.

Optional URL parameters to define maximum number of results (limit) and offset (offset).

For performance reason 50 is a hard max limit, and default is 10.

Example response

{

"pagination": {

"total_items": 135,

"limit": 25,

"offset": 50

"data": [

{

"_id": "x7y8z9w0",

"created_at": "2025-07-15T12:00:00Z",

"expires_at": "2025-07-15T13:00:00Z"

}

]

}

DELETE /v1/pastes/{pastabin_id}

Deletes a pastebin if it belongs to the current loged user

Common response codes:

200 OK.
201 Created: if the paste has been created.
204 No Content: for successful delete.
404 Not found: if a paste doesn't exist.
401 Unauthorized: if the user is not logged in.
403 Forbidden: if the user is not the owner.

Database design

Pastebin collection

{

"_id": ObjectId("67890abcdef1234567890ab"), // Handled by MongoDB

"owner_id": "user-uuid-1234", // The owner's identifier

"created_at": ISODate("2025-07-15T12:00:00Z"),

"expires_at": ISODate("2025-07-15T13:00:00Z"), // Can be null if it never expires

"content": "Your multi-line content goes here."

}

Indexes

ID (_id) is the default collection index. Fast, unique lookups for a single paste (GET /pastes/{id}). No action needed.
Owner and cration date. { owner_id: 1, created_at: -1 }. Compound index. Efficiently retrieve pastes for a specific user, sorted by creation date.
By expiration date. { expires_at: 1 } (with filter). For retrieving expired documents.

Partition key

_id is used as the partition key for sharding, expecting an even distribution of data and load.

High-level design

Edge layer

DNS

Translates domain names to IPs.

Load balancer

Distributes requests to Kubernetes cluster ingress controller. It can balance between different regions for disaster recovery.

Application layer

API Gateway

Validates JWT token and redirects to IAM in case of logging needed (posting).

Manages rate limitting / throtling

Central request logging point.

Identity and Access Management

Implemented with Keycloack.

It can provide different flows for differente clients (browser, mobile app)

Ofloads user management and authentication.Dedicated specialized application that improves security and regulatory compliance.

Enables social login and identity federation with other IAM systems.

REST API

Run as Spring Boot microservices that writes and reads pastes to / from Pastebin database.

It also uses a cache to store and retrieve last used pastebins.

Expiration service

Periodically deletes expired pastes from database and Redis.

Implemented as a Spring Batch Kubernetes CronJob.

Pastes database

Storage for pastes information. Implemented with MongoDB.

Cache

Stores cached pastes for reducing latency and database load. Implemented with Redis for

Kubernetes cluster

Application layer, Monitoring and Alerting layer, and Logging & Analytics layer are run in a k8s cluster.

For high availability, several nodes are run in 2 availability zones.

For failover and disaster recovery, another cluster can be started to takeover in a different region.

Telemetry Collection

Log Collector

Fluent bit service that collects application logs and forwards them to Logging and Analytics system.

Telemetry Collector

OpenTelemetry Collector that forwards traces to Logging and Analytics system.

Logging & Analytics

Centralices logs and traces to help SRE / DevOps teams the analysis of issues and the system behaviour.

Data Prepper

Prepares logs content before inserting into OpenSearch.

Open Search

Indexes logs and traces to allow easy search and analysis.

It also sends alerts to Alertmanager when specific error logs are found.

Dashboards

Provide a productive user interface with dashboards to analyse and search for logs and traces.

Monitoring & Alerting

Prometheus

Scrapes metrics from applications, and stores them in a TSDB.

Grafana

Provides dashboards for metrics visualization

Alertmanager

Handles alarms providint the following functionalities:

Deduplication
Grouping
Inhibition
Routing
Integration with different channels (Email, Slack, ...)

Security

Some requests requiere authentication. It is enforced in API Gateway, that checks a valid JWT is present in the request, and the RES API implementation, that searches for the appropriate scope.

Data is encripte in transit and in rest.

Request flows

Sequence diagrams are provided for request flows of the following APIs:

POST /v1/pastes
GET /v1/pastes/{paste_id}

Detailed component design

Pastes database - Mongo DB

High availability and durability achieved with 5 nodes distributed across 3 different availability zones. 5 replicas distributed across the 3 availability zones (2-2-1)

Scalability achieved with partitioning.

When crating or updating a paste, strong consistenchy achieved writhing to 3 nodes.

For reading, eventual consistency is allowed. Reads of recent pastes are retrieved from the cache. If paste not chached t has been stored in secondary replicas or the cache is down, in which case a secondary replica is used to avoid overloading the primary. So secondaryPrefered + available read concern is selected.

Cache - Redis

High availability and scalability achieved with a Redis Cluster.

The cluster provides high availability and partitioning for scalability.

The cache is used in 2 scenarios:

Reads are usually performed a short time after writing, so a write-through pattern is used for this scenario.
Some pastes are shared with many people, for example in social networks. For tackling the celebrity problem, when a paste is retrieved multiple times in a short period of times, it's also cached. A cache-aside pattern with hit counter is used.

Kubernetes cluster

For high availability a 5 nodes cluster spread across 3 availability zones in a region.

Trade offs/Tech choices

Database selection

The data requierements include:

Simple queries.
Big ammount of data > 1 TB.
High availability preferred to strong consistency. Evantual consistency is allowed.

Taking all of this into account, a NoSQL database is selected.

MongoDB is selected because it's widely known, provides flexibility for data schema, allows flexible consistency depending per requests depending on the operation and it has low operational overhead. A wide-column store as Cassandra is not requiered based on data volume and write throughput.

Cache updates

Sync writting to cache is performed instead of async processing using queues because:

Simpler solution with less components
Write time to Redis is short, not adding significant latency to user requests

Failure scenarios/bottlenecks

Pastes database

If many queries are received in a short period of time the following measures can be applied:

Use cache-aside strategy with hit counter, as previously proposed, for caching frequently requested pastes.
Use a circuit breaker pattern in the API implementation to avoid overloading the database, replying with an error in case the database takes too long to answer.

Redis cache cluster

In case of a high demand spike, Redis can become a bottleneck. In this situation new nodes for new partitions can be added. It can be performend:

Manually, adding shards or replicas to the Redis Cluster
Try to automate it with: Kubernetes Horizontal Pod Autoscaling to increase the number of pods and Kubeblock Kubernetes operator for manage automatic resharding.

Region outage

In case to be able to recover in case of regional outage, a disaster recovery strategy could be used:

Use Infrastructure as Code to quicly restore the infrastructure in a different region
Pecona Backup for MongoDB for making backups in S3 in multipl Another option is to use MongoDB as a managed service (AtlasDB)

Future improvements

Include functionality for pastes visibility.

Async notifications to users informing of soon paste expiration. This could be implemented with SSE for web application and push notifications for movile devices.