System requirements
Functional:
- User login
- Create pastebin specifying the text and expiration. There is a limit of 64 kb per pastebin
- Get pastebin from an url (no login requiered)
- Delete pastebin
Non-Functional:
- High availability and failover
- Low latency when querying a pastebin
- Eventual consistency is allowed, some seconds can happen
- High scalability so user base and requests can increase
- Security: users must be authenticated to create pastebins. The information is protected in transit and at rest
Capacity estimation
Storage
Pastebin (10 K) --> 12,5 K
- Id (UUID 36): 36
- Owner (UUID 36): 36
- Creation timestamp: 4
- Expiration timestamp: 4
- Content (VARCHAR 64K): 10 K (mean size)
First year
- 100.000 users
- 100 pastebins x user x year: 100 M pastebins x year
- Average expiration: 0,1 years
- 1st year storage: 125 MB
- 275 K pastebin creation by day
- 6 peak
- 5 requests x user x day: 0,5 M requests x day
- 250.000 peak requests x hour: 70 requests x second
Third year
Previous figures x 10
- 1 M users
- 1.000 M pastebins x year
- Average expiration: 0,1 years
- 3rd year storage: 1,25 GB
- 5 M requests x day
- Peak 700 requests x second
API design
POST /v1/pastes
The body of the request is a JSON with the expiration and the content of the pastebin
{
"expiration_in_minutes" : 10,
"content" : ""
}
Returns the paste id:
HTTP/1.1 201 Created
Location: /v1/pastes/a1b2c3d4
GET /v1/pastes/{paste_id}
Returns the pastebin information (creation date, expiration date and content)
GET /v1/pastes?limit=nn&offset=nn
Returns list of pastebins for the loged user.
Optional URL parameters to define maximum number of results (limit) and offset (offset).
For performance reason 50 is a hard max limit, and default is 10.
Example response
{
"pagination": {
"total_items": 135,
"limit": 25,
"offset": 50
},
"data": [
{
"_id": "x7y8z9w0",
"created_at": "2025-07-15T12:00:00Z",
"expires_at": "2025-07-15T13:00:00Z"
}
]
}
DELETE /v1/pastes/{pastabin_id}
Deletes a pastebin if it belongs to the current loged user
Common response codes:
- 200 OK.
- 201 Created: if the paste has been created.
- 204 No Content: for successful delete.
- 404 Not found: if a paste doesn't exist.
- 401 Unauthorized: if the user is not logged in.
- 403 Forbidden: if the user is not the owner.
Database design
Pastebin collection
{
"_id": ObjectId("67890abcdef1234567890ab"), // Handled by MongoDB
"owner_id": "user-uuid-1234", // The owner's identifier
"created_at": ISODate("2025-07-15T12:00:00Z"),
"expires_at": ISODate("2025-07-15T13:00:00Z"), // Can be null if it never expires
"content": "Your multi-line content goes here."
}
Indexes
- ID (
_id) is the default collection index. Fast, unique lookups for a single paste (GET /pastes/{id}). No action needed. - Owner and cration date.
{ owner_id: 1, created_at: -1 }. Compound index. Efficiently retrieve pastes for a specific user, sorted by creation date. - By expiration date.
{ expires_at: 1 }(with filter). For retrieving expired documents.
Partition key
_id is used as the partition key for sharding, expecting an even distribution of data and load.
High-level design
Edge layer
DNS
Translates domain names to IPs.
Load balancer
Distributes requests to Kubernetes cluster ingress controller. It can balance between different regions for disaster recovery.
Application layer
API Gateway
Validates JWT token and redirects to IAM in case of logging needed (posting).
Manages rate limitting / throtling
Central request logging point.
Identity and Access Management
Implemented with Keycloack.
It can provide different flows for differente clients (browser, mobile app)
Ofloads user management and authentication.Dedicated specialized application that improves security and regulatory compliance.
Enables social login and identity federation with other IAM systems.
REST API
Run as Spring Boot microservices that writes and reads pastes to / from Pastebin database.
It also uses a cache to store and retrieve last used pastebins.
Expiration service
Periodically deletes expired pastes from database and Redis.
Implemented as a Spring Batch Kubernetes CronJob.
Pastes database
Storage for pastes information. Implemented with MongoDB.
Cache
Stores cached pastes for reducing latency and database load. Implemented with Redis for
Kubernetes cluster
Application layer, Monitoring and Alerting layer, and Logging & Analytics layer are run in a k8s cluster.
For high availability, several nodes are run in 2 availability zones.
For failover and disaster recovery, another cluster can be started to takeover in a different region.
Telemetry Collection
Log Collector
Fluent bit service that collects application logs and forwards them to Logging and Analytics system.
Telemetry Collector
OpenTelemetry Collector that forwards traces to Logging and Analytics system.
Logging & Analytics
Centralices logs and traces to help SRE / DevOps teams the analysis of issues and the system behaviour.
Data Prepper
Prepares logs content before inserting into OpenSearch.
Open Search
Indexes logs and traces to allow easy search and analysis.
It also sends alerts to Alertmanager when specific error logs are found.
Dashboards
Provide a productive user interface with dashboards to analyse and search for logs and traces.
Monitoring & Alerting
Prometheus
Scrapes metrics from applications, and stores them in a TSDB.
Grafana
Provides dashboards for metrics visualization
Alertmanager
Handles alarms providint the following functionalities:
- Deduplication
- Grouping
- Inhibition
- Routing
- Integration with different channels (Email, Slack, ...)
Security
Some requests requiere authentication. It is enforced in API Gateway, that checks a valid JWT is present in the request, and the RES API implementation, that searches for the appropriate scope.
Data is encripte in transit and in rest.
Request flows
Sequence diagrams are provided for request flows of the following APIs:
- POST /v1/pastes
- GET /v1/pastes/{paste_id}
Detailed component design
Pastes database - Mongo DB
High availability and durability achieved with 5 nodes distributed across 3 different availability zones. 5 replicas distributed across the 3 availability zones (2-2-1)
Scalability achieved with partitioning.
When crating or updating a paste, strong consistenchy achieved writhing to 3 nodes.
For reading, eventual consistency is allowed. Reads of recent pastes are retrieved from the cache. If paste not chached t has been stored in secondary replicas or the cache is down, in which case a secondary replica is used to avoid overloading the primary. So secondaryPrefered + available read concern is selected.
Cache - Redis
High availability and scalability achieved with a Redis Cluster.
The cluster provides high availability and partitioning for scalability.
The cache is used in 2 scenarios:
- Reads are usually performed a short time after writing, so a write-through pattern is used for this scenario.
- Some pastes are shared with many people, for example in social networks. For tackling the celebrity problem, when a paste is retrieved multiple times in a short period of times, it's also cached. A cache-aside pattern with hit counter is used.
Kubernetes cluster
For high availability a 5 nodes cluster spread across 3 availability zones in a region.
Trade offs/Tech choices
Database selection
The data requierements include:
- Simple queries.
- Big ammount of data > 1 TB.
- High availability preferred to strong consistency. Evantual consistency is allowed.
Taking all of this into account, a NoSQL database is selected.
MongoDB is selected because it's widely known, provides flexibility for data schema, allows flexible consistency depending per requests depending on the operation and it has low operational overhead. A wide-column store as Cassandra is not requiered based on data volume and write throughput.
Cache updates
Sync writting to cache is performed instead of async processing using queues because:
- Simpler solution with less components
- Write time to Redis is short, not adding significant latency to user requests
Failure scenarios/bottlenecks
Pastes database
If many queries are received in a short period of time the following measures can be applied:
- Use cache-aside strategy with hit counter, as previously proposed, for caching frequently requested pastes.
- Use a circuit breaker pattern in the API implementation to avoid overloading the database, replying with an error in case the database takes too long to answer.
Redis cache cluster
In case of a high demand spike, Redis can become a bottleneck. In this situation new nodes for new partitions can be added. It can be performend:
- Manually, adding shards or replicas to the Redis Cluster
- Try to automate it with: Kubernetes Horizontal Pod Autoscaling to increase the number of pods and Kubeblock Kubernetes operator for manage automatic resharding.
Region outage
In case to be able to recover in case of regional outage, a disaster recovery strategy could be used:
- Use Infrastructure as Code to quicly restore the infrastructure in a different region
- Pecona Backup for MongoDB for making backups in S3 in multipl Another option is to use MongoDB as a managed service (AtlasDB)
Future improvements
Include functionality for pastes visibility.
Async notifications to users informing of soon paste expiration. This could be implemented with SSE for web application and push notifications for movile devices.