Design Pastebin - System Design

Requirements

Functional Requirements:

Allow users to upload and store text or code snippets.
Generate a unique shareable URL for each paste.
Enable retrieval of paste content by URL.
Support expiration and TTL for pastes.
Allow paste owners or the system to delete a paste before its natural expiration.

Non-Functional Requirements:

Availability: The service must be highly available; users should always be able to read existing pastes.

Performance: Read operations must have extremely low latency.
Scalability: Must handle a high read-to-write ratio (e.g., 100:1) and massive amounts of concurrent requests.
Reliability: Uploaded pastes must not be lost before their expiration time.
Resilience & Fault Tolerance: The system must continue to operate smoothly even if individual nodes or datacenters fail.

API Design

To ensure robust data handling and address critical operational edge cases, the RESTful APIs are structured as follows:

1.CreatePaste(api_dev_key, paste_content, expiration_date, idempotency_key)

Edge Case Handled (Idempotent case & Rate limit case): The API Gateway enforces a strict Token Bucket rate limit on this endpoint to block spam and bulk creation abuse. Concurrently, the idempotency_key ensures that if a client retry occurs due to network timeouts, the system intercepts the duplicate request and returns the already generated paste without executing a secondary write.
2.GetPaste(api_paste_key)
Edge Case Handled (Rate limit case): Returns the raw paste text. Read operations are monitored via sliding window rate limiting at the API Gateway to prevent malicious scraping and data harvesting.
3.DeletePaste(api_dev_key, api_paste_key)
Executes early manual deletion from storage and aggressively invalidates the distributed cache.

High-Level Design

The architecture relies on decoupled microservices to optimize both the read and write paths while guaranteeing data security:

1.API_Gateway: Centralized entry point handling SSL termination, global rate limiting, and request routing.

2.App_Servers: Horizontally scaling, stateless instances executing the core paste business logic equipped with circuit breakers.

3.Key_Generation_Service(KGS): An isolated, asynchronous service utilizing a CSPRNG to pre-generate non-guessable Base62 strings, pushing them to a dedicated localized KGS database.

4.Caching_Layer: A distributed Redis cluster acting as the primary read interface and temporary fallback data source.

5.Message_Queue: A distributed queue (e.g., Apache Kafka) acting as a dead-letter and retry buffer during storage outages.

6.Metadata_Database: A NoSQL datastore (e.g., DynamoDB) housing paste metadata.

7.Object_Storage: Cloud-native blob storage containing the raw text snippets.

Detailed Component Design

To comprehensively solve the identified architectural gaps, the core components are engineered as follows:

1.URLGuessability&_Security:

Previous iterations risked predictable sequential IDs. To prevent enumeration attacks, the KGS is upgraded to use a Cryptographically Secure Pseudo-Random Number Generator (CSPRNG) mapped to a Base62 encoding scheme. This mathematically guarantees that malicious actors cannot guess valid URLs or sequentially scrape the system.
2.KeyGeneration&_Collision_Avoidance:
Key collisions are handled entirely offline. The background KGS worker continually generates CSPRNG strings and attempts to insert them into the KGS Database.
The KGS Database enforces a strict UNIQUE constraint on the key column. If a collision occurs naturally, the database simply rejects the insertion. The KGS worker discards the duplicate and generates a new one. Because the App Servers only ever fetch pre-validated keys from this pool, users experience zero latency penalties from collision resolution.
3.Resilience_&_Graceful_Degradation:
To maintain high availability during partial outages (e.g., NoSQL or S3 downtime), the App Servers implement the Circuit Breaker pattern.
Fallback for Reads: If the primary datastore times out, the circuit breaker opens, and the system gracefully degrades by fetching potentially stale—but highly available—data directly from the Redis Distributed Cache.
Fallback for Writes: If the storage layer fails during a paste creation, the system does not drop the user's data. Instead, the raw payload is temporarily buffered into the Kafka Message Queue for asynchronous retry, allowing the API to return a success response immediately (achieving Eventual Consistency).
4.Edge_Case_Handling(Thundering_Herd_Scenario):
The Threat: When a highly viral paste's cache TTL expires, millions of concurrent read requests can bypass the empty cache and strike the NoSQL database simultaneously, causing a catastrophic Cache Stampede.
The Solution: We deploy a Mutex Lock (Cache Lock). Upon a cache miss, only the first requested thread is permitted to query the NoSQL database. All subsequent parallel threads are temporarily blocked. Once the first thread repopulates the Redis cache, the lock releases, and all waiting threads fetch safely from the freshly warmed cache.